@josstei/maestro 1.6.4-rc.1 → 1.6.4-rc.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (42) hide show
  1. package/CHANGELOG.md +2 -1
  2. package/EXAMPLES.md +2 -2
  3. package/GEMINI.md +46 -26
  4. package/QWEN.md +63 -30
  5. package/claude/.claude-plugin/plugin.json +1 -1
  6. package/claude/src/platforms/shared/agent-names.js +10 -5
  7. package/claude/src/skills/shared/delegation/SKILL.md +18 -1
  8. package/claude/src/skills/shared/design-dialogue/SKILL.md +1 -1
  9. package/claude/src/skills/shared/execution/SKILL.md +1 -1
  10. package/claude/src/skills/shared/implementation-planning/SKILL.md +30 -26
  11. package/claude/src/skills/shared/session-management/SKILL.md +4 -4
  12. package/claude/src/version.json +1 -1
  13. package/docs/architecture.md +24 -11
  14. package/docs/cicd.md +26 -15
  15. package/docs/flow.md +14 -3
  16. package/docs/maestro-cheatsheet.md +8 -0
  17. package/docs/overview.md +2 -2
  18. package/docs/runtime-codex.md +12 -12
  19. package/docs/runtime-gemini.md +5 -2
  20. package/docs/runtime-qwen.md +9 -6
  21. package/docs/usage.md +11 -8
  22. package/gemini-extension.json +2 -1
  23. package/package.json +1 -1
  24. package/plugins/maestro/.codex-plugin/plugin.json +1 -1
  25. package/plugins/maestro/.mcp.json +1 -1
  26. package/plugins/maestro/README.md +2 -2
  27. package/plugins/maestro/src/platforms/shared/agent-names.js +10 -5
  28. package/plugins/maestro/src/skills/shared/delegation/SKILL.md +18 -1
  29. package/plugins/maestro/src/skills/shared/design-dialogue/SKILL.md +1 -1
  30. package/plugins/maestro/src/skills/shared/execution/SKILL.md +1 -1
  31. package/plugins/maestro/src/skills/shared/implementation-planning/SKILL.md +30 -26
  32. package/plugins/maestro/src/skills/shared/session-management/SKILL.md +4 -4
  33. package/plugins/maestro/src/version.json +1 -1
  34. package/qwen-extension.json +2 -1
  35. package/scripts/npm-publish-idempotent.js +153 -0
  36. package/src/platforms/metadata-shared.js +3 -1
  37. package/src/platforms/shared/agent-names.js +10 -5
  38. package/src/skills/shared/delegation/SKILL.md +18 -1
  39. package/src/skills/shared/design-dialogue/SKILL.md +1 -1
  40. package/src/skills/shared/execution/SKILL.md +1 -1
  41. package/src/skills/shared/implementation-planning/SKILL.md +30 -26
  42. package/src/skills/shared/session-management/SKILL.md +4 -4
package/CHANGELOG.md CHANGED
@@ -15,10 +15,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
15
15
 
16
16
  ### Changed
17
17
 
18
- - **npm package identity**: renamed the planned npm package to `@josstei/maestro`, added `hello@josstei.dev` to public author metadata, and moved the stable release publish path toward GitHub Actions trusted publishing.
18
+ - **npm package identity**: renamed the planned npm package to `@josstei/maestro`, added `hello@josstei.dev` to public author metadata, and moved the stable release publish path into GitHub Actions with npm token authentication.
19
19
 
20
20
  ### Fixed
21
21
 
22
+ - **Stable npm release recovery**: Release now uses `NPM_TOKEN` for stable publishes, supports manual recovery from an existing `vX.Y.Z` tag and target SHA, and enforces a stable-only `latest` dist-tag through the idempotent npm publish helper.
22
23
  - **Codex plugin MCP server fails to start**: corrected `npx` args in `plugins/maestro/.mcp.json` — added `-p`/`--package` flag so `maestro-mcp-server` is resolved as the binary name rather than an argument to the package's default binary.
23
24
  - **Release metadata drift**: runtime manifests, marketplace entries, detached payload versions, and Codex MCP package specs are now generated from `package.json` so stable and prerelease packages stay self-consistent.
24
25
 
package/EXAMPLES.md CHANGED
@@ -247,9 +247,9 @@ Source: `justfile`, `package.json`
247
247
  ```bash
248
248
  # edit README.md, EXAMPLES.md, docs/*.md, or canonical src/ docs as appropriate
249
249
  node --test tests/unit/doc-drift-guard.test.js
250
- just check
250
+ node scripts/generate.js --diff
251
251
  ```
252
252
 
253
- Expected outcome: user-facing docs remain aligned with command names, runtime counts, MCP tool names, and generated-output rules.
253
+ Expected outcome: user-facing docs remain aligned with command names, runtime counts, MCP tool names, and generated-output rules, and the generator reports no additional pending runtime output. In CI or a clean worktree, `just check` covers the same drift check with `git diff --exit-code`.
254
254
 
255
255
  Source: `tests/unit/doc-drift-guard.test.js`
package/GEMINI.md CHANGED
@@ -76,7 +76,7 @@ For each domain, determine if the task has needs that warrant specialist involve
76
76
 
77
77
  | Domain | Signal questions | Candidate agents |
78
78
  | --- | --- | --- |
79
- | Engineering | Does the task involve code, infrastructure, or data? | `architect`, `api_designer`, `coder`, `code_reviewer`, `tester`, `refactor`, `data_engineer`, `debugger`, `devops_engineer`, `performance_engineer`, `security_engineer`, `technical_writer` |
79
+ | Engineering | Does the task involve code, infrastructure, APIs, data, or delivery? | `architect`, `api_designer`, `coder`, `code_reviewer`, `tester`, `refactor`, `data_engineer`, `database_administrator`, `debugger`, `devops_engineer`, `integration_engineer`, `platform_engineer`, `cloud_architect`, `solutions_architect`, `site_reliability_engineer`, `observability_engineer`, `performance_engineer`, `security_engineer`, `technical_writer`, `release_manager` |
80
80
  | Product | Are requirements unclear, or does success depend on user outcomes? | `product_manager` |
81
81
  | Design | Does the deliverable have a user-facing interface or interaction? | `ux_designer`, `accessibility_specialist`, `design_system_engineer` |
82
82
  | Content | Does the task produce or modify user-visible text, copy, or media? | `content_strategist`, `copywriter` |
@@ -84,13 +84,16 @@ For each domain, determine if the task has needs that warrant specialist involve
84
84
  | Compliance | Does the task handle user data, payments, or operate in a regulated domain? | `compliance_reviewer` |
85
85
  | Internationalization | Must the deliverable support multiple locales? | `i18n_specialist` |
86
86
  | Analytics | Does success need to be measured, or does the feature need instrumentation? | `analytics_engineer` |
87
+ | ML/AI | Does the task involve model training, inference, prompts, or model operations? | `ml_engineer`, `mlops_engineer`, `prompt_engineer` |
88
+ | Mobile | Does the task target iOS, Android, React Native, Flutter, or mobile release constraints? | `mobile_engineer` |
89
+ | Mainframe / IBM | Does the task involve COBOL, JCL, DB2 for z/OS or IBM i, HLASM, RACF, CICS, IMS, or USS? | `cobol_engineer`, `db2_dba`, `zos_sysprog`, `hlasm_assembler_specialist`, `ibm_i_specialist` |
87
90
 
88
91
  Skip domains where the answer is clearly "no." For relevant domains, include appropriate agents in the phase plan alongside engineering agents. Domain agents participate at whatever phase makes sense — design, implementation, or post-build audit — based on the specific task.
89
92
 
90
93
  Apply domain analysis proportional to `task_complexity`:
91
94
  - `simple`: Engineering domain only. Skip other domains unless explicitly requested.
92
95
  - `medium`: Engineering + domains with clear signals from the task description.
93
- - `complex`: Full 8-domain sweep (current behavior).
96
+ - `complex`: Full domain sweep (current behavior).
94
97
 
95
98
 
96
99
  ## Native Parallel Contract
@@ -141,7 +144,7 @@ CORRECT — Delegating via the agent's own tool:
141
144
 
142
145
  When building delegation prompts:
143
146
 
144
- 1. Call the agent's registered tool by its exact name from the Agent Roster (e.g., `coder`, `tester`, `design_system_engineer`). Use `get_agent` to load the full methodology body and declared tool restrictions for the matching kebab-case agent.
147
+ 1. Call the agent's registered tool by its exact name from the Agent Roster (e.g., `coder`, `tester`, `design_system_engineer`). Use `get_agent` to load the full methodology body, declared tool restrictions, and runtime `tool_name` for the matching canonical agent.
145
148
  2. Do not rely on Maestro-level model, temperature, turn, or timeout overrides. Use agent frontmatter and runtime-level agent configuration for native tuning.
146
149
  3. Inject shared protocols from `get_skill_content` with resources: `["agent-base-protocol", "filesystem-safety-protocol"]`.
147
150
  4. Include dependency downstream context from session state.
@@ -189,30 +192,47 @@ All agent names use **snake_case** (underscores, not hyphens). When delegating,
189
192
 
190
193
  ## Agent Roster
191
194
 
192
- | Agent | Focus | Key Tool Profile |
195
+ | Agent | Focus | Capability Tier |
193
196
  | --- | --- | --- |
194
- | `architect` | System design | Read tools + web search/fetch |
195
- | `api_designer` | API contracts | Read tools + web search/fetch |
196
- | `code_reviewer` | Code quality review | Read-only |
197
- | `coder` | Feature implementation | Read/write/shell + todos + skill activation |
198
- | `data_engineer` | Schema/data/queries | Read/write/shell + todos + web search |
199
- | `debugger` | Root cause analysis | Read + shell + todos |
200
- | `devops_engineer` | CI/CD and infra | Read/write/shell + todos + web search/fetch |
201
- | `performance_engineer` | Performance profiling | Read + shell + todos + web search/fetch |
202
- | `refactor` | Structural refactoring | Read/write/shell + todos + skill activation |
203
- | `security_engineer` | Security auditing | Read + shell + todos + web search/fetch |
204
- | `technical_writer` | Documentation | Read/write + todos + web search |
205
- | `tester` | Test implementation | Read/write/shell + todos + skill activation + web search |
206
- | `seo_specialist` | Technical SEO auditing | Read + shell + web search/fetch + todos |
207
- | `copywriter` | Marketing copy & content | Read/write |
208
- | `content_strategist` | Content planning & strategy | Read + web search/fetch |
209
- | `ux_designer` | User experience design | Read/write + web search |
210
- | `accessibility_specialist` | WCAG compliance auditing | Read + shell + web search + todos |
211
- | `product_manager` | Requirements & product strategy | Read/write + web search |
212
- | `analytics_engineer` | Tracking & measurement | Read/write/shell + web search + todos |
213
- | `i18n_specialist` | Internationalization | Read/write/shell + todos |
214
- | `design_system_engineer` | Design tokens & theming | Read/write/shell + todos + skill activation |
215
- | `compliance_reviewer` | Legal & regulatory compliance | Read + web search/fetch |
197
+ | `accessibility_specialist` | WCAG compliance auditing, ARIA review | Read + shell |
198
+ | `analytics_engineer` | Event tracking, conversion funnels | Full access |
199
+ | `api_designer` | API contracts and endpoint design | Read-only |
200
+ | `architect` | System design and architecture decisions | Read-only |
201
+ | `cloud_architect` | AWS/GCP/Azure topology, IaC, multi-region design | Read-only |
202
+ | `cobol_engineer` | Mainframe COBOL, JCL, CICS/IMS on z/OS | Full access |
203
+ | `code_reviewer` | Code quality review and bug identification | Read-only |
204
+ | `coder` | Feature implementation | Full access |
205
+ | `compliance_reviewer` | Legal and regulatory compliance | Read-only |
206
+ | `content_strategist` | Content planning and strategy | Read-only |
207
+ | `copywriter` | Marketing copy and landing-page content | Read + write |
208
+ | `data_engineer` | Schema design, queries, and data pipelines | Full access |
209
+ | `database_administrator` | RDBMS tuning, indexes, and migration safety | Read + shell |
210
+ | `db2_dba` | DB2 for z/OS and LUW, REORG, RUNSTATS, bind/rebind | Read + shell |
211
+ | `debugger` | Root cause analysis and defect investigation | Read + shell |
212
+ | `design_system_engineer` | Design tokens and theming | Full access |
213
+ | `devops_engineer` | CI/CD, containerization, and deployment | Full access |
214
+ | `hlasm_assembler_specialist` | IBM HLASM for z/OS, macros, SVCs | Full access |
215
+ | `i18n_specialist` | Internationalization and locale management | Full access |
216
+ | `ibm_i_specialist` | IBM i RPG/CL, DB2 for i, OS/400 | Full access |
217
+ | `integration_engineer` | B2B APIs, ETL, and message brokers | Full access |
218
+ | `ml_engineer` | Model training, feature pipelines, and evaluation | Full access |
219
+ | `mlops_engineer` | Model registry, CI/CD for models, drift detection | Full access |
220
+ | `mobile_engineer` | iOS/Android/React Native/Flutter platform work | Full access |
221
+ | `observability_engineer` | Metrics, logs, traces, OpenTelemetry, dashboards | Full access |
222
+ | `performance_engineer` | Performance profiling and optimization | Read + shell |
223
+ | `platform_engineer` | Internal developer platforms and paved paths | Full access |
224
+ | `product_manager` | Requirements and product strategy | Read + write |
225
+ | `prompt_engineer` | LLM prompt design, few-shot, and RAG tuning | Read + write |
226
+ | `refactor` | Structural refactoring and technical debt | Full access |
227
+ | `release_manager` | Release notes, changelogs, rollout planning | Read + write |
228
+ | `security_engineer` | Security assessment and vulnerability analysis | Read + shell |
229
+ | `seo_specialist` | Technical SEO auditing and structured data | Read + shell |
230
+ | `site_reliability_engineer` | SLOs, error budgets, runbooks, postmortems | Read + shell |
231
+ | `solutions_architect` | Enterprise integration and cross-team architecture | Read-only |
232
+ | `technical_writer` | Documentation and technical writing | Read + write |
233
+ | `tester` | Test implementation and coverage analysis | Full access |
234
+ | `ux_designer` | User experience design | Read + write |
235
+ | `zos_sysprog` | z/OS systems programming, JCL, USS, RACF | Read + shell |
216
236
 
217
237
  ## Hooks
218
238
 
package/QWEN.md CHANGED
@@ -34,20 +34,33 @@ Before running orchestration commands:
34
34
 
35
35
  - Extension settings from `qwen-extension.json` are exposed as `MAESTRO_*` env vars via Qwen Code extension settings; honor them as runtime source of truth.
36
36
  - Maestro slash commands are file commands loaded from `commands/maestro/*.toml`; they are expected to resolve as `/maestro:*`.
37
- - Hook entries must remain `type: "command"` in `hooks/hooks.json` for compatibility with current Qwen Code hook validation.
37
+ - Hook entries must remain `type: "command"` in `qwen/hooks.json` for compatibility with current Qwen Code hook validation.
38
38
  - Extension workflows run only when the extension is linked/enabled and workspace trust allows extension assets.
39
39
  - Keep `ask_user_question` header fields short (aim for 16 characters or fewer) to fit the UI chip display. Short headers like `Database`, `Auth`, `Approach` work best.
40
40
  - The extension contributes deny/ask policy rules from `policies/maestro.toml`. Treat these as safety rails that complement, but do not replace, prompt-level instructions.
41
41
 
42
42
  ## Qwen Tool Name Mapping
43
43
 
44
- This extension was authored for Qwen Code. When following agent methodology files that reference Gemini tool names, use the following mapping:
44
+ This extension was authored for Qwen Code. When following agent methodology files that reference canonical tool names, use the runtime mapping from `src/platforms/qwen/runtime-config.js`:
45
45
 
46
46
  | Source (raw file) | Qwen tool |
47
47
  |---|---|
48
+ | `read_file` | `read_file` |
49
+ | `read_many_files` | `read_many_files` |
50
+ | `list_directory` | `list_directory` |
51
+ | `glob` | `glob` |
52
+ | `grep_search` | `grep_search` |
53
+ | `google_web_search` | `web_search` |
54
+ | `web_fetch` | `web_fetch` |
55
+ | `write_file` | `write_file` |
56
+ | `replace` | `edit` |
57
+ | `run_shell_command` | `run_shell_command` |
48
58
  | `ask_user` | `ask_user_question` |
49
-
50
- **Known residual gap:** Other Gemini tool names (e.g., `google_web_search`, `write_todos`, `activate_skill`) may still appear in raw source agent files. Where encountered, apply the same mapping principle if the Qwen-equivalent tool exists and is semantically equivalent. Do not assume equivalence for tools with different invocation semantics — verify behavior first.
59
+ | `write_todos` | `todo_write` |
60
+ | `activate_skill` | `skill` |
61
+ | `enter_plan_mode` | `enter_plan_mode` |
62
+ | `exit_plan_mode` | `exit_plan_mode` |
63
+ | `codebase_investigator` | `codebase_investigator` |
51
64
 
52
65
  ## Context Budget
53
66
 
@@ -86,7 +99,7 @@ For each domain, determine if the task has needs that warrant specialist involve
86
99
 
87
100
  | Domain | Signal questions | Candidate agents |
88
101
  | --- | --- | --- |
89
- | Engineering | Does the task involve code, infrastructure, or data? | `architect`, `api_designer`, `coder`, `code_reviewer`, `tester`, `refactor`, `data_engineer`, `debugger`, `devops_engineer`, `performance_engineer`, `security_engineer`, `technical_writer` |
102
+ | Engineering | Does the task involve code, infrastructure, APIs, data, or delivery? | `architect`, `api_designer`, `coder`, `code_reviewer`, `tester`, `refactor`, `data_engineer`, `database_administrator`, `debugger`, `devops_engineer`, `integration_engineer`, `platform_engineer`, `cloud_architect`, `solutions_architect`, `site_reliability_engineer`, `observability_engineer`, `performance_engineer`, `security_engineer`, `technical_writer`, `release_manager` |
90
103
  | Product | Are requirements unclear, or does success depend on user outcomes? | `product_manager` |
91
104
  | Design | Does the deliverable have a user-facing interface or interaction? | `ux_designer`, `accessibility_specialist`, `design_system_engineer` |
92
105
  | Content | Does the task produce or modify user-visible text, copy, or media? | `content_strategist`, `copywriter` |
@@ -94,13 +107,16 @@ For each domain, determine if the task has needs that warrant specialist involve
94
107
  | Compliance | Does the task handle user data, payments, or operate in a regulated domain? | `compliance_reviewer` |
95
108
  | Internationalization | Must the deliverable support multiple locales? | `i18n_specialist` |
96
109
  | Analytics | Does success need to be measured, or does the feature need instrumentation? | `analytics_engineer` |
110
+ | ML/AI | Does the task involve model training, inference, prompts, or model operations? | `ml_engineer`, `mlops_engineer`, `prompt_engineer` |
111
+ | Mobile | Does the task target iOS, Android, React Native, Flutter, or mobile release constraints? | `mobile_engineer` |
112
+ | Mainframe / IBM | Does the task involve COBOL, JCL, DB2 for z/OS or IBM i, HLASM, RACF, CICS, IMS, or USS? | `cobol_engineer`, `db2_dba`, `zos_sysprog`, `hlasm_assembler_specialist`, `ibm_i_specialist` |
97
113
 
98
114
  Skip domains where the answer is clearly "no." For relevant domains, include appropriate agents in the phase plan alongside engineering agents. Domain agents participate at whatever phase makes sense — design, implementation, or post-build audit — based on the specific task.
99
115
 
100
116
  Apply domain analysis proportional to `task_complexity`:
101
117
  - `simple`: Engineering domain only. Skip other domains unless explicitly requested.
102
118
  - `medium`: Engineering + domains with clear signals from the task description.
103
- - `complex`: Full 8-domain sweep (current behavior).
119
+ - `complex`: Full domain sweep (current behavior).
104
120
 
105
121
 
106
122
  ## Native Parallel Contract
@@ -151,7 +167,7 @@ CORRECT — Delegating via the agent's own tool:
151
167
 
152
168
  When building delegation prompts:
153
169
 
154
- 1. Call the agent's registered tool by its exact name from the Agent Roster (e.g., `coder`, `tester`, `design_system_engineer`). Use `get_agent` to load the full methodology body and declared tool restrictions for the matching kebab-case agent.
170
+ 1. Call the agent's registered tool by its exact name from the Agent Roster (e.g., `coder`, `tester`, `design_system_engineer`). Use `get_agent` to load the full methodology body, declared tool restrictions, and runtime `tool_name` for the matching canonical agent.
155
171
  2. Do not rely on Maestro-level model, temperature, turn, or timeout overrides. Use agent frontmatter and runtime-level agent configuration for native tuning.
156
172
  3. Inject shared protocols from `get_skill_content` with resources: `["agent-base-protocol", "filesystem-safety-protocol"]`.
157
173
  4. Include dependency downstream context from session state.
@@ -199,30 +215,47 @@ All agent names use **snake_case** (underscores, not hyphens). When delegating,
199
215
 
200
216
  ## Agent Roster
201
217
 
202
- | Agent | Focus | Key Tool Profile |
218
+ | Agent | Focus | Capability Tier |
203
219
  | --- | --- | --- |
204
- | `architect` | System design | Read tools + web search/fetch |
205
- | `api_designer` | API contracts | Read tools + web search/fetch |
206
- | `code_reviewer` | Code quality review | Read-only |
207
- | `coder` | Feature implementation | Read/write/shell + todos + skill activation |
208
- | `data_engineer` | Schema/data/queries | Read/write/shell + todos + web search |
209
- | `debugger` | Root cause analysis | Read + shell + todos |
210
- | `devops_engineer` | CI/CD and infra | Read/write/shell + todos + web search/fetch |
211
- | `performance_engineer` | Performance profiling | Read + shell + todos + web search/fetch |
212
- | `refactor` | Structural refactoring | Read/write/shell + todos + skill activation |
213
- | `security_engineer` | Security auditing | Read + shell + todos + web search/fetch |
214
- | `technical_writer` | Documentation | Read/write + todos + web search |
215
- | `tester` | Test implementation | Read/write/shell + todos + skill activation + web search |
216
- | `seo_specialist` | Technical SEO auditing | Read + shell + web search/fetch + todos |
217
- | `copywriter` | Marketing copy & content | Read/write |
218
- | `content_strategist` | Content planning & strategy | Read + web search/fetch |
219
- | `ux_designer` | User experience design | Read/write + web search |
220
- | `accessibility_specialist` | WCAG compliance auditing | Read + shell + web search + todos |
221
- | `product_manager` | Requirements & product strategy | Read/write + web search |
222
- | `analytics_engineer` | Tracking & measurement | Read/write/shell + web search + todos |
223
- | `i18n_specialist` | Internationalization | Read/write/shell + todos |
224
- | `design_system_engineer` | Design tokens & theming | Read/write/shell + todos + skill activation |
225
- | `compliance_reviewer` | Legal & regulatory compliance | Read + web search/fetch |
220
+ | `accessibility_specialist` | WCAG compliance auditing, ARIA review | Read + shell |
221
+ | `analytics_engineer` | Event tracking, conversion funnels | Full access |
222
+ | `api_designer` | API contracts and endpoint design | Read-only |
223
+ | `architect` | System design and architecture decisions | Read-only |
224
+ | `cloud_architect` | AWS/GCP/Azure topology, IaC, multi-region design | Read-only |
225
+ | `cobol_engineer` | Mainframe COBOL, JCL, CICS/IMS on z/OS | Full access |
226
+ | `code_reviewer` | Code quality review and bug identification | Read-only |
227
+ | `coder` | Feature implementation | Full access |
228
+ | `compliance_reviewer` | Legal and regulatory compliance | Read-only |
229
+ | `content_strategist` | Content planning and strategy | Read-only |
230
+ | `copywriter` | Marketing copy and landing-page content | Read + write |
231
+ | `data_engineer` | Schema design, queries, and data pipelines | Full access |
232
+ | `database_administrator` | RDBMS tuning, indexes, and migration safety | Read + shell |
233
+ | `db2_dba` | DB2 for z/OS and LUW, REORG, RUNSTATS, bind/rebind | Read + shell |
234
+ | `debugger` | Root cause analysis and defect investigation | Read + shell |
235
+ | `design_system_engineer` | Design tokens and theming | Full access |
236
+ | `devops_engineer` | CI/CD, containerization, and deployment | Full access |
237
+ | `hlasm_assembler_specialist` | IBM HLASM for z/OS, macros, SVCs | Full access |
238
+ | `i18n_specialist` | Internationalization and locale management | Full access |
239
+ | `ibm_i_specialist` | IBM i RPG/CL, DB2 for i, OS/400 | Full access |
240
+ | `integration_engineer` | B2B APIs, ETL, and message brokers | Full access |
241
+ | `ml_engineer` | Model training, feature pipelines, and evaluation | Full access |
242
+ | `mlops_engineer` | Model registry, CI/CD for models, drift detection | Full access |
243
+ | `mobile_engineer` | iOS/Android/React Native/Flutter platform work | Full access |
244
+ | `observability_engineer` | Metrics, logs, traces, OpenTelemetry, dashboards | Full access |
245
+ | `performance_engineer` | Performance profiling and optimization | Read + shell |
246
+ | `platform_engineer` | Internal developer platforms and paved paths | Full access |
247
+ | `product_manager` | Requirements and product strategy | Read + write |
248
+ | `prompt_engineer` | LLM prompt design, few-shot, and RAG tuning | Read + write |
249
+ | `refactor` | Structural refactoring and technical debt | Full access |
250
+ | `release_manager` | Release notes, changelogs, rollout planning | Read + write |
251
+ | `security_engineer` | Security assessment and vulnerability analysis | Read + shell |
252
+ | `seo_specialist` | Technical SEO auditing and structured data | Read + shell |
253
+ | `site_reliability_engineer` | SLOs, error budgets, runbooks, postmortems | Read + shell |
254
+ | `solutions_architect` | Enterprise integration and cross-team architecture | Read-only |
255
+ | `technical_writer` | Documentation and technical writing | Read + write |
256
+ | `tester` | Test implementation and coverage analysis | Full access |
257
+ | `ux_designer` | User experience design | Read + write |
258
+ | `zos_sysprog` | z/OS systems programming, JCL, USS, RACF | Read + shell |
226
259
 
227
260
  ## Hooks
228
261
 
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "maestro",
3
- "version": "1.6.4-rc.1",
3
+ "version": "1.6.4-rc.3",
4
4
  "description": "Multi-agent development orchestration platform — 39 specialists, 4-phase orchestration, native parallel subagents, persistent sessions, and standalone review/debug/security/perf/seo/a11y/compliance commands",
5
5
  "author": {
6
6
  "name": "josstei",
@@ -1,10 +1,15 @@
1
1
  module.exports = {
2
2
  agentNames: [
3
3
  'accessibility-specialist', 'analytics-engineer', 'api-designer', 'architect',
4
- 'code-reviewer', 'coder', 'compliance-reviewer', 'content-strategist',
5
- 'copywriter', 'data-engineer', 'debugger', 'design-system-engineer',
6
- 'devops-engineer', 'i18n-specialist', 'performance-engineer', 'product-manager',
7
- 'refactor', 'security-engineer', 'seo-specialist', 'technical-writer',
8
- 'tester', 'ux-designer',
4
+ 'cloud-architect', 'cobol-engineer', 'code-reviewer', 'coder',
5
+ 'compliance-reviewer', 'content-strategist', 'copywriter', 'data-engineer',
6
+ 'database-administrator', 'db2-dba', 'debugger', 'design-system-engineer',
7
+ 'devops-engineer', 'hlasm-assembler-specialist', 'i18n-specialist',
8
+ 'ibm-i-specialist', 'integration-engineer', 'ml-engineer', 'mlops-engineer',
9
+ 'mobile-engineer', 'observability-engineer', 'performance-engineer',
10
+ 'platform-engineer', 'product-manager', 'prompt-engineer', 'refactor',
11
+ 'release-manager', 'security-engineer', 'seo-specialist',
12
+ 'site-reliability-engineer', 'solutions-architect', 'technical-writer',
13
+ 'tester', 'ux-designer', 'zos-sysprog',
9
14
  ],
10
15
  };
@@ -130,17 +130,26 @@ Explicitly state what the agent must NOT do:
130
130
  | Task Domain | Agent | Key Capability |
131
131
  |-------------|-------|---------------|
132
132
  | System architecture, component design | `architect` | Read-only analysis, architecture patterns |
133
+ | Cloud architecture, multi-region topology | `cloud-architect` | Read-only cloud/IaC architecture |
134
+ | Enterprise integration architecture | `solutions-architect` | Read-only cross-team architecture |
133
135
  | API contracts, endpoint design | `api-designer` | Read-only, REST/GraphQL expertise |
134
136
  | Feature implementation, coding | `coder` | Full read/write/shell access |
135
137
  | Code quality assessment | `code-reviewer` | Read-only, verified findings |
136
138
  | Database schema, queries, ETL | `data-engineer` | Full read/write/shell access |
139
+ | RDBMS tuning, indexes, migration safety | `database-administrator` | Read + shell for database analysis |
140
+ | DB2 operations and tuning | `db2-dba` | Read + shell for DB2-specific work |
137
141
  | Bug investigation, root cause | `debugger` | Read + shell for investigation |
138
142
  | CI/CD, infrastructure, deployment | `devops-engineer` | Full read/write/shell access |
143
+ | Internal platforms, paved paths | `platform-engineer` | Full platform implementation access |
144
+ | B2B APIs, ETL, message brokers | `integration-engineer` | Full integration implementation access |
145
+ | SLOs, runbooks, reliability | `site-reliability-engineer` | Read + shell reliability analysis |
146
+ | Metrics, logs, traces, dashboards | `observability-engineer` | Full observability implementation access |
139
147
  | Performance analysis, profiling | `performance-engineer` | Read + shell for profiling |
140
148
  | Code restructuring, modernization | `refactor` | Read/write/shell, skill activation |
141
149
  | Security assessment, vulnerability | `security-engineer` | Read + shell for scanning |
142
150
  | Test creation, TDD, coverage | `tester` | Full read/write/shell access |
143
151
  | Documentation, READMEs, guides | `technical-writer` | Read/write, no shell |
152
+ | Release notes, changelogs, rollout | `release-manager` | Read/write for release artifacts |
144
153
  | Technical SEO auditing | `seo-specialist` | Read + shell + web search/fetch |
145
154
  | Marketing copy, content writing | `copywriter` | Read/write |
146
155
  | Content planning, strategy | `content-strategist` | Read + web search/fetch |
@@ -151,6 +160,14 @@ Explicitly state what the agent must NOT do:
151
160
  | Internationalization | `i18n-specialist` | Full read/write/shell access |
152
161
  | Design tokens, theming | `design-system-engineer` | Full read/write/shell access |
153
162
  | Legal, regulatory compliance | `compliance-reviewer` | Read + web search/fetch |
163
+ | Mobile platform work | `mobile-engineer` | Full mobile implementation access |
164
+ | Model training and inference integration | `ml-engineer` | Full ML implementation access |
165
+ | Model operations and model CI/CD | `mlops-engineer` | Full MLOps implementation access |
166
+ | Prompt design, few-shot, RAG tuning | `prompt-engineer` | Read/write prompt and eval design |
167
+ | Mainframe COBOL, JCL, CICS/IMS | `cobol-engineer` | Full mainframe implementation access |
168
+ | IBM HLASM for z/OS | `hlasm-assembler-specialist` | Full assembly implementation access |
169
+ | IBM i RPG/CL, DB2 for i | `ibm-i-specialist` | Full IBM i implementation access |
170
+ | z/OS systems programming, JCL, RACF | `zos-sysprog` | Read + shell for z/OS system work |
154
171
 
155
172
  ## Agent Tool Dispatch Contract
156
173
 
@@ -269,7 +286,7 @@ Before each agent dispatch, a hook tracks which agent is currently executing:
269
286
  - Preferred signal: the required `Agent: <agent_name>` header in the delegation prompt
270
287
  - Legacy fallbacks: `MAESTRO_CURRENT_AGENT` from the environment, then regex-based detection of patterns like `delegate to <agent>` or `@<agent>`
271
288
 
272
- The detected agent name is persisted to `/tmp/maestro-hooks/<session-id>/active-agent` and cleared by the post-delegation hook on every allowed response (both successful validation and retry allow-through). On deny (malformed output), the active agent is preserved to enable re-validation on retry.
289
+ The detected agent name is persisted to `${MAESTRO_HOOKS_DIR:-<os.tmpdir()>/maestro-hooks-<uid>}/<session-id>/active-agent` and cleared by the post-delegation hook on every allowed response (both successful validation and retry allow-through). On deny (malformed output), the active agent is preserved to enable re-validation on retry.
273
290
 
274
291
  ### Session Context Injection
275
292
 
@@ -257,7 +257,7 @@ Apply depth-gated reasoning enrichment to design section content during the conv
257
257
 
258
258
  The write path depends on whether your runtime provides a Plan Mode surface (check `get_runtime_context`, loaded at session start, step 0):
259
259
 
260
- - **Plan Mode active**: Some runtimes restrict writes to a temporary staging directory during Plan Mode. Write the design document there. After `exit_plan_mode` approval in Phase 2, copy it to the permanent location.
260
+ - **Plan Mode active**: Some runtimes restrict writes to a temporary staging directory during Plan Mode. Write the design document there first, then exit Plan Mode and complete the design approval handoff. When the runtime's Plan Mode path is not visible to the MCP server, use the `record_design_approval` content variant so the server materializes the canonical copy under `<state_dir>/plans/`.
261
261
  - **Plan Mode not active or not available**: Write directly to the permanent location. If your runtime does not provide Plan Mode, track design progress using the plan-update mechanism from runtime context and use the user-prompt tool from runtime context for section approvals and final signoff.
262
262
 
263
263
  Permanent location: `<state_dir>/plans/YYYY-MM-DD-<topic-slug>-design.md` (where `<state_dir>` resolves from `MAESTRO_STATE_DIR`, default `docs/maestro`).
@@ -121,7 +121,7 @@ Hooks fire automatically at agent boundaries. The orchestrator does not invoke t
121
121
 
122
122
  The hooks system tracks which agent is currently executing. Before each agent dispatch, a hook resolves the active agent identity from the required `Agent:` header first, then falls back to legacy env/regex detection, and injects compact session context. After completion, a hook validates that the response contains both `Task Report` and `Downstream Context`; it requests one retry on the first malformed response.
123
123
 
124
- The hook state directory under `/tmp/maestro-hooks/<session-id>/` is transient and separate from orchestration state.
124
+ The hook state directory under `${MAESTRO_HOOKS_DIR:-<os.tmpdir()>/maestro-hooks-<uid>}/<session-id>/` is transient and separate from orchestration state.
125
125
 
126
126
  ## Sequential Execution Protocol
127
127
 
@@ -77,12 +77,12 @@ Before finalizing agent assignments, verify each phase's agent can deliver its r
77
77
 
78
78
  | Phase Deliverable | Required Tier | Compatible Agents |
79
79
  |-------------------|--------------|-------------------|
80
- | Creates/modifies files | Full Access or Read+Write | coder, data-engineer, devops-engineer, tester, refactor, design-system-engineer, i18n-specialist, analytics-engineer, technical-writer, product-manager, ux-designer, copywriter |
81
- | Runs shell commands | Full Access or Read+Shell | coder, data-engineer, devops-engineer, tester, refactor, design-system-engineer, i18n-specialist, analytics-engineer, debugger, performance-engineer, security-engineer, seo-specialist, accessibility-specialist |
80
+ | Creates/modifies files | Full Access or Read+Write | analytics-engineer, cobol-engineer, coder, copywriter, data-engineer, design-system-engineer, devops-engineer, hlasm-assembler-specialist, i18n-specialist, ibm-i-specialist, integration-engineer, ml-engineer, mlops-engineer, mobile-engineer, observability-engineer, platform-engineer, product-manager, prompt-engineer, refactor, release-manager, technical-writer, tester, ux-designer |
81
+ | Runs shell commands | Full Access or Read+Shell | accessibility-specialist, analytics-engineer, cobol-engineer, coder, data-engineer, database-administrator, db2-dba, debugger, design-system-engineer, devops-engineer, hlasm-assembler-specialist, i18n-specialist, ibm-i-specialist, integration-engineer, ml-engineer, mlops-engineer, mobile-engineer, observability-engineer, performance-engineer, platform-engineer, refactor, security-engineer, seo-specialist, site-reliability-engineer, tester, zos-sysprog |
82
82
  | Analysis/review only | Any tier | All agents |
83
83
 
84
84
  <HARD-GATE>
85
- Read-Only agents (architect, api-designer, code-reviewer, content-strategist, compliance-reviewer)
85
+ Read-Only agents (architect, api-designer, cloud-architect, code-reviewer, compliance-reviewer, content-strategist, solutions-architect)
86
86
  CANNOT be assigned to phases that create or modify files. If a phase requires file creation
87
87
  and domain expertise from a Read-Only agent, split it: the Read-Only agent produces a spec
88
88
  or analysis, then a write-capable agent (typically coder) implements the files based on that output.
@@ -180,17 +180,26 @@ If `validate_plan` is available, review its `parallelization_profile` and `redun
180
180
  | Task Domain | Primary Agent | Secondary Agent | Rationale |
181
181
  |-------------|--------------|-----------------|-----------|
182
182
  | System design, architecture | `architect` | - | Read-only analysis, design expertise |
183
+ | Cloud architecture, multi-region topology | `cloud-architect` | `devops-engineer` | Architecture first, implementation second |
184
+ | Enterprise integration architecture | `solutions-architect` | `integration-engineer` | Cross-team design before implementation |
183
185
  | API contracts, endpoints | `api-designer` | `coder` | Design then implement |
184
186
  | Feature implementation | `coder` | - | Full implementation access |
185
187
  | Code quality review | `code-reviewer` | - | Read-only verification |
186
188
  | Database schema, queries | `data-engineer` | - | Schema + implementation |
189
+ | RDBMS tuning, indexes, migration safety | `database-administrator` | `data-engineer` | DBA analysis before schema/code changes |
190
+ | DB2 administration | `db2-dba` | `data-engineer` | DB2-specific operations and design |
187
191
  | Bug investigation | `debugger` | - | Read + shell for investigation |
188
192
  | CI/CD, infrastructure | `devops-engineer` | - | Full DevOps access |
193
+ | Internal platforms, paved paths | `platform-engineer` | `devops-engineer` | Platform conventions and implementation |
194
+ | B2B integrations, ETL, message brokers | `integration-engineer` | - | Full integration implementation |
195
+ | SLOs, runbooks, reliability | `site-reliability-engineer` | `observability-engineer` | Reliability assessment plus telemetry implementation |
196
+ | Observability, metrics, traces | `observability-engineer` | - | Full telemetry implementation |
189
197
  | Performance analysis | `performance-engineer` | - | Read + shell for profiling |
190
198
  | Code restructuring | `refactor` | - | Write + shell access (for validation) |
191
199
  | Security assessment | `security-engineer` | - | Read + shell for scanning |
192
200
  | Test creation | `tester` | - | Full test implementation |
193
201
  | Documentation | `technical-writer` | - | Write access for docs |
202
+ | Release notes, changelogs, rollout | `release-manager` | - | Write access for release artifacts |
194
203
  | Technical SEO audit | `seo-specialist` | - | Read + shell + web search |
195
204
  | Marketing copy, content | `copywriter` | - | Read/write |
196
205
  | Content planning | `content-strategist` | - | Read + web search/fetch |
@@ -201,6 +210,14 @@ If `validate_plan` is available, review its `parallelization_profile` and `redun
201
210
  | Internationalization | `i18n-specialist` | `coder` | Implement then localize |
202
211
  | Design tokens, theming | `design-system-engineer` | `coder` | Tokens then consume |
203
212
  | Legal, regulatory | `compliance-reviewer` | - | Read + web search/fetch |
213
+ | Mobile platform work | `mobile-engineer` | `tester` | Mobile implementation plus validation |
214
+ | Model training, inference integration | `ml-engineer` | `tester` | ML implementation plus evaluation |
215
+ | Model registry, drift, model CI/CD | `mlops-engineer` | `devops-engineer` | Model operations and deployment |
216
+ | Prompt design, few-shot, RAG tuning | `prompt-engineer` | `coder` | Prompt spec before integration |
217
+ | Mainframe COBOL, JCL, CICS/IMS | `cobol-engineer` | `tester` | Mainframe implementation and validation |
218
+ | IBM HLASM for z/OS | `hlasm-assembler-specialist` | - | Assembly implementation |
219
+ | IBM i RPG/CL, DB2 for i | `ibm-i-specialist` | - | IBM i implementation |
220
+ | z/OS systems programming, JCL, RACF | `zos-sysprog` | `security-engineer` | System-level analysis and controls |
204
221
 
205
222
  ### Assignment Rules
206
223
  1. Match the primary task domain to the agent specialization
@@ -216,32 +233,19 @@ Estimate token consumption per phase based on:
216
233
  - Agent's max_turns limit as upper bound
217
234
  - Historical averages: ~500 input tokens per file read, ~200 output tokens per file written
218
235
 
219
- ### Cost Estimation
236
+ ### Resource Estimation
220
237
 
221
- #### Per-Phase Cost Factors
222
- - **Model tier**: Pro agents (~$0.01/1K input, ~$0.04/1K output) vs Flash agents (~$0.001/1K input, ~$0.004/1K output)
223
- - **Input complexity**: Number of files read, average file size, context from previous phases
224
- - **Output complexity**: Lines of code generated, number of files created/modified
225
- - **Retry budget**: Add 50% buffer per phase for potential retries (max 2 retries)
238
+ Do not invent provider pricing or model tiers. Agent model selection is runtime-owned through agent frontmatter and runtime configuration. Estimate execution size in stable, codebase-derived terms instead:
226
239
 
227
- #### Estimation Formula
228
- ```
229
- Phase Cost = (input_tokens × input_rate + output_tokens × output_rate) × retry_multiplier
230
- ```
231
-
232
- Where:
233
- - `input_tokens` = files_to_read × 500 + context_tokens
234
- - `output_tokens` = files_to_write × 200 + validation_output
235
- - `retry_multiplier` = 1.5 (accounts for up to 2 retries)
240
+ - **Input complexity**: number of files likely to be read, average file size, and prior-phase context
241
+ - **Output complexity**: number of files created or modified, validation output volume, and expected handoff detail
242
+ - **Retry budget**: note phases likely to need retries because of broad file ownership, external dependencies, or uncertain validation
236
243
 
237
- #### Plan-Level Cost Summary
238
- Include this table in every implementation plan:
244
+ Include a lightweight plan-level resource summary when useful:
239
245
 
240
- | Phase | Agent | Model | Est. Input | Est. Output | Est. Cost |
241
- |-------|-------|-------|-----------|------------|----------|
242
- | 1 | [agent] | [model] | [tokens] | [tokens] | [$X.XX] |
243
- | ... | ... | ... | ... | ... | ... |
244
- | **Total** | | | **[sum]** | **[sum]** | **[$X.XX]** |
246
+ | Phase | Agent | Est. Files Read | Est. Files Written | Retry Risk | Notes |
247
+ |-------|-------|-----------------|--------------------|------------|-------|
248
+ | 1 | [agent] | [N] | [N] | LOW/MEDIUM/HIGH | [why] |
245
249
 
246
250
  ## Plan Document Generation
247
251
 
@@ -296,7 +300,7 @@ After writing the implementation plan:
296
300
  1. Confirm the file path to the user
297
301
  2. Present the dependency graph and execution strategy
298
302
  3. Highlight parallel execution opportunities
299
- 4. Provide token budget estimates
303
+ 4. Provide resource estimates when useful
300
304
  5. If your runtime provides Plan Mode, call `exit_plan_mode` with the plan path to present the plan for user approval. If Plan Mode is not available, present the completed plan for user approval using the user-prompt tool from runtime context.
301
305
  6. Ensure the approved plan is at `<state_dir>/plans/YYYY-MM-DD-<slug>-impl-plan.md` as the permanent project reference (copy from the staging directory if Plan Mode was used)
302
306
  7. Ask if the user is ready to proceed to execution (Phase 3)
@@ -18,12 +18,12 @@ Detection: check whether MCP state tools appear in your available tools. If they
18
18
 
19
19
  ## Hook-Level Session State
20
20
 
21
- Maestro hooks maintain a separate, transient state directory at `/tmp/maestro-hooks/<session-id>/` that is distinct from orchestration state in `<MAESTRO_STATE_DIR>`:
21
+ Maestro hooks maintain a separate, transient state directory under `${MAESTRO_HOOKS_DIR:-<os.tmpdir()>/maestro-hooks-<uid>}/<session-id>/` that is distinct from orchestration state in `<MAESTRO_STATE_DIR>`:
22
22
 
23
23
  | Concern | Orchestration State | Hook State |
24
24
  | --- | --- | --- |
25
- | Location | `<MAESTRO_STATE_DIR>/state/` | `/tmp/maestro-hooks/<session-id>/` (Unix) or `<os.tmpdir()>/maestro-hooks/<session-id>/` (Windows) |
26
- | Lifecycle | Created in Phase 2, archived in Phase 4 | Directory created by the session-start hook when an active session exists; active-agent file written by the pre-delegation hook and cleared by the post-delegation hook; stale directories pruned by both session-start and pre-delegation hooks |
25
+ | Location | `<MAESTRO_STATE_DIR>/state/` | `${MAESTRO_HOOKS_DIR:-<os.tmpdir()>/maestro-hooks-<uid>}/<session-id>/` |
26
+ | Lifecycle | Created at execution setup, archived in Phase 4 | Directory created by the session-start hook when an active session exists; active-agent file written by the pre-delegation hook and cleared by the post-delegation hook; stale directories pruned by both session-start and pre-delegation hooks |
27
27
  | Contents | Session metadata, phase tracking, token usage, file manifests | Active agent tracking file (`active-agent`) |
28
28
  | Persistence | Survives session restarts (supports `/maestro:resume`) | Ephemeral — lost on session end or system reboot |
29
29
  | Managed by | Orchestrator via session-management skill | The runtime's pre-delegation and post-delegation hooks |
@@ -35,7 +35,7 @@ The orchestrator does not read or write hook-level state directly. It interacts
35
35
  ## Session Creation Protocol
36
36
 
37
37
  ### When to Create
38
- For Standard workflow, create a new session when beginning Phase 2 (Team Assembly & Planning) of orchestration, after the design document has been approved. For Express workflow, create a session after the structured brief is approved (see Express Workflow section in the orchestrator template).
38
+ For Standard workflow, create a new session at execution setup after the design document and implementation plan are approved and the execution mode gate has resolved. For Express workflow, create a session after the structured brief is approved (see Express Workflow section in the orchestrator template).
39
39
 
40
40
  ### Session ID Format
41
41
  `YYYY-MM-DD-<topic-slug>`
@@ -1,3 +1,3 @@
1
1
  {
2
- "version": "1.6.4-rc.1"
2
+ "version": "1.6.4-rc.3"
3
3
  }