oh-my-customcode 0.37.2 → 0.38.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (27) hide show
  1. package/README.md +20 -5
  2. package/dist/cli/index.js +1189 -99
  3. package/dist/index.js +4 -1
  4. package/package.json +3 -1
  5. package/templates/.claude/hooks/hooks.json +23 -11
  6. package/templates/.claude/hooks/scripts/context-budget-advisor.sh +1 -1
  7. package/templates/.claude/hooks/scripts/eval-core-batch-save.sh +23 -0
  8. package/templates/.claude/hooks/scripts/session-env-check.sh +20 -0
  9. package/templates/.claude/hooks/scripts/stuck-detector.sh +1 -1
  10. package/templates/.claude/hooks/scripts/task-outcome-recorder.sh +13 -1
  11. package/templates/.claude/rules/MAY-optimization.md +12 -0
  12. package/templates/.claude/rules/MUST-agent-design.md +45 -7
  13. package/templates/.claude/rules/MUST-completion-verification.md +81 -0
  14. package/templates/.claude/rules/SHOULD-memory-integration.md +81 -0
  15. package/templates/.claude/skills/de-lead-routing/SKILL.md +8 -92
  16. package/templates/.claude/skills/deep-plan/SKILL.md +55 -4
  17. package/templates/.claude/skills/dev-lead-routing/SKILL.md +9 -21
  18. package/templates/.claude/skills/dev-refactor/SKILL.md +34 -1
  19. package/templates/.claude/skills/evaluator-optimizer/SKILL.md +53 -0
  20. package/templates/.claude/skills/qa-lead-routing/SKILL.md +7 -242
  21. package/templates/.claude/skills/research/SKILL.md +74 -7
  22. package/templates/.claude/skills/sauron-watch/SKILL.md +81 -0
  23. package/templates/.claude/skills/secretary-routing/SKILL.md +3 -18
  24. package/templates/.claude/skills/structured-dev-cycle/SKILL.md +20 -3
  25. package/templates/guides/claude-code/index.yaml +5 -0
  26. package/templates/manifest.json +3 -3
  27. package/templates/.claude/hooks/scripts/session-compliance-report.sh +0 -65
@@ -5,14 +5,14 @@ scope: core
5
5
  version: 1.0.0
6
6
  user-invocable: true
7
7
  argument-hint: "<topic-or-issue>"
8
- context: fork
8
+ teams-compatible: true
9
9
  ---
10
10
 
11
11
  # Deep Plan Skill
12
12
 
13
13
  Research-validated planning that eliminates the gap between research assumptions and actual code. Orchestrates a 3-phase cycle: Discovery Research → Reality-Check Planning → Plan Verification.
14
14
 
15
- **Orchestrator-only** — only the main conversation uses this skill (R010). All phases execute as subagents.
15
+ **Teams-compatible** — works both from the main conversation (R010) and inside Agent Teams members. When used in Teams, the member directly executes the 3-phase workflow without Skill tool invocation.
16
16
 
17
17
  ## Usage
18
18
 
@@ -50,7 +50,11 @@ Phase 1: Discovery Research
50
50
  └── Output: research report (artifact)
51
51
  ```
52
52
 
53
- **Execution**: Delegates to `/research` skill via `Skill(research, args="<topic>")`. The orchestrator waits for completion before proceeding to Phase 2.
53
+ **Execution**:
54
+ - **Orchestrator mode**: Delegates to `/research` skill via `Skill(research, args="<topic>")`.
55
+ - **Teams mode**: Executes the research workflow inline (see Teams Mode section). The member spawns research teams directly as sub-agents.
56
+
57
+ The executor waits for completion before proceeding to Phase 2.
54
58
 
55
59
  **Output**: Full research report with ADOPT/ADAPT/AVOID taxonomy.
56
60
 
@@ -251,7 +255,7 @@ Phase 1 delegation to `/research` means Agent Teams decisions are handled by the
251
255
 
252
256
  | Component | Integration |
253
257
  |-----------|-------------|
254
- | `/research` | Phase 1 full invocation + Phase 3 reduced invocation pattern |
258
+ | `/research` | Phase 1 full invocation (via Skill tool or inline in Teams mode) + Phase 3 reduced invocation pattern |
255
259
  | EnterPlanMode/ExitPlanMode | Phase 2 plan creation and user approval |
256
260
  | Explore agents | Phase 2 codebase verification (up to 3 parallel) |
257
261
  | R009 | Phase 1 (10 teams batched), Phase 2 (3 Explore), Phase 3 (3 teams) |
@@ -271,6 +275,53 @@ Phase 1 delegation to `/research` means Agent Teams decisions are handled by the
271
275
  | Explore agent failure | Reduce parallel count, retry with remaining |
272
276
  | Partial team failure | Synthesize from available results, note gaps |
273
277
 
278
+ ## Teams Mode
279
+
280
+ When running inside an Agent Teams member (not via Skill tool), the deep-plan workflow operates identically but with these adaptations:
281
+
282
+ ### How It Works
283
+
284
+ The orchestrator reads this SKILL.md and includes the deep-plan instructions directly in the Teams member's prompt. The member then:
285
+
286
+ 1. Phase 1: Executes research workflow inline (not via `Skill(research)`) — spawns 10 research teams as sub-agents
287
+ 2. Phase 2: Uses EnterPlanMode/ExitPlanMode and Explore agents normally
288
+ 3. Phase 3: Spawns 3 verification teams as sub-agents
289
+ 4. Delivers final verified plan via `SendMessage` to team lead
290
+
291
+ ### Prompt Embedding Pattern
292
+
293
+ ```
294
+ # When spawning a Teams member for deep-plan:
295
+ Agent(
296
+ name: "planner-1",
297
+ team_name: "my-team",
298
+ prompt: """
299
+ You are a deep-plan agent. Follow the deep-plan skill workflow below:
300
+ {contents of deep-plan/SKILL.md}
301
+
302
+ Also follow this research workflow for Phase 1:
303
+ {contents of research/SKILL.md}
304
+
305
+ Topic: {user's planning topic}
306
+ Deliver verified plan via SendMessage to team lead when complete.
307
+ """
308
+ )
309
+ ```
310
+
311
+ ### Differences from Orchestrator Mode
312
+
313
+ | Aspect | Orchestrator Mode | Teams Mode |
314
+ |--------|------------------|------------|
315
+ | Invocation | `Skill(deep-plan)` | Prompt embedding |
316
+ | Phase 1 research | `Skill(research)` | Inline execution |
317
+ | Result delivery | Return to main conversation | `SendMessage` to team lead |
318
+ | Plan approval | User via ExitPlanMode | Team lead via SendMessage |
319
+ | Context isolation | Previously used `context: fork` | Standard context (no fork) |
320
+
321
+ ### Why No context: fork
322
+
323
+ `context: fork` was removed to enable Teams compatibility. Fork blocks sub-agent spawning, which is essential for Phase 1 (10 research teams) and Phase 3 (3 verification teams). Without fork, deep-plan operates in the standard context, which is required for both orchestrator and Teams usage.
324
+
274
325
  ## Artifact Persistence
275
326
 
276
327
  Phase 1 research artifact is persisted by the `/research` skill.
@@ -18,6 +18,7 @@ context: fork
18
18
  | Tooling | tool-npm-expert, tool-optimizer, tool-bun-expert |
19
19
  | Database | db-supabase-expert, db-postgres-expert, db-redis-expert |
20
20
  | Architect | arch-documenter, arch-speckit-agent |
21
+ | Security | sec-codeql-expert |
21
22
  | Infra | infra-docker-expert, infra-aws-expert |
22
23
 
23
24
  ## File Extension Mapping
@@ -67,6 +68,7 @@ context: fork
67
68
  | supabase, rls, edge function | db-supabase-expert |
68
69
  | docker, dockerfile, container, compose | infra-docker-expert |
69
70
  | aws, cloudformation, vpc, iam, s3, lambda, cdk, terraform | infra-aws-expert |
71
+ | security, codeql, cve, vulnerability, sarif, sast, security audit | sec-codeql-expert |
70
72
  | architecture, adr, openapi, swagger, diagram | arch-documenter |
71
73
  | spec, specification, tdd, requirements | arch-speckit-agent |
72
74
 
@@ -97,10 +99,11 @@ Check if Agent Teams is available (`CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1` or T
97
99
  For **new file creation**, **boilerplate**, or **test code generation**:
98
100
 
99
101
  1. Check `/tmp/.claude-env-status-*` for codex availability
100
- 2. If codex available → suggest hybrid workflow:
102
+ 2. If codex available AND task involves new file creation automatically delegate to `/codex-exec` for scaffolding:
103
+ - Display: `[Codex Hybrid] Delegating to codex-exec...`
101
104
  - codex-exec generates initial code (strength: fast generation)
102
- - Claude expert reviews and refines (strength: reasoning, quality)
103
- 3. If codex unavailable → use Claude expert directly
105
+ - Selected Claude expert reviews and refines codex output (strength: reasoning, quality)
106
+ 3. If codex unavailable → display `[Codex] Unavailable — proceeding with {expert} directly` and use Claude expert directly
104
107
 
105
108
  **Suitable**: New file creation, boilerplate, scaffolding, test code
106
109
  **Unsuitable**: Existing code modification, architecture decisions, bug fixes
@@ -110,26 +113,11 @@ Route to appropriate language/framework expert based on file extension and keywo
110
113
 
111
114
  ### Step 4: Ontology-RAG Enrichment (R019)
112
115
 
113
- After agent selection, enrich the spawned agent's prompt with ontology context:
116
+ If `get_agent_for_task` MCP tool is available, call it with the original query and inject `suggested_skills` into the agent prompt. Skip silently on failure.
114
117
 
115
- 1. Call `get_agent_for_task(original_query)` via MCP
116
- 2. Extract `suggested_skills` from response
117
- 3. If `suggested_skills` non-empty, prepend to spawned agent prompt:
118
- `"Ontology context suggests these skills may be relevant: {suggested_skills}"`
119
- 4. On MCP failure: skip silently, proceed with unmodified prompt
118
+ ### Step 5: Soul Injection (R006)
120
119
 
121
- **This step is advisory only it never changes which agent is selected.**
122
-
123
- ### Step 5: Soul Injection
124
-
125
- If the selected agent has `soul: true` in its frontmatter:
126
-
127
- 1. Read `.claude/agents/souls/{agent-name}.soul.md`
128
- 2. If file exists, prepend soul content to the agent's prompt:
129
- `"Identity context:\n{soul content}\n\n---\n\n"`
130
- 3. If file doesn't exist → skip silently (no error, no injection)
131
-
132
- **This step runs after ontology-RAG enrichment. Soul content is identity context, not capability instructions.**
120
+ If the selected agent has `soul: true` in frontmatter, read and prepend `.claude/agents/souls/{agent-name}.soul.md` content to the prompt. Skip silently if file doesn't exist.
133
121
 
134
122
  ## Routing Rules
135
123
 
@@ -2,7 +2,7 @@
2
2
  name: dev-refactor
3
3
  description: Refactor code for better structure and patterns
4
4
  scope: core
5
- argument-hint: "<file-or-directory> [--lang <language>]"
5
+ argument-hint: "<file-or-directory> [--lang <language>] [--spec]"
6
6
  ---
7
7
 
8
8
  # Code Refactoring Skill
@@ -198,3 +198,36 @@ Summary:
198
198
 
199
199
  Recommendation: Run tests to verify changes.
200
200
  ```
201
+
202
+ ## Spec Mode (`--spec`)
203
+
204
+ When the `--spec` flag is present, refactoring is guided by the target's canonical specification:
205
+
206
+ ### Workflow
207
+
208
+ 1. **Load spec**: Read `.claude/specs/<agent-name>.spec.md` (generated by `/omcustom:takeover`)
209
+ - If spec doesn't exist, run takeover first: `/omcustom:takeover <name>`
210
+ 2. **Extract invariants**: Parse the spec's `## Invariants` section as pre-flight guard constraints
211
+ 3. **Refactor**: Perform normal refactoring (per existing workflow)
212
+ 4. **Verify invariants**: After refactoring, check each invariant still holds:
213
+ ```
214
+ [Spec Verification]
215
+ ├── ✓ Invariant 1: {description} — PASS
216
+ ├── ✓ Invariant 2: {description} — PASS
217
+ └── ✗ Invariant 3: {description} — FAIL (reason)
218
+ ```
219
+ 5. **Regenerate spec**: If refactoring changed the contract, run `/omcustom:takeover <name>` to update
220
+
221
+ ### When to Use
222
+
223
+ | Scenario | Use `--spec`? |
224
+ |----------|--------------|
225
+ | Refactoring agent internals | Yes — preserves declared invariants |
226
+ | Renaming/restructuring skill | Yes — ensures contract stability |
227
+ | Simple code cleanup | No — overhead not justified |
228
+ | Adding new capability | No — spec will change anyway |
229
+
230
+ ### Prerequisites
231
+
232
+ - `.claude/specs/<name>.spec.md` must exist (or will be auto-generated via takeover)
233
+ - Target must be an agent or skill (not arbitrary code)
@@ -46,6 +46,14 @@ evaluator-optimizer:
46
46
  | `quality_gate.threshold` | No | `0.8` | Score threshold (for `score_threshold` type) |
47
47
  | `max_iterations` | No | `3` | Max refinement loops (hard cap: 5) |
48
48
 
49
+ ### Model Selection Guidance
50
+
51
+ For model selection within the evaluator-optimizer loop, follow the [reasoning-sandwich](/skills/reasoning-sandwich) pattern:
52
+
53
+ - **Generator**: Use `sonnet` (default) — optimized for content generation
54
+ - **Evaluator**: Use `opus` (default) — benefits from stronger reasoning for quality assessment
55
+ - **Override**: For simpler domains, `sonnet`/`sonnet` is acceptable; for critical domains, consider `opus`/`opus`
56
+
49
57
  ## Quality Gate Types
50
58
 
51
59
  | Type | Behavior |
@@ -148,6 +156,7 @@ The evaluator MUST return a structured verdict in this format:
148
156
  | Documentation | `arch-documenter` | opus reviewer | Completeness, clarity, accuracy |
149
157
  | Architecture | Plan agent | opus reviewer | No SPOFs, no circular deps |
150
158
  | Test plans | `qa-planner` | `qa-engineer` | Coverage, edge cases, feasibility |
159
+ | Test coverage | `qa-writer` | `qa-engineer` + coverage tool | `coverage >= target%` |
151
160
  | Agent creation | `mgr-creator` | opus reviewer | Frontmatter validity, R006 compliance |
152
161
  | Security audit | `sec-codeql-expert` | opus reviewer | Vulnerability coverage, false positive rate |
153
162
 
@@ -208,6 +217,50 @@ evaluator-optimizer:
208
217
  max_iterations: 3
209
218
  ```
210
219
 
220
+ ### Domain: Test Coverage Optimization
221
+
222
+ ```yaml
223
+ evaluator-optimizer:
224
+ generator:
225
+ agent: qa-writer
226
+ model: sonnet
227
+ evaluator:
228
+ agent: qa-engineer
229
+ model: sonnet
230
+ rubric:
231
+ - criterion: line_coverage
232
+ weight: 0.4
233
+ description: "Percentage of code lines exercised by tests"
234
+ - criterion: branch_coverage
235
+ weight: 0.3
236
+ description: "Percentage of conditional branches tested"
237
+ - criterion: edge_cases
238
+ weight: 0.2
239
+ description: "Critical edge cases explicitly tested"
240
+ - criterion: test_quality
241
+ weight: 0.1
242
+ description: "Tests are meaningful, not just hitting lines"
243
+ quality_gate:
244
+ type: score_threshold
245
+ threshold: 0.8
246
+ max_iterations: 5
247
+ parameters:
248
+ target_coverage: 80 # Minimum coverage percentage
249
+ max_iterations: 5 # Hard cap (matches skill-level cap)
250
+ ```
251
+
252
+ **Workflow**:
253
+ 1. qa-writer generates test cases targeting uncovered code
254
+ 2. qa-engineer runs tests and measures coverage
255
+ 3. If coverage < target: qa-writer generates additional tests for uncovered paths
256
+ 4. Repeat until target reached or max_iterations exhausted
257
+
258
+ **Parameters**:
259
+ | Parameter | Default | Description |
260
+ |-----------|---------|-------------|
261
+ | `target_coverage` | 80% | Minimum acceptable coverage |
262
+ | `max_iterations` | 5 | Hard cap on refinement loops |
263
+
211
264
  ## Integration
212
265
 
213
266
  | Rule | Integration |
@@ -47,261 +47,26 @@ full_qa_cycle → all agents (sequential)
47
47
 
48
48
  ### Ontology-RAG Enrichment (R019)
49
49
 
50
- After agent selection, enrich the spawned agent's prompt with ontology context:
50
+ If `get_agent_for_task` MCP tool is available, call it with the original query and inject `suggested_skills` into the agent prompt. Skip silently on failure.
51
51
 
52
- 1. Call `get_agent_for_task(original_query)` via MCP
53
- 2. Extract `suggested_skills` from response
54
- 3. If `suggested_skills` non-empty, prepend to spawned agent prompt:
55
- `"Ontology context suggests these skills may be relevant: {suggested_skills}"`
56
- 4. On MCP failure: skip silently, proceed with unmodified prompt
52
+ ### Step 5: Soul Injection (R006)
57
53
 
58
- **This step is advisory only it never changes which agent is selected.**
54
+ If the selected agent has `soul: true` in frontmatter, read and prepend `.claude/agents/souls/{agent-name}.soul.md` content to the prompt. Skip silently if file doesn't exist.
59
55
 
60
- ### Step 5: Soul Injection
56
+ ## Sequential Workflow Ordering
61
57
 
62
- If the selected agent has `soul: true` in its frontmatter:
63
-
64
- 1. Read `.claude/agents/souls/{agent-name}.soul.md`
65
- 2. If file exists, prepend soul content to the agent's prompt:
66
- `"Identity context:\n{soul content}\n\n---\n\n"`
67
- 3. If file doesn't exist → skip silently (no error, no injection)
68
-
69
- **This step runs after ontology-RAG enrichment. Soul content is identity context, not capability instructions.**
70
-
71
- ## Routing Rules
72
-
73
- ### 1. Test Planning
74
-
75
- ```
76
- User: "Create test plan for feature X"
77
-
78
- Route:
79
- Agent(qa-planner role → create test plan, model: "sonnet")
80
-
81
- Output:
82
- - Test scenarios
83
- - Coverage targets
84
- - Acceptance criteria
85
- - Risk assessment
86
- ```
87
-
88
- ### 2. Test Documentation
89
-
90
- ```
91
- User: "Document test cases for API"
92
-
93
- Route:
94
- Agent(qa-writer role → document test cases, model: "sonnet")
95
-
96
- Output:
97
- - Test case specifications
98
- - Test data requirements
99
- - Expected results
100
- - Test templates
101
- ```
102
-
103
- ### 3. Test Execution
104
-
105
- ```
106
- User: "Execute tests for module Y"
107
-
108
- Route:
109
- Agent(qa-engineer role → execute tests, model: "sonnet")
110
-
111
- Output:
112
- - Test execution results
113
- - Pass/fail metrics
114
- - Defect reports
115
- - Coverage reports
116
- ```
117
-
118
- ### 4. Quality Analysis
119
-
120
- When analysis is needed (parallel execution):
121
-
122
- ```
123
- User: "Analyze quality metrics"
124
-
125
- Route (parallel):
126
- Agent(qa-planner role → analyze strategy, model: "sonnet")
127
- Agent(qa-engineer role → analyze results, model: "sonnet")
128
-
129
- Aggregate:
130
- Strategy insights + execution data
131
- ```
132
-
133
- ### 5. Full QA Cycle (Sequential)
134
-
135
- For complete quality assurance workflow:
136
-
137
- ```
138
- User: "Run full QA cycle for feature Z"
139
-
140
- Route (sequential):
141
- 1. Agent(qa-planner role → create test plan, model: "sonnet")
142
- 2. Agent(qa-writer role → document test cases, model: "sonnet")
143
- 3. Agent(qa-engineer role → execute tests, model: "sonnet")
144
- 4. Agent(qa-writer role → generate report, model: "sonnet")
145
-
146
- Aggregate and present final report
147
- ```
148
-
149
- ## Full QA Cycle Workflow
150
-
151
- ```
152
- 1. Planning Phase (qa-planner)
153
- - Analyze requirements
154
- - Define test scenarios
155
- - Set acceptance criteria
156
- - Identify risks
157
-
158
- 2. Documentation Phase (qa-writer)
159
- - Write test cases
160
- - Define test data
161
- - Document expected results
162
- - Create templates
163
-
164
- 3. Execution Phase (qa-engineer)
165
- - Execute test cases
166
- - Record results
167
- - Report defects
168
- - Calculate coverage
169
-
170
- 4. Reporting Phase (qa-writer)
171
- - Aggregate results
172
- - Generate reports
173
- - Document findings
174
- - Provide recommendations
175
-
176
- 5. Aggregation (qa-lead routing)
177
- - Combine all phases
178
- - Present unified status
179
- - Highlight critical issues
180
- ```
181
-
182
- ## Sequential vs Parallel Execution
183
-
184
- ### Sequential (typical for QA workflow)
185
-
186
- QA workflow is typically sequential because each phase depends on the previous:
187
- - Planning must complete before documentation
188
- - Documentation must complete before execution
189
- - Execution must complete before reporting
58
+ Full QA cycle follows sequential phases (each depends on the previous):
190
59
 
191
60
  ```
192
61
  qa-planner → qa-writer → qa-engineer → qa-writer
193
62
  (plan) (document) (execute) (report)
194
63
  ```
195
64
 
196
- ### Parallel (rare, for independent analyses)
197
-
198
- Only when tasks are truly independent:
199
- - Quality analysis (strategy + results)
200
- - Multi-module testing (independent modules)
201
-
202
- ```
203
- Example:
204
- Agent(qa-engineer role → test module A, model: "sonnet")
205
- Agent(qa-engineer role → test module B, model: "sonnet")
206
- Agent(qa-engineer role → test module C, model: "sonnet")
207
- ```
65
+ Parallel execution only for independent analyses (e.g., multi-module testing). See R009.
208
66
 
209
67
  ## Sub-agent Model Selection
210
68
 
211
- ### Model Mapping
212
-
213
- | Agent | Recommended Model | Reason |
214
- |-------|-------------------|--------|
215
- | qa-planner | `sonnet` | Strategy requires balanced reasoning |
216
- | qa-writer | `sonnet` | Documentation quality matters |
217
- | qa-engineer | `sonnet` | Test execution needs accuracy |
218
-
219
- All QA agents typically use `sonnet` for balanced quality output.
220
-
221
- ### Agent Call Examples
222
-
223
- ```
224
- # Test planning
225
- Agent(
226
- subagent_type: "general-purpose",
227
- prompt: "Create comprehensive test plan for authentication feature following qa-planner guidelines",
228
- model: "sonnet"
229
- )
230
-
231
- # Test documentation
232
- Agent(
233
- subagent_type: "general-purpose",
234
- prompt: "Document test cases for API endpoints following qa-writer guidelines",
235
- model: "sonnet"
236
- )
237
-
238
- # Test execution
239
- Agent(
240
- subagent_type: "general-purpose",
241
- prompt: "Execute integration tests and report results following qa-engineer guidelines",
242
- model: "sonnet"
243
- )
244
- ```
245
-
246
- ## Display Format
247
-
248
- ### Full QA Cycle
249
-
250
- ```
251
- [Planning] Delegating to qa-planner...
252
- → Test plan created (15 scenarios)
253
-
254
- [Documentation] Delegating to qa-writer...
255
- → 15 test cases documented
256
-
257
- [Execution] Delegating to qa-engineer...
258
- → 13 passed, 2 failed
259
-
260
- [Report] Generating summary...
261
- Coverage: 85%
262
- Pass Rate: 87%
263
- Defects: 2 (1 High, 1 Medium)
264
-
265
- [Done] QA cycle completed
266
- ```
267
-
268
- ### Parallel Quality Analysis
269
-
270
- ```
271
- [Analyzing] Spawning parallel analysis...
272
-
273
- [Instance 1] strategy-analysis:sonnet → qa-planner
274
- [Instance 2] results-analysis:sonnet → qa-engineer
275
-
276
- [Progress] ████████████ 2/2
277
-
278
- [Summary]
279
- Strategy: Coverage targets met, add edge cases
280
- Results: 85% pass rate, 2 critical defects
281
-
282
- Analysis completed.
283
- ```
284
-
285
- ## Integration with Other Agents
286
-
287
- - **Receives requirements from**: arch-speckit-agent (sw-architect)
288
- - **Reports quality status to**: dev-lead
289
- - **Coordinates with**: Language experts for automated tests
290
- - **Provides feedback to**: Development team via dev-lead
291
-
292
- ## Metrics Tracking
293
-
294
- QA lead routing should aggregate these metrics:
295
-
296
- ```yaml
297
- metrics:
298
- test_coverage: percentage
299
- pass_rate: percentage
300
- defect_count: number
301
- defect_severity: [critical, high, medium, low]
302
- execution_time: duration
303
- test_case_count: number
304
- ```
69
+ All QA agents use `sonnet` by default for balanced quality output.
305
70
 
306
71
  ## No Match Fallback
307
72
 
@@ -3,13 +3,14 @@ name: research
3
3
  description: 10-team parallel deep analysis with cross-verification for any topic, repository, or technology. Use when user invokes /research or asks for comprehensive research.
4
4
  scope: core
5
5
  user-invocable: true
6
+ teams-compatible: true
6
7
  ---
7
8
 
8
9
  # Research Skill
9
10
 
10
11
  Orchestrates 10 parallel research teams for comprehensive deep analysis of any topic, GitHub repository, or technology. Produces a structured report with ADOPT/ADAPT/AVOID taxonomy.
11
12
 
12
- **Orchestrator-only** — only the main conversation uses this skill (R010). Teams execute as subagents.
13
+ **Teams-compatible** — works both from the main conversation (R010) and inside Agent Teams members. When used in Teams, the member directly executes the research workflow without Skill tool invocation.
13
14
 
14
15
  ## Usage
15
16
 
@@ -162,8 +163,21 @@ Batch 3: T9, T10 (Innovation)
162
163
 
163
164
  ### Phase 2: Cross-Verification Loop (min 2, max 30 rounds)
164
165
 
166
+ #### Codex Availability Check
167
+
168
+ Before starting verification rounds, check codex availability:
169
+
170
+ ```bash
171
+ # Run this check once before Phase 2 begins
172
+ which codex &>/dev/null && [ -n "$OPENAI_API_KEY" ]
173
+ # Exit 0 → codex available: enable dual-model verification (opus + codex)
174
+ # Exit 1 → codex unavailable: display notice and proceed with opus-only
175
+ ```
176
+
177
+ If unavailable, display: `[Phase 2] Codex unavailable — opus-only verification`
178
+
165
179
  ```
166
- Team findings ──→ opus 4.6 verification ──→ codex-exec xhigh verification
180
+ Team findings ──→ opus 4.6 verification ──→ codex-exec xhigh verification (if available)
167
181
  │ │
168
182
  └── Contradiction detected? ── YES ──→ Round N+1
169
183
  NO ──→ Consensus reached → Phase 3
@@ -171,7 +185,8 @@ Team findings ──→ opus 4.6 verification ──→ codex-exec xhigh verific
171
185
 
172
186
  Each round:
173
187
  1. **opus 4.6**: Deep reasoning verification — checks logical consistency, identifies gaps, challenges assumptions
174
- 2. **codex-exec xhigh** (if available): Independent code-level verification — validates technical claims, tests feasibility
188
+ 2. **codex-exec xhigh** (when available): Independent code-level verification — validates technical claims, tests feasibility
189
+ - If unavailable: display `[Phase 2] Round {N}: Codex unavailable, proceeding with opus verification only`
175
190
  3. **Contradiction resolution**: Reconcile divergent findings between teams and verifiers
176
191
  4. **Convergence check**: All major claims verified with no outstanding contradictions → proceed
177
192
 
@@ -271,10 +286,16 @@ Round N:
271
286
  - Internal consistency (breadth ↔ depth alignment)
272
287
  - Cross-domain consistency (security ↔ architecture)
273
288
  - Evidence quality (claims without backing)
274
- Step 2: codex-exec validates technical claims:
275
- - Code patterns actually exist
276
- - Benchmarks are reproducible
277
- - Dependencies resolve correctly
289
+ Step 2: codex-exec validates technical claims (when available):
290
+ a. Invoke: /codex-exec with findings from all teams
291
+ b. Prompt: "Validate technical claims: {findings}.
292
+ Check code patterns, benchmark reproducibility,
293
+ dependency resolution."
294
+ c. Effort: --effort xhigh
295
+ d. Parse: contradictions → merge with opus findings
296
+ e. On timeout/error: log "[Phase 2] Round {N}: codex-exec error — {reason},
297
+ continuing with opus results only"
298
+ If unavailable: log "[Phase 2] Round {N}: Codex unavailable, proceeding with opus verification only"
278
299
  Step 3: Compile contradiction list
279
300
  - 0 contradictions → CONVERGED
280
301
  - >0 contradictions → feedback to relevant teams → Round N+1
@@ -385,6 +406,52 @@ Progress:
385
406
  └── T9-T10: ○ Pending
386
407
  ```
387
408
 
409
+ ## Teams Mode
410
+
411
+ When running inside an Agent Teams member (not via Skill tool), the research workflow operates identically but with these adaptations:
412
+
413
+ ### How It Works
414
+
415
+ The orchestrator reads this SKILL.md and includes the research instructions directly in the Teams member's prompt. The member then:
416
+
417
+ 1. Executes Phase 1-4 autonomously using its own Agent tool access
418
+ 2. Spawns research teams as sub-agents (Teams members CAN spawn sub-agents)
419
+ 3. Delivers results via `SendMessage` to the team lead instead of returning to orchestrator
420
+
421
+ ### Prompt Embedding Pattern
422
+
423
+ ```
424
+ # When spawning a Teams member for research:
425
+ Agent(
426
+ name: "researcher-1",
427
+ team_name: "my-team",
428
+ prompt: """
429
+ You are a research agent. Follow the research skill workflow below:
430
+ {contents of research/SKILL.md}
431
+
432
+ Topic: {user's research topic}
433
+ Deliver results via SendMessage to team lead when complete.
434
+ """
435
+ )
436
+ ```
437
+
438
+ ### Differences from Orchestrator Mode
439
+
440
+ | Aspect | Orchestrator Mode | Teams Mode |
441
+ |--------|------------------|------------|
442
+ | Invocation | `Skill(research)` | Prompt embedding |
443
+ | Result delivery | Return to main conversation | `SendMessage` to team lead |
444
+ | Artifact persistence | Teams member writes artifact | Same |
445
+ | GitHub issue creation | Orchestrator handles | Teams member handles directly |
446
+ | Phase management | Orchestrator manages phases | Member manages phases autonomously |
447
+
448
+ ### Constraints
449
+
450
+ - Each Teams member running research still respects R009 (max 4 concurrent sub-agents)
451
+ - Batching order remains: T1-T4 → T5-T8 → T9-T10
452
+ - Cost is identical to orchestrator mode (~$8-15 per research invocation)
453
+ - Multiple Teams members running research simultaneously will multiply costs proportionally
454
+
388
455
  ## Integration
389
456
 
390
457
  | Rule | Integration |