oh-my-customcode 0.158.0 → 0.159.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +4 -4
- package/dist/cli/index.js +1 -1
- package/dist/index.js +1 -1
- package/package.json +1 -1
- package/templates/.claude/rules/MUST-agent-design.md +1 -1
- package/templates/.claude/rules/MUST-agent-teams.md +1 -16
- package/templates/.claude/skills/de-lead-routing/SKILL.md +1 -21
- package/templates/.claude/skills/dev-lead-routing/SKILL.md +1 -21
- package/templates/.claude/skills/intent-detection/SKILL.md +3 -9
- package/templates/.claude/skills/intent-detection/patterns/agent-triggers.yaml +3 -12
- package/templates/.claude/skills/research/SKILL.md +8 -35
- package/templates/.claude/skills/roundtable-debate/SKILL.md +6 -7
- package/templates/.claude/skills/structured-dev-cycle/SKILL.md +1 -12
- package/templates/CLAUDE.md +3 -2
- package/templates/guides/agent-design/google-agents-cli-patterns.md +1 -2
- package/templates/guides/multi-agent-debate-patterns/README.md +1 -10
- package/templates/guides/multi-provider-exec/README.md +2 -15
- package/templates/manifest.json +2 -2
- package/templates/.claude/skills/agora/SKILL.md +0 -288
- package/templates/.claude/skills/codex-exec/SKILL.md +0 -259
- package/templates/.claude/skills/codex-exec/scripts/codex-wrapper.cjs +0 -430
- package/templates/.claude/skills/gemini-exec/SKILL.md +0 -215
- package/templates/.claude/skills/gemini-exec/scripts/gemini-wrapper.cjs +0 -485
|
@@ -1,288 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: omcustom:agora
|
|
3
|
-
description: "Multi-LLM adversarial consensus loop — 3+ LLMs compete to find flaws in designs/specs until unanimous agreement is reached"
|
|
4
|
-
user-invocable: true
|
|
5
|
-
argument-hint: "<document-path> [--rounds N] [--severity-threshold HIGH]"
|
|
6
|
-
effort: max
|
|
7
|
-
scope: core
|
|
8
|
-
version: 1.0.0
|
|
9
|
-
source:
|
|
10
|
-
type: external
|
|
11
|
-
origin: github
|
|
12
|
-
url: https://github.com/baekenough/baekenough-skills
|
|
13
|
-
version: 1.0.0
|
|
14
|
-
---
|
|
15
|
-
|
|
16
|
-
# Agora: Multi-LLM Adversarial Consensus
|
|
17
|
-
|
|
18
|
-
3개 이상의 LLM(Claude, Codex/GPT, Gemini)이 경쟁적으로 설계/문서의 결함을 찾고, 만장일치 합의에 도달할 때까지 반복하는 적대적 교차 검증 스킬.
|
|
19
|
-
|
|
20
|
-
## Prerequisites
|
|
21
|
-
|
|
22
|
-
- `codex-exec` skill (Codex/GPT 호출)
|
|
23
|
-
- `gemini-exec` skill (Gemini 호출)
|
|
24
|
-
- Agent Teams enabled (`CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1`) or Agent tool available
|
|
25
|
-
|
|
26
|
-
## Usage
|
|
27
|
-
|
|
28
|
-
```
|
|
29
|
-
/agora docs/design.md # Default: 3 LLMs, unlimited rounds
|
|
30
|
-
/agora docs/design.md --rounds 10 # Max 10 rounds
|
|
31
|
-
/agora docs/design.md --severity-threshold HIGH # Exit when no HIGH+ findings
|
|
32
|
-
/agora docs/design.md --models claude,codex # 2 LLMs only
|
|
33
|
-
```
|
|
34
|
-
|
|
35
|
-
## Workflow
|
|
36
|
-
|
|
37
|
-
### Phase 1: Setup
|
|
38
|
-
1. Read the target document
|
|
39
|
-
2. Create Agent Team: `TeamCreate("agora-review")`
|
|
40
|
-
3. Create review tasks per focus area
|
|
41
|
-
|
|
42
|
-
### Phase 2: Spawn Reviewers (parallel)
|
|
43
|
-
Spawn 3 reviewers as Agent Team members:
|
|
44
|
-
|
|
45
|
-
```
|
|
46
|
-
Agent(name: "claude-critic", model: opus, effort: max)
|
|
47
|
-
→ 20-point deep adversarial review
|
|
48
|
-
|
|
49
|
-
Agent(name: "codex-critic", model: opus)
|
|
50
|
-
→ Invoke Skill(codex-exec) for GPT perspective + independent Claude analysis
|
|
51
|
-
|
|
52
|
-
Agent(name: "gemini-critic", model: opus)
|
|
53
|
-
→ Invoke Skill(gemini-exec) for Gemini perspective + independent Claude analysis
|
|
54
|
-
```
|
|
55
|
-
|
|
56
|
-
### Phase 3: Independent Review
|
|
57
|
-
Each reviewer performs adversarial review with this template:
|
|
58
|
-
|
|
59
|
-
```
|
|
60
|
-
For EACH review point:
|
|
61
|
-
### Round N: [Topic]
|
|
62
|
-
**Severity**: CRITICAL / HIGH / MEDIUM / LOW
|
|
63
|
-
**Flaw**: [Specific, concrete problem description]
|
|
64
|
-
**Evidence**: [Why this is real, not theoretical]
|
|
65
|
-
**Impact**: [What happens if not addressed]
|
|
66
|
-
**Counter-argument**: [Best case FOR the current design]
|
|
67
|
-
**Verdict**: KEEP / MODIFY / REJECT
|
|
68
|
-
```
|
|
69
|
-
|
|
70
|
-
Review areas (adapt to document type):
|
|
71
|
-
- Architecture fundamentals
|
|
72
|
-
- Component/service design
|
|
73
|
-
- Data architecture
|
|
74
|
-
- Security & resilience
|
|
75
|
-
- Feasibility & deployment
|
|
76
|
-
- Testing strategy
|
|
77
|
-
- Operational complexity
|
|
78
|
-
|
|
79
|
-
### Phase 4: Cross-Review (Peer-to-Peer)
|
|
80
|
-
Each reviewer sends findings to the other two via `SendMessage`.
|
|
81
|
-
|
|
82
|
-
Counter-review template:
|
|
83
|
-
1. Which findings do you **AGREE** with? (and why)
|
|
84
|
-
2. Which findings do you **DISAGREE** with? (evidence-based rebuttal)
|
|
85
|
-
3. What did they **MISS** that you caught?
|
|
86
|
-
4. What did they catch that you **MISSED**?
|
|
87
|
-
5. **SEVERITY** adjustments — upgrade or downgrade with justification
|
|
88
|
-
|
|
89
|
-
### Phase 5: Synthesis
|
|
90
|
-
Team lead aggregates all findings:
|
|
91
|
-
|
|
92
|
-
```
|
|
93
|
-
UNANIMOUS CRITICAL: [findings all 3 agreed on]
|
|
94
|
-
STRONG AGREEMENT: [findings 2/3 agreed on]
|
|
95
|
-
SPLIT DECISIONS: [findings with disagreement + resolution]
|
|
96
|
-
```
|
|
97
|
-
|
|
98
|
-
Determine verdict:
|
|
99
|
-
- **BUILD**: No CRITICAL, no unresolved HIGH
|
|
100
|
-
- **BUILD WITH CHANGES**: No CRITICAL, HIGH findings have accepted mitigations
|
|
101
|
-
- **REDESIGN**: Any unresolved CRITICAL findings
|
|
102
|
-
- **ABANDON**: Fundamental concept is flawed
|
|
103
|
-
|
|
104
|
-
### Phase 6: Loop (if REDESIGN)
|
|
105
|
-
1. Team lead produces/delegates redesign addressing ALL critical findings
|
|
106
|
-
2. New version sent to ALL reviewers: `SendMessage(to: "*")`
|
|
107
|
-
3. Reviewers re-review → GOTO Phase 4
|
|
108
|
-
4. Repeat until EXIT criteria met
|
|
109
|
-
|
|
110
|
-
### Phase 7: Exit (consensus reached)
|
|
111
|
-
When ALL reviewers agree BUILD or BUILD WITH CHANGES:
|
|
112
|
-
1. Produce final consensus report
|
|
113
|
-
2. Write to `.claude/outputs/sessions/{date}/agora-{topic}-{time}.md`
|
|
114
|
-
|
|
115
|
-
### Tool: Writing artifacts under .claude/outputs/
|
|
116
|
-
|
|
117
|
-
Direct Write to `.claude/outputs/sessions/{date}/agora-{topic}-{time}.md` is permitted under `mode: "bypassPermissions"` (CC v2.1.121+, #1101). Write tool auto-creates parent directory. For CC < v2.1.121, see git history for the legacy /tmp bypass pattern.
|
|
118
|
-
|
|
119
|
-
3. Shut down team: `SendMessage(to: "*", message: {type: "shutdown_request"})`
|
|
120
|
-
|
|
121
|
-
## Reviewer Principles
|
|
122
|
-
|
|
123
|
-
1. **NEUTRAL** — no reviewer has home team advantage
|
|
124
|
-
2. **COMPETITIVE** — find flaws others missed
|
|
125
|
-
3. **CRITICAL** — "fewer than 5 CRITICAL flaws = not looking hard enough"
|
|
126
|
-
4. **EVIDENCE-BASED** — every finding cites specific evidence
|
|
127
|
-
5. **CONSTRUCTIVE** — every flaw includes recommended fix
|
|
128
|
-
6. **CONVERGENT** — goal is consensus, not endless disagreement
|
|
129
|
-
|
|
130
|
-
## Consensus Criteria
|
|
131
|
-
|
|
132
|
-
| Condition | Required |
|
|
133
|
-
|-----------|----------|
|
|
134
|
-
| CRITICAL findings resolved | ALL |
|
|
135
|
-
| HIGH findings resolved or accepted | ALL |
|
|
136
|
-
| All reviewers rate BUILD or BUILD WITH CHANGES | YES |
|
|
137
|
-
| Cross-review disagreements resolved | ALL |
|
|
138
|
-
|
|
139
|
-
## Output Format
|
|
140
|
-
|
|
141
|
-
```markdown
|
|
142
|
-
# Agora Consensus Report
|
|
143
|
-
|
|
144
|
-
## Document: [path]
|
|
145
|
-
## Rounds: [N]
|
|
146
|
-
## Reviewers: [list with LLM models used]
|
|
147
|
-
|
|
148
|
-
## Verdict: [BUILD / BUILD WITH CHANGES / REDESIGN]
|
|
149
|
-
|
|
150
|
-
## Unanimous Findings
|
|
151
|
-
| # | Finding | Severity | All 3 Agree |
|
|
152
|
-
|---|---------|----------|-------------|
|
|
153
|
-
|
|
154
|
-
## Required Changes Before Build
|
|
155
|
-
1. [change with source reviewer]
|
|
156
|
-
2. ...
|
|
157
|
-
|
|
158
|
-
## Accepted Risks
|
|
159
|
-
- [finding accepted with justification]
|
|
160
|
-
|
|
161
|
-
## Unique Contributions Per Reviewer
|
|
162
|
-
| Reviewer | Findings Others Missed |
|
|
163
|
-
|----------|----------------------|
|
|
164
|
-
|
|
165
|
-
## Process Metrics
|
|
166
|
-
- Rounds: N
|
|
167
|
-
- Total findings: N
|
|
168
|
-
- Cross-adopted: N
|
|
169
|
-
- Severity upgrades: N
|
|
170
|
-
- Severity downgrades: N
|
|
171
|
-
- Disagreements raised: N
|
|
172
|
-
- Disagreements resolved: N/N
|
|
173
|
-
```
|
|
174
|
-
|
|
175
|
-
## Configuration
|
|
176
|
-
|
|
177
|
-
```yaml
|
|
178
|
-
# Default settings
|
|
179
|
-
agora:
|
|
180
|
-
max_rounds: unlimited # Set --rounds to limit
|
|
181
|
-
severity_threshold: HIGH # EXIT when no findings >= threshold
|
|
182
|
-
models:
|
|
183
|
-
- claude (opus, max effort)
|
|
184
|
-
- codex (via codex-exec skill)
|
|
185
|
-
- gemini (via gemini-exec skill)
|
|
186
|
-
review_points: 20 # Per reviewer
|
|
187
|
-
cross_review: true # Peer-to-peer sharing
|
|
188
|
-
auto_redesign: true # Auto-produce redesign on REDESIGN verdict
|
|
189
|
-
```
|
|
190
|
-
|
|
191
|
-
## Anti-Patterns
|
|
192
|
-
|
|
193
|
-
| Anti-Pattern | Why Wrong | Correct |
|
|
194
|
-
|-------------|-----------|---------|
|
|
195
|
-
| Single LLM review | Misses blind spots | 3+ LLMs find complementary flaws |
|
|
196
|
-
| No cross-review | Reviewers don't challenge each other | Peer-to-peer sharing surfaces disagreements |
|
|
197
|
-
| Accepting first BUILD | May miss edge cases | Loop until ALL agree |
|
|
198
|
-
| Ignoring split decisions | Unresolved disagreements fester | Resolve every split with evidence |
|
|
199
|
-
| Push for consensus too fast | Premature agreement | Let reviewers challenge freely |
|
|
200
|
-
|
|
201
|
-
When spawning agents via the Agent tool during this skill's execution, always pass `mode: "bypassPermissions"`. The Agent tool default (`acceptEdits`) overrides agent frontmatter `permissionMode`, causing permission prompts during unattended execution.
|
|
202
|
-
|
|
203
|
-
## Ontology Convergence (PoC)
|
|
204
|
-
|
|
205
|
-
> Source: #993 (from ouroboros #966 re-evaluation, Option C deferred → PoC 섹션으로 내재화)
|
|
206
|
-
> Status: Experimental — default disabled
|
|
207
|
-
|
|
208
|
-
agora는 기본적으로 만장일치 기반으로 종료하지만, 의미적 유사도 기반 조기 종료를 **PoC로 지원**합니다.
|
|
209
|
-
|
|
210
|
-
### Rationale
|
|
211
|
-
|
|
212
|
-
여러 라운드 후 모든 에이전트의 마지막 응답이 의미상 거의 동일하면(semantic similarity ≥ threshold), 만장일치를 기다리지 않고 조기 수렴으로 판단하여 토큰 비용을 절감합니다.
|
|
213
|
-
|
|
214
|
-
### Configuration
|
|
215
|
-
|
|
216
|
-
```yaml
|
|
217
|
-
ontology_convergence:
|
|
218
|
-
enabled: false # 기본 비활성 (PoC)
|
|
219
|
-
threshold: 0.95 # cosine similarity 최소값
|
|
220
|
-
min_rounds: 2 # 최소 라운드 (너무 이른 종료 방지)
|
|
221
|
-
embedding_model: voyage-3.5 # 또는 openai-text-embedding-3
|
|
222
|
-
```
|
|
223
|
-
|
|
224
|
-
### Algorithm
|
|
225
|
-
|
|
226
|
-
1. 각 라운드 종료 시 participant 응답의 embedding 계산
|
|
227
|
-
2. Pairwise cosine similarity 매트릭스 생성
|
|
228
|
-
3. 최소 유사도(min pairwise similarity) 계산
|
|
229
|
-
4. `min_sim ≥ threshold` AND `rounds ≥ min_rounds` → 조기 종료
|
|
230
|
-
|
|
231
|
-
### Trade-offs
|
|
232
|
-
|
|
233
|
-
| 장점 | 단점 |
|
|
234
|
-
|------|------|
|
|
235
|
-
| 토큰 절감 (수렴 시 2-3 라운드 단축) | embedding 계산 오버헤드 |
|
|
236
|
-
| 만장일치 편향 완화 (의미 일치만으로 충분) | threshold 튜닝 필요 (프로젝트마다 다름) |
|
|
237
|
-
| 정량적 수렴 지표 | 오분류 시 조기 종료 리스크 |
|
|
238
|
-
|
|
239
|
-
### Activation
|
|
240
|
-
|
|
241
|
-
현재 PoC 단계. 활성화 시 `agora` 스킬 호출 파라미터에 `--ontology-convergence=true` 추가. 프로덕션 승격 결정은 3개월 후 데이터 기반 재평가 (연계: #992 PAL Router Defer+observe 전략과 동일 원칙).
|
|
242
|
-
|
|
243
|
-
### Cross-references
|
|
244
|
-
|
|
245
|
-
- #993 (source)
|
|
246
|
-
- #966 ouroboros 재평가
|
|
247
|
-
- guides/agent-design/pal-cost-routing-analysis.md (유사한 Defer+observe 전략)
|
|
248
|
-
|
|
249
|
-
## Anti-Groupthink Mode (Optional)
|
|
250
|
-
|
|
251
|
-
`agora`의 기본 워크플로우는 만장일치 수렴(convergence)이 목표지만, 토론 과정에서 집단사고(Groupthink) 위험이 있을 때 anti-groupthink mode를 활성화할 수 있습니다.
|
|
252
|
-
|
|
253
|
-
### Activation
|
|
254
|
-
|
|
255
|
-
스킬 호출 시 인자로 활성화:
|
|
256
|
-
```
|
|
257
|
-
/agora docs/design.md --mode anti-groupthink
|
|
258
|
-
```
|
|
259
|
-
|
|
260
|
-
### Mechanisms
|
|
261
|
-
|
|
262
|
-
| 메커니즘 | 동작 |
|
|
263
|
-
|---------|------|
|
|
264
|
-
| Devil's Advocate slot | 리뷰어 1명이 전담 반대자 역할 — 합의 형성 시도에 항상 반대 입장 견지 |
|
|
265
|
-
| Minority opinion protection | 1명만 주장하는 의견도 보존, 기각 시 명시적 정당화(3개 근거) 필수 |
|
|
266
|
-
| Round soft cap | 라운드 3회 도달 시 합의 미도달 영역은 "합의 없음 — 분기 결정 필요"로 종결 (기본 워크플로우는 무한 루프 가능) |
|
|
267
|
-
|
|
268
|
-
### Reviewer Role Adjustment
|
|
269
|
-
|
|
270
|
-
기본 모드(3 reviewers)에 anti-groupthink mode 적용 시:
|
|
271
|
-
- `claude-critic` → Devil's Advocate 전담 (모든 합의 시도에 반대 입장)
|
|
272
|
-
- `codex-critic`, `gemini-critic` → 일반 리뷰 (변경 없음)
|
|
273
|
-
|
|
274
|
-
Round soft cap이 작동하면 최종 보고서에 "UNRESOLVED — BRANCHING DECISION NEEDED" 섹션이 추가됩니다.
|
|
275
|
-
|
|
276
|
-
### When to Use Anti-Groupthink Mode vs roundtable-debate
|
|
277
|
-
|
|
278
|
-
| 상황 | 권장 스킬 |
|
|
279
|
-
|------|---------|
|
|
280
|
-
| 합의가 *필요*하지만 위험 발굴도 필요 | `agora --mode anti-groupthink` |
|
|
281
|
-
| 합의 자체가 *불필요*, 다양한 시각이 산출물 | `roundtable-debate` |
|
|
282
|
-
| 단순 검증 (통과/실패) | `agora` (기본 모드) |
|
|
283
|
-
|
|
284
|
-
자세한 비교는 `guides/multi-agent-debate-patterns/` 가이드 참조 (별도 wave에서 생성 예정).
|
|
285
|
-
|
|
286
|
-
### Attribution
|
|
287
|
-
|
|
288
|
-
Devil's Advocate + minority protection 메커니즘은 cc-roundtable 패턴에서 차용되었습니다. (`roundtable-debate` 스킬과 공유 메커니즘)
|
|
@@ -1,259 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: codex-exec
|
|
3
|
-
description: Execute OpenAI Codex CLI prompts and return results
|
|
4
|
-
scope: core
|
|
5
|
-
argument-hint: "<prompt> [--json] [--output <path>] [--model <name>] [--timeout <ms>] [--effort <level>]"
|
|
6
|
-
user-invocable: true
|
|
7
|
-
---
|
|
8
|
-
|
|
9
|
-
# Codex Exec Skill
|
|
10
|
-
|
|
11
|
-
Execute OpenAI Codex CLI prompts in non-interactive mode and return structured results. Enables Claude + Codex hybrid workflows.
|
|
12
|
-
|
|
13
|
-
## Options
|
|
14
|
-
|
|
15
|
-
```
|
|
16
|
-
<prompt> Required. The prompt to send to Codex CLI
|
|
17
|
-
--json Return structured JSON Lines output
|
|
18
|
-
--output <path> Save final message to file
|
|
19
|
-
--model <name> Model override (default: Codex CLI default model)
|
|
20
|
-
--timeout <ms> Execution timeout (default: 120000, max: 600000)
|
|
21
|
-
--full-auto Enable auto-approval mode (codex -a full-auto)
|
|
22
|
-
--working-dir Working directory for Codex execution
|
|
23
|
-
--effort <level> Set reasoning effort level (minimal, low, medium, high, xhigh)
|
|
24
|
-
Maps to Codex CLI's model_reasoning_effort config
|
|
25
|
-
Default: uses Codex CLI's configured default
|
|
26
|
-
Recommended: xhigh for research/analysis tasks
|
|
27
|
-
```
|
|
28
|
-
|
|
29
|
-
## Workflow
|
|
30
|
-
|
|
31
|
-
```
|
|
32
|
-
1. Pre-checks
|
|
33
|
-
- Verify `codex` binary is installed (which codex || npx codex --version)
|
|
34
|
-
- Verify authentication (OPENAI_API_KEY or logged in)
|
|
35
|
-
2. Build command
|
|
36
|
-
- Base: codex exec --ephemeral "<prompt>"
|
|
37
|
-
- Apply options: --json, --model, --full-auto, -C <dir>
|
|
38
|
-
- Set --working-dir if specified
|
|
39
|
-
3. Execute
|
|
40
|
-
- Run via Bash tool with timeout (default 2min, max 10min)
|
|
41
|
-
- Or use helper script: node .claude/skills/codex-exec/scripts/codex-wrapper.cjs
|
|
42
|
-
4. Parse output
|
|
43
|
-
- Text mode: return raw stdout
|
|
44
|
-
- JSON mode: parse JSON Lines, extract final assistant message
|
|
45
|
-
5. Report results
|
|
46
|
-
- Format output with execution metadata
|
|
47
|
-
```
|
|
48
|
-
|
|
49
|
-
## Safety Defaults
|
|
50
|
-
|
|
51
|
-
- `--ephemeral`: No session persistence (conversations not saved)
|
|
52
|
-
- Default mode: Normal approval (Codex prompts for confirmation)
|
|
53
|
-
- Override with `--full-auto` only when explicitly requested
|
|
54
|
-
|
|
55
|
-
## Output Format
|
|
56
|
-
|
|
57
|
-
### Success (Text Mode)
|
|
58
|
-
```
|
|
59
|
-
[Codex Exec] Completed
|
|
60
|
-
|
|
61
|
-
Model: (default)
|
|
62
|
-
Duration: 23.4s
|
|
63
|
-
Working Dir: /path/to/project
|
|
64
|
-
|
|
65
|
-
--- Output ---
|
|
66
|
-
{codex response text}
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
### Success (JSON Mode)
|
|
70
|
-
```
|
|
71
|
-
[Codex Exec] Completed (JSON)
|
|
72
|
-
|
|
73
|
-
Model: (default)
|
|
74
|
-
Duration: 23.4s
|
|
75
|
-
Events: 12
|
|
76
|
-
|
|
77
|
-
--- Final Message ---
|
|
78
|
-
{extracted final assistant message}
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
### Failure
|
|
82
|
-
```
|
|
83
|
-
[Codex Exec] Failed
|
|
84
|
-
|
|
85
|
-
Error: {error_message}
|
|
86
|
-
Exit Code: {code}
|
|
87
|
-
Suggested Fix: {suggestion}
|
|
88
|
-
```
|
|
89
|
-
|
|
90
|
-
## Helper Script
|
|
91
|
-
|
|
92
|
-
For complex executions, use the wrapper script:
|
|
93
|
-
```bash
|
|
94
|
-
node .claude/skills/codex-exec/scripts/codex-wrapper.cjs --prompt "your prompt" [options]
|
|
95
|
-
```
|
|
96
|
-
|
|
97
|
-
The wrapper provides:
|
|
98
|
-
- Environment validation (binary + auth checks)
|
|
99
|
-
- Safe command construction
|
|
100
|
-
- JSON Lines parsing with event extraction
|
|
101
|
-
- Structured JSON output
|
|
102
|
-
- Timeout handling with graceful termination
|
|
103
|
-
|
|
104
|
-
## Examples
|
|
105
|
-
|
|
106
|
-
```bash
|
|
107
|
-
# Simple text prompt
|
|
108
|
-
codex-exec "explain what this project does"
|
|
109
|
-
|
|
110
|
-
# JSON output with model override
|
|
111
|
-
codex-exec "list all TODO items" --json
|
|
112
|
-
|
|
113
|
-
# Save output to file
|
|
114
|
-
codex-exec "generate a README" --output ./README.md
|
|
115
|
-
|
|
116
|
-
# Full auto mode with custom timeout
|
|
117
|
-
codex-exec "fix the failing tests" --full-auto --timeout 300000
|
|
118
|
-
|
|
119
|
-
# Specify working directory
|
|
120
|
-
codex-exec "analyze the codebase" --working-dir /path/to/project
|
|
121
|
-
```
|
|
122
|
-
|
|
123
|
-
## Integration
|
|
124
|
-
|
|
125
|
-
Works with the orchestrator pattern:
|
|
126
|
-
- Main conversation delegates Codex execution via this skill
|
|
127
|
-
- Results are returned to the main conversation for further processing
|
|
128
|
-
- Can be chained with other skills (e.g., dev-review after Codex generates code)
|
|
129
|
-
|
|
130
|
-
## Availability Check
|
|
131
|
-
|
|
132
|
-
codex-exec requires the Codex CLI binary to be installed and authenticated. The skill is only usable when:
|
|
133
|
-
|
|
134
|
-
1. `codex` binary is found in PATH (`which codex` succeeds)
|
|
135
|
-
2. Authentication is valid (OPENAI_API_KEY set or `codex` logged in)
|
|
136
|
-
|
|
137
|
-
If either check fails, this skill cannot be used. Fall back to Claude agents for the task.
|
|
138
|
-
|
|
139
|
-
> **Note**: This skill is invoked via `/codex-exec` command, delegated by the orchestrator, or suggested by routing skills when codex is available. The intent-detection system can trigger it for research (xhigh) and code generation (hybrid) workflows.
|
|
140
|
-
|
|
141
|
-
## Agent Teams Integration
|
|
142
|
-
|
|
143
|
-
When used within Agent Teams (requires explicit invocation):
|
|
144
|
-
|
|
145
|
-
1. **As delegated task**: orchestrator explicitly delegates codex-exec for code generation
|
|
146
|
-
2. **Hybrid workflow**: Claude team member analyzes → orchestrator invokes codex-exec → Claude reviews
|
|
147
|
-
3. **Iteration**: Team messaging enables review-fix cycles between Claude and Codex outputs
|
|
148
|
-
|
|
149
|
-
```
|
|
150
|
-
Orchestrator delegates generation task
|
|
151
|
-
→ /codex-exec invoked explicitly
|
|
152
|
-
→ Output returned to orchestrator
|
|
153
|
-
→ Reviewer validates quality
|
|
154
|
-
→ Iterate if needed
|
|
155
|
-
```
|
|
156
|
-
|
|
157
|
-
## Research Workflow
|
|
158
|
-
|
|
159
|
-
When the orchestrator or intent-detection detects a research/information gathering request (routing_rule in agent-triggers.yaml):
|
|
160
|
-
|
|
161
|
-
1. **Check Codex availability**: Verify `codex` binary and `OPENAI_API_KEY`
|
|
162
|
-
2. **If available**: Execute with xhigh reasoning effort for thorough research
|
|
163
|
-
3. **If unavailable**: Fall back to Claude's WebFetch/WebSearch
|
|
164
|
-
|
|
165
|
-
### Research Command Pattern
|
|
166
|
-
|
|
167
|
-
```
|
|
168
|
-
/codex-exec "Research and analyze: {topic}. Provide structured findings with sources." --effort xhigh --full-auto --json
|
|
169
|
-
```
|
|
170
|
-
|
|
171
|
-
### Effort Level Guide
|
|
172
|
-
|
|
173
|
-
| Level | Use Case | Speed | Depth |
|
|
174
|
-
|-------|----------|-------|-------|
|
|
175
|
-
| minimal | Quick lookups | Fastest | Surface |
|
|
176
|
-
| low | Simple queries | Fast | Basic |
|
|
177
|
-
| medium | General tasks | Balanced | Standard |
|
|
178
|
-
| high | Complex analysis | Slower | Deep |
|
|
179
|
-
| xhigh | Research & investigation | Slowest | Maximum |
|
|
180
|
-
|
|
181
|
-
## Code Generation Workflow
|
|
182
|
-
|
|
183
|
-
When routing skills detect a code generation task and codex is available:
|
|
184
|
-
|
|
185
|
-
1. **Check availability**: Verify codex CLI via `/tmp/.claude-env-status-*`
|
|
186
|
-
2. **If available + new file creation**: Suggest hybrid workflow
|
|
187
|
-
3. **Hybrid pattern**:
|
|
188
|
-
- codex-exec generates initial code (fast, broad generation)
|
|
189
|
-
- Claude expert reviews for quality, patterns, best practices
|
|
190
|
-
- Iterate if needed
|
|
191
|
-
|
|
192
|
-
### Suitable Tasks
|
|
193
|
-
- New file scaffolding
|
|
194
|
-
- Boilerplate generation
|
|
195
|
-
- Test stub creation
|
|
196
|
-
- Documentation generation
|
|
197
|
-
|
|
198
|
-
### Unsuitable Tasks
|
|
199
|
-
- Modifying existing code (Claude expert better at understanding context)
|
|
200
|
-
- Architecture decisions (requires reasoning, not generation)
|
|
201
|
-
- Bug fixes (requires deep code understanding)
|
|
202
|
-
|
|
203
|
-
### Code Generation Command Pattern
|
|
204
|
-
```
|
|
205
|
-
/codex-exec "Generate {description} following {framework} best practices" --effort high --full-auto
|
|
206
|
-
```
|
|
207
|
-
|
|
208
|
-
## Browser Verify Workflow (Codex + claude-in-chrome 협업 루프)
|
|
209
|
-
|
|
210
|
-
Codex가 생성/수정한 프론트엔드 결과를 시각적으로 검증하는 루프 패턴. 신규 스킬 불요 — 기존 도구 조합.
|
|
211
|
-
|
|
212
|
-
### Pattern
|
|
213
|
-
|
|
214
|
-
```
|
|
215
|
-
codex-exec "build/fix frontend"
|
|
216
|
-
→ bun dev / npm run dev (로컬 서버 기동)
|
|
217
|
-
→ claude-in-chrome:navigate(localhost:port)
|
|
218
|
-
→ claude-in-chrome:gif_creator(action capture)
|
|
219
|
-
→ claude-in-chrome:read_console_messages (오류 감지)
|
|
220
|
-
→ claude-in-chrome:read_network_requests (실패 호출 감지)
|
|
221
|
-
→ 오류 있으면: codex-exec "fix: {error context}" → 루프 재진입
|
|
222
|
-
→ 오류 없으면: 종료 + 결과 보고
|
|
223
|
-
```
|
|
224
|
-
|
|
225
|
-
### When to Use
|
|
226
|
-
|
|
227
|
-
| 상황 | 권장 |
|
|
228
|
-
|------|------|
|
|
229
|
-
| 단순 코드 생성 | `codex-exec` 단독 |
|
|
230
|
-
| 프론트엔드 시각 검증 필요 | **이 루프** |
|
|
231
|
-
| API/백엔드 검증 | `deep-verify` skill |
|
|
232
|
-
| 복잡한 디자인 시스템 | `design-shotgun` 병행 |
|
|
233
|
-
|
|
234
|
-
### Loop Termination Rules
|
|
235
|
-
|
|
236
|
-
- 최대 반복 3회 (degeneration 방지, R013/agora 패턴 차용)
|
|
237
|
-
- console error 0개 + network failure 0개 → 종료
|
|
238
|
-
- 동일 오류 반복 시 즉시 종료 + 사용자 보고
|
|
239
|
-
|
|
240
|
-
### Tool Composition
|
|
241
|
-
|
|
242
|
-
| 단계 | 도구 |
|
|
243
|
-
|------|------|
|
|
244
|
-
| Build/Fix | `codex-exec` |
|
|
245
|
-
| Server | `Bash(bun dev)` (background) |
|
|
246
|
-
| Visual | `mcp__claude-in-chrome__navigate` + `gif_creator` |
|
|
247
|
-
| Diagnose | `read_console_messages` + `read_network_requests` |
|
|
248
|
-
|
|
249
|
-
자세한 구현 패턴: `guides/browser-automation/01-browser-automation-patterns.md` 참조.
|
|
250
|
-
|
|
251
|
-
> **Tool**: Use the **Write tool** for any artifact files this loop produces — never Bash mkdir on `.claude/outputs/`.
|
|
252
|
-
|
|
253
|
-
### Tool: Writing artifacts under .claude/outputs/
|
|
254
|
-
|
|
255
|
-
Direct Write to `.claude/outputs/codex/sessions/{date}/codex-{HHmmss}.{ext}` is permitted under `mode: "bypassPermissions"` (CC v2.1.121+, #1101). Write tool auto-creates parent directory. For CC < v2.1.121, see git history for the legacy /tmp bypass pattern.
|
|
256
|
-
|
|
257
|
-
### Attribution
|
|
258
|
-
|
|
259
|
-
Pattern source: Codex Browser Use (https://x.com/jameszmsun/status/2047522852854026378), scout #1009.
|