oh-my-customcodex 0.5.8 → 0.5.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +4 -4
- package/dist/cli/index.js +1 -1
- package/dist/index.js +1 -1
- package/package.json +1 -1
- package/templates/.claude/hooks/scripts/agent-teams-advisor.sh +4 -1
- package/templates/.claude/rules/MUST-agent-teams.md +85 -246
- package/templates/.claude/rules/MUST-completion-verification.md +12 -0
- package/templates/.claude/rules/MUST-orchestrator-coordination.md +20 -0
- package/templates/.claude/rules/MUST-permissions.md +6 -0
- package/templates/.claude/rules/MUST-safety.md +7 -0
- package/templates/.claude/rules/SHOULD-interaction.md +11 -0
- package/templates/.claude/skills/de-lead-routing/SKILL.md +6 -13
- package/templates/.claude/skills/dev-lead-routing/SKILL.md +6 -13
- package/templates/.claude/skills/intent-detection/SKILL.md +7 -9
- package/templates/.claude/skills/research/SKILL.md +8 -23
- package/templates/.claude/skills/roundtable-debate/SKILL.md +3 -4
- package/templates/.claude/skills/skill-extractor/SKILL.md +165 -73
- package/templates/.claude/skills/structured-dev-cycle/SKILL.md +7 -10
- package/templates/AGENTS.md.en +1 -2
- package/templates/AGENTS.md.ko +1 -2
- package/templates/CLAUDE.md +2 -2
- package/templates/CLAUDE.md.en +1 -2
- package/templates/CLAUDE.md.ko +1 -2
- package/templates/README.md +2 -2
- package/templates/guides/claude-code/15-version-compatibility.md +24 -0
- package/templates/guides/multi-agent-debate-patterns/README.md +1 -1
- package/templates/guides/multi-provider-exec/README.md +9 -79
- package/templates/manifest.json +3 -3
- package/templates/.claude/skills/agora/SKILL.md +0 -209
- package/templates/.claude/skills/codex-exec/SKILL.md +0 -218
- package/templates/.claude/skills/codex-exec/scripts/codex-wrapper.cjs +0 -433
- package/templates/.claude/skills/gemini-exec/SKILL.md +0 -215
- package/templates/.claude/skills/gemini-exec/scripts/gemini-wrapper.cjs +0 -485
|
@@ -2,6 +2,30 @@
|
|
|
2
2
|
|
|
3
3
|
This guide records Claude Code release-note impact that affects the Claude compatibility template. The Codex-native runtime still uses `.codex/**` and OMX as the primary surface.
|
|
4
4
|
|
|
5
|
+
## v2.1.158
|
|
6
|
+
|
|
7
|
+
Published: 2026-05-30.
|
|
8
|
+
|
|
9
|
+
Source: upstream oh-my-customcode #1264, Codex port #1436.
|
|
10
|
+
|
|
11
|
+
| Change | Impact on oh-my-customcodex | Action |
|
|
12
|
+
|--------|------------------------------|--------|
|
|
13
|
+
| Auto mode is available on Bedrock, Vertex, and Foundry for Opus 4.7 and Opus 4.8 via `CLAUDE_CODE_ENABLE_AUTO_MODE=1` | Claude compatibility sessions can opt into provider-backed auto mode for those Opus surfaces. Codex-native model routing and approval policy are unchanged. | Document as Claude provider compatibility only. Do not infer Codex auto-mode behavior from this env var. |
|
|
14
|
+
|
|
15
|
+
## v2.1.157
|
|
16
|
+
|
|
17
|
+
Published: 2026-05-29.
|
|
18
|
+
|
|
19
|
+
Source: upstream oh-my-customcode #1265, Codex port #1437.
|
|
20
|
+
|
|
21
|
+
| Change | Impact on oh-my-customcodex | Action |
|
|
22
|
+
|--------|------------------------------|--------|
|
|
23
|
+
| Plugins under `.claude/skills` auto-load, `claude plugin init <name>` scaffolds plugins there, and `/plugin` has argument autocomplete | Useful for Claude compatibility plugin setup. Codex-native skills remain under `.codex/skills`, `.agents/skills`, and OMX skill roots. | Keep `.codex/OMX` primary. Mention `.claude/skills` only when documenting Claude plugin compatibility. |
|
|
24
|
+
| `claude agents` honors the `agent` field in `settings.json`, with `--agent <name>` as an override | Claude dispatched sessions can inherit a configured default agent unless explicitly overridden. | Do not mirror this into Codex routing. Codex native subagents still follow prompt routing, role metadata, and explicit delegation. |
|
|
25
|
+
| `EnterWorktree` can switch between Claude-managed worktrees mid-session, and Claude-managed worktrees are left unlocked for `git worktree remove`/`prune` cleanup | Claude worktree lifecycle is more flexible and less likely to leave locked cleanup blockers. | Keep auto-dev work in clean worktrees, verify `git status`, and do not treat Claude-managed worktree state as OMX state. |
|
|
26
|
+
| `tool_decision` telemetry can include `tool_parameters` such as Bash commands and MCP/skill names when `OTEL_LOG_TOOL_DETAILS=1` | Telemetry may contain more detailed operational data, including command and tool-parameter strings. | Treat logs as potentially sensitive. Avoid exporting transcripts or telemetry that may expose secrets, credentials, or privileged commands. |
|
|
27
|
+
| Background/session fixes cover parked subagents, leaked background shells, orphaned `.claude/worktrees`, resume state, date after sleep/wake, fullscreen picker cleanup, current linked worktree return, image placeholders, network prompts, and tmux clipboard behavior | Reduces false blockers and stale-state surprises in Claude compatibility sessions. | No Codex runtime change. Continue using OMX state, Codex worktree checks, and direct command evidence for Codex-native completion claims. |
|
|
28
|
+
|
|
5
29
|
## v2.1.156
|
|
6
30
|
|
|
7
31
|
Published: 2026-05-29.
|
|
@@ -4,7 +4,7 @@
|
|
|
4
4
|
|
|
5
5
|
| Pattern | Goal | Use When |
|
|
6
6
|
|---------|------|----------|
|
|
7
|
-
| `
|
|
7
|
+
| `roundtable-debate` | Preserve dissent while reaching a bounded decision | Release gates, design approval, high-risk specs with minority risks |
|
|
8
8
|
| `roundtable-debate` | Preserve dissent | Strategy choices, tradeoffs, ambiguous product or architecture decisions |
|
|
9
9
|
|
|
10
10
|
## Failure Modes
|
|
@@ -1,83 +1,13 @@
|
|
|
1
|
-
#
|
|
1
|
+
# External Interop Guidance
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
The packaged multi-provider exec skills have been retired. For Codex interoperability, use the official Claude Code plugin `openai/codex-plugin-cc` only when it is explicitly installed and requested.
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
## Current Paths
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
| Need | Preferred path | Notes |
|
|
8
|
+
|------|----------------|-------|
|
|
9
|
+
| Codex interop | `openai/codex-plugin-cc` | Official plugin path; opt-in only. |
|
|
10
|
+
| Token-optimized local command output | `rtk-exec` | Existing RTK proxy remains supported. |
|
|
11
|
+
| Research or independent review | `researcher`, expert agents, or `roundtable-debate` | Prefer in-repo agent workflows unless plugin interop is requested. |
|
|
8
12
|
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
| Provider | Skill | CLI Dependency | Model | Strengths |
|
|
12
|
-
|----------|-------|---------------|-------|-----------|
|
|
13
|
-
| OpenAI (Codex) | `codex-exec` | `codex` CLI | GPT-5.4 | Code generation, broad knowledge |
|
|
14
|
-
| Google (Gemini) | `gemini-exec` | `gemini` CLI | Gemini 2.5 Pro | Long context, multimodal |
|
|
15
|
-
| RTK (proxy) | `rtk-exec` | `rtk` CLI | Configurable | Token-optimized output, cost reduction |
|
|
16
|
-
|
|
17
|
-
## Availability Detection
|
|
18
|
-
|
|
19
|
-
The `session-env-check.sh` hook (SessionStart) auto-detects available providers:
|
|
20
|
-
|
|
21
|
-
```
|
|
22
|
-
[SessionStart] Checking external CLI availability...
|
|
23
|
-
codex: ✓ available
|
|
24
|
-
gemini: ✗ not found
|
|
25
|
-
rtk: ✓ available
|
|
26
|
-
```
|
|
27
|
-
|
|
28
|
-
Providers are opt-in — missing CLIs are silently skipped.
|
|
29
|
-
|
|
30
|
-
## Usage Patterns
|
|
31
|
-
|
|
32
|
-
### Direct Invocation
|
|
33
|
-
|
|
34
|
-
```
|
|
35
|
-
/codex-exec "Review this function for security issues"
|
|
36
|
-
/gemini-exec "Analyze this architecture diagram"
|
|
37
|
-
/rtk-exec "List files matching pattern X"
|
|
38
|
-
```
|
|
39
|
-
|
|
40
|
-
### Provider Selection Guide
|
|
41
|
-
|
|
42
|
-
| Task | Recommended Provider | Rationale |
|
|
43
|
-
|------|---------------------|-----------|
|
|
44
|
-
| Second opinion on code review | codex-exec | Independent model reduces confirmation bias |
|
|
45
|
-
| Long document analysis | gemini-exec | 1M+ context window |
|
|
46
|
-
| Token-heavy batch operations | rtk-exec | Compressed output reduces context cost |
|
|
47
|
-
| Security audit cross-check | codex-exec | Different training data catches different patterns |
|
|
48
|
-
| Multi-model verification | All three | `/multi-model-verification` skill orchestrates this |
|
|
49
|
-
|
|
50
|
-
### Integration with Existing Skills
|
|
51
|
-
|
|
52
|
-
| Skill | Uses Provider | How |
|
|
53
|
-
|-------|--------------|-----|
|
|
54
|
-
| `multi-model-verification` | codex-exec + gemini-exec | Parallel verification with severity classification |
|
|
55
|
-
| `reasoning-sandwich` | Any exec skill | Pre/post reasoning with different models |
|
|
56
|
-
| `model-escalation` | Claude models only | Internal escalation (haiku→sonnet→opus), not cross-provider |
|
|
57
|
-
|
|
58
|
-
## Relationship to Multi-Model Routing
|
|
59
|
-
|
|
60
|
-
| Aspect | Multi-Model Routing | Multi-Provider Exec |
|
|
61
|
-
|--------|--------------------|--------------------|
|
|
62
|
-
| Scope | Claude model selection | Cross-provider execution |
|
|
63
|
-
| Models | haiku / sonnet / opus | GPT-5.4 / Gemini 2.5 / RTK proxy |
|
|
64
|
-
| Mechanism | `model` frontmatter field | Exec skill invocation |
|
|
65
|
-
| Use case | Cost/quality optimization within Claude | Independent verification, specialized tasks |
|
|
66
|
-
| Guide | `guides/multi-model-routing/` | `guides/multi-provider-exec/` |
|
|
67
|
-
|
|
68
|
-
## Configuration
|
|
69
|
-
|
|
70
|
-
No global configuration required. Each exec skill reads its own CLI configuration:
|
|
71
|
-
|
|
72
|
-
| Skill | Config Source |
|
|
73
|
-
|-------|-------------|
|
|
74
|
-
| codex-exec | `~/.codex/config` or CODEX_API_KEY env |
|
|
75
|
-
| gemini-exec | `~/.gemini/config` or GEMINI_API_KEY env |
|
|
76
|
-
| rtk-exec | RTK proxy running on localhost |
|
|
77
|
-
|
|
78
|
-
## Limitations
|
|
79
|
-
|
|
80
|
-
- Provider availability depends on user's CLI installations
|
|
81
|
-
- Cross-provider results are advisory — Claude remains the primary execution engine
|
|
82
|
-
- No automatic fallback between providers (by design — explicit selection preferred)
|
|
83
|
-
- Rate limits and costs are provider-specific and not tracked by oh-my-customcodex
|
|
13
|
+
Do not auto-delegate to retired provider wrapper skills. Keep expert agents responsible for reviewing any plugin-assisted output.
|
package/templates/manifest.json
CHANGED
|
@@ -1,11 +1,11 @@
|
|
|
1
1
|
{
|
|
2
|
-
"version": "0.5.
|
|
2
|
+
"version": "0.5.10",
|
|
3
3
|
"requiresCC": ">=2.1.121",
|
|
4
4
|
"claudeCode": {
|
|
5
5
|
"minimumVersion": "2.1.121",
|
|
6
6
|
"protectedPathBypassVersion": "2.1.126"
|
|
7
7
|
},
|
|
8
|
-
"lastUpdated": "2026-
|
|
8
|
+
"lastUpdated": "2026-06-01T00:00:00.000Z",
|
|
9
9
|
"components": [
|
|
10
10
|
{
|
|
11
11
|
"name": "rules",
|
|
@@ -23,7 +23,7 @@
|
|
|
23
23
|
"name": "skills",
|
|
24
24
|
"path": ".agents/skills",
|
|
25
25
|
"description": "Reusable skill modules (project-scoped repo skills)",
|
|
26
|
-
"files":
|
|
26
|
+
"files": 120
|
|
27
27
|
},
|
|
28
28
|
{
|
|
29
29
|
"name": "guides",
|
|
@@ -1,209 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: omcustomcodex:agora
|
|
3
|
-
description: "Multi-LLM adversarial consensus loop — 3+ LLMs compete to find flaws in designs/specs until unanimous agreement is reached"
|
|
4
|
-
user-invocable: true
|
|
5
|
-
argument-hint: "<document-path> [--rounds N] [--severity-threshold HIGH]"
|
|
6
|
-
effort: max
|
|
7
|
-
scope: core
|
|
8
|
-
version: 1.0.0
|
|
9
|
-
source:
|
|
10
|
-
type: external
|
|
11
|
-
origin: github
|
|
12
|
-
url: https://github.com/baekenough/baekenough-skills
|
|
13
|
-
version: 1.0.0
|
|
14
|
-
---
|
|
15
|
-
|
|
16
|
-
# Agora: Multi-LLM Adversarial Consensus
|
|
17
|
-
|
|
18
|
-
3개 이상의 LLM(Claude, Codex/GPT, Gemini)이 경쟁적으로 설계/문서의 결함을 찾고, 만장일치 합의에 도달할 때까지 반복하는 적대적 교차 검증 스킬.
|
|
19
|
-
|
|
20
|
-
## Prerequisites
|
|
21
|
-
|
|
22
|
-
- `codex-exec` skill (Codex/GPT 호출)
|
|
23
|
-
- `gemini-exec` skill (Gemini 호출)
|
|
24
|
-
- Agent Teams enabled (`OMCODEX_AGENT_TEAMS=1`) or Agent tool available
|
|
25
|
-
|
|
26
|
-
## Sensitive-Path Delegation
|
|
27
|
-
|
|
28
|
-
Sensitive-path compatibility note: if this skill delegates work that touches `.claude/**`, `.claude/outputs/**`, `templates/.claude/**`, or read-only measurements of those paths, keep `.codex/**` edits on the normal Codex path. On Claude Code v2.1.121+ with `bypassPermissions`, direct writes to `.claude/skills/`, `.claude/agents/`, and `.claude/commands/` are allowed; on v2.1.126+ that extends to broader protected paths. Only use `/tmp/{skill}-{timestamp}.md` as a legacy fallback when the target runtime is older or still prompts.
|
|
29
|
-
|
|
30
|
-
## Usage
|
|
31
|
-
|
|
32
|
-
```
|
|
33
|
-
/omcustomcodex:agora docs/design.md # Default: 3 LLMs, unlimited rounds
|
|
34
|
-
/omcustomcodex:agora docs/design.md --rounds 10 # Max 10 rounds
|
|
35
|
-
/omcustomcodex:agora docs/design.md --severity-threshold HIGH # Exit when no HIGH+ findings
|
|
36
|
-
/omcustomcodex:agora docs/design.md --models claude,codex # 2 LLMs only
|
|
37
|
-
```
|
|
38
|
-
|
|
39
|
-
## Workflow
|
|
40
|
-
|
|
41
|
-
### Phase 1: Setup
|
|
42
|
-
1. Read the target document
|
|
43
|
-
2. Create Agent Team: `TeamCreate("agora-review")`
|
|
44
|
-
3. Create review tasks per focus area
|
|
45
|
-
|
|
46
|
-
### Phase 2: Spawn Reviewers (parallel)
|
|
47
|
-
Spawn 3 reviewers as Agent Team members:
|
|
48
|
-
|
|
49
|
-
```
|
|
50
|
-
|
|
51
|
-
### Anti-Groupthink Mode
|
|
52
|
-
|
|
53
|
-
Use `--anti-groupthink` when consensus itself is a risk:
|
|
54
|
-
|
|
55
|
-
1. Reviewers submit independent findings before seeing peer output.
|
|
56
|
-
2. One reviewer is assigned as devil's advocate.
|
|
57
|
-
3. Minority findings are preserved unless the synthesis explicitly rejects them with evidence.
|
|
58
|
-
4. Debate is capped at two challenge rounds before the lead either decides or requests more facts.
|
|
59
|
-
|
|
60
|
-
For decisions where dissent preservation is the main goal, use `roundtable-debate` directly instead of `agora`.
|
|
61
|
-
Agent(name: "claude-critic", model: opus, effort: max)
|
|
62
|
-
→ 20-point deep adversarial review
|
|
63
|
-
|
|
64
|
-
Agent(name: "codex-critic", model: opus)
|
|
65
|
-
→ Invoke Skill(codex-exec) for GPT perspective + independent Claude analysis
|
|
66
|
-
|
|
67
|
-
Agent(name: "gemini-critic", model: opus)
|
|
68
|
-
→ Invoke Skill(gemini-exec) for Gemini perspective + independent Claude analysis
|
|
69
|
-
```
|
|
70
|
-
|
|
71
|
-
### Phase 3: Independent Review
|
|
72
|
-
Each reviewer performs adversarial review with this template:
|
|
73
|
-
|
|
74
|
-
```
|
|
75
|
-
For EACH review point:
|
|
76
|
-
### Round N: [Topic]
|
|
77
|
-
**Severity**: CRITICAL / HIGH / MEDIUM / LOW
|
|
78
|
-
**Flaw**: [Specific, concrete problem description]
|
|
79
|
-
**Evidence**: [Why this is real, not theoretical]
|
|
80
|
-
**Impact**: [What happens if not addressed]
|
|
81
|
-
**Counter-argument**: [Best case FOR the current design]
|
|
82
|
-
**Verdict**: KEEP / MODIFY / REJECT
|
|
83
|
-
```
|
|
84
|
-
|
|
85
|
-
Review areas (adapt to document type):
|
|
86
|
-
- Architecture fundamentals
|
|
87
|
-
- Component/service design
|
|
88
|
-
- Data architecture
|
|
89
|
-
- Security & resilience
|
|
90
|
-
- Feasibility & deployment
|
|
91
|
-
- Testing strategy
|
|
92
|
-
- Operational complexity
|
|
93
|
-
|
|
94
|
-
### Phase 4: Cross-Review (Peer-to-Peer)
|
|
95
|
-
Each reviewer sends findings to the other two via `SendMessage`.
|
|
96
|
-
|
|
97
|
-
Counter-review template:
|
|
98
|
-
1. Which findings do you **AGREE** with? (and why)
|
|
99
|
-
2. Which findings do you **DISAGREE** with? (evidence-based rebuttal)
|
|
100
|
-
3. What did they **MISS** that you caught?
|
|
101
|
-
4. What did they catch that you **MISSED**?
|
|
102
|
-
5. **SEVERITY** adjustments — upgrade or downgrade with justification
|
|
103
|
-
|
|
104
|
-
### Phase 5: Synthesis
|
|
105
|
-
Team lead aggregates all findings:
|
|
106
|
-
|
|
107
|
-
```
|
|
108
|
-
UNANIMOUS CRITICAL: [findings all 3 agreed on]
|
|
109
|
-
STRONG AGREEMENT: [findings 2/3 agreed on]
|
|
110
|
-
SPLIT DECISIONS: [findings with disagreement + resolution]
|
|
111
|
-
```
|
|
112
|
-
|
|
113
|
-
Determine verdict:
|
|
114
|
-
- **BUILD**: No CRITICAL, no unresolved HIGH
|
|
115
|
-
- **BUILD WITH CHANGES**: No CRITICAL, HIGH findings have accepted mitigations
|
|
116
|
-
- **REDESIGN**: Any unresolved CRITICAL findings
|
|
117
|
-
- **ABANDON**: Fundamental concept is flawed
|
|
118
|
-
|
|
119
|
-
### Phase 6: Loop (if REDESIGN)
|
|
120
|
-
1. Team lead produces/delegates redesign addressing ALL critical findings
|
|
121
|
-
2. New version sent to ALL reviewers: `SendMessage(to: "*")`
|
|
122
|
-
3. Reviewers re-review → GOTO Phase 4
|
|
123
|
-
4. Repeat until EXIT criteria met
|
|
124
|
-
|
|
125
|
-
### Phase 7: Exit (consensus reached)
|
|
126
|
-
When ALL reviewers agree BUILD or BUILD WITH CHANGES:
|
|
127
|
-
1. Produce final consensus report
|
|
128
|
-
2. Write to `.codex/outputs/sessions/{date}/agora-{topic}-{time}.md`
|
|
129
|
-
3. Shut down team: `SendMessage(to: "*", message: {type: "shutdown_request"})`
|
|
130
|
-
|
|
131
|
-
## Reviewer Principles
|
|
132
|
-
|
|
133
|
-
1. **NEUTRAL** — no reviewer has home team advantage
|
|
134
|
-
2. **COMPETITIVE** — find flaws others missed
|
|
135
|
-
3. **CRITICAL** — "fewer than 5 CRITICAL flaws = not looking hard enough"
|
|
136
|
-
4. **EVIDENCE-BASED** — every finding cites specific evidence
|
|
137
|
-
5. **CONSTRUCTIVE** — every flaw includes recommended fix
|
|
138
|
-
6. **CONVERGENT** — goal is consensus, not endless disagreement
|
|
139
|
-
|
|
140
|
-
## Consensus Criteria
|
|
141
|
-
|
|
142
|
-
| Condition | Required |
|
|
143
|
-
|-----------|----------|
|
|
144
|
-
| CRITICAL findings resolved | ALL |
|
|
145
|
-
| HIGH findings resolved or accepted | ALL |
|
|
146
|
-
| All reviewers rate BUILD or BUILD WITH CHANGES | YES |
|
|
147
|
-
| Cross-review disagreements resolved | ALL |
|
|
148
|
-
|
|
149
|
-
## Output Format
|
|
150
|
-
|
|
151
|
-
```markdown
|
|
152
|
-
# Agora Consensus Report
|
|
153
|
-
|
|
154
|
-
## Document: [path]
|
|
155
|
-
## Rounds: [N]
|
|
156
|
-
## Reviewers: [list with LLM models used]
|
|
157
|
-
|
|
158
|
-
## Verdict: [BUILD / BUILD WITH CHANGES / REDESIGN]
|
|
159
|
-
|
|
160
|
-
## Unanimous Findings
|
|
161
|
-
| # | Finding | Severity | All 3 Agree |
|
|
162
|
-
|---|---------|----------|-------------|
|
|
163
|
-
|
|
164
|
-
## Required Changes Before Build
|
|
165
|
-
1. [change with source reviewer]
|
|
166
|
-
2. ...
|
|
167
|
-
|
|
168
|
-
## Accepted Risks
|
|
169
|
-
- [finding accepted with justification]
|
|
170
|
-
|
|
171
|
-
## Unique Contributions Per Reviewer
|
|
172
|
-
| Reviewer | Findings Others Missed |
|
|
173
|
-
|----------|----------------------|
|
|
174
|
-
|
|
175
|
-
## Process Metrics
|
|
176
|
-
- Rounds: N
|
|
177
|
-
- Total findings: N
|
|
178
|
-
- Cross-adopted: N
|
|
179
|
-
- Severity upgrades: N
|
|
180
|
-
- Severity downgrades: N
|
|
181
|
-
- Disagreements raised: N
|
|
182
|
-
- Disagreements resolved: N/N
|
|
183
|
-
```
|
|
184
|
-
|
|
185
|
-
## Configuration
|
|
186
|
-
|
|
187
|
-
```yaml
|
|
188
|
-
# Default settings
|
|
189
|
-
agora:
|
|
190
|
-
max_rounds: unlimited # Set --rounds to limit
|
|
191
|
-
severity_threshold: HIGH # EXIT when no findings >= threshold
|
|
192
|
-
models:
|
|
193
|
-
- claude (opus, max effort)
|
|
194
|
-
- codex (via codex-exec skill)
|
|
195
|
-
- gemini (via gemini-exec skill)
|
|
196
|
-
review_points: 20 # Per reviewer
|
|
197
|
-
cross_review: true # Peer-to-peer sharing
|
|
198
|
-
auto_redesign: true # Auto-produce redesign on REDESIGN verdict
|
|
199
|
-
```
|
|
200
|
-
|
|
201
|
-
## Anti-Patterns
|
|
202
|
-
|
|
203
|
-
| Anti-Pattern | Why Wrong | Correct |
|
|
204
|
-
|-------------|-----------|---------|
|
|
205
|
-
| Single LLM review | Misses blind spots | 3+ LLMs find complementary flaws |
|
|
206
|
-
| No cross-review | Reviewers don't challenge each other | Peer-to-peer sharing surfaces disagreements |
|
|
207
|
-
| Accepting first BUILD | May miss edge cases | Loop until ALL agree |
|
|
208
|
-
| Ignoring split decisions | Unresolved disagreements fester | Resolve every split with evidence |
|
|
209
|
-
| Push for consensus too fast | Premature agreement | Let reviewers challenge freely |
|
|
@@ -1,218 +0,0 @@
|
|
|
1
|
-
---
|
|
2
|
-
name: codex-exec
|
|
3
|
-
description: Execute OpenAI Codex CLI prompts and return results
|
|
4
|
-
scope: core
|
|
5
|
-
argument-hint: "<prompt> [--json] [--output <path>] [--model <name>] [--timeout <ms>] [--effort <level>]"
|
|
6
|
-
user-invocable: true
|
|
7
|
-
---
|
|
8
|
-
|
|
9
|
-
# Codex Exec Skill
|
|
10
|
-
|
|
11
|
-
Execute OpenAI Codex CLI prompts in non-interactive mode and return structured results. Enables Claude + Codex hybrid workflows.
|
|
12
|
-
|
|
13
|
-
## Options
|
|
14
|
-
|
|
15
|
-
```
|
|
16
|
-
<prompt> Required. The prompt to send to Codex CLI
|
|
17
|
-
--json Return structured JSON Lines output
|
|
18
|
-
--output <path> Save final message to file
|
|
19
|
-
--model <name> Model override (default: Codex CLI default model)
|
|
20
|
-
--timeout <ms> Execution timeout (default: 120000, max: 600000)
|
|
21
|
-
--full-auto Enable auto-approval mode (codex -a full-auto)
|
|
22
|
-
--working-dir Working directory for Codex execution
|
|
23
|
-
--effort <level> Set reasoning effort level (minimal, low, medium, high, xhigh)
|
|
24
|
-
Maps to Codex CLI's model_reasoning_effort config
|
|
25
|
-
Default: uses Codex CLI's configured default
|
|
26
|
-
Recommended: xhigh for research/analysis tasks
|
|
27
|
-
```
|
|
28
|
-
|
|
29
|
-
## Workflow
|
|
30
|
-
|
|
31
|
-
```
|
|
32
|
-
1. Pre-checks
|
|
33
|
-
- Verify `codex` binary is installed (which codex || npx codex --version)
|
|
34
|
-
- Verify authentication (`OPENAI_API_KEY`, `CODEX_API_KEY`, or stored `codex login` / ChatGPT login)
|
|
35
|
-
2. Build command
|
|
36
|
-
- Base: codex exec --ephemeral "<prompt>"
|
|
37
|
-
- Apply options: --json, --model, --full-auto, -C <dir>
|
|
38
|
-
- Set --working-dir if specified
|
|
39
|
-
3. Execute
|
|
40
|
-
- Run via Bash tool with timeout (default 2min, max 10min)
|
|
41
|
-
- Or use helper script: node .codex/skills/codex-exec/scripts/codex-wrapper.cjs
|
|
42
|
-
4. Parse output
|
|
43
|
-
- Text mode: return raw stdout
|
|
44
|
-
- JSON mode: parse JSON Lines, extract final assistant message
|
|
45
|
-
5. Report results
|
|
46
|
-
- Format output with execution metadata
|
|
47
|
-
```
|
|
48
|
-
|
|
49
|
-
## Safety Defaults
|
|
50
|
-
|
|
51
|
-
- `--ephemeral`: No session persistence (conversations not saved)
|
|
52
|
-
- Default mode: Normal approval (Codex prompts for confirmation)
|
|
53
|
-
- Override with `--full-auto` only when explicitly requested
|
|
54
|
-
|
|
55
|
-
## Output Format
|
|
56
|
-
|
|
57
|
-
### Success (Text Mode)
|
|
58
|
-
```
|
|
59
|
-
[Codex Exec] Completed
|
|
60
|
-
|
|
61
|
-
Model: (default)
|
|
62
|
-
Duration: 23.4s
|
|
63
|
-
Working Dir: /path/to/project
|
|
64
|
-
|
|
65
|
-
--- Output ---
|
|
66
|
-
{codex response text}
|
|
67
|
-
```
|
|
68
|
-
|
|
69
|
-
### Success (JSON Mode)
|
|
70
|
-
```
|
|
71
|
-
[Codex Exec] Completed (JSON)
|
|
72
|
-
|
|
73
|
-
Model: (default)
|
|
74
|
-
Duration: 23.4s
|
|
75
|
-
Events: 12
|
|
76
|
-
|
|
77
|
-
--- Final Message ---
|
|
78
|
-
{extracted final assistant message}
|
|
79
|
-
```
|
|
80
|
-
|
|
81
|
-
### Failure
|
|
82
|
-
```
|
|
83
|
-
[Codex Exec] Failed
|
|
84
|
-
|
|
85
|
-
Error: {error_message}
|
|
86
|
-
Exit Code: {code}
|
|
87
|
-
Suggested Fix: {suggestion}
|
|
88
|
-
```
|
|
89
|
-
|
|
90
|
-
## Helper Script
|
|
91
|
-
|
|
92
|
-
For complex executions, use the wrapper script:
|
|
93
|
-
```bash
|
|
94
|
-
node .codex/skills/codex-exec/scripts/codex-wrapper.cjs --prompt "your prompt" [options]
|
|
95
|
-
```
|
|
96
|
-
|
|
97
|
-
The wrapper provides:
|
|
98
|
-
- Environment validation (binary + auth checks)
|
|
99
|
-
- Safe command construction
|
|
100
|
-
- JSON Lines parsing with event extraction
|
|
101
|
-
- Structured JSON output
|
|
102
|
-
- Timeout handling with graceful termination
|
|
103
|
-
|
|
104
|
-
## Examples
|
|
105
|
-
|
|
106
|
-
```bash
|
|
107
|
-
# Simple text prompt
|
|
108
|
-
codex-exec "explain what this project does"
|
|
109
|
-
|
|
110
|
-
# JSON output with model override
|
|
111
|
-
codex-exec "list all TODO items" --json
|
|
112
|
-
|
|
113
|
-
# Save output to file
|
|
114
|
-
codex-exec "generate a README" --output ./README.md
|
|
115
|
-
|
|
116
|
-
# Full auto mode with custom timeout
|
|
117
|
-
codex-exec "fix the failing tests" --full-auto --timeout 300000
|
|
118
|
-
|
|
119
|
-
# Specify working directory
|
|
120
|
-
codex-exec "analyze the codebase" --working-dir /path/to/project
|
|
121
|
-
```
|
|
122
|
-
|
|
123
|
-
## Integration
|
|
124
|
-
|
|
125
|
-
Works with the orchestrator pattern:
|
|
126
|
-
- Main conversation delegates Codex execution via this skill
|
|
127
|
-
- Results are returned to the main conversation for further processing
|
|
128
|
-
- Can be chained with other skills (e.g., dev-review after Codex generates code)
|
|
129
|
-
|
|
130
|
-
## Availability Check
|
|
131
|
-
|
|
132
|
-
codex-exec requires the Codex CLI binary to be installed and authenticated. The skill is only usable when:
|
|
133
|
-
|
|
134
|
-
1. `codex` binary is found in PATH (`which codex` succeeds)
|
|
135
|
-
2. Authentication is valid (`OPENAI_API_KEY`, `CODEX_API_KEY`, or stored auth from `codex login --api-key` / ChatGPT login)
|
|
136
|
-
|
|
137
|
-
If either check fails, this skill cannot be used. Fall back to Claude agents for the task.
|
|
138
|
-
|
|
139
|
-
> **Note**: This skill is invoked via `/codex-exec` command, delegated by the orchestrator, or suggested by routing skills when codex is available. The intent-detection system can trigger it for research (xhigh) and code generation (hybrid) workflows.
|
|
140
|
-
|
|
141
|
-
## Agent Teams Integration
|
|
142
|
-
|
|
143
|
-
When used within Agent Teams (requires explicit invocation):
|
|
144
|
-
|
|
145
|
-
1. **As delegated task**: orchestrator explicitly delegates codex-exec for code generation
|
|
146
|
-
2. **Hybrid workflow**: Claude team member analyzes → orchestrator invokes codex-exec → Claude reviews
|
|
147
|
-
3. **Iteration**: Team messaging enables review-fix cycles between Claude and Codex outputs
|
|
148
|
-
|
|
149
|
-
```
|
|
150
|
-
Orchestrator delegates generation task
|
|
151
|
-
→ /codex-exec invoked explicitly
|
|
152
|
-
→ Output returned to orchestrator
|
|
153
|
-
→ Reviewer validates quality
|
|
154
|
-
→ Iterate if needed
|
|
155
|
-
```
|
|
156
|
-
|
|
157
|
-
## Research Workflow
|
|
158
|
-
|
|
159
|
-
When the orchestrator or intent-detection detects a research/information gathering request (routing_rule in agent-triggers.yaml):
|
|
160
|
-
|
|
161
|
-
1. **Check Codex availability**: Verify `codex` binary plus `OPENAI_API_KEY`, `CODEX_API_KEY`, or stored `codex login` auth
|
|
162
|
-
2. **If available**: Execute with xhigh reasoning effort for thorough research
|
|
163
|
-
3. **If unavailable**: Fall back to Claude's WebFetch/WebSearch
|
|
164
|
-
|
|
165
|
-
### Research Command Pattern
|
|
166
|
-
|
|
167
|
-
```
|
|
168
|
-
/codex-exec "Research and analyze: {topic}. Provide structured findings with sources." --effort xhigh --full-auto --json
|
|
169
|
-
```
|
|
170
|
-
|
|
171
|
-
### Effort Level Guide
|
|
172
|
-
|
|
173
|
-
| Level | Use Case | Speed | Depth |
|
|
174
|
-
|-------|----------|-------|-------|
|
|
175
|
-
| minimal | Quick lookups | Fastest | Surface |
|
|
176
|
-
| low | Simple queries | Fast | Basic |
|
|
177
|
-
| medium | General tasks | Balanced | Standard |
|
|
178
|
-
| high | Complex analysis | Slower | Deep |
|
|
179
|
-
| xhigh | Research & investigation | Slowest | Maximum |
|
|
180
|
-
|
|
181
|
-
## Code Generation Workflow
|
|
182
|
-
|
|
183
|
-
When routing skills detect a code generation task and codex is available:
|
|
184
|
-
|
|
185
|
-
1. **Check availability**: Verify the codex CLI directly (`command -v codex`) or via current session diagnostics
|
|
186
|
-
2. **If available + new file creation**: Suggest hybrid workflow
|
|
187
|
-
3. **Hybrid pattern**:
|
|
188
|
-
- codex-exec generates initial code (fast, broad generation)
|
|
189
|
-
- Claude expert reviews for quality, patterns, best practices
|
|
190
|
-
- Iterate if needed
|
|
191
|
-
|
|
192
|
-
### Suitable Tasks
|
|
193
|
-
- New file scaffolding
|
|
194
|
-
- Boilerplate generation
|
|
195
|
-
- Test stub creation
|
|
196
|
-
- Documentation generation
|
|
197
|
-
|
|
198
|
-
### Unsuitable Tasks
|
|
199
|
-
- Modifying existing code (Claude expert better at understanding context)
|
|
200
|
-
- Architecture decisions (requires reasoning, not generation)
|
|
201
|
-
- Bug fixes (requires deep code understanding)
|
|
202
|
-
|
|
203
|
-
### Code Generation Command Pattern
|
|
204
|
-
```
|
|
205
|
-
/codex-exec "Generate {description} following {framework} best practices" --effort high --full-auto
|
|
206
|
-
```
|
|
207
|
-
|
|
208
|
-
## Browser Verify Workflow
|
|
209
|
-
|
|
210
|
-
For frontend or browser-visible changes, use a Build + Vision + Verify loop instead of stopping at a successful build:
|
|
211
|
-
|
|
212
|
-
1. Build or start the local dev server.
|
|
213
|
-
2. Open the target in the available browser automation surface.
|
|
214
|
-
3. Capture screenshot evidence and console/network errors.
|
|
215
|
-
4. If the visual state or console is wrong, run `codex-exec` with the concrete evidence and repeat.
|
|
216
|
-
5. Stop only when build, browser render, and error checks all pass.
|
|
217
|
-
|
|
218
|
-
This pattern composes with the Codex App Browser Use plugin or any local browser MCP. Keep the loop evidence-driven: screenshot, console output, network status, and the exact command that produced the build.
|