@wazir-dev/cli 1.0.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (163) hide show
  1. package/CHANGELOG.md +100 -2
  2. package/README.md +6 -6
  3. package/docs/concepts/architecture.md +1 -1
  4. package/docs/concepts/roles-and-workflows.md +2 -0
  5. package/docs/concepts/why-wazir.md +59 -0
  6. package/docs/decisions/2026-03-19-deferred-items.md +564 -0
  7. package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
  8. package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +1 -1
  9. package/docs/readmes/INDEX.md +21 -5
  10. package/docs/readmes/features/expertise/README.md +2 -2
  11. package/docs/readmes/features/exports/README.md +2 -2
  12. package/docs/readmes/features/schemas/README.md +3 -0
  13. package/docs/readmes/features/skills/README.md +17 -0
  14. package/docs/readmes/features/skills/clarifier.md +5 -0
  15. package/docs/readmes/features/skills/claude-cli.md +5 -0
  16. package/docs/readmes/features/skills/codex-cli.md +5 -0
  17. package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
  18. package/docs/readmes/features/skills/executing-plans.md +5 -0
  19. package/docs/readmes/features/skills/executor.md +5 -0
  20. package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
  21. package/docs/readmes/features/skills/gemini-cli.md +5 -0
  22. package/docs/readmes/features/skills/humanize.md +5 -0
  23. package/docs/readmes/features/skills/init-pipeline.md +5 -0
  24. package/docs/readmes/features/skills/receiving-code-review.md +5 -0
  25. package/docs/readmes/features/skills/requesting-code-review.md +5 -0
  26. package/docs/readmes/features/skills/reviewer.md +5 -0
  27. package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
  28. package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
  29. package/docs/readmes/features/skills/wazir.md +5 -0
  30. package/docs/readmes/features/skills/writing-skills.md +5 -0
  31. package/docs/readmes/features/workflows/prepare-next.md +1 -1
  32. package/docs/reference/configuration-reference.md +47 -6
  33. package/docs/reference/launch-checklist.md +4 -4
  34. package/docs/reference/review-loop-pattern.md +538 -0
  35. package/docs/reference/roles-reference.md +1 -0
  36. package/docs/reference/skill-tiers.md +147 -0
  37. package/docs/reference/tooling-cli.md +5 -1
  38. package/docs/truth-claims.yaml +18 -0
  39. package/expertise/antipatterns/process/ai-coding-antipatterns.md +97 -1
  40. package/exports/hosts/claude/.claude/agents/clarifier.md +3 -0
  41. package/exports/hosts/claude/.claude/agents/designer.md +3 -0
  42. package/exports/hosts/claude/.claude/agents/executor.md +2 -0
  43. package/exports/hosts/claude/.claude/agents/planner.md +3 -0
  44. package/exports/hosts/claude/.claude/agents/researcher.md +2 -0
  45. package/exports/hosts/claude/.claude/agents/reviewer.md +5 -1
  46. package/exports/hosts/claude/.claude/agents/specifier.md +3 -0
  47. package/exports/hosts/claude/.claude/commands/clarify.md +4 -0
  48. package/exports/hosts/claude/.claude/commands/design-review.md +4 -0
  49. package/exports/hosts/claude/.claude/commands/design.md +4 -0
  50. package/exports/hosts/claude/.claude/commands/discover.md +4 -0
  51. package/exports/hosts/claude/.claude/commands/execute.md +4 -0
  52. package/exports/hosts/claude/.claude/commands/plan-review.md +4 -0
  53. package/exports/hosts/claude/.claude/commands/plan.md +4 -0
  54. package/exports/hosts/claude/.claude/commands/spec-challenge.md +4 -0
  55. package/exports/hosts/claude/.claude/commands/specify.md +4 -0
  56. package/exports/hosts/claude/.claude/commands/verify.md +4 -0
  57. package/exports/hosts/claude/.claude/settings.json +9 -0
  58. package/exports/hosts/claude/CLAUDE.md +1 -1
  59. package/exports/hosts/claude/export.manifest.json +22 -20
  60. package/exports/hosts/claude/host-package.json +3 -1
  61. package/exports/hosts/codex/AGENTS.md +1 -1
  62. package/exports/hosts/codex/export.manifest.json +22 -20
  63. package/exports/hosts/codex/host-package.json +3 -1
  64. package/exports/hosts/cursor/.cursor/hooks.json +4 -0
  65. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
  66. package/exports/hosts/cursor/export.manifest.json +22 -20
  67. package/exports/hosts/cursor/host-package.json +3 -1
  68. package/exports/hosts/gemini/GEMINI.md +1 -1
  69. package/exports/hosts/gemini/export.manifest.json +22 -20
  70. package/exports/hosts/gemini/host-package.json +3 -1
  71. package/hooks/context-mode-router +191 -0
  72. package/hooks/definitions/context_mode_router.yaml +19 -0
  73. package/hooks/definitions/loop_cap_guard.yaml +1 -1
  74. package/hooks/hooks.json +43 -0
  75. package/hooks/protected-path-write-guard +8 -0
  76. package/hooks/routing-matrix.json +45 -0
  77. package/hooks/session-start +62 -1
  78. package/llms-full.txt +905 -132
  79. package/package.json +3 -3
  80. package/roles/clarifier.md +3 -0
  81. package/roles/designer.md +3 -0
  82. package/roles/executor.md +2 -0
  83. package/roles/planner.md +3 -0
  84. package/roles/researcher.md +2 -0
  85. package/roles/reviewer.md +5 -1
  86. package/roles/specifier.md +3 -0
  87. package/schemas/hook.schema.json +2 -1
  88. package/schemas/phase-report.schema.json +80 -0
  89. package/schemas/usage.schema.json +25 -1
  90. package/schemas/wazir-manifest.schema.json +19 -0
  91. package/skills/brainstorming/SKILL.md +20 -56
  92. package/skills/clarifier/SKILL.md +243 -0
  93. package/skills/claude-cli/SKILL.md +320 -0
  94. package/skills/codex-cli/SKILL.md +260 -0
  95. package/skills/debugging/SKILL.md +24 -1
  96. package/skills/design/SKILL.md +13 -0
  97. package/skills/dispatching-parallel-agents/SKILL.md +13 -0
  98. package/skills/executing-plans/SKILL.md +28 -2
  99. package/skills/executor/SKILL.md +129 -0
  100. package/skills/finishing-a-development-branch/SKILL.md +13 -0
  101. package/skills/gemini-cli/SKILL.md +260 -0
  102. package/skills/humanize/SKILL.md +13 -0
  103. package/skills/init-pipeline/SKILL.md +76 -78
  104. package/skills/prepare-next/SKILL.md +81 -10
  105. package/skills/receiving-code-review/SKILL.md +21 -0
  106. package/skills/requesting-code-review/SKILL.md +38 -5
  107. package/skills/reviewer/SKILL.md +423 -0
  108. package/skills/run-audit/SKILL.md +13 -0
  109. package/skills/scan-project/SKILL.md +13 -0
  110. package/skills/self-audit/SKILL.md +197 -16
  111. package/skills/subagent-driven-development/SKILL.md +38 -2
  112. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
  113. package/skills/subagent-driven-development/implementer-prompt.md +8 -0
  114. package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
  115. package/skills/tdd/SKILL.md +21 -0
  116. package/skills/using-git-worktrees/SKILL.md +13 -0
  117. package/skills/using-skills/SKILL.md +13 -0
  118. package/skills/verification/SKILL.md +13 -0
  119. package/skills/wazir/SKILL.md +286 -262
  120. package/skills/writing-plans/SKILL.md +44 -4
  121. package/skills/writing-skills/SKILL.md +13 -0
  122. package/templates/artifacts/implementation-plan.md +3 -0
  123. package/templates/artifacts/tasks-template.md +133 -0
  124. package/templates/examples/phase-report.example.json +48 -0
  125. package/templates/examples/wazir-manifest.example.yaml +1 -1
  126. package/tooling/src/adapters/composition-engine.js +256 -0
  127. package/tooling/src/adapters/model-router.js +84 -0
  128. package/tooling/src/capture/command.js +111 -2
  129. package/tooling/src/capture/run-config.js +23 -0
  130. package/tooling/src/capture/store.js +24 -0
  131. package/tooling/src/capture/usage.js +106 -0
  132. package/tooling/src/checks/ac-matrix.js +256 -0
  133. package/tooling/src/checks/brand-truth.js +3 -6
  134. package/tooling/src/checks/command-registry.js +13 -0
  135. package/tooling/src/checks/docs-truth.js +1 -1
  136. package/tooling/src/checks/runtime-surface.js +3 -7
  137. package/tooling/src/checks/skills.js +111 -0
  138. package/tooling/src/cli.js +17 -3
  139. package/tooling/src/commands/stats.js +161 -0
  140. package/tooling/src/commands/validate.js +5 -1
  141. package/tooling/src/export/compiler.js +33 -37
  142. package/tooling/src/gating/agent.js +145 -0
  143. package/tooling/src/guards/phase-prerequisite-guard.js +127 -0
  144. package/tooling/src/hooks/routing-logic.js +69 -0
  145. package/tooling/src/init/auto-detect.js +260 -0
  146. package/tooling/src/init/command.js +161 -0
  147. package/tooling/src/input/scanner.js +46 -0
  148. package/tooling/src/reports/command.js +103 -0
  149. package/tooling/src/reports/phase-report.js +323 -0
  150. package/tooling/src/state/command.js +160 -0
  151. package/tooling/src/state/db.js +287 -0
  152. package/tooling/src/status/command.js +53 -1
  153. package/wazir.manifest.yaml +26 -17
  154. package/workflows/clarify.md +4 -0
  155. package/workflows/design-review.md +4 -0
  156. package/workflows/design.md +4 -0
  157. package/workflows/discover.md +4 -0
  158. package/workflows/execute.md +4 -0
  159. package/workflows/plan-review.md +4 -0
  160. package/workflows/plan.md +4 -0
  161. package/workflows/spec-challenge.md +4 -0
  162. package/workflows/specify.md +4 -0
  163. package/workflows/verify.md +4 -0
@@ -0,0 +1,320 @@
1
+ ---
2
+ name: wz:claude-cli
3
+ description: How to use Claude Code CLI programmatically for reviews, automation, and non-interactive operations within Wazir pipelines.
4
+ ---
5
+
6
+ # Claude Code CLI Integration
7
+
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
13
+
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
21
+ Reference for using the Claude Code CLI (Anthropic's official CLI for Claude) in Wazir pipelines. Claude Code is an agentic coding tool that operates in your terminal with access to tools like file operations, search, and bash execution.
22
+
23
+ ## Commands
24
+
25
+ ### claude (interactive)
26
+
27
+ Launch the interactive TUI for ad-hoc work.
28
+
29
+ ```bash
30
+ claude
31
+ claude "Fix the failing test in src/auth.ts"
32
+ ```
33
+
34
+ ### Print Mode (non-interactive)
35
+
36
+ The `-p` (or `--print`) flag is the primary mode for Wazir automation. It runs Claude Code non-interactively, outputs the result to stdout, and exits.
37
+
38
+ ```bash
39
+ # Basic non-interactive prompt
40
+ claude -p "Explain the architecture of this project"
41
+
42
+ # Pipe data from stdin (avoids command-line length limits)
43
+ git diff main | claude -p "Review this diff for bugs"
44
+
45
+ # Chain with other tools
46
+ claude -p "List all exported functions" --output-format json | jq '.result'
47
+
48
+ # Save output to file
49
+ claude -p "Summarize the test coverage" > summary.md
50
+ ```
51
+
52
+ **Key flags (all work with `-p`):**
53
+
54
+ | Flag | Description |
55
+ |------|-------------|
56
+ | `-p, --print` | Run non-interactively; print response to stdout and exit |
57
+ | `--model <MODEL>` | Select model: `opus`, `sonnet`, `haiku`, or full name (e.g., `claude-opus-4-6`) |
58
+ | `--fallback-model <MODEL>` | Fallback model when the primary is overloaded (print mode) |
59
+ | `--output-format <FORMAT>` | Output format: `text` (default), `json`, `stream-json` |
60
+ | `--json-schema <SCHEMA>` | Enforce structured output conforming to a JSON schema (requires `--output-format json`) |
61
+ | `--allowedTools <TOOLS...>` | Pre-approve specific tools without prompting (space-separated) |
62
+ | `--disallowedTools <TOOLS...>` | Block specific tools from being used |
63
+ | `--max-turns <N>` | Limit agentic turns in a session |
64
+ | `--append-system-prompt <TEXT>` | Add custom instructions while keeping default capabilities (safe choice) |
65
+ | `--system-prompt <TEXT>` | Replace the entire system prompt (removes all defaults; use with caution) |
66
+ | `--dangerously-skip-permissions` | Bypass all permission barriers (CI/CD and dev containers only) |
67
+ | `--verbose` | Enable verbose output for debugging |
68
+ | `--input-format <FORMAT>` | Input format: `text` (default) |
69
+
70
+ ### Session Management
71
+
72
+ ```bash
73
+ # Continue most recent session in current directory
74
+ claude -c
75
+
76
+ # Resume a specific session by ID
77
+ claude -r <SESSION_ID>
78
+
79
+ # Resume from a PR
80
+ claude --from-pr <NUMBER>
81
+
82
+ # Fork a session into a new thread
83
+ claude --fork-session <SESSION_ID>
84
+
85
+ # Set a custom session ID
86
+ claude --session-id <ID>
87
+ ```
88
+
89
+ ### Subcommands
90
+
91
+ | Subcommand | Description |
92
+ |------------|-------------|
93
+ | `claude mcp add` | Add an MCP server (`--transport http\|stdio`, `-s` for scope, `-e` for env vars) |
94
+ | `claude mcp serve` | Expose Claude Code itself as an MCP server |
95
+ | `claude agents` | List all configured agents |
96
+ | `claude config` | Manage configuration settings |
97
+ | `claude remote-control` | Serve your local environment for external builds |
98
+
99
+ ## Model Selection
100
+
101
+ | Model | Best For | Notes |
102
+ |-------|----------|-------|
103
+ | `opus` / `claude-opus-4-6` | Complex reasoning, architecture, multi-step tasks | Most capable, highest cost |
104
+ | `sonnet` / `claude-sonnet-4-6` | Daily coding, balanced performance | Recommended default |
105
+ | `haiku` / `claude-haiku-4-5` | Quick tasks, fast responses, high volume | Lowest cost, fastest |
106
+
107
+ **Select via:**
108
+ - CLI flag: `--model opus` or `--model claude-opus-4-6`
109
+ - Interactive: `/model` slash command
110
+ - Environment variable: `CLAUDE_MODEL`
111
+ - Settings: `.claude/settings.json`
112
+
113
+ **Fallback:** Use `--fallback-model haiku` with `-p` to auto-switch when the primary model is overloaded.
114
+
115
+ ## Permission Management
116
+
117
+ ### allowedTools
118
+
119
+ Pre-approve tools to avoid interactive permission prompts. Critical for non-interactive automation.
120
+
121
+ ```bash
122
+ # Allow specific tools
123
+ claude -p --allowedTools "Read" "Grep" "Glob" "Bash(npm run test:*)" \
124
+ "Review the test suite"
125
+
126
+ # Allow all tools (equivalent to dangerously-skip-permissions but scoped)
127
+ claude -p --allowedTools "Read" "Write" "Edit" "Bash" "Grep" "Glob" \
128
+ "Fix the bug in auth.ts"
129
+ ```
130
+
131
+ **Tool name patterns:**
132
+ - Exact: `"Read"`, `"Write"`, `"Edit"`, `"Bash"`, `"Grep"`, `"Glob"`
133
+ - Scoped: `"Bash(npm run test:*)"` allows only matching bash commands
134
+ - MCP tools: `"mcp__servername__toolname"`
135
+
136
+ ### disallowedTools
137
+
138
+ Block specific tools:
139
+
140
+ ```bash
141
+ claude -p --disallowedTools "Write" "Edit" "Bash" \
142
+ "Analyze this codebase for security issues"
143
+ ```
144
+
145
+ ### Project permissions
146
+
147
+ Store permanent permissions in `.claude/settings.json`:
148
+
149
+ ```json
150
+ {
151
+ "permissions": {
152
+ "allowedTools": ["Read", "Grep", "Glob"],
153
+ "deny": ["Bash(rm *)"]
154
+ }
155
+ }
156
+ ```
157
+
158
+ ## Non-Interactive Usage
159
+
160
+ ### Piping data
161
+
162
+ ```bash
163
+ # Pipe a diff for review
164
+ git diff main | claude -p "Review this diff for correctness"
165
+
166
+ # Pipe file content
167
+ cat src/auth.ts | claude -p "Find potential bugs"
168
+
169
+ # Pipe combined context
170
+ { echo "## Spec"; cat spec.md; echo "## Code"; cat src/main.ts; } | \
171
+ claude -p "Does the code match the spec?"
172
+ ```
173
+
174
+ ### Structured output
175
+
176
+ ```bash
177
+ # JSON output with full metadata (tool calls, token usage)
178
+ claude -p --output-format json "List all API endpoints"
179
+
180
+ # Streaming JSONL (real-time events)
181
+ claude -p --output-format stream-json "Analyze the codebase"
182
+
183
+ # Schema-enforced structured output
184
+ claude -p --output-format json --json-schema '{"type":"object","properties":{"findings":{"type":"array"},"summary":{"type":"string"}}}' \
185
+ "Review this code and return findings"
186
+ ```
187
+
188
+ ### System prompt customization
189
+
190
+ ```bash
191
+ # Append instructions (keeps default Claude Code capabilities)
192
+ claude -p --append-system-prompt "You are a security auditor. Focus only on vulnerabilities." \
193
+ "Review src/auth.ts"
194
+
195
+ # Full override (removes all defaults; use when you need a clean slate)
196
+ claude -p --system-prompt "You are a JSON-only responder. Return only valid JSON." \
197
+ --output-format json "List all functions in this file"
198
+ ```
199
+
200
+ ## MCP Server Integration
201
+
202
+ Claude Code supports MCP (Model Context Protocol) servers for extended capabilities.
203
+
204
+ ### Adding MCP servers
205
+
206
+ ```bash
207
+ # stdio transport
208
+ claude mcp add github -- npx -y @modelcontextprotocol/server-github
209
+
210
+ # HTTP transport
211
+ claude mcp add api --transport http https://api.example.com
212
+
213
+ # With scope (project vs user)
214
+ claude mcp add myserver -s project -- node server.js
215
+
216
+ # With environment variables
217
+ claude mcp add myserver -e API_KEY=xxx -- node server.js
218
+ ```
219
+
220
+ ### Using MCP tools in automation
221
+
222
+ ```bash
223
+ # Allow specific MCP tools in print mode
224
+ claude -p --allowedTools "mcp__github__create_pull_request" "mcp__github__list_issues" \
225
+ "Create a PR for the current changes"
226
+ ```
227
+
228
+ ### Claude Code as MCP server
229
+
230
+ ```bash
231
+ # Expose Claude Code as an MCP server for other tools
232
+ claude mcp serve
233
+ ```
234
+
235
+ **Tool Search (lazy loading):** Since early 2026, Claude Code uses Tool Search for MCP tools by default, loading tool schemas on demand rather than all at once. This reduces context usage by ~95%.
236
+
237
+ ## Built-in Slash Commands
238
+
239
+ | Command | Description |
240
+ |---------|-------------|
241
+ | `/model` | Switch model |
242
+ | `/cost` | Show token usage and cost |
243
+ | `/clear` | Clear conversation context |
244
+ | `/compact` | Compress conversation to save tokens |
245
+ | `/help` | Display available commands |
246
+ | `/review` | Review current changes |
247
+ | `/debug` | Debug a failing test or error |
248
+ | `/effort` | Set reasoning effort level |
249
+
250
+ ## Wazir Integration Patterns
251
+
252
+ ### Secondary Review (used by wz:reviewer)
253
+
254
+ ```bash
255
+ CLAUDE_MODEL=$(jq -r '.multi_tool.claude.model // empty' .wazir/state/config.json 2>/dev/null)
256
+ CLAUDE_MODEL=${CLAUDE_MODEL:-sonnet}
257
+
258
+ # Review uncommitted changes
259
+ git diff | claude -p --model "$CLAUDE_MODEL" \
260
+ --allowedTools "Read" "Grep" "Glob" \
261
+ "Review this diff against these acceptance criteria: <criteria>" \
262
+ 2>&1 | tee .wazir/runs/latest/reviews/claude-review.md
263
+
264
+ # Review a spec or design artifact
265
+ cat artifact.md | claude -p --model "$CLAUDE_MODEL" \
266
+ "Review this spec against these criteria: <criteria>" \
267
+ 2>&1 | tee .wazir/runs/latest/reviews/claude-review.md
268
+ ```
269
+
270
+ ### Structured Review Output
271
+
272
+ ```bash
273
+ claude -p --model "$CLAUDE_MODEL" --output-format json \
274
+ --json-schema '{"type":"object","properties":{"findings":{"type":"array","items":{"type":"object","properties":{"severity":{"type":"string"},"description":{"type":"string"},"location":{"type":"string"}}}},"summary":{"type":"string"}}}' \
275
+ "Review the changes in src/auth/" \
276
+ > .wazir/runs/latest/reviews/claude-review.json
277
+ ```
278
+
279
+ ### Parallel Execution as External Validator
280
+
281
+ ```bash
282
+ # Run Claude review in background
283
+ git diff main | claude -p --model haiku \
284
+ --allowedTools "Read" "Grep" \
285
+ "Quick security scan of this diff" \
286
+ > .wazir/runs/latest/reviews/claude-security.md 2>&1 &
287
+ ```
288
+
289
+ ### Multi-Turn Programmatic Sessions
290
+
291
+ For complex automation requiring multiple turns:
292
+
293
+ ```bash
294
+ # Limit turns to prevent runaway sessions
295
+ claude -p --max-turns 5 --allowedTools "Read" "Grep" "Bash(npm test)" \
296
+ "Run the tests, analyze any failures, and suggest fixes"
297
+ ```
298
+
299
+ ## Error Handling
300
+
301
+ | Error | Handling |
302
+ |-------|----------|
303
+ | **Non-zero exit** (auth/rate-limit/transport) | Log full stderr, mark pass as `claude-unavailable`, use self-review only. Next pass re-attempts. |
304
+ | **Timeout** | Wrap with `timeout 120 claude -p ...`. Treat timeout as `claude-unavailable`. |
305
+ | **Model overloaded** | `--fallback-model haiku` auto-switches. Without it, retry after backoff. |
306
+ | **Permission denied** | Add required tools to `--allowedTools` or use `--dangerously-skip-permissions` in CI. |
307
+ | **Max turns reached** | Increase `--max-turns` or break the task into smaller prompts. |
308
+
309
+ ## Configuration
310
+
311
+ Claude Code reads configuration from (highest to lowest precedence):
312
+ 1. CLI flags (`--model`, `--allowedTools`, etc.)
313
+ 2. Environment variables (`CLAUDE_MODEL`, `ANTHROPIC_API_KEY`)
314
+ 3. `.claude/settings.json` (project-level)
315
+ 4. `~/.claude/settings.json` (user-level)
316
+ 5. `.claude/rules/` directory (modular rule files)
317
+ 6. `CLAUDE.md` (project instructions)
318
+ 7. Auto Memory (persisted learnings)
319
+
320
+ Key config fields in `settings.json`: `model`, `maxTokens`, `permissions.allowedTools`, `permissions.deny`, `env`.
@@ -0,0 +1,260 @@
1
+ ---
2
+ name: wz:codex-cli
3
+ description: How to use Codex CLI programmatically for reviews, execution, and sandbox operations within Wazir pipelines.
4
+ ---
5
+
6
+ # Codex CLI Integration
7
+
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
13
+
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
21
+ Reference for using the OpenAI Codex CLI in Wazir pipelines. Codex is a terminal-based coding agent that reads your codebase, suggests or implements changes, and executes commands with OS-level sandboxing.
22
+
23
+ ## Commands
24
+
25
+ ### codex (interactive)
26
+
27
+ Launch the interactive TUI. Default mode for ad-hoc work.
28
+
29
+ ```bash
30
+ codex "Fix the failing test in src/auth.ts"
31
+ ```
32
+
33
+ ### codex exec
34
+
35
+ Run Codex non-interactively (alias: `codex e`). Streams results to stdout or JSONL. This is the primary command for Wazir automation.
36
+
37
+ ```bash
38
+ # Basic non-interactive run
39
+ codex exec "Refactor the auth module to use async/await"
40
+
41
+ # Pipe a long prompt from stdin
42
+ cat prompt.md | codex exec -
43
+
44
+ # JSON event stream for programmatic consumption
45
+ codex exec --json "Add error handling to all API routes"
46
+
47
+ # Write final assistant message to a file
48
+ codex exec --output-file result.md "Summarize the codebase architecture"
49
+
50
+ # Enforce structured output with a JSON schema
51
+ codex exec --output-schema schema.json "List all exported functions"
52
+
53
+ # Ephemeral run (no session files persisted)
54
+ codex exec --ephemeral "Quick check: does this file import lodash?"
55
+ ```
56
+
57
+ **Key flags:**
58
+
59
+ | Flag | Description |
60
+ |------|-------------|
61
+ | `--model <MODEL>` | Override the configured model (e.g., `gpt-5.4`, `gpt-5.4-mini`) |
62
+ | `--json` | Emit newline-delimited JSON events (JSONL) instead of formatted text |
63
+ | `--output-file <PATH>` | Write the final assistant message to a file |
64
+ | `--output-schema <PATH>` | Enforce structured output conforming to a JSON schema |
65
+ | `--ephemeral` | Run without persisting session rollout files |
66
+ | `--full-auto` | Apply the low-friction automation preset (`workspace-write` sandbox + `on-request` approvals) |
67
+ | `--dangerously-bypass-approvals-and-sandbox` / `--yolo` | Bypass all approval prompts and sandboxing (use only inside isolated runners) |
68
+ | `-c key=value` | Set a config override (e.g., `-c model=gpt-5.4`) |
69
+
70
+ **JSONL event types:** `thread.started`, `turn.started`, `turn.completed`, `turn.failed`, `item.message`, `item.command`, `item.file_change`, `item.mcp_tool_call`.
71
+
72
+ ### codex review
73
+
74
+ Dedicated code review command. This is what Wazir uses for secondary review in the reviewer skill.
75
+
76
+ ```bash
77
+ # Review uncommitted changes (staged + unstaged + untracked)
78
+ codex review --uncommitted
79
+
80
+ # Review changes against a base branch
81
+ codex review --base main
82
+
83
+ # Review a specific commit
84
+ codex review --commit d5853d9
85
+
86
+ # Review with custom instructions
87
+ codex review --uncommitted "Check for security vulnerabilities and missing error handling"
88
+
89
+ # Review with model override
90
+ codex review --model gpt-5.4 --base main "Review against these acceptance criteria: ..."
91
+ ```
92
+
93
+ **Key flags:**
94
+
95
+ | Flag | Description |
96
+ |------|-------------|
97
+ | `--uncommitted` | Review staged, unstaged, and untracked changes |
98
+ | `--base <BRANCH>` | Review changes against a given base branch |
99
+ | `--commit <SHA>` | Review the changes introduced by a specific commit |
100
+ | `--model <MODEL>` | Override the model for this review |
101
+ | `[PROMPT]` | Optional custom review instructions (positional argument) |
102
+
103
+ ### codex resume
104
+
105
+ Resume a previous interactive session.
106
+
107
+ ```bash
108
+ # Open session picker
109
+ codex resume
110
+
111
+ # Resume most recent session in current directory
112
+ codex resume --last
113
+
114
+ # Resume most recent session from any directory
115
+ codex resume --last --all
116
+
117
+ # Resume a specific session by ID
118
+ codex resume <SESSION_ID>
119
+ ```
120
+
121
+ ### codex fork
122
+
123
+ Fork a previous session into a new thread, preserving the original transcript. Useful for exploring alternative approaches in parallel.
124
+
125
+ ### codex cloud
126
+
127
+ Browse or execute Codex Cloud tasks from the terminal without opening the TUI.
128
+
129
+ ### codex execpolicy
130
+
131
+ Evaluate execpolicy rule files and check whether a command would be allowed, prompted, or blocked.
132
+
133
+ ### codex auth
134
+
135
+ Authenticate Codex using ChatGPT OAuth, device auth, or an API key piped over stdin.
136
+
137
+ ### codex completion
138
+
139
+ Generate shell completion scripts for Bash, Zsh, Fish, or PowerShell.
140
+
141
+ ## Sandbox Modes
142
+
143
+ Codex provides OS-level sandboxing to protect your system.
144
+
145
+ | Sandbox Mode | Description |
146
+ |-------------|-------------|
147
+ | `read-only` | Codex can only read files, no writes or commands |
148
+ | `workspace-write` | Codex can write files in the workspace but external commands are sandboxed |
149
+ | `danger-full-access` | Full system access with no restrictions (use only in VMs/containers) |
150
+
151
+ ## Approval Policies
152
+
153
+ | Policy | Description |
154
+ |--------|-------------|
155
+ | `suggest` | Default. Every action requires explicit approval before execution |
156
+ | `auto-edit` | Auto-approves file edits but still requires approval for shell commands |
157
+ | `full-auto` | Autonomous mode; executes everything without confirmation |
158
+
159
+ **Preset shortcut:** `--full-auto` sets `sandbox_mode=workspace-write` + `approval_policy=on-request`.
160
+
161
+ **Granular control:** Set `approval_policy` to `"on-request"`, `"untrusted"`, or `"never"` in config. You can also define per-command allow/reject rules.
162
+
163
+ ## Model Selection
164
+
165
+ | Model | Best For | Notes |
166
+ |-------|----------|-------|
167
+ | `gpt-5.4` | Complex coding, reasoning, professional workflows | Recommended default |
168
+ | `gpt-5.4-mini` | Faster, lower-cost tasks, subagents | Uses ~30% of gpt-5.4 quota |
169
+ | `gpt-5.3-codex-spark` | Near-instant real-time coding iteration | Research preview, Pro subscribers |
170
+
171
+ Select via `--model <model>` flag or `-c model=<model>` config override.
172
+
173
+ ## Non-Interactive Usage
174
+
175
+ ### Piping prompts
176
+
177
+ ```bash
178
+ # Pipe from stdin
179
+ cat prompt.md | codex exec -
180
+
181
+ # Pipe a diff for review
182
+ git diff main | codex exec - "Review this diff for bugs"
183
+ ```
184
+
185
+ ### Structured output
186
+
187
+ ```bash
188
+ # JSONL event stream
189
+ codex exec --json "Analyze the test coverage"
190
+
191
+ # Schema-enforced output
192
+ codex exec --output-schema schema.json "List all API endpoints"
193
+ ```
194
+
195
+ ### Capturing output
196
+
197
+ ```bash
198
+ # Write final message to file
199
+ codex exec --output-file result.md "Summarize changes since v2.0"
200
+
201
+ # Tee for both display and capture
202
+ codex exec "Review this code" 2>&1 | tee review-output.md
203
+ ```
204
+
205
+ ## Wazir Integration Patterns
206
+
207
+ ### Secondary Review (used by wz:reviewer)
208
+
209
+ ```bash
210
+ CODEX_MODEL=$(jq -r '.multi_tool.codex.model // empty' .wazir/state/config.json 2>/dev/null)
211
+ CODEX_MODEL=${CODEX_MODEL:-gpt-5.4}
212
+
213
+ # Review uncommitted changes
214
+ codex review -c model="$CODEX_MODEL" --uncommitted \
215
+ "Review against these acceptance criteria: <criteria>" \
216
+ 2>&1 | tee .wazir/runs/latest/reviews/codex-review.md
217
+
218
+ # Review committed changes against a base branch
219
+ codex review -c model="$CODEX_MODEL" --base main \
220
+ "Review against these acceptance criteria: <criteria>" \
221
+ 2>&1 | tee .wazir/runs/latest/reviews/codex-review.md
222
+ ```
223
+
224
+ ### Non-Code Artifact Review
225
+
226
+ For reviewing specs, designs, and plans (not code diffs):
227
+
228
+ ```bash
229
+ cat artifact.md | codex exec -c model="$CODEX_MODEL" - \
230
+ "Review this spec against these criteria: <criteria>" \
231
+ 2>&1 | tee .wazir/runs/latest/reviews/codex-review.md
232
+ ```
233
+
234
+ ### Parallel Execution
235
+
236
+ Use Codex as an external validator alongside the primary Wazir review:
237
+
238
+ ```bash
239
+ # Run Codex review in background, merge findings later
240
+ codex review --uncommitted "Check for security issues" \
241
+ > .wazir/runs/latest/reviews/codex-security.md 2>&1 &
242
+ ```
243
+
244
+ ## Error Handling
245
+
246
+ | Error | Handling |
247
+ |-------|----------|
248
+ | **Non-zero exit** (auth/rate-limit/transport) | Log full stderr, mark pass as `codex-unavailable`, use self-review only for that pass. Next pass re-attempts Codex (transient failures may recover). |
249
+ | **Timeout** | Set reasonable timeouts via shell (`timeout 120 codex review ...`). If exceeded, treat as `codex-unavailable`. |
250
+ | **Model unavailable** | Fall back to `gpt-5.4-mini` if primary model is overloaded. |
251
+ | **Rate limiting** | Respect retry-after headers. Space sequential calls by at least 5 seconds. |
252
+
253
+ ## Configuration
254
+
255
+ Codex CLI reads configuration from:
256
+ - `~/.codex/config.yaml` or `~/.codex/config.json` (global)
257
+ - `.codex/config.yaml` in the project root (project-level)
258
+ - Command-line flags and `-c key=value` overrides (highest precedence)
259
+
260
+ Key config fields: `model`, `approval_policy`, `sandbox_mode`, `providers`.
@@ -5,6 +5,19 @@ description: Use when behavior is wrong or verification fails. Follow an observe
5
5
 
6
6
  # Debugging
7
7
 
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
13
+
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
8
21
  > **Note:** This skill uses Wazir CLI commands for symbol-first code
9
22
  > exploration. If the CLI index is unavailable, fall back to direct file reads —
10
23
  > the generic OBSERVE methodology (read files, inspect state, gather evidence)
@@ -43,7 +56,17 @@ Follow this order:
43
56
 
44
57
  Apply the minimum corrective change, then rerun the failing check and the relevant broader verification set.
45
58
 
46
- Rules:
59
+ ## Loop Cap Awareness
60
+
61
+ Debugging loops respect the loop cap when running inside a pipeline:
62
+ - **Pipeline mode** (`.wazir/runs/latest/` exists): use `wazir capture loop-check` to track iteration count. If the cap is reached (exit 43), escalate to the user with all evidence collected so far.
63
+ - **Standalone mode** (no `.wazir/runs/latest/`): the loop runs for `pass_counts[depth]` passes (quick=3, standard=5, deep=7) with no cap guard. Track iteration count manually.
64
+
65
+ In standalone mode, any debug logs go to `docs/plans/` alongside the artifact.
66
+
67
+ See `docs/reference/review-loop-pattern.md` for cap guard integration.
68
+
69
+ ## Rules
47
70
 
48
71
  - change one thing at a time
49
72
  - keep evidence for each failed hypothesis
@@ -7,6 +7,19 @@ description: Guide the designer role through open-pencil MCP workflow to produce
7
7
 
8
8
  Use open-pencil MCP tools to create visual designs from the approved spec.
9
9
 
10
+ ## Command Routing
11
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
12
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
13
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
14
+ - If context-mode unavailable, fall back to native Bash with warning
15
+
16
+ ## Codebase Exploration
17
+ 1. Query `wazir index search-symbols <query>` first
18
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
19
+ 3. Fall back to direct file reads ONLY for files identified by index queries
20
+ 4. Maximum 10 direct file reads without a justifying index query
21
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
22
+
10
23
  ## Prerequisites
11
24
 
12
25
  - open-pencil MCP server running (`openpencil-mcp` or `openpencil-mcp-http`)
@@ -5,6 +5,19 @@ description: Use when facing 2+ independent tasks that can be worked on without
5
5
 
6
6
  # Dispatching Parallel Agents
7
7
 
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
13
+
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
8
21
  ## Overview
9
22
 
10
23
  You delegate tasks to specialized agents with isolated context. By precisely crafting their instructions and context, you ensure they stay focused and succeed at their task. They should never inherit your session's context or history — you construct exactly what they need. This also preserves your own context for coordination work.