@wazir-dev/cli 1.1.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (138) hide show
  1. package/CHANGELOG.md +74 -10
  2. package/README.md +15 -15
  3. package/assets/demo.cast +47 -0
  4. package/assets/demo.gif +0 -0
  5. package/docs/anti-patterns/AP-23-skipping-enabled-workflows.md +28 -0
  6. package/docs/anti-patterns/AP-24-clarifier-deciding-scope.md +34 -0
  7. package/docs/concepts/architecture.md +1 -1
  8. package/docs/concepts/roles-and-workflows.md +2 -0
  9. package/docs/concepts/why-wazir.md +59 -0
  10. package/docs/decisions/2026-03-19-deferred-items.md +564 -0
  11. package/docs/decisions/2026-03-19-enhancement-decisions.md +300 -0
  12. package/docs/readmes/INDEX.md +21 -5
  13. package/docs/readmes/features/expertise/README.md +2 -2
  14. package/docs/readmes/features/exports/README.md +2 -2
  15. package/docs/readmes/features/hooks/pre-compact-summary.md +1 -1
  16. package/docs/readmes/features/schemas/README.md +3 -0
  17. package/docs/readmes/features/skills/README.md +17 -0
  18. package/docs/readmes/features/skills/clarifier.md +5 -0
  19. package/docs/readmes/features/skills/claude-cli.md +5 -0
  20. package/docs/readmes/features/skills/codex-cli.md +5 -0
  21. package/docs/readmes/features/skills/dispatching-parallel-agents.md +5 -0
  22. package/docs/readmes/features/skills/executing-plans.md +5 -0
  23. package/docs/readmes/features/skills/executor.md +5 -0
  24. package/docs/readmes/features/skills/finishing-a-development-branch.md +5 -0
  25. package/docs/readmes/features/skills/gemini-cli.md +5 -0
  26. package/docs/readmes/features/skills/humanize.md +5 -0
  27. package/docs/readmes/features/skills/init-pipeline.md +5 -0
  28. package/docs/readmes/features/skills/receiving-code-review.md +5 -0
  29. package/docs/readmes/features/skills/requesting-code-review.md +5 -0
  30. package/docs/readmes/features/skills/reviewer.md +5 -0
  31. package/docs/readmes/features/skills/subagent-driven-development.md +5 -0
  32. package/docs/readmes/features/skills/using-git-worktrees.md +5 -0
  33. package/docs/readmes/features/skills/wazir.md +5 -0
  34. package/docs/readmes/features/skills/writing-skills.md +5 -0
  35. package/docs/readmes/features/workflows/prepare-next.md +1 -1
  36. package/docs/reference/configuration-reference.md +47 -6
  37. package/docs/reference/hooks.md +1 -0
  38. package/docs/reference/launch-checklist.md +4 -4
  39. package/docs/reference/review-loop-pattern.md +119 -9
  40. package/docs/reference/roles-reference.md +1 -0
  41. package/docs/reference/skill-tiers.md +147 -0
  42. package/docs/reference/tooling-cli.md +3 -1
  43. package/docs/truth-claims.yaml +12 -0
  44. package/expertise/antipatterns/process/ai-coding-antipatterns.md +214 -1
  45. package/exports/hosts/claude/.claude/commands/plan-review.md +3 -1
  46. package/exports/hosts/claude/.claude/commands/verify.md +30 -1
  47. package/exports/hosts/claude/.claude/settings.json +9 -0
  48. package/exports/hosts/claude/CLAUDE.md +1 -1
  49. package/exports/hosts/claude/export.manifest.json +6 -4
  50. package/exports/hosts/claude/host-package.json +3 -1
  51. package/exports/hosts/codex/AGENTS.md +1 -1
  52. package/exports/hosts/codex/export.manifest.json +6 -4
  53. package/exports/hosts/codex/host-package.json +3 -1
  54. package/exports/hosts/cursor/.cursor/hooks.json +4 -0
  55. package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +1 -1
  56. package/exports/hosts/cursor/export.manifest.json +6 -4
  57. package/exports/hosts/cursor/host-package.json +3 -1
  58. package/exports/hosts/gemini/GEMINI.md +1 -1
  59. package/exports/hosts/gemini/export.manifest.json +6 -4
  60. package/exports/hosts/gemini/host-package.json +3 -1
  61. package/hooks/context-mode-router +191 -0
  62. package/hooks/definitions/context_mode_router.yaml +19 -0
  63. package/hooks/hooks.json +31 -6
  64. package/hooks/protected-path-write-guard +8 -0
  65. package/hooks/routing-matrix.json +45 -0
  66. package/hooks/session-start +62 -1
  67. package/llms-full.txt +937 -134
  68. package/package.json +2 -4
  69. package/schemas/hook.schema.json +2 -1
  70. package/schemas/phase-report.schema.json +89 -0
  71. package/schemas/usage.schema.json +25 -1
  72. package/schemas/wazir-manifest.schema.json +19 -0
  73. package/skills/brainstorming/SKILL.md +32 -157
  74. package/skills/clarifier/SKILL.md +289 -111
  75. package/skills/claude-cli/SKILL.md +320 -0
  76. package/skills/codex-cli/SKILL.md +260 -0
  77. package/skills/debugging/SKILL.md +13 -0
  78. package/skills/design/SKILL.md +13 -0
  79. package/skills/dispatching-parallel-agents/SKILL.md +13 -0
  80. package/skills/executing-plans/SKILL.md +13 -0
  81. package/skills/executor/SKILL.md +139 -19
  82. package/skills/finishing-a-development-branch/SKILL.md +13 -0
  83. package/skills/gemini-cli/SKILL.md +260 -0
  84. package/skills/humanize/SKILL.md +13 -0
  85. package/skills/init-pipeline/SKILL.md +72 -164
  86. package/skills/prepare-next/SKILL.md +81 -10
  87. package/skills/receiving-code-review/SKILL.md +13 -0
  88. package/skills/requesting-code-review/SKILL.md +13 -0
  89. package/skills/reviewer/SKILL.md +369 -24
  90. package/skills/run-audit/SKILL.md +13 -0
  91. package/skills/scan-project/SKILL.md +13 -0
  92. package/skills/self-audit/SKILL.md +217 -16
  93. package/skills/skill-research/SKILL.md +188 -0
  94. package/skills/subagent-driven-development/SKILL.md +13 -0
  95. package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +2 -0
  96. package/skills/subagent-driven-development/implementer-prompt.md +8 -0
  97. package/skills/subagent-driven-development/spec-reviewer-prompt.md +7 -0
  98. package/skills/tdd/SKILL.md +13 -0
  99. package/skills/using-git-worktrees/SKILL.md +13 -0
  100. package/skills/using-skills/SKILL.md +13 -0
  101. package/skills/verification/SKILL.md +54 -3
  102. package/skills/wazir/SKILL.md +464 -381
  103. package/skills/writing-plans/SKILL.md +14 -1
  104. package/skills/writing-skills/SKILL.md +13 -0
  105. package/templates/artifacts/implementation-plan.md +3 -0
  106. package/templates/artifacts/tasks-template.md +133 -0
  107. package/templates/examples/phase-report.example.json +48 -0
  108. package/tooling/src/adapters/composition-engine.js +256 -0
  109. package/tooling/src/adapters/model-router.js +84 -0
  110. package/tooling/src/capture/command.js +41 -2
  111. package/tooling/src/capture/run-config.js +3 -1
  112. package/tooling/src/capture/store.js +56 -0
  113. package/tooling/src/capture/usage.js +106 -0
  114. package/tooling/src/capture/user-input.js +66 -0
  115. package/tooling/src/checks/ac-matrix.js +256 -0
  116. package/tooling/src/checks/command-registry.js +12 -0
  117. package/tooling/src/checks/docs-truth.js +1 -1
  118. package/tooling/src/checks/security-sensitivity.js +69 -0
  119. package/tooling/src/checks/skills.js +111 -0
  120. package/tooling/src/cli.js +31 -20
  121. package/tooling/src/commands/stats.js +161 -0
  122. package/tooling/src/commands/validate.js +5 -1
  123. package/tooling/src/export/compiler.js +33 -37
  124. package/tooling/src/gating/agent.js +145 -0
  125. package/tooling/src/guards/phase-prerequisite-guard.js +185 -0
  126. package/tooling/src/hooks/routing-logic.js +69 -0
  127. package/tooling/src/init/auto-detect.js +258 -0
  128. package/tooling/src/init/command.js +38 -170
  129. package/tooling/src/input/scanner.js +46 -0
  130. package/tooling/src/reports/command.js +103 -0
  131. package/tooling/src/reports/phase-report.js +323 -0
  132. package/tooling/src/state/command.js +160 -0
  133. package/tooling/src/state/db.js +287 -0
  134. package/tooling/src/status/command.js +58 -1
  135. package/tooling/src/verify/proof-collector.js +299 -0
  136. package/wazir.manifest.yaml +26 -14
  137. package/workflows/plan-review.md +3 -1
  138. package/workflows/verify.md +30 -1
@@ -5,13 +5,53 @@ description: Run the execution phase — implement the approved plan with TDD, q
5
5
 
6
6
  # Executor
7
7
 
8
- Run Phase 2 (Execute) for the current project.
8
+ ## Model Annotation
9
+ When multi-model mode is enabled, the executor phase uses:
10
+ - **Sonnet** for per-task implementation (write-implementation)
11
+ - **Sonnet** for per-task review (task-review)
12
+ - **Sonnet** for test execution (run-tests)
13
+ - **Opus** for orchestration decisions
14
+
15
+ ## Command Routing
16
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
17
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
18
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
19
+ - If context-mode unavailable, fall back to native Bash with warning
20
+
21
+ ## Codebase Exploration
22
+ 1. Query `wazir index search-symbols <query>` first
23
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
24
+ 3. Fall back to direct file reads ONLY for files identified by index queries
25
+ 4. Maximum 10 direct file reads without a justifying index query
26
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
27
+
28
+ Run the Executor phase — implement the approved plan, then verify all claims.
29
+
30
+ ## Phase Prerequisites (Hard Gate)
31
+
32
+ Before proceeding, verify these artifacts exist. Check each file. If ANY file is missing, **STOP immediately** and report:
33
+
34
+ > **Cannot start Executor phase: missing prerequisite artifacts.**
35
+ >
36
+ > Missing:
37
+ > - [list missing files]
38
+ >
39
+ > Run `/wazir:clarifier` to produce the missing artifacts.
40
+
41
+ Required artifacts:
42
+ - [ ] `.wazir/runs/latest/clarified/clarification.md`
43
+ - [ ] `.wazir/runs/latest/clarified/spec-hardened.md`
44
+ - [ ] `.wazir/runs/latest/clarified/design.md`
45
+ - [ ] `.wazir/runs/latest/clarified/execution-plan.md`
46
+
47
+ **This is a hard gate. Do NOT proceed without all artifacts. Do NOT rationalize that the input is "clear enough" to skip phases. The existence of detailed input does NOT replace the pipeline's clarification, specification, design, and planning phases.**
48
+
49
+ **Standalone mode exception:** If `.wazir/runs/latest/` does not exist at all, operate in standalone mode (skip this check).
9
50
 
10
51
  ## Prerequisites
11
52
 
12
- 1. Check `.wazir/runs/latest/clarified/execution-plan.md` exists. If not, tell the user to run `/wazir:clarifier` first.
13
- 2. Read the execution plan and task specs from `.wazir/runs/latest/tasks/`.
14
- 3. Read `.wazir/state/config.json` for team_mode and depth settings.
53
+ 1. Read the execution plan from `.wazir/runs/latest/clarified/execution-plan.md`.
54
+ 2. Read `.wazir/state/config.json` for depth settings.
15
55
 
16
56
  ## Pre-Execution Validation
17
57
 
@@ -21,41 +61,93 @@ Run these checks before implementing:
21
61
 
22
62
  If either fails, surface the failure and do NOT proceed until resolved.
23
63
 
24
- ## Execution
64
+ > **Output to the user** before execution begins:
65
+ > Each task is implemented with TDD (test first, then code) and reviewed before commit. This catches correctness bugs, missing tests, wiring errors, and spec drift at the task level — before they compound across tasks and become expensive to fix.
66
+
67
+ ## Security Awareness
68
+
69
+ Before implementing each task, check if the task touches security-sensitive areas. Run `detectSecurityPatterns` (from `tooling/src/checks/security-sensitivity.js`) mentally against the planned changes. If security patterns are detected (auth, token, password, session, SQL, fetch, upload, secret, env, API key, cookie, CORS, CSRF, JWT, OAuth, encrypt, decrypt, hash, salt):
70
+
71
+ - Load security expertise from the composition map for the relevant concern
72
+ - Apply defense-in-depth: validate inputs, parameterize queries, escape outputs, use secure defaults
73
+ - The per-task reviewer will automatically add security dimensions when patterns are detected — expect and address security findings
74
+
75
+ ## Execute (execute workflow)
25
76
 
26
77
  Implement tasks in the order defined by the execution plan.
27
78
 
28
79
  For each task:
29
80
 
30
- 1. **Read** the task spec at `.wazir/runs/latest/tasks/task-NNN/spec.md`
81
+ **Before starting each task, output to the user:**
82
+
83
+ > **Implementing Task [NNN]: [task title]** — This enables [what downstream tasks or user-facing features depend on this task].
84
+ >
85
+ > **Looking for:** [Key technical concerns for this specific task — e.g., "correct API contract", "database migration safety", "backwards compatibility"]
86
+
87
+ 1. **Read** the task from the execution plan
31
88
  2. **Implement** using TDD (write test first, make it pass, refactor)
32
- 3. **Verify** — run tests, type checks, linting as appropriate
89
+ 3. **Verify locally** — run tests, type checks, linting as appropriate
33
90
  4. **Review BEFORE commit** (per-task review, NOT final review):
34
91
  - Reviewer runs task-review loop with `--mode task-review` using 5 task-execution dimensions (correctness, tests, wiring, drift, quality)
35
- - Reads the Codex model from config: `CODEX_MODEL=$(jq -r '.multi_tool.codex.model // empty' .wazir/state/config.json 2>/dev/null); CODEX_MODEL=${CODEX_MODEL:-gpt-5.4}`
92
+ - Reads Codex model from config: `CODEX_MODEL=$(jq -r '.multi_tool.codex.model // empty' .wazir/state/config.json 2>/dev/null); CODEX_MODEL=${CODEX_MODEL:-gpt-5.4}`
36
93
  - Uses `codex review -c model="$CODEX_MODEL" --uncommitted` for the current task's changes
37
- - Codex error handling: if codex exits non-zero, log error, mark pass as `codex-unavailable`, use self-review only for that pass. Do NOT treat a Codex failure as a clean review. Do NOT skip the pass. The next pass still attempts Codex (transient failures may recover).
94
+ - Codex error handling: if codex exits non-zero, log error, mark pass as `codex-unavailable`, use self-review only for that pass. Do NOT skip. Next pass still attempts Codex.
38
95
  - Executor resolves findings, reviewer re-reviews
39
96
  - Loop runs for `pass_counts[depth]` passes (quick=3, standard=5, deep=7). No extension.
40
97
  - Review logs: `.wazir/runs/latest/reviews/execute-task-<NNN>-review-pass-<N>.md`
41
- - Loop cap tracking: `wazir capture loop-check --task-id <NNN>` (each task has its own cap counter)
98
+ - Loop cap tracking: `wazir capture loop-check --task-id <NNN>`
42
99
  - See `docs/reference/review-loop-pattern.md` for full protocol
43
- - NOTE: this is the per-task review (5 dims), not the final scored review (7 dims) which runs later in `/wazir:reviewer --mode final`
100
+ - NOTE: this is the per-task review (5 dims), not the final scored review (7 dims) which runs in Phase 4
44
101
  5. **Commit** — only after review passes, commit with conventional commit format: `<type>(<scope>): <description>`
45
- 6. **CHANGELOG** if the change is user-facing (new feature, behavior change, bug fix visible to users), update `CHANGELOG.md` `[Unreleased]` section. If not user-facing (refactor, internal tooling, tests), skip.
102
+ - **HARD RULE: One task = one commit.** Commit after EACH task completes its review. Never batch multiple tasks into a single commit. If the reviewer detects multi-task batching, the commit is REJECTED.
103
+ 6. **CHANGELOG** — if user-facing change, update `CHANGELOG.md` under `[Unreleased]` using keepachangelog types: Added, Changed, Fixed, Removed, Deprecated, Security.
46
104
  7. **Record** evidence at `.wazir/runs/latest/artifacts/task-NNN/`
47
105
 
48
- Review loops follow the pattern in `docs/reference/review-loop-pattern.md`. Code review scoping: review uncommitted changes before commit. If changes are already committed (subagent workflow), use `codex review -c model="$CODEX_MODEL" --base <pre-task-sha>`.
106
+ **After completing each task, output to the user:**
49
107
 
50
- If `team_mode: parallel` in config, spawn Agent Teams for independent tasks. Otherwise, tasks run sequentially.
108
+ > **Completed Task [NNN]: [task title].**
109
+ >
110
+ > **Changed:** [List of files created/modified, tests added, key implementation decisions]
111
+ >
112
+ > **Without this task:** [Concrete risk — e.g., "no auth middleware means all routes are publicly accessible", "no migration means schema change would require manual DB intervention"]
113
+ >
114
+ > **Review result:** [N] findings in [N] review passes, [N] fixed before commit
115
+
116
+ Review loops follow `docs/reference/review-loop-pattern.md`. Code review scoping: review uncommitted changes before commit. If changes are committed, use `--base <pre-task-sha>`.
117
+
118
+ Tasks always run sequentially.
119
+
120
+ **Standalone mode:** When no `.wazir/runs/latest/` exists, review logs go to `docs/plans/`.
121
+
122
+ > **Output to the user** before verification:
123
+ > Verification produces deterministic proof — actual command output, not claims. It confirms that tests pass, types check, linters are clean, and every acceptance criterion has evidence. This is the evidence gate that separates "I think it works" from "here is proof it works."
51
124
 
52
- **Standalone mode:** When no `.wazir/runs/latest/` exists, review logs go to `docs/plans/` alongside the artifact.
125
+ ## Verify (verify workflow)
126
+
127
+ After all tasks are complete, run deterministic verification:
128
+
129
+ 1. Run the full test suite
130
+ 2. Run type checks (if applicable)
131
+ 3. Run linters
132
+ 4. Verify all acceptance criteria from the spec have evidence
133
+ 5. Produce verification proof at `.wazir/runs/latest/artifacts/verification-proof.md`
134
+
135
+ This is NOT a review loop — it produces proof, not findings. If verification fails, report which criteria lack evidence and offer to fix.
53
136
 
54
137
  ## Context Retrieval
55
138
 
56
139
  - Use `wazir index search-symbols <query>` to locate relevant code before reading
57
140
  - Read full files directly when editing or verifying
58
141
  - Use `wazir recall file <path> --tier L1` for files you need to understand but not modify
142
+ - When dispatching subagents, include: "Use wazir index search-symbols before direct file reads."
143
+
144
+ ## Interaction Mode Awareness
145
+
146
+ Read `interaction_mode` from run-config at the start of execution:
147
+
148
+ - **`auto`:** Skip user checkpoints. On escalation, write reason to `.wazir/runs/<id>/escalations/` and STOP (do not proceed without user). Gating agent evaluates phase reports.
149
+ - **`guided`:** Standard behavior — ask user on escalation, show per-task completion summaries.
150
+ - **`interactive`:** Before implementing each task, briefly describe the approach and ask: "About to implement [task] using [approach] — sound right?" Show more detail in per-task summaries.
59
151
 
60
152
  ## Escalation
61
153
 
@@ -64,13 +156,41 @@ Pause and ask the user when:
64
156
  - Implementation would require unapproved scope change
65
157
  - A task's acceptance criteria can't be met
66
158
 
159
+ When escalating, use this pattern:
160
+
161
+ Ask the user via AskUserQuestion:
162
+ - **Question:** "[Describe the specific blocker or conflict]"
163
+ - **Options:**
164
+ 1. "Adjust the plan to work around the blocker" *(Recommended)*
165
+ 2. "Expand scope to handle the new requirement"
166
+ 3. "Skip this task and continue with the rest"
167
+ 4. "Abort the run"
168
+
169
+ Wait for the user's selection before continuing.
170
+
171
+ ## Reasoning Output
172
+
173
+ Throughout the executor phase, produce reasoning at two layers:
174
+
175
+ **Conversation (Layer 1):** Before each task, explain what you're about to implement and why. After each task, state what would have gone wrong without this task.
176
+
177
+ **File (Layer 2):** Write `.wazir/runs/<id>/reasoning/phase-executor-reasoning.md` with structured entries per implementation decision:
178
+ - **Trigger** — what prompted the decision (e.g., "task spec requires auth middleware")
179
+ - **Options considered** — implementation alternatives
180
+ - **Chosen** — selected approach
181
+ - **Reasoning** — why this approach over alternatives
182
+ - **Confidence** — high/medium/low
183
+ - **Counterfactual** — what would break without this decision
184
+
185
+ Key executor reasoning moments: architecture choices, library selections, API design decisions, test strategy decisions, and any deviation from the plan.
186
+
67
187
  ## Done
68
188
 
69
- When all tasks are complete, present:
189
+ When all tasks are complete and verified:
70
190
 
71
- > **Execution complete.**
191
+ > **Executor phase complete.**
72
192
  >
73
193
  > - Tasks: [completed]/[total] implemented
74
- > - Artifacts: `.wazir/runs/latest/artifacts/`
194
+ > - Verification: proof at `.wazir/runs/latest/artifacts/verification-proof.md`
75
195
  >
76
- > **Next:** Run `/wazir:reviewer --mode final` to review the changes, or `/wazir` for the full pipeline.
196
+ > **Next:** Run `/reviewer --mode final` to review against the original input.
@@ -5,6 +5,19 @@ description: Use when implementation is complete, all tests pass, and you need t
5
5
 
6
6
  # Finishing a Development Branch
7
7
 
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
13
+
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
8
21
  ## Overview
9
22
 
10
23
  Guide completion of development work by presenting clear options and handling chosen workflow.
@@ -0,0 +1,260 @@
1
+ ---
2
+ name: wz:gemini-cli
3
+ description: How to use Gemini CLI programmatically for headless reviews, automation, and sandbox operations within Wazir pipelines.
4
+ ---
5
+
6
+ # Gemini CLI Integration
7
+
8
+ ## Command Routing
9
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
10
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
11
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
12
+ - If context-mode unavailable, fall back to native Bash with warning
13
+
14
+ ## Codebase Exploration
15
+ 1. Query `wazir index search-symbols <query>` first
16
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
17
+ 3. Fall back to direct file reads ONLY for files identified by index queries
18
+ 4. Maximum 10 direct file reads without a justifying index query
19
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
20
+
21
+ Reference for using the Google Gemini CLI in Wazir pipelines. Gemini CLI is an open-source AI agent that uses a ReAct (reason and act) loop with built-in tools and MCP servers to complete tasks directly in your terminal.
22
+
23
+ ## Commands
24
+
25
+ ### gemini (interactive)
26
+
27
+ Launch the interactive TUI for ad-hoc work.
28
+
29
+ ```bash
30
+ gemini
31
+ ```
32
+
33
+ ### Headless Mode (non-interactive)
34
+
35
+ Headless mode is the primary mode for Wazir automation. It is triggered when providing a prompt with the `-p` (or `--prompt`) flag, or when the CLI runs in a non-TTY environment.
36
+
37
+ ```bash
38
+ # Basic headless prompt
39
+ gemini -p "Explain the architecture of this project"
40
+
41
+ # Pipe data from stdin
42
+ git diff main | gemini -p "Review this diff for bugs and security issues"
43
+
44
+ # Chain with other tools
45
+ gemini -p "List all exported functions" | jq '.response'
46
+
47
+ # Save output to file
48
+ gemini -p "Summarize the test coverage" > summary.md
49
+ ```
50
+
51
+ **Key flags:**
52
+
53
+ | Flag | Description |
54
+ |------|-------------|
55
+ | `-p, --prompt <PROMPT>` | Run in headless mode; print response to stdout and exit |
56
+ | `-m, --model <MODEL>` | Specify the model to use (alias or full name) |
57
+ | `--output-format json` | Output a single structured JSON object with the complete result |
58
+ | `--output-format stream-json` | Stream real-time JSONL events as they occur |
59
+ | `-s, --sandbox` | Enable sandboxed execution for shell commands and file modifications |
60
+ | `-y, --yolo` | Auto-approve all operations (enables sandbox by default) |
61
+ | `--approval-mode <MODE>` | Set approval mode: `default`, `auto_edit`, `plan`, `yolo` |
62
+ | `--checkpoint` | Enable checkpoint mode for long-running tasks |
63
+
64
+ **Headless mode limitations:**
65
+ - No follow-up questions or continued conversation
66
+ - Cannot authorize tools (including WriteFile) or run shell commands unless `--yolo` is used
67
+ - For tool-using automation, combine `-p` with `--yolo` or `--approval-mode auto_edit`
68
+
69
+ ## Slash Commands
70
+
71
+ | Command | Description |
72
+ |---------|-------------|
73
+ | `/model` | Switch model (Pro, Flash, Auto, or Manual selection) |
74
+ | `/yolo` | Toggle YOLO mode (auto-approve all tool calls) |
75
+ | `/stats` | Show token usage and session statistics |
76
+ | `/export` | Export conversation to Markdown or JSON |
77
+ | `/help` | Display available commands |
78
+ | `/settings` | Open settings editor |
79
+
80
+ ## Approval Modes
81
+
82
+ | Mode | Description |
83
+ |------|-------------|
84
+ | `default` | Prompts for approval on every tool use |
85
+ | `auto_edit` | Auto-approves file reads/writes, still prompts for shell commands |
86
+ | `plan` | Read-only mode; no writes or commands executed |
87
+ | `yolo` | Auto-approves everything; enables sandbox by default |
88
+
89
+ **Enable YOLO mode:**
90
+ - CLI flag: `--yolo` or `-y`
91
+ - Interactive toggle: `Ctrl+Y`
92
+ - Slash command: `/yolo`
93
+ - Environment variable: `GEMINI_YOLO=1`
94
+
95
+ **Granular command auto-approval:** Configure specific commands to run without prompts:
96
+ ```json
97
+ {
98
+ "tools": {
99
+ "shell": {
100
+ "autoApprove": ["git ", "npm test", "ls "]
101
+ }
102
+ }
103
+ }
104
+ ```
105
+
106
+ ## Sandbox Mode
107
+
108
+ Sandboxing isolates shell commands and file modifications from your host system. Disabled by default except when using YOLO mode.
109
+
110
+ **Enable sandbox:**
111
+ - CLI flag: `--sandbox` or `-s`
112
+ - Environment variable: `GEMINI_SANDBOX=1`
113
+ - Automatic with `--yolo` or `--approval-mode=yolo`
114
+
115
+ Sandbox uses a pre-built `gemini-cli-sandbox` Docker image for isolation.
116
+
117
+ **Safety configuration:** Set `requireApprovals: true` in settings to disallow YOLO mode and "Always allow" options entirely.
118
+
119
+ ## Model Selection
120
+
121
+ | Model | Best For | Notes |
122
+ |-------|----------|-------|
123
+ | `gemini-3-pro` | Complex reasoning, coding, multi-step tasks | Latest Pro model |
124
+ | `gemini-3-flash` | Fast responses, lighter tasks | Lower latency |
125
+ | `gemini-3.1-pro-preview` | Cutting-edge features | Rolling preview access |
126
+ | `gemini-2.5-pro` | Legacy stable | Still available |
127
+ | `gemini-2.5-flash` | Legacy fast | Still available |
128
+ | `auto` | Recommended; CLI picks best model per task | Default with Google login |
129
+
130
+ **Select via:**
131
+ - CLI flag: `-m <model>` or `--model <model>`
132
+ - Environment variable: `export GEMINI_MODEL="gemini-3-pro"`
133
+ - Interactive: `/model` slash command
134
+ - Config: `settings.json` model field
135
+
136
+ **Note:** With a Google login (not API key), the CLI may auto-blend Pro and Flash models based on task complexity and system capacity.
137
+
138
+ ## Non-Interactive Usage
139
+
140
+ ### Piping data
141
+
142
+ ```bash
143
+ # Pipe a diff for review
144
+ git diff main | gemini -p "Review this diff for correctness and security"
145
+
146
+ # Pipe file content
147
+ cat src/auth.ts | gemini -p "Find potential bugs in this code"
148
+
149
+ # Multi-file context
150
+ cat src/types.ts src/auth.ts | gemini -p "Are these types used correctly in auth?"
151
+ ```
152
+
153
+ ### Structured output
154
+
155
+ ```bash
156
+ # JSON output (single object)
157
+ gemini -p "List all API endpoints" --output-format json | jq '.response'
158
+
159
+ # Streaming JSONL (real-time events)
160
+ gemini -p "Analyze codebase" --output-format stream-json
161
+ ```
162
+
163
+ ### Output in scripts
164
+
165
+ ```bash
166
+ # Capture to variable
167
+ RESULT=$(gemini -p "What does this function do?" --output-format json | jq -r '.response')
168
+
169
+ # Save to file
170
+ gemini -p "Generate a test plan" > test-plan.md
171
+ ```
172
+
173
+ ## MCP Server Integration
174
+
175
+ Gemini CLI supports MCP servers for extended tool capabilities.
176
+
177
+ **Configuration in `settings.json`:**
178
+ ```json
179
+ {
180
+ "mcpServers": {
181
+ "my-server": {
182
+ "command": "npx",
183
+ "args": ["-y", "@my-org/mcp-server"],
184
+ "env": { "API_KEY": "..." }
185
+ }
186
+ }
187
+ }
188
+ ```
189
+
190
+ **Extensions:** Gemini CLI extensions package prompts, MCP servers, and custom commands into installable bundles via `gemini-extension.json`. Extensions use a secure tool-merge approach where exclusions are combined and inclusions are intersected (most restrictive policy wins).
191
+
192
+ **Tool control:**
193
+ - `includeTools` / `excludeTools` in extension or settings config
194
+ - MCP tools appear alongside built-in tools once configured
195
+
196
+ ## Built-in Tools
197
+
198
+ Gemini CLI includes these tools out of the box:
199
+ - **Google Search grounding** (web search)
200
+ - **File operations** (read, write, list)
201
+ - **Shell commands** (subject to approval mode)
202
+ - **Web fetching** (retrieve URL content)
203
+
204
+ ## Wazir Integration Patterns
205
+
206
+ ### Secondary Review (used by wz:reviewer)
207
+
208
+ ```bash
209
+ GEMINI_MODEL=$(jq -r '.multi_tool.gemini.model // empty' .wazir/state/config.json 2>/dev/null)
210
+ GEMINI_MODEL=${GEMINI_MODEL:-gemini-3-pro}
211
+
212
+ # Review uncommitted changes
213
+ git diff | gemini -m "$GEMINI_MODEL" -p \
214
+ "Review this diff against these acceptance criteria: <criteria>" \
215
+ 2>&1 | tee .wazir/runs/latest/reviews/gemini-review.md
216
+
217
+ # Review a spec or design artifact
218
+ cat artifact.md | gemini -m "$GEMINI_MODEL" -p \
219
+ "Review this spec against these criteria: <criteria>" \
220
+ 2>&1 | tee .wazir/runs/latest/reviews/gemini-review.md
221
+ ```
222
+
223
+ ### Automation with Tool Access
224
+
225
+ When the review needs tool access (e.g., reading additional files for context):
226
+
227
+ ```bash
228
+ gemini -m "$GEMINI_MODEL" --yolo -p \
229
+ "Review the changes in src/auth/ for security issues. Read related test files for context." \
230
+ 2>&1 | tee .wazir/runs/latest/reviews/gemini-review.md
231
+ ```
232
+
233
+ ### Structured Review Output
234
+
235
+ ```bash
236
+ gemini -m "$GEMINI_MODEL" -p \
237
+ "Review this code and return JSON with fields: findings (array), severity, summary" \
238
+ --output-format json | jq '.response' \
239
+ > .wazir/runs/latest/reviews/gemini-review.json
240
+ ```
241
+
242
+ ## Error Handling
243
+
244
+ | Error | Handling |
245
+ |-------|----------|
246
+ | **Non-zero exit** (auth/quota/transport) | Log full stderr, mark pass as `gemini-unavailable`, use self-review only. Next pass re-attempts. |
247
+ | **Timeout** | Wrap with `timeout 120 gemini -p ...`. Treat timeout as `gemini-unavailable`. |
248
+ | **Model unavailable** | Fall back to `gemini-3-flash` if Pro model is overloaded. |
249
+ | **Rate limiting** | Respect backoff. Free-tier users share capacity; API key users have dedicated quota. |
250
+ | **Headless tool denial** | If a headless prompt needs tool access, re-run with `--yolo` or `--approval-mode auto_edit`. |
251
+
252
+ ## Configuration
253
+
254
+ Gemini CLI reads configuration from:
255
+ - `~/.gemini/settings.json` (global)
256
+ - `.gemini/settings.json` in the project root (project-level)
257
+ - Environment variables (`GEMINI_MODEL`, `GEMINI_SANDBOX`, `GEMINI_YOLO`, `GOOGLE_API_KEY`)
258
+ - CLI flags (highest precedence)
259
+
260
+ Key config fields: `model`, `approvalMode`, `sandbox`, `mcpServers`, `tools`, `requireApprovals`.
@@ -7,6 +7,19 @@ description: Use when reviewing or editing any text artifact (specs, plans, code
7
7
 
8
8
  Remove AI writing patterns from text artifacts using a 4-phase corrective pipeline. This skill operates on text that has already been generated. For rules that prevent AI patterns during generation, the composition engine loads expertise modules from `expertise/humanize/` into role context automatically.
9
9
 
10
+ ## Command Routing
11
+ Follow the Canonical Command Matrix in `hooks/routing-matrix.json`.
12
+ - Large commands (test runners, builds, diffs, dependency trees, linting) → context-mode tools
13
+ - Small commands (git status, ls, pwd, wazir CLI) → native Bash
14
+ - If context-mode unavailable, fall back to native Bash with warning
15
+
16
+ ## Codebase Exploration
17
+ 1. Query `wazir index search-symbols <query>` first
18
+ 2. Use `wazir recall file <path> --tier L1` for targeted reads
19
+ 3. Fall back to direct file reads ONLY for files identified by index queries
20
+ 4. Maximum 10 direct file reads without a justifying index query
21
+ 5. If no index exists: `wazir index build && wazir index summarize --tier all`
22
+
10
23
  ## Phase 1: Scan
11
24
 
12
25
  Detect the domain and scan for AI patterns.