specpipe 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +1319 -0
- package/bin/devkit.js +3 -0
- package/package.json +61 -0
- package/src/cli.js +76 -0
- package/src/commands/check.js +33 -0
- package/src/commands/diff.js +84 -0
- package/src/commands/init-adopt.js +54 -0
- package/src/commands/init-agents.js +118 -0
- package/src/commands/init-global.js +102 -0
- package/src/commands/init.js +311 -0
- package/src/commands/list.js +54 -0
- package/src/commands/remove.js +133 -0
- package/src/commands/upgrade.js +215 -0
- package/src/lib/agent-guards.js +100 -0
- package/src/lib/agent-install.js +161 -0
- package/src/lib/agents.js +280 -0
- package/src/lib/claude-global.js +183 -0
- package/src/lib/detector.js +93 -0
- package/src/lib/hasher.js +21 -0
- package/src/lib/installer.js +213 -0
- package/src/lib/logger.js +16 -0
- package/src/lib/manifest.js +102 -0
- package/src/lib/reconcile.js +56 -0
- package/templates/.claude/CLAUDE.md +79 -0
- package/templates/.claude/hooks/comment-guard.js +126 -0
- package/templates/.claude/hooks/file-guard.js +216 -0
- package/templates/.claude/hooks/glob-guard.js +104 -0
- package/templates/.claude/hooks/path-guard.sh +118 -0
- package/templates/.claude/hooks/self-review.sh +27 -0
- package/templates/.claude/hooks/sensitive-guard.sh +227 -0
- package/templates/.claude/settings.json +68 -0
- package/templates/docs/WORKFLOW.md +325 -0
- package/templates/docs/specs/.gitkeep +0 -0
- package/templates/hooks/specpipe-read-guard.sh +42 -0
- package/templates/hooks/specpipe-shell-guard.sh +65 -0
- package/templates/rules/specpipe-guards.md +40 -0
- package/templates/scripts/test-hooks.sh +66 -0
- package/templates/skills/sp-build/SKILL.md +776 -0
- package/templates/skills/sp-challenge/SKILL.md +255 -0
- package/templates/skills/sp-commit/SKILL.md +174 -0
- package/templates/skills/sp-explore/SKILL.md +730 -0
- package/templates/skills/sp-fix/SKILL.md +266 -0
- package/templates/skills/sp-humanize/SKILL.md +212 -0
- package/templates/skills/sp-investigate/SKILL.md +648 -0
- package/templates/skills/sp-md-render/SKILL.md +200 -0
- package/templates/skills/sp-md-render/components.md +415 -0
- package/templates/skills/sp-md-render/template.html +283 -0
- package/templates/skills/sp-plan/SKILL.md +947 -0
- package/templates/skills/sp-review/SKILL.md +268 -0
- package/templates/skills/sp-scaffold/SKILL.md +237 -0
- package/templates/skills/sp-scaffold/references/ARCHITECTURE.md.tmpl +228 -0
- package/templates/skills/sp-scaffold/references/DESIGN.md.tmpl +113 -0
- package/templates/skills/sp-scaffold/references/adr/NNNN-template.md +92 -0
- package/templates/skills/sp-scaffold/references/stack-profiles/react.md +36 -0
- package/templates/skills/sp-spec-render/SKILL.md +254 -0
- package/templates/skills/sp-spec-render/components.md +418 -0
- package/templates/skills/sp-spec-render/examples/user-auth.html +749 -0
- package/templates/skills/sp-spec-render/examples/user-auth.md +114 -0
- package/templates/skills/sp-spec-render/template.html +222 -0
- package/templates/skills/sp-voices/SKILL.md +1184 -0
|
@@ -0,0 +1,1184 @@
|
|
|
1
|
+
---
|
|
2
|
+
description: |
|
|
3
|
+
Multi-voice review — orchestrate multiple LLMs (Claude + Codex + others) to
|
|
4
|
+
independently evaluate any input, synthesize consensus and disagreements
|
|
5
|
+
into actionable output.
|
|
6
|
+
Use when asked to "multi-voice review", "second opinion", "ý kiến nhiều mô hình",
|
|
7
|
+
"hỏi nhiều LLM", "ask multiple LLMs", "voices review", or "what do other models think".
|
|
8
|
+
Proactively suggest for high-stakes or controversial decisions — irreversible
|
|
9
|
+
architecture choices, security trade-offs, "are we sure about this design"
|
|
10
|
+
moments — where a single model's confidence is not enough.
|
|
11
|
+
Skip for trivial questions or work where one perspective is sufficient.
|
|
12
|
+
Works on code, specs, plans, ideas, or any text input.
|
|
13
|
+
allowed-tools: Read, Bash, Glob, Grep, Write, AskUserQuestion
|
|
14
|
+
---
|
|
15
|
+
# /sp-voices — Multi-Voice Review
|
|
16
|
+
|
|
17
|
+
Get independent perspectives from multiple LLMs on anything —
|
|
18
|
+
code, ideas, documents, architecture, skills, decisions.
|
|
19
|
+
|
|
20
|
+
Target: $ARGUMENTS
|
|
21
|
+
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
## How It Works
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
1. Understand what you're asking (Phase 1)
|
|
28
|
+
2. Find available reviewers (Phase 2)
|
|
29
|
+
3. Ask them — open-ended, not templated (Phase 3)
|
|
30
|
+
4. Synthesize their responses (Phase 4)
|
|
31
|
+
5. Show you what matters for YOUR decision (Phase 5)
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
---
|
|
35
|
+
|
|
36
|
+
## Phase 1: Understand Intent
|
|
37
|
+
|
|
38
|
+
Read `$ARGUMENTS`. Don't classify into a box — understand what the user
|
|
39
|
+
is trying to DECIDE.
|
|
40
|
+
|
|
41
|
+
### 1.1 — What is the user trying to decide?
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
Parse $ARGUMENTS for decision intent:
|
|
45
|
+
|
|
46
|
+
"what do you think about..." → User wants: opinions + consensus on direction
|
|
47
|
+
"review code/diff" → User wants: bugs, risks, merge/block decision
|
|
48
|
+
"check this doc" → User wants: readiness assessment, gaps
|
|
49
|
+
"is this approach ok" → User wants: validation or alternatives
|
|
50
|
+
"any issues with this" → User wants: risk identification
|
|
51
|
+
"compare A vs B" → User wants: trade-off analysis
|
|
52
|
+
"this strategy" → User wants: go/pivot/stop signal
|
|
53
|
+
|
|
54
|
+
If unclear → ask 1 question:
|
|
55
|
+
"What decision are you trying to make from this review?"
|
|
56
|
+
Don't ask "what type of review" — ask "what decision".
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
### 1.2 — What material is involved?
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
# If $ARGUMENTS points to file(s)
|
|
63
|
+
# Read and measure
|
|
64
|
+
MATERIAL=$(cat <file> 2>/dev/null)
|
|
65
|
+
LINES=$(echo "$MATERIAL" | wc -l | xargs)
|
|
66
|
+
echo "Material: <file>, $LINES lines"
|
|
67
|
+
|
|
68
|
+
# If $ARGUMENTS is about git diff
|
|
69
|
+
MATERIAL=$(git diff main...HEAD 2>/dev/null)
|
|
70
|
+
[ -z "$MATERIAL" ] && MATERIAL=$(git diff HEAD~1 2>/dev/null)
|
|
71
|
+
|
|
72
|
+
# If $ARGUMENTS is a question/idea (no file)
|
|
73
|
+
# Material = the question itself + any referenced context
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
If material > 32KB → chunk by logical sections.
|
|
77
|
+
|
|
78
|
+
### 1.3 — Confirm before proceeding
|
|
79
|
+
|
|
80
|
+
**Always confirm intent in 1 line before spawning voices.**
|
|
81
|
+
Include voice count + which voice(s).
|
|
82
|
+
|
|
83
|
+
```
|
|
84
|
+
Simple (1 voice, auto-selected):
|
|
85
|
+
"Asking Perplexity if anyone has solved a similar problem. Ok?"
|
|
86
|
+
"Having Claude review auth.ts for bugs. Ok?"
|
|
87
|
+
|
|
88
|
+
Medium (2 voices, auto-selected):
|
|
89
|
+
"Getting 2 opinions: Claude (code logic) + Perplexity (security/CVEs). Ok?"
|
|
90
|
+
"Asking GPT (business logic) + Claude (technical feasibility). Ok?"
|
|
91
|
+
|
|
92
|
+
Complex (N voices, user picks via AskUserQuestion):
|
|
93
|
+
"Complex problem — I'll ask you to pick voices. First, confirm:
|
|
94
|
+
you want to evaluate [intent summary] — correct?"
|
|
95
|
+
```
|
|
96
|
+
|
|
97
|
+
**If user corrects → adjust intent + voice selection.**
|
|
98
|
+
**If user says "add voice" or "fewer voices" → adjust.**
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## Phase 2: Find Reviewers
|
|
103
|
+
|
|
104
|
+
### 2.1 — Probe Availability
|
|
105
|
+
|
|
106
|
+
```bash
|
|
107
|
+
echo "=== Reviewer availability ==="
|
|
108
|
+
|
|
109
|
+
# External LLMs
|
|
110
|
+
command -v openai &>/dev/null && echo "OPENAI_CLI: available" || \
|
|
111
|
+
([ -n "$OPENAI_API_KEY" ] && echo "OPENAI_API: key set" || echo "OPENAI: ✗")
|
|
112
|
+
# Codex needs binary AND auth (one of: $CODEX_API_KEY, $OPENAI_API_KEY,
|
|
113
|
+
# or ${CODEX_HOME:-~/.codex}/auth.json). Binary alone isn't enough.
|
|
114
|
+
if command -v codex &>/dev/null; then
|
|
115
|
+
_CODEX_AUTH_FILE="${CODEX_HOME:-$HOME/.codex}/auth.json"
|
|
116
|
+
if [ -n "$CODEX_API_KEY" ] || [ -n "$OPENAI_API_KEY" ] || [ -f "$_CODEX_AUTH_FILE" ]; then
|
|
117
|
+
echo "CODEX_CLI: available"
|
|
118
|
+
else
|
|
119
|
+
echo "CODEX: ✗ (binary present, no auth — run 'codex login')"
|
|
120
|
+
fi
|
|
121
|
+
else
|
|
122
|
+
echo "CODEX: ✗"
|
|
123
|
+
fi
|
|
124
|
+
# Gemini API (generativelanguage / AI Studio) is a hosted REST endpoint — needs
|
|
125
|
+
# $GEMINI_API_KEY. NOTE: the standalone `gemini` CLI was retired 2026-06-18 and
|
|
126
|
+
# folded into Antigravity CLI (`agy`) — probe that separately below, not here.
|
|
127
|
+
[ -n "$GEMINI_API_KEY" ] && echo "GEMINI_API: key set" || echo "GEMINI: ✗"
|
|
128
|
+
[ -n "$PERPLEXITY_API_KEY" ] && echo "PERPLEXITY: available" || echo "PERPLEXITY: ✗"
|
|
129
|
+
# Antigravity CLI (`agy`) — Google's agentic terminal coding agent, the successor
|
|
130
|
+
# to the retired `gemini` CLI. Agentic like Codex: reads code, runs commands.
|
|
131
|
+
# Needs binary AND auth (one of: $ANTIGRAVITY_API_KEY, $GEMINI_API_KEY — both
|
|
132
|
+
# accepted by agy — or OS-keyring/OAuth state from a prior interactive `agy`
|
|
133
|
+
# login under ~/.gemini/antigravity-cli/). Binary alone isn't enough.
|
|
134
|
+
if command -v agy &>/dev/null; then
|
|
135
|
+
if [ -n "$ANTIGRAVITY_API_KEY" ] || [ -n "$GEMINI_API_KEY" ] || [ -d "$HOME/.gemini/antigravity-cli" ]; then
|
|
136
|
+
echo "ANTIGRAVITY_CLI: available"
|
|
137
|
+
else
|
|
138
|
+
echo "ANTIGRAVITY: ✗ (binary present, no auth — run 'agy' once to log in)"
|
|
139
|
+
fi
|
|
140
|
+
else
|
|
141
|
+
echo "ANTIGRAVITY: ✗"
|
|
142
|
+
fi
|
|
143
|
+
[ -n "$ANTHROPIC_API_KEY" ] && echo "ANTHROPIC_API: key set" || echo "ANTHROPIC: host only"
|
|
144
|
+
command -v ollama &>/dev/null && echo "OLLAMA: available" || echo "OLLAMA: ✗"
|
|
145
|
+
command -v claude &>/dev/null && echo "SELF_SPAWN: available" || echo "SELF_SPAWN: ✗"
|
|
146
|
+
|
|
147
|
+
echo "==========================="
|
|
148
|
+
```
|
|
149
|
+
|
|
150
|
+
### 2.2 — Auth Probe (Tier 1 voices only)
|
|
151
|
+
|
|
152
|
+
Before building expensive prompts, verify that API keys are actually valid.
|
|
153
|
+
A set key does not mean a working key.
|
|
154
|
+
|
|
155
|
+
```bash
|
|
156
|
+
# Lightweight auth probe — only for voices that will be used
|
|
157
|
+
# Each probe: small request, < 10 tokens, just check for 401/403
|
|
158
|
+
|
|
159
|
+
# OpenAI
|
|
160
|
+
if [ -n "$OPENAI_API_KEY" ]; then
|
|
161
|
+
_OAI_STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
|
|
162
|
+
-H "Authorization: Bearer $OPENAI_API_KEY" \
|
|
163
|
+
https://api.openai.com/v1/models 2>/dev/null)
|
|
164
|
+
[ "$_OAI_STATUS" = "200" ] && echo "OPENAI_AUTH: valid" || echo "OPENAI_AUTH: FAILED ($_OAI_STATUS)"
|
|
165
|
+
fi
|
|
166
|
+
|
|
167
|
+
# Perplexity — SKIPPED: Perplexity has no free auth-probe endpoint.
|
|
168
|
+
# A real chat completion (even max_tokens:1) is billed per request, so probing
|
|
169
|
+
# every run wastes money. Trust the key is set; if invalid, the actual review
|
|
170
|
+
# call will return 401 and Phase 3.5 (Post-Response Checks) flags it as
|
|
171
|
+
# "auth failed". Net cost: 1 wasted real call vs N probe calls per session.
|
|
172
|
+
if [ -n "$PERPLEXITY_API_KEY" ]; then
|
|
173
|
+
echo "PERPLEXITY_AUTH: assumed valid (probe skipped — would cost money)"
|
|
174
|
+
fi
|
|
175
|
+
|
|
176
|
+
# Gemini API — use header auth (x-goog-api-key) to keep the key out of URLs/logs.
|
|
177
|
+
if [ -n "$GEMINI_API_KEY" ]; then
|
|
178
|
+
_GEM_STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
|
|
179
|
+
-H "x-goog-api-key: $GEMINI_API_KEY" \
|
|
180
|
+
https://generativelanguage.googleapis.com/v1beta/models 2>/dev/null)
|
|
181
|
+
[ "$_GEM_STATUS" = "200" ] && echo "GEMINI_AUTH: valid" || echo "GEMINI_AUTH: FAILED ($_GEM_STATUS)"
|
|
182
|
+
fi
|
|
183
|
+
|
|
184
|
+
# Antigravity CLI — no cheap REST auth-probe endpoint; it authenticates the agent
|
|
185
|
+
# harness on first call. If $ANTIGRAVITY_API_KEY / $GEMINI_API_KEY is set or OAuth
|
|
186
|
+
# state exists, trust it; a dead key surfaces as an error on the real call
|
|
187
|
+
# (Phase 3.5 flags it).
|
|
188
|
+
if command -v agy &>/dev/null && { [ -n "$ANTIGRAVITY_API_KEY" ] || [ -n "$GEMINI_API_KEY" ] || [ -d "$HOME/.gemini/antigravity-cli" ]; }; then
|
|
189
|
+
echo "ANTIGRAVITY_AUTH: assumed valid (agent harness — probe skipped)"
|
|
190
|
+
fi
|
|
191
|
+
|
|
192
|
+
# Anthropic
|
|
193
|
+
if [ -n "$ANTHROPIC_API_KEY" ]; then
|
|
194
|
+
_ANT_STATUS=$(curl -s -o /dev/null -w "%{http_code}" \
|
|
195
|
+
-H "x-api-key: $ANTHROPIC_API_KEY" \
|
|
196
|
+
-H "anthropic-version: 2023-06-01" \
|
|
197
|
+
https://api.anthropic.com/v1/models 2>/dev/null)
|
|
198
|
+
[ "$_ANT_STATUS" = "200" ] && echo "ANTHROPIC_AUTH: valid" || echo "ANTHROPIC_AUTH: FAILED ($_ANT_STATUS)"
|
|
199
|
+
fi
|
|
200
|
+
```
|
|
201
|
+
|
|
202
|
+
If any voice's auth probe returns FAILED:
|
|
203
|
+
- Remove it from available voices BEFORE voice selection
|
|
204
|
+
- Note in output: "Voice X skipped — auth failed"
|
|
205
|
+
- Do NOT waste tokens building a prompt for a dead key
|
|
206
|
+
|
|
207
|
+
### 2.3 — Reviewer Priority
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
Tier 1 — Different model family (most diverse):
|
|
211
|
+
GPT, Gemini, Perplexity
|
|
212
|
+
→ Different training = different perspectives
|
|
213
|
+
|
|
214
|
+
Tier 2 — Agentic / independent session (reads code, runs commands, or fresh context):
|
|
215
|
+
Codex CLI, Antigravity CLI (`agy`), Anthropic API (different Claude model)
|
|
216
|
+
→ Antigravity CLI is Google's agentic terminal agent (successor to the retired
|
|
217
|
+
`gemini` CLI, shut down 2026-06-18) — it actually reads the repo, like Codex.
|
|
218
|
+
Pick the backing model with `agy --model` (Gemini 3.1 Pro, Claude, GPT-OSS —
|
|
219
|
+
depends on plan), so this voice doubles as a Google-family OR cross-family reviewer.
|
|
220
|
+
→ Independent context = still valuable
|
|
221
|
+
|
|
222
|
+
Tier 3 — Local:
|
|
223
|
+
Ollama
|
|
224
|
+
→ Free, private, lower capability
|
|
225
|
+
|
|
226
|
+
Tier 4 — Self-spawn (always available):
|
|
227
|
+
claude --print (fresh context, no conversation history)
|
|
228
|
+
→ Inherits the current Claude Code session's model by default
|
|
229
|
+
(override via $MF_VOICES_SELF_SPAWN_MODEL)
|
|
230
|
+
→ Same model but fresh eyes — better than nothing
|
|
231
|
+
→ MARK in output: "self-spawn — same model family"
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
### Voice Strengths — Who's Good at What
|
|
235
|
+
|
|
236
|
+
```
|
|
237
|
+
┌─────────────────┬──────────────────────────────────────────────────────┐
|
|
238
|
+
│ Voice │ Best For │
|
|
239
|
+
├─────────────────┼──────────────────────────────────────────────────────┤
|
|
240
|
+
│ Claude │ Code review, nuanced reasoning, design/architecture, │
|
|
241
|
+
│ (Haiku 4.5 / │ long-context analysis, careful edge case thinking. │
|
|
242
|
+
│ Sonnet 4.6 / │ Default voice: sonnet-4-6 ($3/$15). Self-spawn │
|
|
243
|
+
│ Opus 4.7) │ inherits the current Claude Code session's model │
|
|
244
|
+
│ │ (override via $MF_VOICES_SELF_SPAWN_MODEL — e.g. │
|
|
245
|
+
│ │ haiku-4-5 $1/$5 for cheap second opinion). │
|
|
246
|
+
│ │ Bump to opus-4-7 ($5/$25) for │
|
|
247
|
+
│ │ hardest reasoning. │
|
|
248
|
+
│ │ Strongest at: code quality, readability, subtle bugs.│
|
|
249
|
+
├─────────────────┼──────────────────────────────────────────────────────┤
|
|
250
|
+
│ GPT (5-mini / │ Wide domain knowledge, business logic, product │
|
|
251
|
+
│ 5.5) │ thinking, real-world patterns. Strong at connecting │
|
|
252
|
+
│ │ technical decisions to business impact. │
|
|
253
|
+
│ │ Default: gpt-5-mini ($0.25/$2). gpt-5.5 ($5/$30, │
|
|
254
|
+
│ │ released 2026-04-23) only when top quality matters │
|
|
255
|
+
│ │ — gpt-5.5 is now pricier than Sonnet 4.6. │
|
|
256
|
+
│ │ Strongest at: domain expertise, practical tradeoffs. │
|
|
257
|
+
├─────────────────┼──────────────────────────────────────────────────────┤
|
|
258
|
+
│ Gemini │ Broad analysis, large context window, multi-modal. │
|
|
259
|
+
│ (3 Flash / │ Good at synthesizing large documents. │
|
|
260
|
+
│ 3.1 Pro) │ Default: gemini-3-flash ($0.50/$3). Upgrade to │
|
|
261
|
+
│ │ gemini-3.1-pro-preview ($2/$12, $4/$18 >200k ctx). │
|
|
262
|
+
│ │ NOTE: gemini-3-pro deprecated 2026-03-09 — calls │
|
|
263
|
+
│ │ to that model ID will fail. Use 3.1-pro-preview. │
|
|
264
|
+
│ │ Strongest at: big-picture, cross-cutting concerns. │
|
|
265
|
+
├─────────────────┼──────────────────────────────────────────────────────┤
|
|
266
|
+
│ Perplexity │ Real-time web search. Knows current CVEs, latest │
|
|
267
|
+
│ (sonar / │ best practices, library health, who solved this │
|
|
268
|
+
│ sonar-pro) │ problem before. CITES SOURCES. │
|
|
269
|
+
│ │ Default: sonar-pro ($3/$15) for citation quality. │
|
|
270
|
+
│ │ sonar ($1/$1) for cheap quick lookups. │
|
|
271
|
+
│ │ Strongest at: security, research, current info, │
|
|
272
|
+
│ │ "is this current best practice?". │
|
|
273
|
+
│ │ UNIQUE: only voice with live web access. │
|
|
274
|
+
├─────────────────┼──────────────────────────────────────────────────────┤
|
|
275
|
+
│ Antigravity CLI │ Google's agentic terminal agent (`agy`), successor │
|
|
276
|
+
│ (`agy`) │ to the retired `gemini` CLI (shut down 2026-06-18). │
|
|
277
|
+
│ │ Agentic like Codex: reads the repo, runs commands. │
|
|
278
|
+
│ │ Backed by a model via `agy --model` (Gemini 3.1 Pro,│
|
|
279
|
+
│ │ Claude Sonnet/Opus, GPT-OSS — plan-dependent). │
|
|
280
|
+
│ │ Strongest at: big-picture + actually running code. │
|
|
281
|
+
├─────────────────┼──────────────────────────────────────────────────────┤
|
|
282
|
+
│ Codex CLI │ Agentic — reads code itself, runs commands, │
|
|
283
|
+
│ │ explores repo structure. Finds things text-only │
|
|
284
|
+
│ │ review misses because it actually RUNS the code. │
|
|
285
|
+
│ │ Strongest at: code bugs, runtime behavior, │
|
|
286
|
+
│ │ "does this actually work when you run it?". │
|
|
287
|
+
├─────────────────┼──────────────────────────────────────────────────────┤
|
|
288
|
+
│ Ollama (local) │ Privacy-sensitive reviews. No data leaves machine. │
|
|
289
|
+
│ │ Capability varies by model (llama3.3:70b decent). │
|
|
290
|
+
│ │ Strongest at: private code, air-gapped envs. │
|
|
291
|
+
├─────────────────┼──────────────────────────────────────────────────────┤
|
|
292
|
+
│ Self-spawn │ Always available. Fresh context = no conversation │
|
|
293
|
+
│ (Claude CLI) │ bias. Same model family = possible blind spots. │
|
|
294
|
+
│ │ Strongest at: "second pair of eyes" when nothing │
|
|
295
|
+
│ │ else available. │
|
|
296
|
+
└─────────────────┴──────────────────────────────────────────────────────┘
|
|
297
|
+
```
|
|
298
|
+
|
|
299
|
+
### Smart Voice Assignment
|
|
300
|
+
|
|
301
|
+
Skill selects voices based on intent + voice strengths:
|
|
302
|
+
|
|
303
|
+
```
|
|
304
|
+
Intent: code review
|
|
305
|
+
Best voices: Claude (quality) + Codex (runtime) + Perplexity (CVEs)
|
|
306
|
+
Alt: Claude + GPT (domain logic) + self-spawn
|
|
307
|
+
|
|
308
|
+
Intent: strategy / business decision
|
|
309
|
+
Best voices: GPT (domain/business) + Claude (reasoning) + Perplexity (research)
|
|
310
|
+
Alt: GPT + Gemini (big-picture) + self-spawn
|
|
311
|
+
|
|
312
|
+
Intent: research / deep technical topic
|
|
313
|
+
Best voices: Perplexity (current info) + GPT (broad knowledge) + Claude (reasoning)
|
|
314
|
+
Alt: Perplexity + Gemini (large-context synthesis) + self-spawn
|
|
315
|
+
|
|
316
|
+
Intent: security review
|
|
317
|
+
Best voices: Perplexity (CVEs, advisories) + Claude (logic) + Codex (runtime test)
|
|
318
|
+
Alt: Perplexity + GPT + self-spawn
|
|
319
|
+
|
|
320
|
+
Intent: architecture / design
|
|
321
|
+
Best voices: Claude (design) + GPT (practical tradeoffs) + Gemini (big picture)
|
|
322
|
+
Alt: Antigravity CLI (reads the repo) + Claude + Perplexity (who solved this before)
|
|
323
|
+
|
|
324
|
+
Intent: document readiness
|
|
325
|
+
Best voices: Claude (nuance) + GPT (domain) + Perplexity (current standards)
|
|
326
|
+
Alt: Claude + Gemini (logical consistency) + self-spawn
|
|
327
|
+
|
|
328
|
+
Intent: comparison (A vs B)
|
|
329
|
+
Best voices: Perplexity (research/benchmarks) + GPT (practical) + Claude (reasoning)
|
|
330
|
+
Alt: Gemini (structured comparison) + any 2
|
|
331
|
+
|
|
332
|
+
Fallback (any intent, limited voices):
|
|
333
|
+
Use whatever is available. Self-spawn as last resort.
|
|
334
|
+
ALWAYS note which voices would be ideal but weren't available.
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
### Voice Count — Adaptive, Not Fixed
|
|
338
|
+
|
|
339
|
+
```
|
|
340
|
+
Do NOT default to 3 voices. Voice count depends on complexity.
|
|
341
|
+
|
|
342
|
+
Simple (clear question, material < 100 lines, straightforward intent):
|
|
343
|
+
→ 1 voice — pick BEST FIT for intent, don't ask
|
|
344
|
+
→ Example: "any bugs in this?" → spawn Claude (best for code)
|
|
345
|
+
→ Example: "has anyone done this before?" → spawn Perplexity (web search)
|
|
346
|
+
→ Fast, cheap, enough for simple questions
|
|
347
|
+
|
|
348
|
+
Medium (material 100-500 lines, a few concerns, clear but nuanced intent):
|
|
349
|
+
→ 2 voices — pick 2 best fit, don't ask
|
|
350
|
+
→ Example: "review security + logic" → Perplexity (CVEs) + Claude (logic)
|
|
351
|
+
|
|
352
|
+
Complex (material > 500 lines, multi-faceted, strategy/architecture, high stakes):
|
|
353
|
+
→ Ask user to pick voices via AskUserQuestion
|
|
354
|
+
→ Suggest combo based on intent + available voices
|
|
355
|
+
```
|
|
356
|
+
|
|
357
|
+
### Complexity Detection
|
|
358
|
+
|
|
359
|
+
```
|
|
360
|
+
Signals for SIMPLE (auto 1 voice):
|
|
361
|
+
- Short, clear question ("any bugs?", "approach ok?")
|
|
362
|
+
- Material < 100 lines or 1 small file
|
|
363
|
+
- User says "quick", "fast", "just one opinion"
|
|
364
|
+
|
|
365
|
+
Signals for MEDIUM (auto 2 voices):
|
|
366
|
+
- Material 100-500 lines
|
|
367
|
+
- Question has 2+ concerns ("security + performance")
|
|
368
|
+
- User didn't say "quick" but also didn't say "thorough"
|
|
369
|
+
|
|
370
|
+
Signals for COMPLEX (ask user):
|
|
371
|
+
- Material > 500 lines or multi-file
|
|
372
|
+
- Strategy, architecture, or high-stakes decision
|
|
373
|
+
- User says "thorough", "complete", "multiple perspectives"
|
|
374
|
+
- Disagreements likely (controversial topic, multiple valid approaches)
|
|
375
|
+
- User explicit: "/sp-voices 3" or "/sp-voices full"
|
|
376
|
+
|
|
377
|
+
When in doubt → treat as MEDIUM (2 voices, don't ask).
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
---
|
|
381
|
+
|
|
382
|
+
## Phase 3: Ask Reviewers
|
|
383
|
+
|
|
384
|
+
### 3.1 — Prompt Construction
|
|
385
|
+
|
|
386
|
+
**Core principle: ask an open question, not a structured template.**
|
|
387
|
+
|
|
388
|
+
Every reviewer gets:
|
|
389
|
+
|
|
390
|
+
```
|
|
391
|
+
[Filesystem Boundary — agentic voices only]
|
|
392
|
+
+
|
|
393
|
+
[Base Question]
|
|
394
|
+
+
|
|
395
|
+
[Bias — light nudge matched to user's intent]
|
|
396
|
+
+
|
|
397
|
+
[Material]
|
|
398
|
+
```
|
|
399
|
+
|
|
400
|
+
**Filesystem Boundary — prepend ONLY for agentic voices (Codex CLI, Antigravity
|
|
401
|
+
CLI, self-spawn, local agents). Hosted chat APIs (OpenAI, Gemini, Anthropic
|
|
402
|
+
Messages, Perplexity) have no file access — the boundary is wasted tokens for
|
|
403
|
+
them.**
|
|
404
|
+
|
|
405
|
+
```
|
|
406
|
+
IMPORTANT: Do NOT read or execute any files under ~/.claude/, .claude/,
|
|
407
|
+
.cursor/, agents/, .claude/skills/, node_modules/, __pycache__/,
|
|
408
|
+
.git/objects/, vendor/, Pods/, DerivedData/, dist/, build/, .next/.
|
|
409
|
+
These paths contain skill definitions, build artifacts, or vendored code
|
|
410
|
+
meant for a different AI system or tooling — they will waste your time
|
|
411
|
+
and pull you off-task. Ignore them completely. Focus only on the content
|
|
412
|
+
provided below.
|
|
413
|
+
```
|
|
414
|
+
|
|
415
|
+
**Base Question (same for all voices, all intents):**
|
|
416
|
+
```
|
|
417
|
+
"Review the following. Be direct, be honest.
|
|
418
|
+
|
|
419
|
+
- What's wrong or could go wrong?
|
|
420
|
+
- What concerns you?
|
|
421
|
+
- What would you change?
|
|
422
|
+
- What's good and should stay?
|
|
423
|
+
|
|
424
|
+
Be specific — point to exact locations.
|
|
425
|
+
If you see an overall pattern, say it.
|
|
426
|
+
If nothing is wrong, say that — don't invent problems.
|
|
427
|
+
|
|
428
|
+
MATERIAL:
|
|
429
|
+
<content>"
|
|
430
|
+
```
|
|
431
|
+
|
|
432
|
+
### 3.2 — Bias Selection (matched to intent, not to type)
|
|
433
|
+
|
|
434
|
+
Bias is a LIGHT NUDGE — 1-2 sentences appended after base question.
|
|
435
|
+
Reviewer can and should go beyond the nudge.
|
|
436
|
+
|
|
437
|
+
**Choose 3 biases that match the user's DECISION INTENT:**
|
|
438
|
+
|
|
439
|
+
```
|
|
440
|
+
When user wants DIRECTION (go/pivot/stop):
|
|
441
|
+
Bias 1: "Pay special attention to: is this feasible? What's the biggest risk?"
|
|
442
|
+
Bias 2: "Pay special attention to: who benefits? Does this solve a real problem or an imagined one?"
|
|
443
|
+
Bias 3: "Pay special attention to: is there a simpler way to achieve the same goal?"
|
|
444
|
+
|
|
445
|
+
When user wants VALIDATION (ok or not):
|
|
446
|
+
Bias 1: "Pay special attention to: is this approach on the right track? What's missing?"
|
|
447
|
+
Bias 2: "Pay special attention to: what risks are being overlooked? What failure modes haven't been considered?"
|
|
448
|
+
Bias 3: "Pay special attention to: has anyone solved this problem better already?"
|
|
449
|
+
|
|
450
|
+
When user wants BUG/RISK FINDING:
|
|
451
|
+
Bias 1: "Pay special attention to: is the code/logic correct? Edge cases?"
|
|
452
|
+
Bias 2: "Pay special attention to: security? How could this be exploited?"
|
|
453
|
+
Bias 3: "Pay special attention to: maintainability? Will the next person understand this?"
|
|
454
|
+
|
|
455
|
+
When user wants COMPARISON (A vs B):
|
|
456
|
+
Bias 1: "Pay special attention to: where does A beat B? Where does B beat A?"
|
|
457
|
+
Bias 2: "Pay special attention to: risk of each option? Which one fails worse?"
|
|
458
|
+
Bias 3: "Pay special attention to: is there an option C that neither has considered?"
|
|
459
|
+
|
|
460
|
+
When user wants READINESS CHECK:
|
|
461
|
+
Bias 1: "Pay special attention to: is this ready to use? What's missing?"
|
|
462
|
+
Bias 2: "Pay special attention to: any internal contradictions? Is the logic consistent?"
|
|
463
|
+
Bias 3: "Pay special attention to: can the implementer read this and actually execute?"
|
|
464
|
+
|
|
465
|
+
When intent doesn't fit above:
|
|
466
|
+
No bias — just base question. Let voices decide what matters.
|
|
467
|
+
```
|
|
468
|
+
|
|
469
|
+
**Every bias ends with:** "But if you see a more important issue, say that instead."
|
|
470
|
+
|
|
471
|
+
### 3.3 — Special Voice Roles
|
|
472
|
+
|
|
473
|
+
**Perplexity (when available):**
|
|
474
|
+
Always assign to the bias that needs real-time information:
|
|
475
|
+
- Security → search CVEs, advisories
|
|
476
|
+
- Strategy → search who else solved this
|
|
477
|
+
- Research → search current standards, benchmarks
|
|
478
|
+
- Comparison → search real-world data
|
|
479
|
+
|
|
480
|
+
Dedicated system prompt for Perplexity:
|
|
481
|
+
```
|
|
482
|
+
"You have web search. Use it to find:
|
|
483
|
+
- Known vulnerabilities in mentioned libraries/patterns
|
|
484
|
+
- Who else solved this problem and how
|
|
485
|
+
- Current best practices (not outdated)
|
|
486
|
+
- Real benchmarks/case studies if discussing performance
|
|
487
|
+
Cite sources for every external claim."
|
|
488
|
+
```
|
|
489
|
+
|
|
490
|
+
**Antigravity CLI (`agy`, when available):** Agentic like Codex — reads the
|
|
491
|
+
repo itself and can run commands. Assign to biases that benefit from actually
|
|
492
|
+
exploring the code: architecture (dependency graph), big-picture review,
|
|
493
|
+
"does this hold up across the whole codebase". Pick the backing model with
|
|
494
|
+
`agy --model` (default is plan-dependent; `gemini-3.1-pro` for large-context
|
|
495
|
+
reasoning), and pass `--sandbox` so the review stays read-only. Do NOT use for
|
|
496
|
+
pure idea/strategy review — wastes agentic tokens, same caveat as Codex.
|
|
497
|
+
|
|
498
|
+
**Codex CLI (when available):**
|
|
499
|
+
Assign to the bias that needs actual code interaction:
|
|
500
|
+
- Code review → reads files, traces execution
|
|
501
|
+
- Bug hunting → can actually run tests
|
|
502
|
+
- Architecture → explores repo structure, dependency graph
|
|
503
|
+
Do NOT use for idea/strategy review — overkill, wastes agentic tokens.
|
|
504
|
+
|
|
505
|
+
### 3.4 — Execute Calls
|
|
506
|
+
|
|
507
|
+
Each voice call is wrapped in a timeout. If a call hangs, skip it and
|
|
508
|
+
continue with remaining voices.
|
|
509
|
+
|
|
510
|
+
**JSON safety:** every payload is built with `jq -n --arg` so material
|
|
511
|
+
containing quotes, newlines, or backslashes cannot break the JSON or
|
|
512
|
+
inject extra fields. Never interpolate `$PROMPT` directly into a JSON
|
|
513
|
+
string with `'"$PROMPT"'`.
|
|
514
|
+
|
|
515
|
+
```bash
|
|
516
|
+
# Defensive: PROMPT must be set before any voice call
|
|
517
|
+
: "${PROMPT:?PROMPT is empty — refusing to call voices}"
|
|
518
|
+
|
|
519
|
+
# macOS does not ship GNU `timeout` — fall back to `gtimeout` (brew coreutils).
|
|
520
|
+
# Without this shim, every voice call below errors with "command not found".
|
|
521
|
+
if ! command -v timeout >/dev/null 2>&1; then
|
|
522
|
+
if command -v gtimeout >/dev/null 2>&1; then
|
|
523
|
+
timeout() { command gtimeout "$@"; }
|
|
524
|
+
else
|
|
525
|
+
echo "WARN: neither timeout nor gtimeout found — install coreutils (brew install coreutils)"
|
|
526
|
+
timeout() { shift; "$@"; } # no-op fallback (no timeout enforcement)
|
|
527
|
+
fi
|
|
528
|
+
fi
|
|
529
|
+
|
|
530
|
+
# Timeout wrapper — use for every voice call
|
|
531
|
+
# Usage: voice_call <timeout_seconds> <command...>
|
|
532
|
+
voice_call() {
|
|
533
|
+
local _TIMEOUT=$1; shift
|
|
534
|
+
timeout "$_TIMEOUT" "$@" 2>/tmp/voice-err-$$.txt
|
|
535
|
+
local _EXIT=$?
|
|
536
|
+
if [ "$_EXIT" = "124" ]; then
|
|
537
|
+
echo "VOICE_TIMEOUT: call exceeded ${_TIMEOUT}s"
|
|
538
|
+
return 124
|
|
539
|
+
fi
|
|
540
|
+
return $_EXIT
|
|
541
|
+
}
|
|
542
|
+
|
|
543
|
+
# OpenAI GPT (timeout: 60s)
|
|
544
|
+
# gpt-5-mini: $0.25/$2.00 per 1M tokens — cheap, strong default for review.
|
|
545
|
+
# Upgrade to "gpt-5.5" ($5/$30, released 2026-04-23) only when top quality
|
|
546
|
+
# matters — note gpt-5.5 is now more expensive per output than Sonnet 4.6.
|
|
547
|
+
# NOTE: GPT-5 family uses `max_completion_tokens`, not `max_tokens` (legacy).
|
|
548
|
+
# Sending `max_tokens` to gpt-5* returns HTTP 400.
|
|
549
|
+
_PAYLOAD=$(jq -n --arg p "$PROMPT" '{
|
|
550
|
+
model: "gpt-5-mini",
|
|
551
|
+
messages: [{role: "user", content: $p}],
|
|
552
|
+
max_completion_tokens: 4000,
|
|
553
|
+
temperature: 0.3
|
|
554
|
+
}')
|
|
555
|
+
voice_call 60 curl -s https://api.openai.com/v1/chat/completions \
|
|
556
|
+
-H "Authorization: Bearer $OPENAI_API_KEY" \
|
|
557
|
+
-H "Content-Type: application/json" \
|
|
558
|
+
-d "$_PAYLOAD" | jq -r '.choices[0].message.content'
|
|
559
|
+
|
|
560
|
+
# Gemini (timeout: 60s)
|
|
561
|
+
# gemini-3-flash: $0.50/$3.00 per 1M tokens — cheapest Tier-1 voice.
|
|
562
|
+
# Upgrade to "gemini-3.1-pro-preview" ($2.00/$12.00, $4/$18 over 200k ctx)
|
|
563
|
+
# for big-picture work on long material.
|
|
564
|
+
# NOTE: gemini-3-pro was deprecated/shut down 2026-03-09. Hardcoding
|
|
565
|
+
# "gemini-3-pro" returns 404 — always use "gemini-3.1-pro-preview".
|
|
566
|
+
_PAYLOAD=$(jq -n --arg p "$PROMPT" '{
|
|
567
|
+
contents: [{parts: [{text: $p}]}],
|
|
568
|
+
generationConfig: {maxOutputTokens: 4000, temperature: 0.3}
|
|
569
|
+
}')
|
|
570
|
+
voice_call 60 curl -s "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-flash:generateContent" \
|
|
571
|
+
-H "x-goog-api-key: $GEMINI_API_KEY" \
|
|
572
|
+
-H "Content-Type: application/json" \
|
|
573
|
+
-d "$_PAYLOAD" | jq -r '.candidates[0].content.parts[0].text'
|
|
574
|
+
|
|
575
|
+
# Perplexity (timeout: 90s — web search takes longer)
|
|
576
|
+
# sonar-pro: $3/$15 per 1M — keeps citations + deeper search.
|
|
577
|
+
# For cheap quick lookups, "sonar" is $1/$1. Use sonar-pro when sources matter.
|
|
578
|
+
_PAYLOAD=$(jq -n --arg p "$PROMPT" '{
|
|
579
|
+
model: "sonar-pro",
|
|
580
|
+
messages: [
|
|
581
|
+
{role: "system", content: "You are a reviewer with web search. Search for relevant CVEs, benchmarks, prior art, and current best practices. Cite sources."},
|
|
582
|
+
{role: "user", content: $p}
|
|
583
|
+
],
|
|
584
|
+
max_tokens: 4000,
|
|
585
|
+
temperature: 0.3
|
|
586
|
+
}')
|
|
587
|
+
voice_call 90 curl -s https://api.perplexity.ai/chat/completions \
|
|
588
|
+
-H "Authorization: Bearer $PERPLEXITY_API_KEY" \
|
|
589
|
+
-H "Content-Type: application/json" \
|
|
590
|
+
-d "$_PAYLOAD" | jq -r '.choices[0].message.content'
|
|
591
|
+
|
|
592
|
+
# Anthropic API (timeout: 60s)
|
|
593
|
+
# claude-sonnet-4-6: $3/$15 per 1M — main quality voice for code/reasoning.
|
|
594
|
+
# For cheap independent second opinion, "claude-haiku-4-5" ($1/$5) works too.
|
|
595
|
+
_PAYLOAD=$(jq -n --arg p "$PROMPT" '{
|
|
596
|
+
model: "claude-sonnet-4-6",
|
|
597
|
+
max_tokens: 4000,
|
|
598
|
+
messages: [{role: "user", content: $p}]
|
|
599
|
+
}')
|
|
600
|
+
voice_call 60 curl -s https://api.anthropic.com/v1/messages \
|
|
601
|
+
-H "x-api-key: $ANTHROPIC_API_KEY" \
|
|
602
|
+
-H "content-type: application/json" \
|
|
603
|
+
-H "anthropic-version: 2023-06-01" \
|
|
604
|
+
-d "$_PAYLOAD" | jq -r '.content[0].text'
|
|
605
|
+
|
|
606
|
+
# Codex CLI (timeout: 300s — agentic, reads code itself)
|
|
607
|
+
# Use `codex exec` for free-form review prompts. Critical flags:
|
|
608
|
+
# < /dev/null — prevents stdin deadlock (regression in codex 0.120.x)
|
|
609
|
+
# -C "$_REPO_ROOT" — runs at git root, not random CWD
|
|
610
|
+
# -s read-only — sandbox, codex cannot mutate files
|
|
611
|
+
# -c '...="high"' — explicit reasoning effort (default is too low)
|
|
612
|
+
# --enable web_search_cached — lets codex look up CVEs / current docs
|
|
613
|
+
# For diff-against-main reviews specifically, swap `exec "$PROMPT"` for
|
|
614
|
+
# `review "$PROMPT" --base main` (same other flags).
|
|
615
|
+
_REPO_ROOT=$(git rev-parse --show-toplevel 2>/dev/null || pwd)
|
|
616
|
+
voice_call 300 codex exec "$PROMPT" \
|
|
617
|
+
-C "$_REPO_ROOT" \
|
|
618
|
+
-s read-only \
|
|
619
|
+
-c 'model_reasoning_effort="high"' \
|
|
620
|
+
--enable web_search_cached \
|
|
621
|
+
< /dev/null 2>/tmp/voice-codex-err-$$.txt
|
|
622
|
+
|
|
623
|
+
# Antigravity CLI (external timeout 360s — agentic, reads code itself, like Codex)
|
|
624
|
+
# Flags below verified against agy 1.0.9 (`agy --help`):
|
|
625
|
+
# -p "$PROMPT" — alias for --print: run ONE prompt non-interactively
|
|
626
|
+
# --model <id> — backing model; `agy models` lists them. Tested ids use
|
|
627
|
+
# kebab-case: gemini-3.1-pro (also gemini-3.5-flash,
|
|
628
|
+
# claude-sonnet-4-6, claude-opus-4-6, gpt-oss-120b —
|
|
629
|
+
# availability is plan-dependent). Omit for the default.
|
|
630
|
+
# --sandbox — run with terminal restrictions enabled (limits what
|
|
631
|
+
# commands the agent may execute). Use it for a review.
|
|
632
|
+
# --print-timeout — agy's own wait cap (default 5m); the 360s external
|
|
633
|
+
# backstop below is intentionally a bit longer.
|
|
634
|
+
# NOTE: there is NO `-m` short flag (that's not a model alias) and NO
|
|
635
|
+
# `--output-format` flag — agy has no structured/JSON output, parse plain text.
|
|
636
|
+
# Auth: $ANTIGRAVITY_API_KEY or $GEMINI_API_KEY (both accepted), else OS keyring
|
|
637
|
+
# / OAuth from a prior interactive `agy` login.
|
|
638
|
+
#
|
|
639
|
+
# NON-TTY STDOUT DROP: when stdout is not a terminal (command substitution,
|
|
640
|
+
# pipes, CI) agy can SILENTLY drop its final answer and still exit 0. Fix: run
|
|
641
|
+
# under `script` to fake a PTY. `script` arg order differs between macOS (BSD)
|
|
642
|
+
# and Linux (util-linux) — branch on uname. Prompt is passed via $AGY_PROMPT
|
|
643
|
+
# (never interpolated into the command string) so quotes/newlines in the
|
|
644
|
+
# material can't break the `script -qec` command line.
|
|
645
|
+
#
|
|
646
|
+
# Output cleanup (verified live on agy 1.0.9 / macOS):
|
|
647
|
+
# perl -0777 — slurp whole output, then:
|
|
648
|
+
# s/\x1b\[…//g strip ANSI escapes (use perl, NOT sed — BSD/macOS sed
|
|
649
|
+
# does not interpret \x1b)
|
|
650
|
+
# s/\A\^D[\x08]*// drop the literal "^D" + backspaces that BSD `script`
|
|
651
|
+
# echoes for the pty EOF at the very start of the stream
|
|
652
|
+
# tr -d … — remove remaining control bytes, keeping only tab (\011)/newline (\012)
|
|
653
|
+
export AGY_PROMPT="$PROMPT"
|
|
654
|
+
if [ "$(uname)" = "Darwin" ]; then
|
|
655
|
+
voice_call 360 script -q /dev/null \
|
|
656
|
+
agy -p "$AGY_PROMPT" --model gemini-3.1-pro --sandbox \
|
|
657
|
+
| perl -0777 -pe 's/\x1b\[[0-9;]*[A-Za-z]//g; s/\A\^D[\x08]*//' \
|
|
658
|
+
| tr -d '\000-\010\013-\037'
|
|
659
|
+
else
|
|
660
|
+
voice_call 360 script -qec 'agy -p "$AGY_PROMPT" --model gemini-3.1-pro --sandbox' /dev/null \
|
|
661
|
+
| perl -0777 -pe 's/\x1b\[[0-9;]*[A-Za-z]//g; s/\A\^D[\x08]*//' \
|
|
662
|
+
| tr -d '\000-\010\013-\037'
|
|
663
|
+
fi
|
|
664
|
+
unset AGY_PROMPT
|
|
665
|
+
|
|
666
|
+
# Ollama (timeout: 120s — local, can be slow)
|
|
667
|
+
_PAYLOAD=$(jq -n --arg p "$PROMPT" '{
|
|
668
|
+
model: "llama3.3:70b",
|
|
669
|
+
prompt: $p,
|
|
670
|
+
stream: false
|
|
671
|
+
}')
|
|
672
|
+
voice_call 120 curl -s http://localhost:11434/api/generate \
|
|
673
|
+
-d "$_PAYLOAD" | jq -r '.response'
|
|
674
|
+
|
|
675
|
+
# Self-spawn (timeout: 120s)
|
|
676
|
+
# By default, `claude --print` inherits the model from the user's current Claude
|
|
677
|
+
# Code session/config — DO NOT hardcode a model here. Hardcoding silently overrode
|
|
678
|
+
# the user's choice (e.g. forcing Haiku on an Opus session) and made the docs lie.
|
|
679
|
+
# To override for a cheaper second opinion, set MF_VOICES_SELF_SPAWN_MODEL
|
|
680
|
+
# (e.g. claude-haiku-4-5 for $1/$5 per 1M, or claude-sonnet-4-6 for stronger).
|
|
681
|
+
# Note: Claude Code CLI uses --append-system-prompt, NOT --system (would error).
|
|
682
|
+
echo "$PROMPT" | voice_call 120 claude --print \
|
|
683
|
+
--append-system-prompt "You are an independent reviewer. Fresh context. No prior conversation. Be direct." \
|
|
684
|
+
${MF_VOICES_SELF_SPAWN_MODEL:+--model "$MF_VOICES_SELF_SPAWN_MODEL"} 2>/dev/null
|
|
685
|
+
```
|
|
686
|
+
|
|
687
|
+
### 3.5 — Post-Response Checks
|
|
688
|
+
|
|
689
|
+
```
|
|
690
|
+
Rabbit hole: response mentions .claude/, SKILL.md, package-lock.json
|
|
691
|
+
→ Flag "⚠ Voice N got distracted by config files"
|
|
692
|
+
|
|
693
|
+
Empty: response < 100 chars
|
|
694
|
+
→ Flag "Voice N: empty response"
|
|
695
|
+
→ Antigravity CLI specifically: empty output WITH exit 0 = non-TTY stdout drop.
|
|
696
|
+
The `script` PTY wrapper in 3.4 prevents this; if it still happens, the
|
|
697
|
+
wrapper failed (no `script` binary?) — note it, don't silently treat as clean.
|
|
698
|
+
|
|
699
|
+
Timeout: voice_call returned 124
|
|
700
|
+
→ Flag "Voice N: timed out after Xs"
|
|
701
|
+
→ If 2+ voices remaining: continue silently
|
|
702
|
+
→ If only 1 remaining: ask retry/continue/stop
|
|
703
|
+
|
|
704
|
+
Auth error: HTTP 401/403 in response
|
|
705
|
+
→ Flag "Voice N: auth failed"
|
|
706
|
+
→ Should have been caught by auth probe — log as unexpected
|
|
707
|
+
|
|
708
|
+
Rate limit: HTTP 429 in response
|
|
709
|
+
→ Flag "Voice N: rate limited"
|
|
710
|
+
→ If 2+ voices remaining: continue silently
|
|
711
|
+
→ If only 1 remaining: ask retry/continue/stop
|
|
712
|
+
```
|
|
713
|
+
|
|
714
|
+
---
|
|
715
|
+
|
|
716
|
+
## Phase 4: Synthesize
|
|
717
|
+
|
|
718
|
+
### 4.1 — Read All Responses
|
|
719
|
+
|
|
720
|
+
Read each voice's free-form response. Don't impose structure yet.
|
|
721
|
+
Note for each voice:
|
|
722
|
+
- What did they focus on? (may differ from bias — that's fine)
|
|
723
|
+
- What's their overall stance?
|
|
724
|
+
- What specific concerns did they raise?
|
|
725
|
+
- What did they praise?
|
|
726
|
+
|
|
727
|
+
### 4.2 — Find Patterns
|
|
728
|
+
|
|
729
|
+
```
|
|
730
|
+
CONSENSUS: 2+ voices raise same concern or hold same position
|
|
731
|
+
→ Strong signal. Note it.
|
|
732
|
+
|
|
733
|
+
UNIQUE: Only 1 voice raises something
|
|
734
|
+
→ May be specialist insight or false positive
|
|
735
|
+
→ Keep, mark as single-voice
|
|
736
|
+
|
|
737
|
+
DISAGREEMENT: Voices contradict each other
|
|
738
|
+
→ Most valuable data. This is WHERE the decision lives.
|
|
739
|
+
→ Present both sides clearly.
|
|
740
|
+
|
|
741
|
+
SEVERITY (for code/doc findings only — WE assign, not reviewers):
|
|
742
|
+
If material is code or doc:
|
|
743
|
+
→ Parse specific findings, assign CRITICAL/HIGH/MEDIUM/LOW
|
|
744
|
+
→ Based on reviewer language + actual impact
|
|
745
|
+
If material is idea/strategy:
|
|
746
|
+
→ Do NOT use severity — use consensus/disagreement instead
|
|
747
|
+
```
|
|
748
|
+
|
|
749
|
+
### 4.3 — Identify the Decision Point
|
|
750
|
+
|
|
751
|
+
```
|
|
752
|
+
From patterns, determine: what does the user need to DECIDE?
|
|
753
|
+
|
|
754
|
+
If consensus is clear → decision is easy, show verdict
|
|
755
|
+
If disagreement is clear → decision is hard, show both sides + context
|
|
756
|
+
If all voices say "fine" → confirm clean, move on
|
|
757
|
+
```
|
|
758
|
+
|
|
759
|
+
### 4.4 — Confusion Protocol
|
|
760
|
+
|
|
761
|
+
```
|
|
762
|
+
If during synthesis you discover:
|
|
763
|
+
- Voices are responding to fundamentally different interpretations of the intent
|
|
764
|
+
- A voice raised something that changes the entire framing of the problem
|
|
765
|
+
- Material had a critical ambiguity that voices split on differently
|
|
766
|
+
|
|
767
|
+
→ STOP synthesis. Do not force a verdict.
|
|
768
|
+
→ Name the ambiguity in 1 sentence.
|
|
769
|
+
→ Present the split: "Voice A read this as X, Voice B read this as Y."
|
|
770
|
+
→ Ask the user which framing is correct before continuing.
|
|
771
|
+
|
|
772
|
+
This is rare. Most synthesis proceeds normally.
|
|
773
|
+
```
|
|
774
|
+
|
|
775
|
+
---
|
|
776
|
+
|
|
777
|
+
## Phase 5: Output — Matched to Intent
|
|
778
|
+
|
|
779
|
+
### Core Rule
|
|
780
|
+
|
|
781
|
+
```
|
|
782
|
+
Chat output is optimized for DECISIONS — not information.
|
|
783
|
+
Max 20 lines in chat. Full details in file.
|
|
784
|
+
|
|
785
|
+
The "→ docs/voices/<file>.md" footer in the templates below is CONDITIONAL —
|
|
786
|
+
include it ONLY when a report file was actually written (see "Report File —
|
|
787
|
+
Save on Demand" below). For unsaved chat-only reviews, OMIT that line.
|
|
788
|
+
```
|
|
789
|
+
|
|
790
|
+
### Completion Status
|
|
791
|
+
|
|
792
|
+
After synthesis, assign 1 of 4 statuses. Status appears on the first line
|
|
793
|
+
of chat output, right after the target name.
|
|
794
|
+
|
|
795
|
+
```
|
|
796
|
+
DONE — All voices responded, synthesis is clear, user has enough data to decide.
|
|
797
|
+
|
|
798
|
+
DONE_WITH_CONCERNS — Synthesis complete but:
|
|
799
|
+
• Voices disagree on an important point (not just minor)
|
|
800
|
+
• 1+ voice flagged a risk that other voices didn't mention
|
|
801
|
+
• Self-spawn only → same model family bias
|
|
802
|
+
• 100% consensus on a complex topic → possible shared blind spot
|
|
803
|
+
→ List each concern, 1 line each.
|
|
804
|
+
|
|
805
|
+
BLOCKED — Cannot produce meaningful output:
|
|
806
|
+
• All voices failed (timeout/auth/empty)
|
|
807
|
+
• Material unreadable or too large even after chunking
|
|
808
|
+
• Intent still unclear after already asking once
|
|
809
|
+
→ State clearly: blocked because of what, what was tried, what user should do next.
|
|
810
|
+
|
|
811
|
+
NEEDS_CONTEXT — Missing important info discovered MID-workflow:
|
|
812
|
+
• Voice A asked "what auth does this use?" but material doesn't say
|
|
813
|
+
• Voices disagree because of an unstated assumption
|
|
814
|
+
→ State clearly: what's needed, from whom, to unlock which decision.
|
|
815
|
+
```
|
|
816
|
+
|
|
817
|
+
If BLOCKED or NEEDS_CONTEXT → do NOT output synthesis.
|
|
818
|
+
Only output status + reason + next step.
|
|
819
|
+
|
|
820
|
+
### Output adapts to what voices actually said — not to a pre-set template.
|
|
821
|
+
|
|
822
|
+
But there are structural patterns for common intents:
|
|
823
|
+
|
|
824
|
+
---
|
|
825
|
+
|
|
826
|
+
**When user wanted DIRECTION:**
|
|
827
|
+
|
|
828
|
+
```
|
|
829
|
+
/sp-voices — <target> STATUS: <status>
|
|
830
|
+
══════════════════════════════════════════
|
|
831
|
+
Voices: <N> (<names>)
|
|
832
|
+
|
|
833
|
+
VERDICT: <GO | PIVOT | STOP | SPLIT>
|
|
834
|
+
|
|
835
|
+
✅ Consensus:
|
|
836
|
+
• <what all voices agree on>
|
|
837
|
+
• <what all voices agree on>
|
|
838
|
+
|
|
839
|
+
❌ Disagreements:
|
|
840
|
+
• <topic> — A: <position> / B: <position>
|
|
841
|
+
|
|
842
|
+
💡 Insight:
|
|
843
|
+
• <notable observation — 1 voice>
|
|
844
|
+
|
|
845
|
+
→ docs/voices/<file>.md
|
|
846
|
+
══════════════════════════════════════════
|
|
847
|
+
```
|
|
848
|
+
|
|
849
|
+
---
|
|
850
|
+
|
|
851
|
+
**When user wanted VALIDATION:**
|
|
852
|
+
|
|
853
|
+
```
|
|
854
|
+
/sp-voices — <target> STATUS: <status>
|
|
855
|
+
══════════════════════════════════════════
|
|
856
|
+
Voices: <N> (<names>)
|
|
857
|
+
|
|
858
|
+
ASSESSMENT: <SOLID | HAS GAPS | RETHINK>
|
|
859
|
+
|
|
860
|
+
✅ Validated:
|
|
861
|
+
• <aspects voices confirm are good>
|
|
862
|
+
|
|
863
|
+
🔴 Must address:
|
|
864
|
+
• <gaps/risks voices agree are blocking>
|
|
865
|
+
|
|
866
|
+
🟡 Consider:
|
|
867
|
+
• <concerns raised but not blocking>
|
|
868
|
+
|
|
869
|
+
→ docs/voices/<file>.md
|
|
870
|
+
══════════════════════════════════════════
|
|
871
|
+
```
|
|
872
|
+
|
|
873
|
+
---
|
|
874
|
+
|
|
875
|
+
**When user wanted BUG/RISK FINDING (code review):**
|
|
876
|
+
|
|
877
|
+
```
|
|
878
|
+
/sp-voices — <target> STATUS: <status>
|
|
879
|
+
══════════════════════════════════════════
|
|
880
|
+
Voices: <N> (<names>)
|
|
881
|
+
|
|
882
|
+
GATE: <PASS | FAIL — N blocking>
|
|
883
|
+
|
|
884
|
+
🔴 Blocking:
|
|
885
|
+
[C1] <summary> — <file:line>
|
|
886
|
+
[H1] <summary> — <file:line> (consensus)
|
|
887
|
+
|
|
888
|
+
⚠️ Non-blocking:
|
|
889
|
+
[H2] <summary> — <file:line>
|
|
890
|
+
|
|
891
|
+
🔵 Disagreements:
|
|
892
|
+
[D1] <topic> — <file:line>
|
|
893
|
+
|
|
894
|
+
→ docs/voices/<file>.md
|
|
895
|
+
══════════════════════════════════════════
|
|
896
|
+
```
|
|
897
|
+
|
|
898
|
+
---
|
|
899
|
+
|
|
900
|
+
**When user wanted COMPARISON:**
|
|
901
|
+
|
|
902
|
+
```
|
|
903
|
+
/sp-voices — <A> vs <B> STATUS: <status>
|
|
904
|
+
══════════════════════════════════════════
|
|
905
|
+
Voices: <N> (<names>)
|
|
906
|
+
|
|
907
|
+
LEAN: <A | B | DEPENDS | NO CLEAR WINNER>
|
|
908
|
+
|
|
909
|
+
Option A:
|
|
910
|
+
✅ <strengths voices agree on>
|
|
911
|
+
❌ <weaknesses voices agree on>
|
|
912
|
+
|
|
913
|
+
Option B:
|
|
914
|
+
✅ <strengths voices agree on>
|
|
915
|
+
❌ <weaknesses voices agree on>
|
|
916
|
+
|
|
917
|
+
🔵 Disagreements:
|
|
918
|
+
• <where voices pick different sides>
|
|
919
|
+
|
|
920
|
+
💡 Option C (if any voice proposed one):
|
|
921
|
+
• <alternative approach>
|
|
922
|
+
|
|
923
|
+
→ docs/voices/<file>.md
|
|
924
|
+
══════════════════════════════════════════
|
|
925
|
+
```
|
|
926
|
+
|
|
927
|
+
---
|
|
928
|
+
|
|
929
|
+
**When user wanted READINESS CHECK:**
|
|
930
|
+
|
|
931
|
+
```
|
|
932
|
+
/sp-voices — <target> STATUS: <status>
|
|
933
|
+
══════════════════════════════════════════
|
|
934
|
+
Voices: <N> (<names>)
|
|
935
|
+
|
|
936
|
+
READY: <YES | NOT YET — N items | MAJOR ISSUES>
|
|
937
|
+
|
|
938
|
+
🔴 Fix before using:
|
|
939
|
+
• <blocking issue + location>
|
|
940
|
+
|
|
941
|
+
🟡 Should fix:
|
|
942
|
+
• <non-blocking issue>
|
|
943
|
+
|
|
944
|
+
✅ Already good:
|
|
945
|
+
• <what voices confirm is ready>
|
|
946
|
+
|
|
947
|
+
→ docs/voices/<file>.md
|
|
948
|
+
══════════════════════════════════════════
|
|
949
|
+
```
|
|
950
|
+
|
|
951
|
+
---
|
|
952
|
+
|
|
953
|
+
**When intent doesn't fit patterns above:**
|
|
954
|
+
|
|
955
|
+
```
|
|
956
|
+
/sp-voices — <target> STATUS: <status>
|
|
957
|
+
══════════════════════════════════════════
|
|
958
|
+
Voices: <N> (<names>)
|
|
959
|
+
|
|
960
|
+
✅ Consensus:
|
|
961
|
+
• <what voices agree on>
|
|
962
|
+
|
|
963
|
+
❌ Disagreements:
|
|
964
|
+
• <where voices differ>
|
|
965
|
+
|
|
966
|
+
💡 Notable:
|
|
967
|
+
• <unique insights>
|
|
968
|
+
|
|
969
|
+
→ docs/voices/<file>.md
|
|
970
|
+
══════════════════════════════════════════
|
|
971
|
+
```
|
|
972
|
+
|
|
973
|
+
---
|
|
974
|
+
|
|
975
|
+
**DONE_WITH_CONCERNS example** (status details appear between status line and verdict):
|
|
976
|
+
|
|
977
|
+
```
|
|
978
|
+
/sp-voices — auth.ts refactor STATUS: DONE_WITH_CONCERNS
|
|
979
|
+
══════════════════════════════════════════
|
|
980
|
+
⚠ Concerns:
|
|
981
|
+
• Self-spawn only — same model family, possible blind spots
|
|
982
|
+
• 100% consensus on complex topic — verify independently
|
|
983
|
+
|
|
984
|
+
Voices: 2 (Claude self-spawn, Claude self-spawn)
|
|
985
|
+
|
|
986
|
+
GATE: PASS
|
|
987
|
+
...
|
|
988
|
+
```
|
|
989
|
+
|
|
990
|
+
---
|
|
991
|
+
|
|
992
|
+
### Report File — Save on Demand, Not Always
|
|
993
|
+
|
|
994
|
+
```
|
|
995
|
+
Do NOT auto-save files. Wastes tokens on file writing + formatting.
|
|
996
|
+
|
|
997
|
+
When to suggest save:
|
|
998
|
+
- 3+ voices, many disagreements → complex, worth saving
|
|
999
|
+
- User says "save"
|
|
1000
|
+
- Many findings (> 5 CRITICAL+HIGH for code, > 3 disagreements for ideas)
|
|
1001
|
+
|
|
1002
|
+
When NOT to suggest save:
|
|
1003
|
+
- Quick review, 2 voices, clear consensus → chat output is enough
|
|
1004
|
+
- User says "quick" or "fast"
|
|
1005
|
+
- Simple yes/no validation
|
|
1006
|
+
|
|
1007
|
+
If save needed → include in next-action options.
|
|
1008
|
+
If not needed → chat output is sufficient, user can copy if they want.
|
|
1009
|
+
```
|
|
1010
|
+
|
|
1011
|
+
File format when saving (`docs/voices/<date>-<target>.md`):
|
|
1012
|
+
|
|
1013
|
+
```markdown
|
|
1014
|
+
# /sp-voices — <target>
|
|
1015
|
+
Date: <date>
|
|
1016
|
+
Voices: <list>
|
|
1017
|
+
Intent: <what user was deciding>
|
|
1018
|
+
Status: <DONE | DONE_WITH_CONCERNS | ...>
|
|
1019
|
+
|
|
1020
|
+
## Summary
|
|
1021
|
+
<same as chat output>
|
|
1022
|
+
|
|
1023
|
+
## Voice A (<model>) — Full Response
|
|
1024
|
+
<verbatim>
|
|
1025
|
+
|
|
1026
|
+
## Voice B (<model>) — Full Response
|
|
1027
|
+
<verbatim>
|
|
1028
|
+
|
|
1029
|
+
## Synthesis Notes
|
|
1030
|
+
- Consensus: <list>
|
|
1031
|
+
- Disagreements: <list>
|
|
1032
|
+
- Unique insights: <list>
|
|
1033
|
+
|
|
1034
|
+
## META
|
|
1035
|
+
| Voice | Model | Bias | Tokens | Cost |
|
|
1036
|
+
|-------|-------|------|--------|------|
|
|
1037
|
+
| A | ... | ... | N | ~$X |
|
|
1038
|
+
Agreement rate: N%
|
|
1039
|
+
Limitations: <if any>
|
|
1040
|
+
```
|
|
1041
|
+
|
|
1042
|
+
---
|
|
1043
|
+
|
|
1044
|
+
## After Output: Next Action
|
|
1045
|
+
|
|
1046
|
+
After showing chat summary, ask what's next.
|
|
1047
|
+
Options adapt based on complexity + output + status.
|
|
1048
|
+
|
|
1049
|
+
**DONE — Simple review (clear consensus, few findings):**
|
|
1050
|
+
```json
|
|
1051
|
+
{
|
|
1052
|
+
"questions": [{
|
|
1053
|
+
"question": "/sp-voices done.",
|
|
1054
|
+
"header": "What next?",
|
|
1055
|
+
"multiSelect": false,
|
|
1056
|
+
"options": [
|
|
1057
|
+
{"label": "Act on it — proceed with recommendation"},
|
|
1058
|
+
{"label": "Drill down — details on specific point"},
|
|
1059
|
+
{"label": "Done — I have what I need"}
|
|
1060
|
+
]
|
|
1061
|
+
}]
|
|
1062
|
+
}
|
|
1063
|
+
```
|
|
1064
|
+
|
|
1065
|
+
**DONE_WITH_CONCERNS or complex review (disagreements, many findings, CRITICAL items):**
|
|
1066
|
+
```json
|
|
1067
|
+
{
|
|
1068
|
+
"questions": [{
|
|
1069
|
+
"question": "/sp-voices done. [N] disagreements, [N] critical findings.",
|
|
1070
|
+
"header": "What next?",
|
|
1071
|
+
"multiSelect": false,
|
|
1072
|
+
"options": [
|
|
1073
|
+
{"label": "Drill down — details on specific point"},
|
|
1074
|
+
{"label": "Resolve disagreement — get tiebreaker voice"},
|
|
1075
|
+
{"label": "Save full report — docs/voices/ for reference"},
|
|
1076
|
+
{"label": "Fix now — address critical items"},
|
|
1077
|
+
{"label": "More voices — add external LLM for diversity"},
|
|
1078
|
+
{"label": "Done — I'll decide myself"}
|
|
1079
|
+
]
|
|
1080
|
+
}]
|
|
1081
|
+
}
|
|
1082
|
+
```
|
|
1083
|
+
|
|
1084
|
+
**Self-spawn only (limited diversity):**
|
|
1085
|
+
```json
|
|
1086
|
+
{
|
|
1087
|
+
"questions": [{
|
|
1088
|
+
"question": "/sp-voices done (self-spawn only — same model family).",
|
|
1089
|
+
"header": "What next?",
|
|
1090
|
+
"multiSelect": false,
|
|
1091
|
+
"options": [
|
|
1092
|
+
{"label": "Good enough — proceed"},
|
|
1093
|
+
{"label": "Get real diversity — add external LLM (GPT/Gemini/Perplexity)"},
|
|
1094
|
+
{"label": "Drill down — details on specific point"},
|
|
1095
|
+
{"label": "Done"}
|
|
1096
|
+
]
|
|
1097
|
+
}]
|
|
1098
|
+
}
|
|
1099
|
+
```
|
|
1100
|
+
|
|
1101
|
+
**BLOCKED:**
|
|
1102
|
+
```json
|
|
1103
|
+
{
|
|
1104
|
+
"questions": [{
|
|
1105
|
+
"question": "/sp-voices BLOCKED — [reason].",
|
|
1106
|
+
"header": "What next?",
|
|
1107
|
+
"multiSelect": false,
|
|
1108
|
+
"options": [
|
|
1109
|
+
{"label": "Retry — try again with same voices"},
|
|
1110
|
+
{"label": "Different voices — switch to available alternatives"},
|
|
1111
|
+
{"label": "Abort — I'll handle this manually"}
|
|
1112
|
+
]
|
|
1113
|
+
}]
|
|
1114
|
+
}
|
|
1115
|
+
```
|
|
1116
|
+
|
|
1117
|
+
**NEEDS_CONTEXT:**
|
|
1118
|
+
```json
|
|
1119
|
+
{
|
|
1120
|
+
"questions": [{
|
|
1121
|
+
"question": "/sp-voices needs context — [what's missing].",
|
|
1122
|
+
"header": "What next?",
|
|
1123
|
+
"multiSelect": false,
|
|
1124
|
+
"options": [
|
|
1125
|
+
{"label": "Provide context — I'll answer now"},
|
|
1126
|
+
{"label": "Continue anyway — work with what you have"},
|
|
1127
|
+
{"label": "Abort — I'll come back later"}
|
|
1128
|
+
]
|
|
1129
|
+
}]
|
|
1130
|
+
}
|
|
1131
|
+
```
|
|
1132
|
+
|
|
1133
|
+
---
|
|
1134
|
+
|
|
1135
|
+
## Drill Down (on demand)
|
|
1136
|
+
|
|
1137
|
+
After summary, user can request details:
|
|
1138
|
+
|
|
1139
|
+
```
|
|
1140
|
+
"details on [topic]" → show relevant voice quotes + context
|
|
1141
|
+
"what did voice A say" → show Voice A full response
|
|
1142
|
+
"why the disagreement on [X]" → show both positions + reasoning
|
|
1143
|
+
"any sources?" (Perplexity) → show citations from Perplexity response
|
|
1144
|
+
```
|
|
1145
|
+
|
|
1146
|
+
Each drill-down = 1 focused response, not full report dump.
|
|
1147
|
+
|
|
1148
|
+
---
|
|
1149
|
+
|
|
1150
|
+
## Adaptive Sizing
|
|
1151
|
+
|
|
1152
|
+
```
|
|
1153
|
+
Simple (< 100 lines, clear question):
|
|
1154
|
+
1 voice, best-fit for intent
|
|
1155
|
+
Compact output, no file save
|
|
1156
|
+
Total cost: ~$0.01-0.05
|
|
1157
|
+
|
|
1158
|
+
Medium (100-500 lines, 2+ concerns):
|
|
1159
|
+
2 voices, auto-selected
|
|
1160
|
+
Standard output, file save on request
|
|
1161
|
+
Total cost: ~$0.05-0.15
|
|
1162
|
+
|
|
1163
|
+
Complex (> 500 lines, multi-faceted, high stakes):
|
|
1164
|
+
N voices (user picks via AskUserQuestion)
|
|
1165
|
+
Full output + suggest file save
|
|
1166
|
+
Total cost: depends on N voices + models
|
|
1167
|
+
```
|
|
1168
|
+
|
|
1169
|
+
---
|
|
1170
|
+
|
|
1171
|
+
## Rules
|
|
1172
|
+
|
|
1173
|
+
1. **Understand intent first.** Don't classify — understand what decision the user faces.
|
|
1174
|
+
2. **Confirm before spawning.** 1 line: what voices will look at, under what angle.
|
|
1175
|
+
3. **Bias matches intent, not type.** Strategy question → strategy biases. Code question → code biases.
|
|
1176
|
+
4. **Open prompts, no templates.** Reviewers think freely. We structure after.
|
|
1177
|
+
5. **Output for decision, not information.** Chat max 20 lines. Details in file.
|
|
1178
|
+
6. **Don't resolve disagreements.** Present both sides. User decides.
|
|
1179
|
+
7. **Consensus ≠ correct.** All voices can share blind spots. Note when agreement is 100%.
|
|
1180
|
+
8. **Findings must be specific.** Location, not vibes.
|
|
1181
|
+
9. **Perplexity → web-grounded role.** When available, assign to bias benefiting from live search.
|
|
1182
|
+
10. **Graceful degradation.** 1 voice fails → continue. 0 succeed → BLOCKED.
|
|
1183
|
+
11. **Probe before prompting.** Verify auth before building expensive prompts. Dead keys waste tokens.
|
|
1184
|
+
12. **Timeout everything.** Every voice call gets a timeout. A hanging call must never block the entire review.
|