discoclaw 0.5.6 → 0.5.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.context/voice.md +2 -0
- package/.env.example +6 -3
- package/.env.example.full +6 -3
- package/dist/config.js +2 -1
- package/dist/discord/actions-config.js +76 -8
- package/dist/discord/actions-config.test.js +125 -0
- package/dist/discord/actions-spawn.js +2 -1
- package/dist/discord/actions-spawn.test.js +4 -0
- package/dist/discord/actions.js +3 -1
- package/dist/discord/actions.test.js +1 -0
- package/dist/discord/models-command.js +3 -2
- package/dist/discord/reaction-prompts.js +2 -0
- package/dist/discord/reaction-prompts.test.js +5 -0
- package/dist/index.js +154 -80
- package/dist/runtime/gemini-cli.test.js +4 -4
- package/dist/runtime/gemini-rest.js +167 -0
- package/dist/runtime/gemini-rest.test.js +201 -0
- package/dist/runtime/model-tiers.js +17 -0
- package/dist/runtime/strategies/gemini-strategy.js +1 -3
- package/dist/runtime-overrides.js +2 -0
- package/dist/runtime-overrides.test.js +16 -0
- package/dist/voice/audio-pipeline.js +20 -1
- package/dist/voice/conversation-buffer.js +71 -0
- package/dist/voice/conversation-buffer.test.js +142 -0
- package/dist/voice/stt-deepgram.js +16 -0
- package/dist/voice/stt-deepgram.test.js +74 -1
- package/dist/voice/voice-action-flags.js +1 -1
- package/dist/voice/voice-action-flags.test.js +13 -2
- package/dist/voice/voice-prompt-builder.js +172 -0
- package/dist/voice/voice-prompt-builder.test.js +311 -0
- package/dist/voice/voice-responder.js +13 -1
- package/dist/voice/voice-responder.test.js +49 -2
- package/package.json +1 -1
package/.context/voice.md
CHANGED
|
@@ -28,6 +28,7 @@ Two native npm packages power the Discord voice integration:
|
|
|
28
28
|
| `src/voice/presence-handler.ts` | Auto-join/leave on `voiceStateUpdate` (allowlisted users only) |
|
|
29
29
|
| `src/voice/transcript-mirror.ts` | Posts user transcriptions and bot responses to a text channel |
|
|
30
30
|
| `src/voice/voice-action-flags.ts` | Restricted action subset for voice invocations (messaging + tasks + memory only) |
|
|
31
|
+
| `src/voice/conversation-buffer.ts` | Per-guild conversation ring buffer (10 turns) — stores user/model exchanges in memory; backfills from voice-log channel on join |
|
|
31
32
|
| `src/discord/actions-voice.ts` | Discord action types: `voiceJoin`, `voiceLeave`, `voiceStatus`, `voiceMute`, `voiceDeafen` |
|
|
32
33
|
|
|
33
34
|
## Audio Data Flow
|
|
@@ -52,6 +53,7 @@ User speaks in Discord voice channel
|
|
|
52
53
|
- **Dual-flag voice actions** — Voice action execution requires both `VOICE_ENABLED` and `DISCORD_ACTIONS_VOICE`. The `buildVoiceActionFlags()` function intersects a voice-specific allowlist (messaging, tasks, memory) with env config; all other action categories are hard-disabled.
|
|
53
54
|
- **Generation-based cancellation** — `VoiceResponder` increments a generation counter on each new transcription. If a newer transcription arrives mid-pipeline, the older one is silently abandoned.
|
|
54
55
|
- **Barge-in** — Gated on a non-empty STT transcription result, not the raw VAD `speaking.start` event. Echo from the bot's own TTS leaking through the user's mic produces empty transcriptions and is ignored. Only when `VoiceResponder.handleTranscription()` receives a non-empty transcript while the player is active does it stop playback and advance the generation counter. This eliminates false positives from echo without relying on a static grace-period timeout.
|
|
56
|
+
- **Conversation ring buffer** — `ConversationBuffer` maintains a per-guild 10-turn ring buffer of user/model exchanges that gets injected into the voice prompt as formatted conversation history. Turns are appended live during a session. On voice join, the buffer backfills from recent voice-log channel messages so context carries across disconnects. The buffer is cleared when the bot leaves the voice channel.
|
|
55
57
|
- **Re-entrancy guard** — `AudioPipelineManager.startPipeline` uses a `starting` set because `VoiceConnection.subscribe()` synchronously fires a Ready state change.
|
|
56
58
|
- **Error containment** — `VoiceConnectionManager` catches connection errors and destroys the connection to prevent process crashes (e.g. DAVE handshake failures).
|
|
57
59
|
- **Deepgram TTS 2000-char limit** — Deepgram Aura REST TTS returns HTTP 413 (silent failure) for inputs exceeding ~2000 characters. `tts-deepgram.ts` truncates the input to 2000 chars before sending to prevent silent audio dropouts. If the AI response is unexpectedly long (e.g. from a missing `VOICE_STYLE_INSTRUCTION`), the user will still hear a truncated response rather than silence.
|
package/.env.example
CHANGED
|
@@ -74,10 +74,13 @@ DISCORD_GUILD_ID=
|
|
|
74
74
|
# is the security boundary instead.
|
|
75
75
|
#CLAUDE_DANGEROUSLY_SKIP_PERMISSIONS=1
|
|
76
76
|
|
|
77
|
-
# Gemini
|
|
78
|
-
#
|
|
77
|
+
# Gemini adapter
|
|
78
|
+
# When GEMINI_API_KEY is set, the REST API adapter is used (zero startup overhead).
|
|
79
|
+
# When unset, falls back to the Gemini CLI binary (requires `gemini` in PATH).
|
|
80
|
+
#GEMINI_API_KEY=
|
|
81
|
+
# Path to the Gemini CLI binary (default: gemini). Only used when GEMINI_API_KEY is unset.
|
|
79
82
|
#GEMINI_BIN=gemini
|
|
80
|
-
# Default model for the Gemini
|
|
83
|
+
# Default model for the Gemini adapter.
|
|
81
84
|
#GEMINI_MODEL=gemini-2.5-pro
|
|
82
85
|
|
|
83
86
|
# --- OpenAI-compatible HTTP adapter ---
|
package/.env.example.full
CHANGED
|
@@ -98,10 +98,13 @@ DISCORD_ALLOW_USER_IDS=
|
|
|
98
98
|
# is the security boundary instead.
|
|
99
99
|
#CLAUDE_DANGEROUSLY_SKIP_PERMISSIONS=1
|
|
100
100
|
|
|
101
|
-
# --- Gemini
|
|
102
|
-
#
|
|
101
|
+
# --- Gemini adapter ---
|
|
102
|
+
# When GEMINI_API_KEY is set, the REST API adapter is used (zero startup overhead).
|
|
103
|
+
# When unset, falls back to the Gemini CLI binary (requires `gemini` in PATH).
|
|
104
|
+
#GEMINI_API_KEY=
|
|
105
|
+
# Path to the Gemini CLI binary (default: gemini). Only used when GEMINI_API_KEY is unset.
|
|
103
106
|
#GEMINI_BIN=gemini
|
|
104
|
-
# Default model for the Gemini
|
|
107
|
+
# Default model for the Gemini adapter.
|
|
105
108
|
#GEMINI_MODEL=gemini-2.5-pro
|
|
106
109
|
|
|
107
110
|
# Log level: trace | debug | info | warn | error | fatal
|
package/dist/config.js
CHANGED
|
@@ -285,7 +285,7 @@ export function parseConfig(env) {
|
|
|
285
285
|
warnings.push('DISCOCLAW_VOICE_ENABLED=1 with TTS provider "openai" but OPENAI_API_KEY is not set; voice TTS will fail at runtime.');
|
|
286
286
|
}
|
|
287
287
|
if (voiceEnabled && !voiceHomeChannel) {
|
|
288
|
-
warnings.push('DISCOCLAW_VOICE_ENABLED=1 but DISCOCLAW_VOICE_HOME_CHANNEL is not set; voice will
|
|
288
|
+
warnings.push('DISCOCLAW_VOICE_ENABLED=1 but DISCOCLAW_VOICE_HOME_CHANNEL is not set; voice actions will be disabled (no target channel for action execution).');
|
|
289
289
|
}
|
|
290
290
|
const openrouterApiKey = parseTrimmedString(env, 'OPENROUTER_API_KEY');
|
|
291
291
|
const openrouterBaseUrl = parseTrimmedString(env, 'OPENROUTER_BASE_URL');
|
|
@@ -423,6 +423,7 @@ export function parseConfig(env) {
|
|
|
423
423
|
openrouterApiKey,
|
|
424
424
|
openrouterBaseUrl,
|
|
425
425
|
openrouterModel,
|
|
426
|
+
geminiApiKey: parseTrimmedString(env, 'GEMINI_API_KEY'),
|
|
426
427
|
geminiBin: parseTrimmedString(env, 'GEMINI_BIN') ?? 'gemini',
|
|
427
428
|
geminiModel: parseTrimmedString(env, 'GEMINI_MODEL') ?? 'gemini-2.5-pro',
|
|
428
429
|
codexBin: parseTrimmedString(env, 'CODEX_BIN') ?? 'codex',
|
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
import { resolveModel } from '../runtime/model-tiers.js';
|
|
1
|
+
import { resolveModel, findRuntimeForModel } from '../runtime/model-tiers.js';
|
|
2
2
|
import { resolveDefaultModel, resolveProvider } from './actions-imagegen.js';
|
|
3
3
|
const CONFIG_TYPE_MAP = {
|
|
4
4
|
modelSet: true,
|
|
@@ -137,8 +137,54 @@ export function executeConfigAction(action, configCtx) {
|
|
|
137
137
|
break;
|
|
138
138
|
case 'voice':
|
|
139
139
|
if (bp.voiceModelCtx) {
|
|
140
|
-
|
|
141
|
-
|
|
140
|
+
// Check if the model string is actually a runtime name.
|
|
141
|
+
const voiceNormalized = model.toLowerCase();
|
|
142
|
+
const voiceNewRuntime = configCtx.runtimeRegistry?.get(voiceNormalized);
|
|
143
|
+
if (voiceNewRuntime) {
|
|
144
|
+
skipPersist = true;
|
|
145
|
+
const voiceRuntimeModel = voiceNewRuntime.defaultModel ?? '';
|
|
146
|
+
bp.voiceModelCtx.runtime = voiceNewRuntime;
|
|
147
|
+
bp.voiceModelCtx.runtimeName = voiceNormalized;
|
|
148
|
+
bp.voiceModelCtx.model = voiceRuntimeModel;
|
|
149
|
+
configCtx.voiceRuntimeName = voiceNormalized;
|
|
150
|
+
configCtx.persistVoiceRuntime?.(voiceNormalized);
|
|
151
|
+
changes.push(`voice runtime → ${voiceNormalized}`);
|
|
152
|
+
if (voiceRuntimeModel)
|
|
153
|
+
changes.push(`voice → ${voiceRuntimeModel} (adapter default)`);
|
|
154
|
+
}
|
|
155
|
+
else {
|
|
156
|
+
bp.voiceModelCtx.model = model;
|
|
157
|
+
changes.push(`voice → ${model}`);
|
|
158
|
+
// Auto-switch voice runtime if the model belongs to a different provider.
|
|
159
|
+
if (configCtx.runtimeRegistry) {
|
|
160
|
+
const owningRuntimeId = findRuntimeForModel(model);
|
|
161
|
+
const currentVoiceRuntimeId = bp.voiceModelCtx.runtime?.id ?? configCtx.runtime.id;
|
|
162
|
+
if (owningRuntimeId && owningRuntimeId !== currentVoiceRuntimeId) {
|
|
163
|
+
// Tier-map keys (e.g. 'claude_code') may differ from registry keys (e.g. 'claude').
|
|
164
|
+
// Scan registry entries by adapter.id to find the matching key.
|
|
165
|
+
let matchedKey;
|
|
166
|
+
let matchedAdapter;
|
|
167
|
+
for (const registryKey of configCtx.runtimeRegistry.list()) {
|
|
168
|
+
const adapter = configCtx.runtimeRegistry.get(registryKey);
|
|
169
|
+
if (adapter && adapter.id === owningRuntimeId) {
|
|
170
|
+
matchedKey = registryKey;
|
|
171
|
+
matchedAdapter = adapter;
|
|
172
|
+
break;
|
|
173
|
+
}
|
|
174
|
+
}
|
|
175
|
+
if (matchedAdapter && matchedKey) {
|
|
176
|
+
bp.voiceModelCtx.runtime = matchedAdapter;
|
|
177
|
+
bp.voiceModelCtx.runtimeName = matchedKey;
|
|
178
|
+
configCtx.voiceRuntimeName = matchedKey;
|
|
179
|
+
configCtx.persistVoiceRuntime?.(matchedKey);
|
|
180
|
+
changes.push(`voice runtime → ${matchedKey} (auto-switched)`);
|
|
181
|
+
}
|
|
182
|
+
else {
|
|
183
|
+
return { ok: false, error: `Model "${model}" belongs to runtime "${owningRuntimeId}" which is not configured in the registry` };
|
|
184
|
+
}
|
|
185
|
+
}
|
|
186
|
+
}
|
|
187
|
+
}
|
|
142
188
|
}
|
|
143
189
|
else {
|
|
144
190
|
return { ok: false, error: 'Voice subsystem not configured' };
|
|
@@ -153,7 +199,10 @@ export function executeConfigAction(action, configCtx) {
|
|
|
153
199
|
if (configCtx.overrideSources) {
|
|
154
200
|
configCtx.overrideSources[action.role] = true;
|
|
155
201
|
}
|
|
156
|
-
const
|
|
202
|
+
const resolveRid = action.role === 'voice' && bp.voiceModelCtx?.runtime
|
|
203
|
+
? bp.voiceModelCtx.runtime.id
|
|
204
|
+
: configCtx.runtime.id;
|
|
205
|
+
const resolvedDisplay = resolveModel(model, resolveRid);
|
|
157
206
|
const resolvedNote = resolvedDisplay && resolvedDisplay !== model ? ` (resolves to ${resolvedDisplay})` : '';
|
|
158
207
|
return { ok: true, summary: `Model updated: ${changes.join(', ')}${resolvedNote}` };
|
|
159
208
|
}
|
|
@@ -215,6 +264,10 @@ export function executeConfigAction(action, configCtx) {
|
|
|
215
264
|
case 'voice':
|
|
216
265
|
if (bp.voiceModelCtx) {
|
|
217
266
|
bp.voiceModelCtx.model = defaultModel;
|
|
267
|
+
bp.voiceModelCtx.runtime = undefined;
|
|
268
|
+
bp.voiceModelCtx.runtimeName = undefined;
|
|
269
|
+
configCtx.voiceRuntimeName = undefined;
|
|
270
|
+
configCtx.clearVoiceRuntime?.();
|
|
218
271
|
resetChanges.push(`voice → ${defaultModel}`);
|
|
219
272
|
}
|
|
220
273
|
break;
|
|
@@ -261,9 +314,6 @@ export function executeConfigAction(action, configCtx) {
|
|
|
261
314
|
const igProvider = resolveProvider(igModel);
|
|
262
315
|
rows.push(['imagegen', igModel, `Image generation (${igProvider})`, '']);
|
|
263
316
|
}
|
|
264
|
-
if (bp.voiceModelCtx) {
|
|
265
|
-
rows.push(['voice', bp.voiceModelCtx.model || `${bp.runtimeModel} (follows chat)`, ROLE_DESCRIPTIONS.voice, ovr('voice')]);
|
|
266
|
-
}
|
|
267
317
|
const adapterDefault = configCtx.runtime.defaultModel;
|
|
268
318
|
const lines = rows.map(([role, model, desc, overrideMarker]) => {
|
|
269
319
|
const resolved = resolveModel(model, rid);
|
|
@@ -276,6 +326,24 @@ export function executeConfigAction(action, configCtx) {
|
|
|
276
326
|
}
|
|
277
327
|
return `**${role}**: \`${display}\`${overrideMarker} — ${desc}`;
|
|
278
328
|
});
|
|
329
|
+
if (bp.voiceModelCtx) {
|
|
330
|
+
const voiceRid = bp.voiceModelCtx.runtime?.id ?? rid;
|
|
331
|
+
const voiceModel = bp.voiceModelCtx.model || `${bp.runtimeModel} (follows chat)`;
|
|
332
|
+
const voiceRtLabel = bp.voiceModelCtx.runtimeName && bp.voiceModelCtx.runtimeName !== (configCtx.runtimeName ?? rid)
|
|
333
|
+
? ` [runtime: ${bp.voiceModelCtx.runtimeName}]`
|
|
334
|
+
: '';
|
|
335
|
+
// Voice row uses its own runtime ID for tier resolution.
|
|
336
|
+
const voiceResolved = resolveModel(voiceModel, voiceRid);
|
|
337
|
+
let voiceDisplay;
|
|
338
|
+
if (voiceModel) {
|
|
339
|
+
voiceDisplay = voiceResolved && voiceResolved !== voiceModel ? `${voiceModel} → ${voiceResolved}` : voiceModel;
|
|
340
|
+
}
|
|
341
|
+
else {
|
|
342
|
+
const voiceAdapterDefault = bp.voiceModelCtx.runtime?.defaultModel ?? adapterDefault;
|
|
343
|
+
voiceDisplay = voiceAdapterDefault || '(adapter default)';
|
|
344
|
+
}
|
|
345
|
+
lines.push(`**voice**: \`${voiceDisplay}\`${ovr('voice')}${voiceRtLabel} — ${ROLE_DESCRIPTIONS.voice}`);
|
|
346
|
+
}
|
|
279
347
|
return { ok: true, summary: lines.join('\n') };
|
|
280
348
|
}
|
|
281
349
|
}
|
|
@@ -297,7 +365,7 @@ export function configActionsPromptSection() {
|
|
|
297
365
|
<discord-action>{"type":"modelSet","role":"fast","model":"haiku"}</discord-action>
|
|
298
366
|
\`\`\`
|
|
299
367
|
- \`role\` (required): One of \`chat\`, \`fast\`, \`forge-drafter\`, \`forge-auditor\`, \`summary\`, \`cron\`, \`cron-exec\`, \`voice\`.
|
|
300
|
-
- \`model\` (required): Model tier (\`fast\`, \`capable\`, \`deep\`), concrete model name (\`haiku\`, \`sonnet\`, \`opus\`), runtime name (\`openrouter\`, \`gemini\` — for \`chat\`
|
|
368
|
+
- \`model\` (required): Model tier (\`fast\`, \`capable\`, \`deep\`), concrete model name (\`haiku\`, \`sonnet\`, \`opus\`), runtime name (\`openrouter\`, \`gemini\` — for \`chat\` and \`voice\` roles, swaps the active runtime adapter independently), or \`default\` (for cron-exec only, to revert to the env-configured default (Sonnet by default)). For the \`voice\` role, setting a model name that belongs to a different provider's tier map (e.g. \`sonnet\` while voice is on Gemini) will auto-switch the voice runtime to match.
|
|
301
369
|
|
|
302
370
|
**Roles:**
|
|
303
371
|
| Role | What it controls |
|
|
@@ -15,6 +15,12 @@ const openrouterRuntime = {
|
|
|
15
15
|
defaultModel: 'anthropic/claude-sonnet-4',
|
|
16
16
|
async *invoke() { },
|
|
17
17
|
};
|
|
18
|
+
const geminiRuntime = {
|
|
19
|
+
id: 'gemini',
|
|
20
|
+
capabilities: new Set(),
|
|
21
|
+
defaultModel: 'gemini-2.5-flash',
|
|
22
|
+
async *invoke() { },
|
|
23
|
+
};
|
|
18
24
|
function makeRegistry(...entries) {
|
|
19
25
|
const reg = new RuntimeRegistry();
|
|
20
26
|
for (const [name, adapter] of entries) {
|
|
@@ -538,6 +544,125 @@ describe('modelShow runtime line', () => {
|
|
|
538
544
|
});
|
|
539
545
|
});
|
|
540
546
|
// ---------------------------------------------------------------------------
|
|
547
|
+
// modelSet — voice runtime swap
|
|
548
|
+
// ---------------------------------------------------------------------------
|
|
549
|
+
describe('modelSet voice runtime swap', () => {
|
|
550
|
+
it('swaps voiceModelCtx.runtime and sets adapter default model', () => {
|
|
551
|
+
const ctx = makeCtx({ voiceModelCtx: { model: 'fast' } });
|
|
552
|
+
ctx.runtimeRegistry = makeRegistry(['gemini', geminiRuntime]);
|
|
553
|
+
const result = executeConfigAction({ type: 'modelSet', role: 'voice', model: 'gemini' }, ctx);
|
|
554
|
+
expect(result.ok).toBe(true);
|
|
555
|
+
if (!result.ok)
|
|
556
|
+
return;
|
|
557
|
+
expect(ctx.botParams.voiceModelCtx.runtime).toBe(geminiRuntime);
|
|
558
|
+
expect(ctx.botParams.voiceModelCtx.runtimeName).toBe('gemini');
|
|
559
|
+
expect(ctx.botParams.voiceModelCtx.model).toBe('gemini-2.5-flash');
|
|
560
|
+
expect(ctx.voiceRuntimeName).toBe('gemini');
|
|
561
|
+
expect(result.summary).toContain('voice runtime → gemini');
|
|
562
|
+
expect(result.summary).toContain('adapter default');
|
|
563
|
+
});
|
|
564
|
+
it('does not swap runtime for a plain model name', () => {
|
|
565
|
+
const ctx = makeCtx({ voiceModelCtx: { model: 'fast' } });
|
|
566
|
+
ctx.runtimeRegistry = makeRegistry(['gemini', geminiRuntime]);
|
|
567
|
+
const result = executeConfigAction({ type: 'modelSet', role: 'voice', model: 'sonnet' }, ctx);
|
|
568
|
+
expect(result.ok).toBe(true);
|
|
569
|
+
expect(ctx.botParams.voiceModelCtx.runtime).toBeUndefined();
|
|
570
|
+
expect(ctx.botParams.voiceModelCtx.runtimeName).toBeUndefined();
|
|
571
|
+
expect(ctx.botParams.voiceModelCtx.model).toBe('sonnet');
|
|
572
|
+
});
|
|
573
|
+
it('calls persistVoiceRuntime (not persistOverride) for runtime swaps', () => {
|
|
574
|
+
let persistOverrideCalled = false;
|
|
575
|
+
let persistVoiceRuntimeName;
|
|
576
|
+
const ctx = makeCtx({ voiceModelCtx: { model: 'fast' } });
|
|
577
|
+
ctx.runtimeRegistry = makeRegistry(['gemini', geminiRuntime]);
|
|
578
|
+
ctx.persistOverride = () => { persistOverrideCalled = true; };
|
|
579
|
+
ctx.persistVoiceRuntime = (name) => { persistVoiceRuntimeName = name; };
|
|
580
|
+
const result = executeConfigAction({ type: 'modelSet', role: 'voice', model: 'gemini' }, ctx);
|
|
581
|
+
expect(result.ok).toBe(true);
|
|
582
|
+
expect(persistOverrideCalled).toBe(false);
|
|
583
|
+
expect(persistVoiceRuntimeName).toBe('gemini');
|
|
584
|
+
});
|
|
585
|
+
it('chat runtime swap does not affect voiceModelCtx.runtime', () => {
|
|
586
|
+
const ctx = makeCtx({ voiceModelCtx: { model: 'fast' } });
|
|
587
|
+
ctx.runtimeRegistry = makeRegistry(['openrouter', openrouterRuntime]);
|
|
588
|
+
ctx.botParams.planCtx = { model: 'capable', runtime: stubRuntime };
|
|
589
|
+
ctx.botParams.deferOpts = { runtime: stubRuntime };
|
|
590
|
+
executeConfigAction({ type: 'modelSet', role: 'chat', model: 'openrouter' }, ctx);
|
|
591
|
+
// Chat runtime swapped but voice stays untouched
|
|
592
|
+
expect(ctx.botParams.runtime).toBe(openrouterRuntime);
|
|
593
|
+
expect(ctx.botParams.voiceModelCtx.runtime).toBeUndefined();
|
|
594
|
+
expect(ctx.botParams.voiceModelCtx.model).toBe('fast');
|
|
595
|
+
});
|
|
596
|
+
it('case-insensitive matching — Gemini matches gemini', () => {
|
|
597
|
+
const ctx = makeCtx({ voiceModelCtx: { model: 'fast' } });
|
|
598
|
+
ctx.runtimeRegistry = makeRegistry(['gemini', geminiRuntime]);
|
|
599
|
+
const result = executeConfigAction({ type: 'modelSet', role: 'voice', model: 'Gemini' }, ctx);
|
|
600
|
+
expect(result.ok).toBe(true);
|
|
601
|
+
expect(ctx.botParams.voiceModelCtx.runtime).toBe(geminiRuntime);
|
|
602
|
+
expect(ctx.botParams.voiceModelCtx.runtimeName).toBe('gemini');
|
|
603
|
+
});
|
|
604
|
+
});
|
|
605
|
+
// ---------------------------------------------------------------------------
|
|
606
|
+
// modelReset — voice runtime clear
|
|
607
|
+
// ---------------------------------------------------------------------------
|
|
608
|
+
describe('modelReset voice runtime', () => {
|
|
609
|
+
it('clears voiceModelCtx.runtime and runtimeName back to undefined', () => {
|
|
610
|
+
const ctx = makeCtx({ voiceModelCtx: { model: 'gemini-2.5-flash', runtime: geminiRuntime, runtimeName: 'gemini' } });
|
|
611
|
+
ctx.voiceRuntimeName = 'gemini';
|
|
612
|
+
ctx.envDefaults = { voice: 'fast' };
|
|
613
|
+
let clearVoiceRuntimeCalled = false;
|
|
614
|
+
ctx.clearVoiceRuntime = () => { clearVoiceRuntimeCalled = true; };
|
|
615
|
+
const result = executeConfigAction({ type: 'modelReset', role: 'voice' }, ctx);
|
|
616
|
+
expect(result.ok).toBe(true);
|
|
617
|
+
expect(ctx.botParams.voiceModelCtx.model).toBe('fast');
|
|
618
|
+
expect(ctx.botParams.voiceModelCtx.runtime).toBeUndefined();
|
|
619
|
+
expect(ctx.botParams.voiceModelCtx.runtimeName).toBeUndefined();
|
|
620
|
+
expect(ctx.voiceRuntimeName).toBeUndefined();
|
|
621
|
+
expect(clearVoiceRuntimeCalled).toBe(true);
|
|
622
|
+
});
|
|
623
|
+
});
|
|
624
|
+
// ---------------------------------------------------------------------------
|
|
625
|
+
// modelShow — voice runtime display
|
|
626
|
+
// ---------------------------------------------------------------------------
|
|
627
|
+
describe('modelShow voice runtime display', () => {
|
|
628
|
+
it('displays voice runtime name when it differs from chat', () => {
|
|
629
|
+
const ctx = makeCtx({ voiceModelCtx: { model: 'gemini-2.5-flash', runtime: geminiRuntime, runtimeName: 'gemini' } });
|
|
630
|
+
ctx.voiceRuntimeName = 'gemini';
|
|
631
|
+
const result = executeConfigAction({ type: 'modelShow' }, ctx);
|
|
632
|
+
expect(result.ok).toBe(true);
|
|
633
|
+
if (!result.ok)
|
|
634
|
+
return;
|
|
635
|
+
const lines = result.summary.split('\n');
|
|
636
|
+
const voiceLine = lines.find(l => l.includes('**voice**'));
|
|
637
|
+
expect(voiceLine).toContain('[runtime: gemini]');
|
|
638
|
+
expect(voiceLine).toContain('gemini-2.5-flash');
|
|
639
|
+
});
|
|
640
|
+
it('does not annotate voice runtime when it matches chat', () => {
|
|
641
|
+
const ctx = makeCtx({ voiceModelCtx: { model: 'sonnet' } });
|
|
642
|
+
const result = executeConfigAction({ type: 'modelShow' }, ctx);
|
|
643
|
+
expect(result.ok).toBe(true);
|
|
644
|
+
if (!result.ok)
|
|
645
|
+
return;
|
|
646
|
+
const lines = result.summary.split('\n');
|
|
647
|
+
const voiceLine = lines.find(l => l.includes('**voice**'));
|
|
648
|
+
expect(voiceLine).toBeDefined();
|
|
649
|
+
expect(voiceLine).not.toContain('[runtime:');
|
|
650
|
+
});
|
|
651
|
+
it('resolves tier names against voice runtime ID, not chat runtime', () => {
|
|
652
|
+
// Voice on gemini with tier 'capable' should resolve to gemini-2.5-pro, not sonnet
|
|
653
|
+
const ctx = makeCtx({ voiceModelCtx: { model: 'capable', runtime: geminiRuntime, runtimeName: 'gemini' } });
|
|
654
|
+
ctx.voiceRuntimeName = 'gemini';
|
|
655
|
+
const result = executeConfigAction({ type: 'modelShow' }, ctx);
|
|
656
|
+
expect(result.ok).toBe(true);
|
|
657
|
+
if (!result.ok)
|
|
658
|
+
return;
|
|
659
|
+
const lines = result.summary.split('\n');
|
|
660
|
+
const voiceLine = lines.find(l => l.includes('**voice**'));
|
|
661
|
+
expect(voiceLine).toContain('gemini-2.5-pro');
|
|
662
|
+
expect(voiceLine).not.toContain('sonnet');
|
|
663
|
+
});
|
|
664
|
+
});
|
|
665
|
+
// ---------------------------------------------------------------------------
|
|
541
666
|
// configActionsPromptSection
|
|
542
667
|
// ---------------------------------------------------------------------------
|
|
543
668
|
describe('configActionsPromptSection', () => {
|
|
@@ -114,5 +114,6 @@ export function spawnActionsPromptSection() {
|
|
|
114
114
|
- Multiple spawnAgent actions in a single response are run in parallel for efficiency.
|
|
115
115
|
- Spawned agents run at recursion depth 1 and cannot themselves spawn further agents.
|
|
116
116
|
- The spawned agent runs fire-and-forget: it posts its output directly to the target channel.
|
|
117
|
-
- Keep prompts focused — each agent handles a single well-defined task
|
|
117
|
+
- Keep prompts focused — each agent handles a single well-defined task.
|
|
118
|
+
- **Context isolation:** The spawned agent has **no conversation history** — it receives only the \`prompt\` string. The prompt must be fully self-contained: include all entity IDs, channel names, file paths, and relevant state. Do not reference "the above," "this task," or anything from the current conversation — the spawned agent cannot see it.`;
|
|
118
119
|
}
|
|
@@ -377,6 +377,10 @@ describe('spawnActionsPromptSection', () => {
|
|
|
377
377
|
const section = spawnActionsPromptSection();
|
|
378
378
|
expect(section).toContain('recursion');
|
|
379
379
|
});
|
|
380
|
+
it('warns about no conversation history (context isolation)', () => {
|
|
381
|
+
const section = spawnActionsPromptSection();
|
|
382
|
+
expect(section).toContain('no conversation history');
|
|
383
|
+
});
|
|
380
384
|
it('includes a usage example block', () => {
|
|
381
385
|
const section = spawnActionsPromptSection();
|
|
382
386
|
expect(section).toContain('<discord-action>');
|
package/dist/discord/actions.js
CHANGED
|
@@ -681,7 +681,9 @@ If an action fails with a "Missing Permissions" or "Missing Access" error, tell
|
|
|
681
681
|
4. The bot may need to be re-invited with the "moderator" permission profile if the role wasn't granted at invite time.`);
|
|
682
682
|
if (flags.defer) {
|
|
683
683
|
sections.push(`### Deferred self-invocation
|
|
684
|
-
Use a <discord-action>{"type":"defer","channel":"general","delaySeconds":600,"prompt":"Check on the forge run"}</discord-action> block to schedule a follow-up run inside the requested channel without another user prompt. You must specify the channel by name or ID; delaySeconds is how long to wait (capped by DISCOCLAW_DISCORD_ACTIONS_DEFER_MAX_DELAY_SECONDS) and prompt becomes the user message when the deferred invocation runs. The scheduler enforces DISCOCLAW_DISCORD_ACTIONS_DEFER_MAX_CONCURRENT pending jobs, respects the same channel permissions as this response, automatically posts the follow-up output, and allows nested defers up to the configured depth limit (DISCOCLAW_DISCORD_ACTIONS_DEFER_MAX_DEPTH, default 4); once the limit is reached, \`defer\` is disabled for that run. If a guard rail rejects the request (too long, too many active defers, missing permissions, or the channel becomes invalid) the action fails with an explanatory message
|
|
684
|
+
Use a <discord-action>{"type":"defer","channel":"general","delaySeconds":600,"prompt":"Check on the forge run"}</discord-action> block to schedule a follow-up run inside the requested channel without another user prompt. You must specify the channel by name or ID; delaySeconds is how long to wait (capped by DISCOCLAW_DISCORD_ACTIONS_DEFER_MAX_DELAY_SECONDS) and prompt becomes the user message when the deferred invocation runs. The scheduler enforces DISCOCLAW_DISCORD_ACTIONS_DEFER_MAX_CONCURRENT pending jobs, respects the same channel permissions as this response, automatically posts the follow-up output, and allows nested defers up to the configured depth limit (DISCOCLAW_DISCORD_ACTIONS_DEFER_MAX_DEPTH, default 4); once the limit is reached, \`defer\` is disabled for that run. If a guard rail rejects the request (too long, too many active defers, missing permissions, or the channel becomes invalid) the action fails with an explanatory message.
|
|
685
|
+
|
|
686
|
+
**Context isolation warning:** The deferred invocation runs with no conversation history — the \`prompt\` string is the **only** context the AI receives. It must include all relevant IDs, file paths, channel references, and state needed to act. Vague prompts like "check on that" will fail because the AI has no memory of what "that" refers to. Write every deferred prompt as a fully self-contained instruction.`);
|
|
685
687
|
}
|
|
686
688
|
return sections.join('\n\n');
|
|
687
689
|
}
|
|
@@ -627,6 +627,7 @@ describe('discordActionsPromptSection', () => {
|
|
|
627
627
|
expect(prompt).toContain('DISCOCLAW_DISCORD_ACTIONS_DEFER_MAX_DELAY_SECONDS');
|
|
628
628
|
expect(prompt).toContain('DISCOCLAW_DISCORD_ACTIONS_DEFER_MAX_CONCURRENT');
|
|
629
629
|
expect(prompt).toContain('DISCOCLAW_DISCORD_ACTIONS_DEFER_MAX_DEPTH');
|
|
630
|
+
expect(prompt).toContain('no conversation history');
|
|
630
631
|
});
|
|
631
632
|
});
|
|
632
633
|
// ---------------------------------------------------------------------------
|
|
@@ -54,8 +54,8 @@ export function handleModelsCommand(cmd, opts) {
|
|
|
54
54
|
'',
|
|
55
55
|
'**Roles:** `chat`, `fast`, `forge-drafter`, `forge-auditor`, `summary`, `cron`, `cron-exec`, `voice`',
|
|
56
56
|
'',
|
|
57
|
-
'**Runtime switching (chat
|
|
58
|
-
'Setting the `chat` role to a runtime name (`openrouter`, `openai`, `gemini`, `codex`, `claude`) switches the active runtime adapter so invocations route through that provider.',
|
|
57
|
+
'**Runtime switching (chat and voice roles):**',
|
|
58
|
+
'Setting the `chat` or `voice` role to a runtime name (`openrouter`, `openai`, `gemini`, `codex`, `claude`) switches the active runtime adapter so invocations route through that provider.',
|
|
59
59
|
'',
|
|
60
60
|
'**Examples:**',
|
|
61
61
|
'- `!models set chat sonnet`',
|
|
@@ -65,6 +65,7 @@ export function handleModelsCommand(cmd, opts) {
|
|
|
65
65
|
'- `!models set forge-drafter opus`',
|
|
66
66
|
'- `!models set cron-exec haiku` — run crons on a cheaper model',
|
|
67
67
|
'- `!models set cron-exec default` — revert to env default (Sonnet by default)',
|
|
68
|
+
'- `!models set voice gemini` — switch voice to the Gemini runtime',
|
|
68
69
|
'- `!models set voice sonnet` — use a specific model for voice responses',
|
|
69
70
|
'- `!models reset` — clear all overrides and revert to env defaults',
|
|
70
71
|
'- `!models reset chat` — revert only the chat model to its env default',
|
|
@@ -118,5 +118,7 @@ export function reactionPromptSection() {
|
|
|
118
118
|
|
|
119
119
|
The action returns immediately with a confirmation that the prompt was sent. When the user reacts with a valid choice, a follow-up invocation is triggered automatically so you can act on the decision.
|
|
120
120
|
|
|
121
|
+
**Context warning:** The follow-up AI invocation receives *only* the \`question\` text and the chosen emoji — no conversation history or prior context is included. Write questions that are specific and self-contained so the follow-up AI knows exactly what action to take for each choice. For example, use "Deploy commit abc123 to staging?" instead of "Should I proceed?" — the follow-up invocation won't know what "proceed" refers to.
|
|
122
|
+
|
|
121
123
|
Use this for binary confirmations (✅/❌) or short option lists — not for open-ended text input.`;
|
|
122
124
|
}
|
|
@@ -242,6 +242,11 @@ describe('reactionPromptSection', () => {
|
|
|
242
242
|
it('mentions choice count limits', () => {
|
|
243
243
|
expect(reactionPromptSection()).toContain('2–9');
|
|
244
244
|
});
|
|
245
|
+
it('warns about no conversation history in follow-up invocation', () => {
|
|
246
|
+
const section = reactionPromptSection();
|
|
247
|
+
expect(section).toContain('no conversation history');
|
|
248
|
+
expect(section).toContain('self-contained');
|
|
249
|
+
});
|
|
245
250
|
});
|
|
246
251
|
// ---------------------------------------------------------------------------
|
|
247
252
|
// QUERY_ACTION_TYPES regression guard
|