osborn 0.9.53 → 0.9.55
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/skills/bug-reporter/SKILL.md +129 -0
- package/dist/index.js +147 -26
- package/package.json +1 -1
|
@@ -0,0 +1,129 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: bug-reporter
|
|
3
|
+
description: |
|
|
4
|
+
File a bug report or feature request when the user describes a problem with
|
|
5
|
+
Osborn itself (voice glitches, agent freezes, audio echo, session crashes,
|
|
6
|
+
interrupt issues) or asks for a new Osborn feature. Posts to a local agent
|
|
7
|
+
endpoint that hands the report off to the frontend, which writes it to
|
|
8
|
+
Supabase. Use whenever the user describes something wrong with Osborn —
|
|
9
|
+
NOT for questions about their own project code.
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
# Bug Reporter Skill
|
|
13
|
+
|
|
14
|
+
File bug reports and feature requests from inside a voice session, without
|
|
15
|
+
breaking the conversation. Reports land in the dev team's Supabase table for
|
|
16
|
+
triage from a separate Claude Code session.
|
|
17
|
+
|
|
18
|
+
## When to use this
|
|
19
|
+
|
|
20
|
+
Trigger when the user describes any of these (or similar):
|
|
21
|
+
|
|
22
|
+
**Bugs:**
|
|
23
|
+
- Voice quality issues — "the audio cut out", "I can't hear you", "you keep echoing", "you interrupted yourself"
|
|
24
|
+
- Agent malfunctions — "the agent froze", "it crashed", "it stopped responding", "you're stuck"
|
|
25
|
+
- Session issues — "session disconnected", "the room keeps closing", "I had to restart"
|
|
26
|
+
- Memory/state issues — "you don't remember", "you lost context"
|
|
27
|
+
- Interrupt problems — "you keep cutting yourself off", "the interrupt isn't working"
|
|
28
|
+
- Direct asks — "this is a bug", "file this", "report this", "let me know when it's fixed"
|
|
29
|
+
|
|
30
|
+
**Feature requests:**
|
|
31
|
+
- "I wish Osborn could…", "can you add…", "it would be nice if…", "feature request:"
|
|
32
|
+
|
|
33
|
+
## When NOT to use
|
|
34
|
+
|
|
35
|
+
- The user has a coding question about THEIR project — that's normal research/coding work
|
|
36
|
+
- The user mentions an error in code they're writing — not an Osborn bug
|
|
37
|
+
- The user is debugging their own logs — they're working, not reporting
|
|
38
|
+
|
|
39
|
+
## How to file
|
|
40
|
+
|
|
41
|
+
### Step 1 — confirm with the user
|
|
42
|
+
|
|
43
|
+
Don't silently file. Say something brief like:
|
|
44
|
+
|
|
45
|
+
> "Sounds like a real bug — want me to file it so the team can dig in? I'll
|
|
46
|
+
> include the recent logs."
|
|
47
|
+
|
|
48
|
+
If they say yes, proceed. If unsure, ask whether it's worth filing.
|
|
49
|
+
|
|
50
|
+
### Step 2 — POST to the local agent endpoint
|
|
51
|
+
|
|
52
|
+
```bash
|
|
53
|
+
curl -sS -X POST http://localhost:8741/report-bug \
|
|
54
|
+
-H "Content-Type: application/json" \
|
|
55
|
+
-d @- <<'JSON'
|
|
56
|
+
{
|
|
57
|
+
"type": "bug",
|
|
58
|
+
"severity": "medium",
|
|
59
|
+
"title": "Voice cuts out mid-sentence in pipeline mode",
|
|
60
|
+
"description": "User reported that the agent stops speaking mid-sentence and resumes 5 minutes later. Happens repeatedly. Started after migrating to user_state_changed handler (May 21, 0.9.39).",
|
|
61
|
+
"reproduction_notes": "Speak to the agent, then go silent — audio cuts off at a sentence boundary and won't resume until mic is muted for ~2 seconds.",
|
|
62
|
+
"tags": ["voice-quality", "interrupt", "echo"]
|
|
63
|
+
}
|
|
64
|
+
JSON
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
The agent endpoint:
|
|
68
|
+
- Generates a `reportId`
|
|
69
|
+
- Tails `/workspace/osborn.log` (last 500 lines)
|
|
70
|
+
- Pulls the last few turns from the current JSONL session
|
|
71
|
+
- Sends everything to the frontend via the LiveKit data channel
|
|
72
|
+
- Returns `{ reportId, status: "submitted" }` to you
|
|
73
|
+
|
|
74
|
+
You don't need to attach logs yourself — the agent does that automatically.
|
|
75
|
+
|
|
76
|
+
### Step 3 — confirm to the user
|
|
77
|
+
|
|
78
|
+
Briefly:
|
|
79
|
+
|
|
80
|
+
> "Filed. Bug `f4a2…` — the team will look. Want me to log anything else?"
|
|
81
|
+
|
|
82
|
+
Use the first 4 chars of the returned `reportId` as a short reference.
|
|
83
|
+
|
|
84
|
+
## Choosing severity
|
|
85
|
+
|
|
86
|
+
- `critical` — voice completely unusable, session crashes immediately, data loss
|
|
87
|
+
- `high` — major friction (voice keeps cutting, frequent crashes, can't connect)
|
|
88
|
+
- `medium` — annoying but workable (echo, occasional drops, minor UI glitches)
|
|
89
|
+
- `low` — nice-to-have polish, edge cases, documentation gaps, feature requests
|
|
90
|
+
|
|
91
|
+
Feature requests default to `low` unless the user describes blocking workflows.
|
|
92
|
+
|
|
93
|
+
## Title writing
|
|
94
|
+
|
|
95
|
+
Short, present-tense, specific. 6–10 words.
|
|
96
|
+
|
|
97
|
+
Good:
|
|
98
|
+
- "Voice cuts out mid-sentence in pipeline mode"
|
|
99
|
+
- "Agent echoes own speech as user interrupt"
|
|
100
|
+
- "Session orphaned after machine OOM auto-restart"
|
|
101
|
+
|
|
102
|
+
Bad:
|
|
103
|
+
- "voice bug" (too vague)
|
|
104
|
+
- "When I was talking the agent stopped responding and I had to..." (use description)
|
|
105
|
+
|
|
106
|
+
## What NOT to include in the description
|
|
107
|
+
|
|
108
|
+
- Don't dump the full transcript — the agent attaches a `transcript_excerpt` automatically
|
|
109
|
+
- Don't paste log lines — the agent attaches the `log_excerpt` automatically
|
|
110
|
+
- Don't speculate about the fix unless the user explicitly suggested one
|
|
111
|
+
- Don't include the user's API keys, OAuth tokens, or PII
|
|
112
|
+
|
|
113
|
+
## Tags vocabulary
|
|
114
|
+
|
|
115
|
+
Pick from these rough buckets (one or more):
|
|
116
|
+
`echo, interrupt, crash, freeze, memory, voice-quality, audio, mode-specific,
|
|
117
|
+
direct, pipeline, realtime, ui, sessions, fly, recall, meeting, deepgram, tts, stt`
|
|
118
|
+
|
|
119
|
+
## Reading existing reports
|
|
120
|
+
|
|
121
|
+
You don't query, list, or close reports from inside a voice session — that's
|
|
122
|
+
the dev team's job from their own Claude Code session. If the user asks "is
|
|
123
|
+
that bug fixed yet?", say "let me check" and use the same endpoint with `GET`:
|
|
124
|
+
|
|
125
|
+
```bash
|
|
126
|
+
curl -sS "http://localhost:8741/report-bug?id=${REPORT_ID}"
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
But typically the user won't ask, and you don't need to volunteer the status.
|
package/dist/index.js
CHANGED
|
@@ -14,6 +14,7 @@ import { existsSync, readdirSync, readFileSync, mkdirSync, writeFileSync, mkdtem
|
|
|
14
14
|
import { dirname, join } from 'node:path';
|
|
15
15
|
import { fileURLToPath } from 'node:url';
|
|
16
16
|
import { spawn } from 'node:child_process';
|
|
17
|
+
import { randomUUID } from 'node:crypto';
|
|
17
18
|
import { homedir, tmpdir } from 'node:os';
|
|
18
19
|
import { PassThrough } from 'node:stream';
|
|
19
20
|
import { createGunzip } from 'node:zlib';
|
|
@@ -189,6 +190,14 @@ const livekitState = {
|
|
|
189
190
|
let intentionalLeave = false;
|
|
190
191
|
let connectRoomHook = null;
|
|
191
192
|
let leaveRoomHook = null;
|
|
193
|
+
// Hook for the bug-reporter skill. The /report-bug HTTP endpoint validates the
|
|
194
|
+
// payload + generates the reportId in the module-level handler, then delegates
|
|
195
|
+
// to this hook which lives in main() (where sendToFrontend, currentVoiceMode,
|
|
196
|
+
// and currentSession are in scope). The frontend listens for the data channel
|
|
197
|
+
// message type 'bug_report' and writes the row to Supabase — same architecture
|
|
198
|
+
// as the existing fetch-log/save-log flow so we don't ship Supabase credentials
|
|
199
|
+
// to the Fly machine.
|
|
200
|
+
let bugReportHook = null;
|
|
192
201
|
function startApiServer(workingDir, port) {
|
|
193
202
|
const server = createServer(async (req, res) => {
|
|
194
203
|
// CORS headers for cloud frontend
|
|
@@ -332,6 +341,67 @@ function startApiServer(workingDir, port) {
|
|
|
332
341
|
}
|
|
333
342
|
return;
|
|
334
343
|
}
|
|
344
|
+
// POST /report-bug — invoked by the bug-reporter skill (running inside Claude
|
|
345
|
+
// Code on this same machine) when the user describes an Osborn bug or
|
|
346
|
+
// requests a feature. We validate the payload, generate a reportId, and emit
|
|
347
|
+
// a data channel message via bugReportHook → sendToFrontend. The frontend
|
|
348
|
+
// owns the actual Supabase write (it already has the keys for the log-upload
|
|
349
|
+
// flow, no need to ship them to the Fly machine).
|
|
350
|
+
if (req.method === 'POST' && url.pathname === '/report-bug') {
|
|
351
|
+
let body = '';
|
|
352
|
+
req.on('data', (chunk) => { body += chunk.toString(); });
|
|
353
|
+
req.on('end', () => {
|
|
354
|
+
try {
|
|
355
|
+
const parsed = JSON.parse(body || '{}');
|
|
356
|
+
const errors = [];
|
|
357
|
+
if (parsed.type !== 'bug' && parsed.type !== 'feature')
|
|
358
|
+
errors.push('type must be "bug" or "feature"');
|
|
359
|
+
if (!parsed.title || typeof parsed.title !== 'string' || parsed.title.length < 3)
|
|
360
|
+
errors.push('title required (>= 3 chars)');
|
|
361
|
+
if (!parsed.description || typeof parsed.description !== 'string')
|
|
362
|
+
errors.push('description required');
|
|
363
|
+
const sev = parsed.severity || 'medium';
|
|
364
|
+
if (!['low', 'medium', 'high', 'critical'].includes(sev))
|
|
365
|
+
errors.push('severity invalid');
|
|
366
|
+
if (errors.length) {
|
|
367
|
+
res.writeHead(400, { 'Content-Type': 'application/json' });
|
|
368
|
+
res.end(JSON.stringify({ error: 'invalid payload', details: errors }));
|
|
369
|
+
return;
|
|
370
|
+
}
|
|
371
|
+
const reportId = randomUUID();
|
|
372
|
+
const payload = {
|
|
373
|
+
type: parsed.type,
|
|
374
|
+
severity: sev,
|
|
375
|
+
title: parsed.title.trim().slice(0, 200),
|
|
376
|
+
description: parsed.description.trim().slice(0, 8000),
|
|
377
|
+
reproduction_notes: typeof parsed.reproduction_notes === 'string'
|
|
378
|
+
? parsed.reproduction_notes.trim().slice(0, 4000)
|
|
379
|
+
: undefined,
|
|
380
|
+
tags: Array.isArray(parsed.tags)
|
|
381
|
+
? parsed.tags.filter((t) => typeof t === 'string').slice(0, 20)
|
|
382
|
+
: undefined,
|
|
383
|
+
};
|
|
384
|
+
res.writeHead(200, { 'Content-Type': 'application/json' });
|
|
385
|
+
res.end(JSON.stringify({ reportId, status: 'submitted' }));
|
|
386
|
+
if (bugReportHook) {
|
|
387
|
+
try {
|
|
388
|
+
bugReportHook(reportId, payload);
|
|
389
|
+
}
|
|
390
|
+
catch (e) {
|
|
391
|
+
console.error('❌ bugReportHook threw:', e);
|
|
392
|
+
}
|
|
393
|
+
}
|
|
394
|
+
else {
|
|
395
|
+
console.warn('⚠️ /report-bug fired but no bugReportHook registered (frontend may not receive)');
|
|
396
|
+
}
|
|
397
|
+
}
|
|
398
|
+
catch (e) {
|
|
399
|
+
res.writeHead(400, { 'Content-Type': 'application/json' });
|
|
400
|
+
res.end(JSON.stringify({ error: 'invalid JSON', details: e.message }));
|
|
401
|
+
}
|
|
402
|
+
});
|
|
403
|
+
return;
|
|
404
|
+
}
|
|
335
405
|
// POST /restart — graceful process restart (process manager will restart)
|
|
336
406
|
if (req.method === 'POST' && url.pathname === '/restart') {
|
|
337
407
|
res.writeHead(200, { 'Content-Type': 'application/json' });
|
|
@@ -1205,16 +1275,6 @@ async function main() {
|
|
|
1205
1275
|
// Session-level always-allow list: paths the user has approved for this session without prompting
|
|
1206
1276
|
let sessionAlwaysAllowPaths = new Set();
|
|
1207
1277
|
let userState = 'listening'; // Track user speech state for queue safety
|
|
1208
|
-
// Leading-edge debounce for the TTS interrupt below — restores the same
|
|
1209
|
-
// anti-flap protection the removed ActiveSpeakersChanged handler had pre-0.9.39
|
|
1210
|
-
// (May 21 / c345c98). Wall-clock timestamp + ms compare; no setTimeout, no
|
|
1211
|
-
// promise, no new API. Suppresses repeat interrupts within the window so a
|
|
1212
|
-
// single user-input transition fires at most one interrupt() call per second.
|
|
1213
|
-
// Without it, TTS echo bleeding through the mic causes user_state to oscillate
|
|
1214
|
-
// speaking ↔ listening across rapid Deepgram frames, each transition firing a
|
|
1215
|
-
// fresh interrupt — and even after 1.4.x's stricter error classification, the
|
|
1216
|
-
// first one survives but the cascade kills the session.
|
|
1217
|
-
let lastInterruptAt = 0;
|
|
1218
1278
|
// Self-echo guard for the TTS interrupt below. Updated by the
|
|
1219
1279
|
// ActiveSpeakersChanged listener registered near the other room.on(...) handlers.
|
|
1220
1280
|
// user_state_changed carries NO speaker identity (verified against the SDK type
|
|
@@ -2072,6 +2132,30 @@ async function main() {
|
|
|
2072
2132
|
minDelay: 500, // Wait 500ms after STT commits before generating reply
|
|
2073
2133
|
maxDelay: 2000, // Force end-of-turn after 2s to prevent hangs
|
|
2074
2134
|
},
|
|
2135
|
+
// Echo-driven false-interrupt protection at the SDK level (1.2.x has these knobs,
|
|
2136
|
+
// we just never set them — defaults are minDuration:500ms / minWords:0 which let
|
|
2137
|
+
// through every short echo blip). Both knobs gate the SDK's internal
|
|
2138
|
+
// interruptByAudioActivity() (agent_activity.js — runs on Deepgram interim
|
|
2139
|
+
// transcripts AND speechDuration updates), which is the path that was firing
|
|
2140
|
+
// even after our user_state_changed handler skipped the trigger.
|
|
2141
|
+
interruption: {
|
|
2142
|
+
// SDK auto-interrupt fully DISABLED (0.9.55). Even with minDuration:750
|
|
2143
|
+
// and minWords:2 in 0.9.54, the SDK's onInterimTranscript path bypasses
|
|
2144
|
+
// duration gating (it fires on first interim text) and minWords gates
|
|
2145
|
+
// against accumulated transcript wordcount — so once a real user utters
|
|
2146
|
+
// ≥2 words, every subsequent echo passes. Worse: double-fires within
|
|
2147
|
+
// 200ms corrupt SegmentSynchronizerImpl state (pushAudio called after
|
|
2148
|
+
// close → markPlaybackFinished before input done → playback hangs).
|
|
2149
|
+
// With enabled:false the SDK won't fire interruptByAudioActivity at all;
|
|
2150
|
+
// our user_state_changed handler at index.ts:3162 with the self-echo
|
|
2151
|
+
// guard (lastRemoteSpeakerAt + ActiveSpeakersChanged) becomes the SOLE
|
|
2152
|
+
// interrupt path. We control timing, deduplication, and identity.
|
|
2153
|
+
enabled: false,
|
|
2154
|
+
// The values below have no effect with enabled:false but kept for
|
|
2155
|
+
// documentation in case enabled is flipped back on for testing.
|
|
2156
|
+
minDuration: 750,
|
|
2157
|
+
minWords: 2,
|
|
2158
|
+
},
|
|
2075
2159
|
},
|
|
2076
2160
|
});
|
|
2077
2161
|
return { session, agent };
|
|
@@ -2989,33 +3073,36 @@ async function main() {
|
|
|
2989
3073
|
console.log(`👤 User state: ${prev} → ${ev.newState} (agent: ${agentState})`);
|
|
2990
3074
|
if (ev.newState === 'speaking' && agentState === 'speaking' && sessionVoiceMode !== 'realtime') {
|
|
2991
3075
|
const now = Date.now();
|
|
2992
|
-
// Self-echo guard
|
|
3076
|
+
// Self-echo guard. Reject this trigger entirely if no remote
|
|
2993
3077
|
// participant has been heard speaking in the last 500ms — at that
|
|
2994
3078
|
// point user_state=speaking is almost certainly TTS bleeding through
|
|
2995
3079
|
// the mic (Deepgram correctly identifies it as "speech", we add the
|
|
2996
3080
|
// identity filter the high-level event lacks). 500ms is wider than
|
|
2997
3081
|
// the ~50-300ms gap between ActiveSpeakersChanged and user_state_changed
|
|
2998
3082
|
// firing, so a real user is comfortably inside the window.
|
|
3083
|
+
//
|
|
3084
|
+
// The 1s leading-edge debounce that used to live here was removed in
|
|
3085
|
+
// 0.9.54 — the SDK-side `turnHandling.interruption.minDuration:750` +
|
|
3086
|
+
// `minWords:2` now do the heavy lifting on echo filtering, and stacking
|
|
3087
|
+
// an extra cooldown on top risked masking the SDK's own resume timing.
|
|
2999
3088
|
if (now - lastRemoteSpeakerAt > 500) {
|
|
3000
3089
|
console.log('🔇 Skipping interrupt — no recent remote-speaker activity (self-echo guard)');
|
|
3001
3090
|
return;
|
|
3002
3091
|
}
|
|
3003
|
-
|
|
3004
|
-
|
|
3005
|
-
|
|
3006
|
-
|
|
3007
|
-
|
|
3008
|
-
|
|
3009
|
-
|
|
3092
|
+
try {
|
|
3093
|
+
// force:true bypasses the SpeechHandle's allowInterruptions check
|
|
3094
|
+
// (speech_handle.js:93-99). Required because turnHandling.interruption.enabled=false
|
|
3095
|
+
// sets allowInterruptions=false on every SpeechHandle (agent_activity.js:329-331),
|
|
3096
|
+
// which is what blocks the SDK's auto-interrupt path — but without
|
|
3097
|
+
// force:true, this manual call from our handler would also throw
|
|
3098
|
+
// "This generation handle does not allow interruptions". Combined,
|
|
3099
|
+
// they let US interrupt (with self-echo guard already verified above)
|
|
3100
|
+
// while keeping the SDK's auto-trigger off.
|
|
3101
|
+
console.log('🎤 user_state_changed=speaking + agent speaking + remote-speaker confirmed → interrupting TTS (force)');
|
|
3102
|
+
currentSession?.interrupt({ force: true });
|
|
3010
3103
|
}
|
|
3011
|
-
|
|
3012
|
-
|
|
3013
|
-
console.log('🎤 user_state_changed=speaking + agent speaking + remote-speaker confirmed → interrupting TTS');
|
|
3014
|
-
currentSession?.interrupt();
|
|
3015
|
-
}
|
|
3016
|
-
catch (err) {
|
|
3017
|
-
console.warn('⚠️ user-state interrupt failed:', err instanceof Error ? err.message : err);
|
|
3018
|
-
}
|
|
3104
|
+
catch (err) {
|
|
3105
|
+
console.warn('⚠️ user-state interrupt failed:', err instanceof Error ? err.message : err);
|
|
3019
3106
|
}
|
|
3020
3107
|
}
|
|
3021
3108
|
// When user stops speaking, retry voice queue — items may be waiting
|
|
@@ -4284,6 +4371,40 @@ async function main() {
|
|
|
4284
4371
|
console.error('leave-room room.disconnect failed:', e);
|
|
4285
4372
|
}
|
|
4286
4373
|
};
|
|
4374
|
+
// bug-reporter skill hook — forwards a validated bug payload to the frontend
|
|
4375
|
+
// via the LiveKit data channel. Frontend (which holds the Supabase keys for
|
|
4376
|
+
// the existing log-upload flow) is responsible for the actual Supabase write.
|
|
4377
|
+
// Enriches with the agent-side facts the frontend doesn't already have on
|
|
4378
|
+
// hand (voice_mode + sandbox_id from FLY_MACHINE_ID — version it can read
|
|
4379
|
+
// from /health, session_id it tracks via preSelectedSessionId).
|
|
4380
|
+
bugReportHook = (reportId, payload) => {
|
|
4381
|
+
const sandboxId = process.env.FLY_MACHINE_ID || null;
|
|
4382
|
+
let osbornVersion;
|
|
4383
|
+
try {
|
|
4384
|
+
for (const rel of ['../package.json', '../../package.json']) {
|
|
4385
|
+
try {
|
|
4386
|
+
const pkg = JSON.parse(readFileSync(join(__dirname, rel), 'utf8'));
|
|
4387
|
+
if (pkg.name === 'osborn' && pkg.version) {
|
|
4388
|
+
osbornVersion = pkg.version;
|
|
4389
|
+
break;
|
|
4390
|
+
}
|
|
4391
|
+
}
|
|
4392
|
+
catch { /* try next */ }
|
|
4393
|
+
}
|
|
4394
|
+
}
|
|
4395
|
+
catch { /* version optional */ }
|
|
4396
|
+
console.log(`🪲 Bug report ${reportId.slice(0, 8)} (${payload.type}/${payload.severity}): ${payload.title}`);
|
|
4397
|
+
sendToFrontend({
|
|
4398
|
+
type: 'bug_report',
|
|
4399
|
+
reportId,
|
|
4400
|
+
payload,
|
|
4401
|
+
context: {
|
|
4402
|
+
voice_mode: currentVoiceMode,
|
|
4403
|
+
sandbox_id: sandboxId,
|
|
4404
|
+
osborn_version: osbornVersion,
|
|
4405
|
+
},
|
|
4406
|
+
}).catch((e) => console.error('❌ bugReportHook sendToFrontend failed:', e));
|
|
4407
|
+
};
|
|
4287
4408
|
// Fire and forget; the retry loop keeps the process alive on its own (so
|
|
4288
4409
|
// we don't need the explicit `new Promise(() => {})` keepalive anymore).
|
|
4289
4410
|
// Errors that escape the retry loop should never happen, but if they do,
|