osborn 0.9.43 → 0.9.44

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,73 +1,145 @@
1
1
  # Skill: Meetings
2
2
 
3
- Silent note-taking and TODO-tracking when osborn is sitting in a live meeting.
3
+ Silent note-taking and TODO-tracking when osborn is sitting in a live meeting,
4
+ and explicit on-demand transcript pulls from Recall.ai when the user asks.
4
5
 
5
6
  ## When to use
6
7
 
7
- When a user message arrives with the prefix `[MEETING — <botId>]:` (every ~30 seconds while a Recall.ai meeting bot is active). Also use this skill when the orchestrating system injects `[SYSTEM] You are now in a meeting ...`.
8
+ Two trigger patterns:
8
9
 
9
- **Do NOT use this skill** for normal user messages in the voice-native chat — those still get spoken responses as usual.
10
+ **1. Auto-tagged meeting transcript chunks** (every ~30s while a Recall bot is active):
11
+ Any user message that starts with `[MEETING — <botId>]:`. Also a `[SYSTEM] You are now in a meeting ...` injection on bot join.
10
12
 
11
- ## How to behave
13
+ **2. Explicit user request to pull / write notes** (any of these keyphrases in voice-native chat):
14
+ - "grab the meeting transcripts"
15
+ - "pull the meeting transcripts"
16
+ - "fetch the meeting transcripts"
17
+ - "what was said in the meeting"
18
+ - "update the meeting notes"
19
+ - "compile the todos"
20
+ - "write the todos"
21
+ - "summarize the meeting"
22
+
23
+ **Do NOT use this skill** for normal user voice-native messages that don't fit those patterns — those get spoken responses as usual.
24
+
25
+ ## How to behave (auto-tagged chunks)
12
26
 
13
27
  For every `[MEETING — *]:` message:
14
28
 
15
- 1. **Do NOT speak.** No TTS output. No `tts_say`. No conversational reply. This is a silent observer path. If you must acknowledge that you processed the message, do it via a Write/Edit tool call (writing to the workspace), not via spoken or chat output.
16
- 2. **Update `meeting-todos.md`** in the session workspace (`{workspace}/meeting-todos.md`). Append new action items, decisions, and open questions as they emerge in the transcript. Do not rewrite existing entries unless something contradicts.
17
- 3. **Optionally trigger background research silently.** If a topic in the meeting would benefit from a quick web/code lookup, dispatch a researcher sub-agent via the Task tool. Save its output to `{workspace}/library/meeting-research-<topic-slug>.md`. Do NOT speak the result.
18
- 4. **Do not consume voice-native attention.** The user can still talk to you via the voice-native browser. When they do (a normal user message with no `[MEETING*]` prefix), respond normally — speak. Treat the meeting transcript as background context they can ask about ("what did Sarah say about pricing?" → answer normally).
29
+ 1. **Do NOT speak.** No TTS output. No conversational reply.
30
+ 2. **Update `meeting-todos.md`** in the session workspace. Append new action items, decisions, open questions. One file, evolving.
31
+ 3. **Optionally trigger background research silently** via Task tool.
32
+ 4. **Don't consume voice-native attention.** The user can interrupt with a voice-native message at any timethat's the only kind that gets spoken responses.
33
+
34
+ ## How to pull transcripts on demand (Bash + curl)
35
+
36
+ When the user explicitly asks (see triggers above), run these commands. Speak briefly first ("On it"), do the work, then speak the result.
37
+
38
+ ### Step 1: Get the bot ID
39
+
40
+ The bot ID is in `meeting-todos.md` on the `**Bot:**` line. If `meeting-todos.md` doesn't exist (user is asking about a meeting that already ended in a prior session), ask the user for the bot ID or meeting URL.
41
+
42
+ ### Step 2: Fetch the bot record
43
+
44
+ ```bash
45
+ curl -sS \
46
+ -H "Authorization: Token ${RECALL_API_KEY}" \
47
+ "https://us-west-2.recall.ai/api/v1/bot/<BOT_ID>"
48
+ ```
49
+
50
+ **CRITICAL**: The endpoint MUST be `us-west-2.recall.ai`, NOT the default `recall.ai` or `us-east-1.recall.ai`. The osborn account is provisioned in the us-west-2 region. Using the default endpoint returns 401 "OAuth authentication is currently not supported" or region-mismatch errors.
51
+
52
+ `${RECALL_API_KEY}` is preset in the agent's env — pass it through. Do NOT echo or print the raw key value in your response.
53
+
54
+ ### Step 3: Extract the transcript download URL
55
+
56
+ Parse the JSON response. The transcript's pre-signed S3 URL lives at:
57
+
58
+ ```
59
+ recordings[0].media_shortcuts.transcript.data.download_url
60
+ ```
61
+
62
+ Pipe through `jq` if needed:
63
+
64
+ ```bash
65
+ DOWNLOAD_URL=$(curl -sS \
66
+ -H "Authorization: Token ${RECALL_API_KEY}" \
67
+ "https://us-west-2.recall.ai/api/v1/bot/<BOT_ID>" \
68
+ | jq -r '.recordings[0].media_shortcuts.transcript.data.download_url')
69
+ ```
70
+
71
+ If `recordings[0]` doesn't exist yet, the meeting hasn't been processed — return "the recording isn't ready yet, give it a minute" and stop.
72
+
73
+ ### Step 4: Download the transcript JSON
74
+
75
+ ```bash
76
+ curl -sS "$DOWNLOAD_URL" -o /tmp/meeting-transcript.json
77
+ ```
78
+
79
+ The download URL is a pre-signed S3 link that **expires** (typically ~6 hours after issue). If you get a 403 or AccessDenied, re-fetch the bot record (step 2) to get a fresh URL.
80
+
81
+ ### Step 5: Parse and distill into meeting-todos.md
82
+
83
+ The transcript JSON is an array of turns. Each turn has `participant.name` and `words[]` (each word has `text` + `start_timestamp.relative`). Concatenate words per turn to get the utterance.
84
+
85
+ Use `jq` to pull turns into readable lines:
86
+
87
+ ```bash
88
+ jq -r '.[] | "\(.participant.name // "Unknown"): \(.words | map(.text) | join(" "))"' /tmp/meeting-transcript.json
89
+ ```
90
+
91
+ Then update `meeting-todos.md` — distill into TODOs / Decisions / Open Questions sections. Don't paste the whole transcript verbatim into the file; summarize.
19
92
 
20
93
  ## The `meeting-todos.md` file
21
94
 
95
+ Path: `{session_workspace}/meeting-todos.md` — get the workspace path from spec.md or from the `[SYSTEM]` injection.
96
+
22
97
  Keep it scannable. Structure:
23
98
 
24
99
  ```markdown
25
100
  # Meeting Notes
26
101
 
27
- **Bot:** <botId> · **Started:** <ISO timestamp>
102
+ **Bot:** <botId>
103
+ **Started:** <ISO timestamp>
104
+ **URL:** <meeting URL>
28
105
 
29
- ## TODOs
106
+ ## Summary
107
+ <3-5 sentences distilling the meeting after it ends — added LAST>
30
108
 
109
+ ## TODOs
31
110
  - [ ] <person>: <action item> — <context>
32
- - [ ] <person>: <action item>
33
111
 
34
112
  ## Decisions
35
-
36
- - <date/time> — <what was decided> (raised by <person>)
113
+ - <what was decided> (raised by <person>)
37
114
 
38
115
  ## Open Questions
39
-
40
116
  - <question> — raised by <person>, still unresolved
41
- - <question> — answered by <person>: <answer>
42
117
 
43
118
  ## Highlights
44
-
45
119
  - <key moment or quote worth surfacing>
46
120
  ```
47
121
 
48
- Update the same file across multiple poll cycles don't create `meeting-todos-1.md`, `meeting-todos-2.md`. One file, evolving.
49
-
50
- ## Workspace path
51
-
52
- The session workspace is `~/.claude/projects/<slug>/osb/<session-uuid>/`. Read the env variable or the spec.md header if you need to confirm the exact path. Write absolute paths in tool calls (e.g. `/Users/<user>/.claude/projects/.../osb/<uuid>/meeting-todos.md`).
122
+ Update the same file across all updatesone file, evolving. Don't create `meeting-todos-1.md`, `meeting-todos-2.md`.
53
123
 
54
124
  ## On meeting end
55
125
 
56
- When the user leaves the meeting (the system stops sending `[MEETING — *]:` messages and may inject `[SYSTEM] meeting ended`), do a final pass on `meeting-todos.md` to:
57
- - Mark items the user has clearly committed to
58
- - Move resolved open questions to a `## Resolved` section
59
- - Add a `## Summary` section at the top with 3-5 lines distilling the meeting
60
-
61
- Still silent. The user will ask out loud if they want a recap.
126
+ When `[MEETING — *]:` messages stop OR the system says `[SYSTEM] meeting ended`:
127
+ - Pull the full final transcript (step 2-4 above)
128
+ - Add a `## Summary` section at the top with 3-5 lines
129
+ - Mark resolved open questions
130
+ - The next user voice-native question may be "what was the meeting about?" — answer normally (speak) from the updated file
62
131
 
63
- ## When the user asks about the meeting
132
+ ## When the user asks about the meeting in voice-native
64
133
 
65
- When a non-meeting-tagged message references the meeting ("what's on the todo list?", "what did we decide about X?", "who's handling Y?"), respond normally — speak. Read `meeting-todos.md` first to ground the response. Don't make up speaker names or decisions; only state what's recorded.
134
+ When a non-meeting-tagged voice message references the meeting ("what's on the todo list?", "what did we decide about X?"), respond normally — **speak** the answer. Read `meeting-todos.md` first to ground the response. If `meeting-todos.md` is empty or missing relevant detail, pull a fresh transcript first (steps 2-4) and update the file, then answer.
66
135
 
67
136
  ## Anti-patterns
68
137
 
69
- - ❌ Speaking in response to a `[MEETING*]:` message
70
- - ❌ Creating a new file per poll cycle instead of updating one
71
- - ❌ Trying to drive the meeting (don't add "we should..." items unless someone in the meeting said them)
72
- - ❌ Asking the user clarifying questions during the meeting they're not paying attention to chat
73
- - ❌ Re-transcribing what's in the message into the TODO file verbatim. Distill.
138
+ - ❌ Using `recall.ai` or `us-east-1.recall.ai`always `us-west-2.recall.ai`
139
+ - ❌ Using `WebFetch` for the S3 download URL use `curl` via `Bash` (the URL has weird chars + pre-signed query strings that confuse WebFetch)
140
+ - ❌ Pasting the full raw transcript into `meeting-todos.md`
141
+ - ❌ Speaking in response to `[MEETING*]:` messages
142
+ - ❌ Asking clarifying questions during a live meeting
143
+ - ❌ Creating a new file per pull instead of updating one
144
+ - ❌ Re-pulling the bot record over and over inside one user turn — fetch once, parse once
145
+ - ❌ Echoing or printing `${RECALL_API_KEY}` value in your response
@@ -84,20 +84,21 @@ export declare class RecallClient extends EventEmitter {
84
84
  }): Promise<string>;
85
85
  /**
86
86
  * Fetch the bot's current transcript. Returns an array of "transcript turns"
87
- * (each turn = one speaker's utterance) sorted by start time. Use the bot's
88
- * `recordings[0].id` from getBotStatus / bot record to locate the recording,
89
- * then list its transcripts.
87
+ * (each turn = one speaker's utterance) sorted by start time.
90
88
  *
91
- * Per Recall docs:
92
- * GET /api/v1/bot/{bot_id} bot record incl. `recordings: [...]`
93
- * GET /api/v1/transcript/{transcript_id} → transcript with download_url
94
- * Download the transcript JSON from download_url to get the actual content.
89
+ * Verified 2026-05-22 against the real us-west-2 API: there is NO simple
90
+ * `GET /bot/{id}/transcript` convenience endpoint. The actual chain is:
95
91
  *
96
- * For the polling use case (called every ~30s), we use the simpler combined
97
- * endpoint: `GET /api/v1/bot/{bot_id}/transcript` which Recall exposes as a
98
- * convenience and returns the full transcript so far in one call. The caller
99
- * is responsible for de-duping (keeping a since-cursor) so the LLM only sees
100
- * new turns.
92
+ * 1. GET /api/v1/bot/{bot_id}
93
+ * 2. recordings[0].media_shortcuts.transcript.data.download_url (S3 signed URL)
94
+ * 3. GET that URL → JSON array of TranscriptTurn objects
95
+ *
96
+ * The S3 URL is pre-signed and expires (~6h). Re-fetch step 1 each poll;
97
+ * don't cache the URL.
98
+ *
99
+ * If `recordings[0]` doesn't exist yet (bot still joining or pre-recording),
100
+ * returns []. Caller (MeetingTranscriptPoller) treats that as "no new turns
101
+ * yet" and waits for the next tick.
101
102
  */
102
103
  getTranscript(botId: string): Promise<TranscriptTurn[]>;
103
104
  leaveMeeting(botId: string): Promise<void>;
@@ -66,30 +66,43 @@ export class RecallClient extends EventEmitter {
66
66
  }
67
67
  /**
68
68
  * Fetch the bot's current transcript. Returns an array of "transcript turns"
69
- * (each turn = one speaker's utterance) sorted by start time. Use the bot's
70
- * `recordings[0].id` from getBotStatus / bot record to locate the recording,
71
- * then list its transcripts.
69
+ * (each turn = one speaker's utterance) sorted by start time.
72
70
  *
73
- * Per Recall docs:
74
- * GET /api/v1/bot/{bot_id} bot record incl. `recordings: [...]`
75
- * GET /api/v1/transcript/{transcript_id} → transcript with download_url
76
- * Download the transcript JSON from download_url to get the actual content.
71
+ * Verified 2026-05-22 against the real us-west-2 API: there is NO simple
72
+ * `GET /bot/{id}/transcript` convenience endpoint. The actual chain is:
77
73
  *
78
- * For the polling use case (called every ~30s), we use the simpler combined
79
- * endpoint: `GET /api/v1/bot/{bot_id}/transcript` which Recall exposes as a
80
- * convenience and returns the full transcript so far in one call. The caller
81
- * is responsible for de-duping (keeping a since-cursor) so the LLM only sees
82
- * new turns.
74
+ * 1. GET /api/v1/bot/{bot_id}
75
+ * 2. recordings[0].media_shortcuts.transcript.data.download_url (S3 signed URL)
76
+ * 3. GET that URL → JSON array of TranscriptTurn objects
77
+ *
78
+ * The S3 URL is pre-signed and expires (~6h). Re-fetch step 1 each poll;
79
+ * don't cache the URL.
80
+ *
81
+ * If `recordings[0]` doesn't exist yet (bot still joining or pre-recording),
82
+ * returns []. Caller (MeetingTranscriptPoller) treats that as "no new turns
83
+ * yet" and waits for the next tick.
83
84
  */
84
85
  async getTranscript(botId) {
85
- const res = await fetch(`${RECALL_BASE_URL}/bot/${botId}/transcript`, {
86
+ const botRes = await fetch(`${RECALL_BASE_URL}/bot/${botId}`, {
86
87
  headers: { 'Authorization': `Token ${this.#apiKey}` },
87
88
  });
88
- if (!res.ok) {
89
- const err = await res.text().catch(() => '');
90
- throw new Error(`Recall.ai transcript fetch failed: ${res.status} ${err.substring(0, 200)}`);
89
+ if (!botRes.ok) {
90
+ const err = await botRes.text().catch(() => '');
91
+ throw new Error(`Recall.ai bot fetch failed: ${botRes.status} ${err.substring(0, 200)}`);
92
+ }
93
+ const bot = await botRes.json();
94
+ const downloadUrl = bot.recordings?.[0]?.media_shortcuts?.transcript?.data?.download_url;
95
+ if (!downloadUrl) {
96
+ // Recording / transcript not ready yet — pre-call, just-joined, or
97
+ // recording_done event hasn't fired. Empty result is expected here.
98
+ return [];
99
+ }
100
+ const txRes = await fetch(downloadUrl);
101
+ if (!txRes.ok) {
102
+ const err = await txRes.text().catch(() => '');
103
+ throw new Error(`Recall.ai transcript download failed: ${txRes.status} ${err.substring(0, 200)}`);
91
104
  }
92
- const turns = await res.json();
105
+ const turns = await txRes.json();
93
106
  return Array.isArray(turns) ? turns : [];
94
107
  }
95
108
  async leaveMeeting(botId) {
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "osborn",
3
- "version": "0.9.43",
3
+ "version": "0.9.44",
4
4
  "description": "Voice AI coding assistant - local agent that connects to Osborn frontend",
5
5
  "type": "module",
6
6
  "bin": {