npm - osborn - Versions diffs - 0.9.43 → 0.9.44 - Mend

osborn 0.9.43 → 0.9.44

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (4) hide show

package/.claude/skills/meetings/SKILL.md +106 -34
package/dist/recall-client.d.ts +13 -12
package/dist/recall-client.js +30 -17
package/package.json +1 -1

package/.claude/skills/meetings/SKILL.md CHANGED Viewed

@@ -1,73 +1,145 @@
 # Skill: Meetings
-Silent note-taking and TODO-tracking when osborn is sitting in a live meeting.
+Silent note-taking and TODO-tracking when osborn is sitting in a live meeting,
+and explicit on-demand transcript pulls from Recall.ai when the user asks.
 ## When to use
-When a user message arrives with the prefix `[MEETING — <botId>]:` (every ~30 seconds while a Recall.ai meeting bot is active). Also use this skill when the orchestrating system injects `[SYSTEM] You are now in a meeting ...`.
+Two trigger patterns:
-**Do NOT use this skill** for normal user messages in the voice-native chat — those still get spoken responses as usual.
+**1. Auto-tagged meeting transcript chunks** (every ~30s while a Recall bot is active):
+   Any user message that starts with `[MEETING — <botId>]:`. Also a `[SYSTEM] You are now in a meeting ...` injection on bot join.
-## How to behave
+**2. Explicit user request to pull / write notes** (any of these keyphrases in voice-native chat):
+   - "grab the meeting transcripts"
+   - "pull the meeting transcripts"
+   - "fetch the meeting transcripts"
+   - "what was said in the meeting"
+   - "update the meeting notes"
+   - "compile the todos"
+   - "write the todos"
+   - "summarize the meeting"
+**Do NOT use this skill** for normal user voice-native messages that don't fit those patterns — those get spoken responses as usual.
+## How to behave (auto-tagged chunks)
 For every `[MEETING — *]:` message:
-1. **Do NOT speak.** No TTS output. No `tts_say`. No conversational reply. This is a silent observer path. If you must acknowledge that you processed the message, do it via a Write/Edit tool call (writing to the workspace), not via spoken or chat output.
-2. **Update `meeting-todos.md`** in the session workspace (`{workspace}/meeting-todos.md`). Append new action items, decisions, and open questions as they emerge in the transcript. Do not rewrite existing entries unless something contradicts.
-3. **Optionally trigger background research silently.** If a topic in the meeting would benefit from a quick web/code lookup, dispatch a researcher sub-agent via the Task tool. Save its output to `{workspace}/library/meeting-research-<topic-slug>.md`. Do NOT speak the result.
-4. **Do not consume voice-native attention.** The user can still talk to you via the voice-native browser. When they do (a normal user message with no `[MEETING — *]` prefix), respond normally — speak. Treat the meeting transcript as background context they can ask about ("what did Sarah say about pricing?" → answer normally).
+1. **Do NOT speak.** No TTS output. No conversational reply.
+2. **Update `meeting-todos.md`** in the session workspace. Append new action items, decisions, open questions. One file, evolving.
+3. **Optionally trigger background research silently** via Task tool.
+4. **Don't consume voice-native attention.** The user can interrupt with a voice-native message at any time — that's the only kind that gets spoken responses.
+## How to pull transcripts on demand (Bash + curl)
+When the user explicitly asks (see triggers above), run these commands. Speak briefly first ("On it"), do the work, then speak the result.
+### Step 1: Get the bot ID
+The bot ID is in `meeting-todos.md` on the `**Bot:**` line. If `meeting-todos.md` doesn't exist (user is asking about a meeting that already ended in a prior session), ask the user for the bot ID or meeting URL.
+### Step 2: Fetch the bot record
+```bash
+curl -sS \
+  -H "Authorization: Token ${RECALL_API_KEY}" \
+  "https://us-west-2.recall.ai/api/v1/bot/<BOT_ID>"
+```
+**CRITICAL**: The endpoint MUST be `us-west-2.recall.ai`, NOT the default `recall.ai` or `us-east-1.recall.ai`. The osborn account is provisioned in the us-west-2 region. Using the default endpoint returns 401 "OAuth authentication is currently not supported" or region-mismatch errors.
+`${RECALL_API_KEY}` is preset in the agent's env — pass it through. Do NOT echo or print the raw key value in your response.
+### Step 3: Extract the transcript download URL
+Parse the JSON response. The transcript's pre-signed S3 URL lives at:
+```
+recordings[0].media_shortcuts.transcript.data.download_url
+```
+Pipe through `jq` if needed:
+```bash
+DOWNLOAD_URL=$(curl -sS \
+  -H "Authorization: Token ${RECALL_API_KEY}" \
+  "https://us-west-2.recall.ai/api/v1/bot/<BOT_ID>" \
+  | jq -r '.recordings[0].media_shortcuts.transcript.data.download_url')
+```
+If `recordings[0]` doesn't exist yet, the meeting hasn't been processed — return "the recording isn't ready yet, give it a minute" and stop.
+### Step 4: Download the transcript JSON
+```bash
+curl -sS "$DOWNLOAD_URL" -o /tmp/meeting-transcript.json
+```
+The download URL is a pre-signed S3 link that **expires** (typically ~6 hours after issue). If you get a 403 or AccessDenied, re-fetch the bot record (step 2) to get a fresh URL.
+### Step 5: Parse and distill into meeting-todos.md
+The transcript JSON is an array of turns. Each turn has `participant.name` and `words[]` (each word has `text` + `start_timestamp.relative`). Concatenate words per turn to get the utterance.
+Use `jq` to pull turns into readable lines:
+```bash
+jq -r '.[] | "\(.participant.name // "Unknown"): \(.words | map(.text) | join(" "))"' /tmp/meeting-transcript.json
+```
+Then update `meeting-todos.md` — distill into TODOs / Decisions / Open Questions sections. Don't paste the whole transcript verbatim into the file; summarize.
 ## The `meeting-todos.md` file
+Path: `{session_workspace}/meeting-todos.md` — get the workspace path from spec.md or from the `[SYSTEM]` injection.
 Keep it scannable. Structure:
 ```markdown
 # Meeting Notes
-**Bot:** <botId> · **Started:** <ISO timestamp>
+**Bot:** <botId>
+**Started:** <ISO timestamp>
+**URL:** <meeting URL>
-## TODOs
+## Summary
+<3-5 sentences distilling the meeting after it ends — added LAST>
+## TODOs
 - [ ] <person>: <action item> — <context>
-- [ ] <person>: <action item>
 ## Decisions
-- <date/time> — <what was decided> (raised by <person>)
+- <what was decided> (raised by <person>)
 ## Open Questions
 - <question> — raised by <person>, still unresolved
-- <question> — answered by <person>: <answer>
 ## Highlights
 - <key moment or quote worth surfacing>
 ```
-Update the same file across multiple poll cycles — don't create `meeting-todos-1.md`, `meeting-todos-2.md`. One file, evolving.
-## Workspace path
-The session workspace is `~/.claude/projects/<slug>/osb/<session-uuid>/`. Read the env variable or the spec.md header if you need to confirm the exact path. Write absolute paths in tool calls (e.g. `/Users/<user>/.claude/projects/.../osb/<uuid>/meeting-todos.md`).
+Update the same file across all updates — one file, evolving. Don't create `meeting-todos-1.md`, `meeting-todos-2.md`.
 ## On meeting end
-When the user leaves the meeting (the system stops sending `[MEETING — *]:` messages and may inject `[SYSTEM] meeting ended`), do a final pass on `meeting-todos.md` to:
-- Mark items the user has clearly committed to
-- Move resolved open questions to a `## Resolved` section
-- Add a `## Summary` section at the top with 3-5 lines distilling the meeting
-Still silent. The user will ask out loud if they want a recap.
+When `[MEETING — *]:` messages stop OR the system says `[SYSTEM] meeting ended`:
+- Pull the full final transcript (step 2-4 above)
+- Add a `## Summary` section at the top with 3-5 lines
+- Mark resolved open questions
+- The next user voice-native question may be "what was the meeting about?" — answer normally (speak) from the updated file
-## When the user asks about the meeting
+## When the user asks about the meeting in voice-native
-When a non-meeting-tagged message references the meeting ("what's on the todo list?", "what did we decide about X?", "who's handling Y?"), respond normally — speak. Read `meeting-todos.md` first to ground the response. Don't make up speaker names or decisions; only state what's recorded.
+When a non-meeting-tagged voice message references the meeting ("what's on the todo list?", "what did we decide about X?"), respond normally — **speak** the answer. Read `meeting-todos.md` first to ground the response. If `meeting-todos.md` is empty or missing relevant detail, pull a fresh transcript first (steps 2-4) and update the file, then answer.
 ## Anti-patterns
-- ❌ Speaking in response to a `[MEETING — *]:` message
-- ❌ Creating a new file per poll cycle instead of updating one
-- ❌ Trying to drive the meeting (don't add "we should..." items unless someone in the meeting said them)
-- ❌ Asking the user clarifying questions during the meeting — they're not paying attention to chat
-- ❌ Re-transcribing what's in the message into the TODO file verbatim. Distill.
+- ❌ Using `recall.ai` or `us-east-1.recall.ai` — always `us-west-2.recall.ai`
+- ❌ Using `WebFetch` for the S3 download URL — use `curl` via `Bash` (the URL has weird chars + pre-signed query strings that confuse WebFetch)
+- ❌ Pasting the full raw transcript into `meeting-todos.md`
+- ❌ Speaking in response to `[MEETING — *]:` messages
+- ❌ Asking clarifying questions during a live meeting
+- ❌ Creating a new file per pull instead of updating one
+- ❌ Re-pulling the bot record over and over inside one user turn — fetch once, parse once
+- ❌ Echoing or printing `${RECALL_API_KEY}` value in your response

package/dist/recall-client.d.ts CHANGED Viewed

@@ -84,20 +84,21 @@ export declare class RecallClient extends EventEmitter {
     }): Promise<string>;
     /**
      * Fetch the bot's current transcript. Returns an array of "transcript turns"
-     * (each turn = one speaker's utterance) sorted by start time. Use the bot's
-     * `recordings[0].id` from getBotStatus / bot record to locate the recording,
-     * then list its transcripts.
+     * (each turn = one speaker's utterance) sorted by start time.
      *
-     * Per Recall docs:
-     *   GET /api/v1/bot/{bot_id} → bot record incl. `recordings: [...]`
-     *   GET /api/v1/transcript/{transcript_id} → transcript with download_url
-     *   Download the transcript JSON from download_url to get the actual content.
+     * Verified 2026-05-22 against the real us-west-2 API: there is NO simple
+     * `GET /bot/{id}/transcript` convenience endpoint. The actual chain is:
      *
-     * For the polling use case (called every ~30s), we use the simpler combined
-     * endpoint: `GET /api/v1/bot/{bot_id}/transcript` which Recall exposes as a
-     * convenience and returns the full transcript so far in one call. The caller
-     * is responsible for de-duping (keeping a since-cursor) so the LLM only sees
-     * new turns.
+     *   1. GET /api/v1/bot/{bot_id}
+     *   2. recordings[0].media_shortcuts.transcript.data.download_url   (S3 signed URL)
+     *   3. GET that URL  →  JSON array of TranscriptTurn objects
+     *
+     * The S3 URL is pre-signed and expires (~6h). Re-fetch step 1 each poll;
+     * don't cache the URL.
+     *
+     * If `recordings[0]` doesn't exist yet (bot still joining or pre-recording),
+     * returns []. Caller (MeetingTranscriptPoller) treats that as "no new turns
+     * yet" and waits for the next tick.
      */
     getTranscript(botId: string): Promise<TranscriptTurn[]>;
     leaveMeeting(botId: string): Promise<void>;

package/dist/recall-client.js CHANGED Viewed

@@ -66,30 +66,43 @@ export class RecallClient extends EventEmitter {
     }
     /**
      * Fetch the bot's current transcript. Returns an array of "transcript turns"
-     * (each turn = one speaker's utterance) sorted by start time. Use the bot's
-     * `recordings[0].id` from getBotStatus / bot record to locate the recording,
-     * then list its transcripts.
+     * (each turn = one speaker's utterance) sorted by start time.
      *
-     * Per Recall docs:
-     *   GET /api/v1/bot/{bot_id} → bot record incl. `recordings: [...]`
-     *   GET /api/v1/transcript/{transcript_id} → transcript with download_url
-     *   Download the transcript JSON from download_url to get the actual content.
+     * Verified 2026-05-22 against the real us-west-2 API: there is NO simple
+     * `GET /bot/{id}/transcript` convenience endpoint. The actual chain is:
      *
-     * For the polling use case (called every ~30s), we use the simpler combined
-     * endpoint: `GET /api/v1/bot/{bot_id}/transcript` which Recall exposes as a
-     * convenience and returns the full transcript so far in one call. The caller
-     * is responsible for de-duping (keeping a since-cursor) so the LLM only sees
-     * new turns.
+     *   1. GET /api/v1/bot/{bot_id}
+     *   2. recordings[0].media_shortcuts.transcript.data.download_url   (S3 signed URL)
+     *   3. GET that URL  →  JSON array of TranscriptTurn objects
+     *
+     * The S3 URL is pre-signed and expires (~6h). Re-fetch step 1 each poll;
+     * don't cache the URL.
+     *
+     * If `recordings[0]` doesn't exist yet (bot still joining or pre-recording),
+     * returns []. Caller (MeetingTranscriptPoller) treats that as "no new turns
+     * yet" and waits for the next tick.
      */
     async getTranscript(botId) {
-        const res = await fetch(`${RECALL_BASE_URL}/bot/${botId}/transcript`, {
+        const botRes = await fetch(`${RECALL_BASE_URL}/bot/${botId}`, {
             headers: { 'Authorization': `Token ${this.#apiKey}` },
         });
-        if (!res.ok) {
-            const err = await res.text().catch(() => '');
-            throw new Error(`Recall.ai transcript fetch failed: ${res.status} ${err.substring(0, 200)}`);
+        if (!botRes.ok) {
+            const err = await botRes.text().catch(() => '');
+            throw new Error(`Recall.ai bot fetch failed: ${botRes.status} ${err.substring(0, 200)}`);
+        }
+        const bot = await botRes.json();
+        const downloadUrl = bot.recordings?.[0]?.media_shortcuts?.transcript?.data?.download_url;
+        if (!downloadUrl) {
+            // Recording / transcript not ready yet — pre-call, just-joined, or
+            // recording_done event hasn't fired. Empty result is expected here.
+            return [];
+        }
+        const txRes = await fetch(downloadUrl);
+        if (!txRes.ok) {
+            const err = await txRes.text().catch(() => '');
+            throw new Error(`Recall.ai transcript download failed: ${txRes.status} ${err.substring(0, 200)}`);
         }
-        const turns = await res.json();
+        const turns = await txRes.json();
         return Array.isArray(turns) ? turns : [];
     }
     async leaveMeeting(botId) {

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "osborn",
-  "version": "0.9.43",
+  "version": "0.9.44",
   "description": "Voice AI coding assistant - local agent that connects to Osborn frontend",
   "type": "module",
   "bin": {