openbuilder 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 OpenBuilder Contributors
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,309 @@
1
+ # OpenBuilder
2
+
3
+ **Open-source AI meeting assistant** — a Read AI alternative that joins your Google Meet meetings, captures live transcripts, and generates AI-powered meeting reports.
4
+
5
+ Get meeting summaries, action items, key decisions, and speaker analytics — all from your terminal.
6
+
7
+ ## Features
8
+
9
+ - **Audio Capture + Whisper** — Captures browser audio via PulseAudio virtual sink, transcribes with OpenAI Whisper (no captions needed)
10
+ - **Caption Scraping Fallback** — DOM-based caption capture for systems without PulseAudio/ffmpeg
11
+ - **Google Meet Bot** — Headless Chromium bot joins meetings automatically
12
+ - **AI Meeting Reports** — Post-meeting analysis with summaries, chapters, action items, decisions, and questions
13
+ - **Speaker Analytics** — Talk time, word count, and participation percentages per speaker
14
+ - **Multiple AI Providers** — Claude (Anthropic) and OpenAI supported; bring your own API key
15
+ - **Standalone Analysis** — Run `summarize` or `report` on any transcript file, not just live meetings
16
+ - **OpenClaw Skill** — Integrates as a skill for OpenClaw agents
17
+ - **Privacy-First** — Your data stays local; AI processing only happens with your own API keys
18
+
19
+ ## Quick Start
20
+
21
+ ### 1. Install
22
+
23
+ ```bash
24
+ npx openbuilder
25
+ ```
26
+
27
+ This installs the OpenClaw skill and Chromium browser.
28
+
29
+ ### 2. Configure AI (Optional but Recommended)
30
+
31
+ ```bash
32
+ # Set your AI provider API key for meeting reports
33
+ npx openbuilder config set anthropicApiKey sk-ant-your-key-here
34
+ # Or use OpenAI instead
35
+ npx openbuilder config set openaiApiKey sk-your-key-here
36
+ npx openbuilder config set aiProvider openai
37
+ ```
38
+
39
+ Or use environment variables:
40
+
41
+ ```bash
42
+ export ANTHROPIC_API_KEY=sk-ant-your-key-here
43
+ # or
44
+ export OPENAI_API_KEY=sk-your-key-here
45
+ ```
46
+
47
+ ### 3. Join a Meeting
48
+
49
+ ```bash
50
+ # As a guest (host must admit)
51
+ npx openbuilder join https://meet.google.com/abc-defg-hij --anon --bot-name "Meeting Bot"
52
+
53
+ # With Google authentication (no admission needed)
54
+ npx openbuilder auth # one-time setup
55
+ npx openbuilder join https://meet.google.com/abc-defg-hij --auth
56
+ ```
57
+
58
+ ### 4. Get Results
59
+
60
+ ```bash
61
+ # View transcript
62
+ npx openbuilder transcript
63
+
64
+ # Quick AI summary
65
+ npx openbuilder summarize
66
+
67
+ # Full meeting report (summary + action items + decisions + analytics)
68
+ npx openbuilder report
69
+ ```
70
+
71
+ ## Commands
72
+
73
+ | Command | Description |
74
+ |---------|-------------|
75
+ | `openbuilder` / `openbuilder install` | Install skill + Chromium |
76
+ | `openbuilder join <url> [options]` | Join a Google Meet meeting |
77
+ | `openbuilder auth` | Save Google session for authenticated joins |
78
+ | `openbuilder transcript [--last N]` | Print the latest transcript |
79
+ | `openbuilder screenshot` | Take an on-demand screenshot |
80
+ | `openbuilder summarize [path]` | AI summary of a transcript |
81
+ | `openbuilder report [path]` | Full AI meeting report |
82
+ | `openbuilder config [subcommand]` | Manage configuration |
83
+ | `openbuilder help` | Show help |
84
+
85
+ ### Join Options
86
+
87
+ ```
88
+ --auth Join using saved Google account
89
+ --anon Join as a guest (requires --bot-name)
90
+ --bot-name Guest display name
91
+ --duration Auto-leave after duration (30m, 1h, etc.)
92
+ --audio Force audio capture mode (PulseAudio + Whisper)
93
+ --captions Force caption scraping mode (DOM-based fallback)
94
+ --headed Show browser window (debugging)
95
+ --camera Join with camera on (default: off)
96
+ --mic Join with microphone on (default: off)
97
+ --no-report Skip auto-report after meeting ends
98
+ --verbose Show real-time transcript output
99
+ ```
100
+
101
+ By default, capture mode is `auto`: uses audio capture if PulseAudio, ffmpeg, and `OPENAI_API_KEY` are all available. Otherwise falls back to caption scraping.
102
+
103
+ ## AI Meeting Reports
104
+
105
+ When a meeting ends (or when you run `openbuilder report`), OpenBuilder generates a structured report:
106
+
107
+ ```markdown
108
+ # Meeting Report: abc-defg-hij — 2026-03-12
109
+
110
+ ## Summary
111
+ The team discussed the Q1 roadmap, focusing on three main initiatives...
112
+
113
+ ## Chapters
114
+ 1. [14:30] Sprint Planning — Review of current sprint velocity
115
+ 2. [14:42] Q1 Roadmap — Discussion of three main initiatives
116
+ 3. [14:55] Resource Allocation — Team capacity and hiring plans
117
+
118
+ ## Action Items
119
+ - [ ] Draft Q1 roadmap document (@Alice)
120
+ - [ ] Schedule design review for new dashboard (@Bob)
121
+ - [ ] Research caching solutions and present options (@Carol)
122
+
123
+ ## Key Decisions
124
+ - Decided to prioritize the dashboard redesign over the API refactor
125
+ - Agreed to move to bi-weekly sprints starting next month
126
+
127
+ ## Key Questions
128
+ - Should we invest in automated testing infrastructure? (unanswered)
129
+ - When is the design system migration complete? (answered)
130
+
131
+ ## Speaker Analytics
132
+ | Speaker | Talk Time | % of Meeting | Words |
133
+ |---------|-----------|--------------|-------|
134
+ | Alice | 12:30 | 45% | 1,850 |
135
+ | Bob | 8:15 | 30% | 1,200 |
136
+ | Carol | 6:45 | 25% | 980 |
137
+
138
+ ## Metadata
139
+ - Duration: 27:30
140
+ - Participants: 3
141
+ ```
142
+
143
+ ## Configuration
144
+
145
+ ### Config File
146
+
147
+ Settings are stored in `~/.openbuilder/config.json`.
148
+
149
+ ```bash
150
+ # View all settings
151
+ npx openbuilder config
152
+
153
+ # Set values
154
+ npx openbuilder config set aiProvider claude
155
+ npx openbuilder config set anthropicApiKey sk-ant-...
156
+ npx openbuilder config set openaiApiKey sk-...
157
+ npx openbuilder config set botName "My Meeting Bot"
158
+ npx openbuilder config set defaultDuration 60m
159
+ npx openbuilder config set captureMode audio
160
+ npx openbuilder config set whisperModel whisper-1
161
+
162
+ # Get a value
163
+ npx openbuilder config get aiProvider
164
+
165
+ # Delete a value
166
+ npx openbuilder config delete defaultDuration
167
+ ```
168
+
169
+ ### Environment Variables
170
+
171
+ Environment variables override config file values:
172
+
173
+ | Variable | Config Key | Description |
174
+ |----------|-----------|-------------|
175
+ | `OPENBUILDER_AI_PROVIDER` | `aiProvider` | `claude` or `openai` |
176
+ | `ANTHROPIC_API_KEY` | `anthropicApiKey` | Anthropic API key |
177
+ | `OPENAI_API_KEY` | `openaiApiKey` | OpenAI API key (also used for Whisper) |
178
+ | `OPENBUILDER_BOT_NAME` | `botName` | Default bot name |
179
+ | `OPENBUILDER_DEFAULT_DURATION` | `defaultDuration` | Default meeting duration |
180
+ | `OPENBUILDER_CAPTURE_MODE` | `captureMode` | `audio`, `captions`, or `auto` (default) |
181
+ | `OPENBUILDER_WHISPER_MODEL` | `whisperModel` | Whisper model name (default `whisper-1`) |
182
+
183
+ ## Authentication
184
+
185
+ The bot can join meetings in two modes:
186
+
187
+ ### Guest Mode (default)
188
+ ```bash
189
+ npx openbuilder join <url> --anon --bot-name "Meeting Bot"
190
+ ```
191
+ The bot joins as a guest. The meeting host must admit the bot.
192
+
193
+ ### Authenticated Mode
194
+ ```bash
195
+ # One-time setup: sign into Google in a browser window
196
+ npx openbuilder auth
197
+
198
+ # Join as your Google account (no host admission needed)
199
+ npx openbuilder join <url> --auth
200
+ ```
201
+
202
+ The auth command opens a headed Chromium browser. Sign into your Google account, then press Enter. The session is saved to `~/.openbuilder/auth.json` and reused for future joins. Re-run if the session expires.
203
+
204
+ ### Automated Auth (Headless Servers)
205
+
206
+ For headless servers or automated bots, use `--auto` mode with credentials in `.env`:
207
+
208
+ ```bash
209
+ echo "GOOGLE_EMAIL=you@gmail.com" >> .env
210
+ echo "GOOGLE_PASSWORD=yourpassword" >> .env
211
+ npx openbuilder auth --auto
212
+ ```
213
+
214
+ This signs in non-interactively and saves the session to `~/.openbuilder/auth.json`.
215
+
216
+ ## Standalone Transcript Analysis
217
+
218
+ OpenBuilder's AI analysis works on any transcript file in `[HH:MM:SS] Speaker: text` format — not just from live meetings:
219
+
220
+ ```bash
221
+ # Summarize any transcript
222
+ npx openbuilder summarize ~/meetings/standup-2026-03-12.txt
223
+
224
+ # Full report on any transcript
225
+ npx openbuilder report ~/meetings/standup-2026-03-12.txt
226
+ ```
227
+
228
+ ## File Locations
229
+
230
+ | Path | Description |
231
+ |------|-------------|
232
+ | `~/.openbuilder/config.json` | Configuration file |
233
+ | `~/.openbuilder/auth.json` | Saved Google session |
234
+ | `~/.openbuilder/auth-meta.json` | Auth metadata (email, timestamp) |
235
+ | `~/.openbuilder/builder.pid` | Running bot PID |
236
+ | `~/.openclaw/workspace/openbuilder/transcripts/` | Live caption transcripts |
237
+ | `~/.openclaw/workspace/openbuilder/reports/` | AI meeting reports |
238
+
239
+ ## How It Works
240
+
241
+ 1. **Join**: Launches headless Chromium with stealth patches (navigator.webdriver, WebGL, plugins), navigates to the Meet URL, enters the bot name, disables camera/mic, and clicks join.
242
+
243
+ 2. **Transcript Capture** (two modes):
244
+ - **Audio mode** (default when available): Creates a PulseAudio virtual sink, routes all browser audio to it, captures audio with ffmpeg into 30-second WAV chunks, and transcribes each chunk with OpenAI Whisper. No captions needed — works like Read AI / Granola.
245
+ - **Caption mode** (fallback): Enables Google Meet's built-in live captions, then injects a MutationObserver that watches for caption DOM mutations. Speaker names are extracted from badge elements. Captions are deduplicated and written to disk with timestamps.
246
+
247
+ 3. **AI Analysis**: When the meeting ends, the transcript is sent to Claude or OpenAI with carefully designed prompts. Long transcripts are chunked and merged. The AI extracts summaries, chapters, action items (with assignee detection), key decisions, and key questions.
248
+
249
+ 4. **Speaker Analytics**: Talk time is estimated from transcript timestamps and speaking rate heuristics. Per-speaker word counts and participation percentages are calculated.
250
+
251
+ ## Requirements
252
+
253
+ - **Node.js** >= 18
254
+ - **playwright-core** (installed automatically)
255
+ - **Optional**: `@anthropic-ai/sdk` or `openai` npm package for AI features
256
+ ```bash
257
+ npm install @anthropic-ai/sdk # For Claude
258
+ npm install openai # For OpenAI (also required for audio capture mode)
259
+ ```
260
+
261
+ ### Audio Capture Mode (recommended)
262
+
263
+ For audio capture via PulseAudio + Whisper, you also need:
264
+
265
+ - **PulseAudio** — `apt install pulseaudio` (most Linux desktops have this already)
266
+ - **ffmpeg** — `apt install ffmpeg`
267
+ - **OpenAI API key** — for Whisper transcription (`OPENAI_API_KEY` env var or config)
268
+
269
+ If these aren't available, OpenBuilder automatically falls back to caption scraping.
270
+
271
+ ## OpenClaw Bot Setup
272
+
273
+ To set up OpenBuilder as an automated meeting bot (e.g. for OpenClaw agents):
274
+
275
+ 1. **Install**: Clone the repo or `npm install openbuilder`
276
+ 2. **Configure credentials** in `.env`:
277
+ ```bash
278
+ GOOGLE_EMAIL=you@gmail.com
279
+ GOOGLE_PASSWORD=yourpassword
280
+ # For AI reports:
281
+ ANTHROPIC_API_KEY=sk-ant-...
282
+ # Or: OPENAI_API_KEY=sk-...
283
+ ```
284
+ 3. **Save Google session**: `npx openbuilder auth --auto`
285
+ 4. **Join meetings**: `npx openbuilder join <url> --auth --captions`
286
+ 5. **Transcripts** saved to `~/.openclaw/workspace/openbuilder/transcripts/`
287
+ 6. **Reports** saved to `~/.openclaw/workspace/openbuilder/reports/`
288
+
289
+ ### Notes
290
+
291
+ - `.env` is gitignored — safe for credentials
292
+ - Auth session saved to `~/.openbuilder/auth.json` — re-run `auth --auto` if expired
293
+ - Caption mode (`--captions`) is the most reliable on headless servers
294
+ - Audio mode (`--audio`) requires PulseAudio + ffmpeg + OpenAI key + Xvfb (experimental on servers)
295
+
296
+ ## OpenClaw Integration
297
+
298
+ OpenBuilder ships as an OpenClaw skill. After running `npx openbuilder`, it's available to your OpenClaw agent. The agent can:
299
+
300
+ - Join meetings on your behalf
301
+ - Capture and summarize transcripts
302
+ - Generate full meeting reports
303
+ - Send screenshots to your chat
304
+
305
+ See [SKILL.md](./SKILL.md) for the full agent integration guide.
306
+
307
+ ## License
308
+
309
+ MIT
package/SKILL.md ADDED
@@ -0,0 +1,276 @@
1
+ ---
2
+ name: open-builder
3
+ description: AI meeting assistant — joins Google Meet, captures live transcripts, generates AI-powered meeting reports with summaries, action items, decisions, and speaker analytics.
4
+ homepage: https://github.com/superliangbot/openbuilder
5
+ metadata: { "openclaw": { "emoji": "📋", "requires": { "bins": ["node"] } } }
6
+ ---
7
+
8
+ # OpenBuilder
9
+
10
+ ## STOP — Ask the user before doing anything
11
+
12
+ **When the user asks to join a meeting, you MUST do these steps IN ORDER. Do NOT skip ahead.**
13
+
14
+ **Step A:** Read the file `~/.openbuilder/auth-meta.json` (it may not exist — that's fine).
15
+
16
+ **Step B:** Ask the user how they want to join. Do NOT launch the bot yet.
17
+
18
+ If auth-meta.json exists and has an `email` field, ask:
19
+
20
+ > "How would you like to join the meeting?
21
+ > 1. Join as **user@gmail.com** (authenticated — no host approval needed)
22
+ > 2. Join as **OpenBuilder Bot** (guest — host must admit)
23
+ > 3. Join with a custom name (guest)"
24
+
25
+ If auth-meta.json does NOT exist, ask:
26
+
27
+ > "What name should the bot use to join the meeting?"
28
+
29
+ Default to "OpenBuilder Bot" if the user doesn't have a preference.
30
+
31
+ **Step C:** WAIT for the user to answer. Only then launch the bot with the right flags.
32
+
33
+ - User chose authenticated → use `--auth`
34
+ - User chose guest → use `--anon --bot-name "Their Name"` (always ask for a name)
35
+
36
+ **The bot will refuse to start without `--auth` or `--anon`.
37
+ When using `--anon`, `--bot-name` is also required.**
38
+
39
+ ---
40
+
41
+ A meeting bot that joins Google Meet meetings via Playwright browser automation,
42
+ captures live captions as a real-time transcript, and generates AI-powered meeting
43
+ reports with summaries, action items, key decisions, and speaker analytics.
44
+
45
+ ## Prerequisites
46
+
47
+ - `playwright-core` (ships with openclaw)
48
+ - Chromium browser: `npx playwright-core install chromium`
49
+ - Optional: `@anthropic-ai/sdk` or `openai` for AI report generation
50
+
51
+ ## Join a Meeting
52
+
53
+ **IMPORTANT: Always run join commands with `background:true`** — the bot is a long-running
54
+ process that stays in the meeting. Do not wait for it to complete; background it immediately
55
+ and poll for status updates.
56
+
57
+ ### Launch command
58
+
59
+ ```bash
60
+ exec background:true command:"npx openbuilder join https://meet.google.com/abc-defg-hij --auth|--anon --channel <current-channel> --target <current-chat-id>"
61
+ ```
62
+
63
+ **IMPORTANT:** Always pass `--channel` and `--target` from the current conversation context.
64
+ The bot uses these to send screenshots and status updates directly to the user's chat.
65
+
66
+ Options (required — bot will error without one):
67
+
68
+ - `--auth` — join using saved Google account (~/.openbuilder/auth.json)
69
+ - `--anon --bot-name "Name"` — join as a guest with this display name (both required together)
70
+
71
+ Other options:
72
+
73
+ - `--headed` — show the browser window (for debugging)
74
+ - `--camera` — join with camera on (default: off)
75
+ - `--mic` — join with microphone on (default: off)
76
+ - `--duration 60m` — auto-leave after duration (supports ms/s/m/h)
77
+ - `--no-report` — skip auto-report generation when meeting ends
78
+ - `--verbose` — show real-time caption output
79
+
80
+ ## Live Caption Transcript
81
+
82
+ Captions are automatically captured whenever the bot is in a meeting. After joining,
83
+ the bot enables Google Meet's built-in live captions and captures the text via a
84
+ MutationObserver. Captions are deduplicated and flushed to a transcript file every 5 seconds.
85
+
86
+ **Transcript location:** `~/.openclaw/workspace/openbuilder/transcripts/<meeting-id>.txt`
87
+
88
+ **Format:**
89
+ ```
90
+ [14:30:05] Alice: Hey everyone, let's get started
91
+ [14:30:12] Bob: Sounds good, I have the updates ready
92
+ [14:30:25] Alice: Great, go ahead
93
+ ```
94
+
95
+ ## Get Transcript (what are they saying?)
96
+
97
+ **When the user asks "what are they saying?", "what's happening?", "summarize the meeting",
98
+ or anything about meeting content — run this script. Do NOT use builder-screenshot.ts for this.**
99
+
100
+ ```bash
101
+ exec command:"npx openbuilder transcript"
102
+ ```
103
+
104
+ Use `--last 20` to get only the last 20 lines (for long meetings).
105
+
106
+ Read the output and summarize it for the user in natural language.
107
+
108
+ ## Take a Screenshot (visual context only)
109
+
110
+ If the user asks to **see** the meeting (e.g. "send me a screenshot", "what does it look like"):
111
+
112
+ ```bash
113
+ exec command:"npx openbuilder screenshot"
114
+ ```
115
+
116
+ Send the screenshot image to the user via `message`. Do NOT read the screenshot yourself.
117
+
118
+ ## AI Summary (quick summary)
119
+
120
+ When the user asks for a summary and the meeting is over (or for a standalone transcript):
121
+
122
+ ```bash
123
+ exec command:"npx openbuilder summarize"
124
+ exec command:"npx openbuilder summarize /path/to/transcript.txt"
125
+ ```
126
+
127
+ Returns a 3-5 paragraph summary of the meeting.
128
+
129
+ ## Full Meeting Report (summary + actions + decisions + analytics)
130
+
131
+ For a comprehensive meeting report with all intelligence:
132
+
133
+ ```bash
134
+ exec command:"npx openbuilder report"
135
+ exec command:"npx openbuilder report /path/to/transcript.txt"
136
+ ```
137
+
138
+ Generates and saves a markdown report with:
139
+ - Meeting summary with chapters
140
+ - Action items with assignee detection
141
+ - Key decisions
142
+ - Key questions (answered/unanswered)
143
+ - Speaker talk-time analytics
144
+
145
+ Report saved to: `~/.openclaw/workspace/openbuilder/reports/<meeting-id>-report.md`
146
+
147
+ ## Configuration
148
+
149
+ ```bash
150
+ exec command:"npx openbuilder config"
151
+ exec command:"npx openbuilder config set anthropicApiKey sk-ant-..."
152
+ exec command:"npx openbuilder config set aiProvider claude"
153
+ ```
154
+
155
+ Keys: `aiProvider`, `anthropicApiKey`, `openaiApiKey`, `botName`, `defaultDuration`
156
+
157
+ ## How It Works
158
+
159
+ 1. **Join**: Launches headless Chromium, navigates to the Meet URL, enters the bot name, clicks "Ask to join", and waits for host admission.
160
+
161
+ 2. **Caption capture**: After joining, the bot clicks the CC button to enable live captions, then injects a MutationObserver to capture caption text from the DOM. Captions are deduplicated and written to a transcript file.
162
+
163
+ 3. **AI Report** (automatic): When the meeting ends, if an AI API key is configured, the bot automatically processes the transcript through Claude or OpenAI to generate a structured meeting report.
164
+
165
+ ## Authentication (Optional)
166
+
167
+ By default the bot joins as a guest and needs host admission. To join as an authenticated
168
+ Google user (no admission needed), run the auth script once:
169
+
170
+ ```bash
171
+ npx openbuilder auth
172
+ ```
173
+
174
+ This opens a headed browser — sign into Google, then press Enter. The session is saved to
175
+ `~/.openbuilder/auth.json` and automatically loaded on future joins. Re-run if the session expires.
176
+
177
+ ## Files
178
+
179
+ - `~/.openbuilder/auth.json` — saved Google session (cookies + localStorage)
180
+ - `~/.openbuilder/auth-meta.json` — email + timestamp
181
+ - `~/.openbuilder/config.json` — bot configuration
182
+ - `~/.openbuilder/chrome-profile/` — persistent Chromium profile
183
+ - `~/.openbuilder/builder.pid` — running bot PID
184
+ - `~/.openclaw/workspace/openbuilder/transcripts/` — live caption transcripts
185
+ - `~/.openclaw/workspace/openbuilder/reports/` — AI meeting reports
186
+ - `~/.openclaw/workspace/openbuilder/on-demand-screenshot.png` — on-demand screenshot
187
+ - `~/.openclaw/workspace/openbuilder/joined-meeting.png` — confirmation screenshot
188
+ - `~/.openclaw/workspace/openbuilder/debug-*.png` — failure screenshots
189
+
190
+ ## Agent Behavior — MANDATORY
191
+
192
+ After launching the bot with `exec background:true`, you MUST poll the process
193
+ to check for success/failure and send screenshots back to the user.
194
+
195
+ ### Step 1: Poll for output
196
+
197
+ After starting the background exec, poll the process every 10-15 seconds:
198
+
199
+ ```
200
+ process action:poll
201
+ ```
202
+
203
+ ### Step 2: Parse markers and send images using the message tool
204
+
205
+ The bot prints machine-readable markers. When you see them, you MUST use the
206
+ `message` tool to send the screenshot image to the user.
207
+
208
+ **On success** — bot prints `[OPENBUILDER_SUCCESS_IMAGE] <path>`:
209
+
210
+ ```
211
+ message action:"send" media:"./openbuilder/joined-meeting.png" content:"Successfully joined the meeting!"
212
+ ```
213
+
214
+ **On screenshot request** — bot prints `[OPENBUILDER_SCREENSHOT] <path>`:
215
+
216
+ ```
217
+ message action:"send" media:"./openbuilder/on-demand-screenshot.png" content:"Here's the current meeting view"
218
+ ```
219
+
220
+ **On failure** — bot prints `[OPENBUILDER_DEBUG_IMAGE] <path>`:
221
+
222
+ ```
223
+ message action:"send" media:"./openbuilder/debug-join-failed.png" content:"Could not join the meeting. Here is what the bot saw"
224
+ ```
225
+
226
+ **On report generated** — bot prints `[OPENBUILDER_REPORT] <path>`:
227
+
228
+ Read the report file and share a formatted summary with the user.
229
+
230
+ **CRITICAL: ALWAYS use the `message` tool with `media:"./openbuilder/<filename>.png"` to send screenshots.**
231
+ Use relative paths only (starting with `./`). Never use absolute paths or ~ paths.
232
+
233
+ ### Step 3: When user asks about meeting content
234
+
235
+ **CRITICAL: When the user asks what's happening, what someone said, or anything about
236
+ meeting content — run `builder-transcript.ts`. NEVER use `builder-screenshot.ts` for this.**
237
+
238
+ ```bash
239
+ exec command:"npx openbuilder transcript"
240
+ ```
241
+
242
+ ### Step 4: When meeting ends
243
+
244
+ When the bot reports the meeting has ended:
245
+ 1. Run `npx openbuilder transcript` to get the full transcript
246
+ 2. If a report was auto-generated (`[OPENBUILDER_REPORT]` marker), read and share it
247
+ 3. If no report was generated, offer to run `npx openbuilder report` for the user
248
+
249
+ ### When to use which command
250
+
251
+ | User asks... | Use this command |
252
+ |-------------------------------------------|-------------------------------|
253
+ | "what are they saying?" | `openbuilder transcript` |
254
+ | "what's happening in the meeting?" | `openbuilder transcript` |
255
+ | "summarize the meeting" | `openbuilder summarize` |
256
+ | "give me a full report" | `openbuilder report` |
257
+ | "what are the action items?" | `openbuilder report` |
258
+ | "send me a screenshot" | `openbuilder screenshot` |
259
+ | "what does the meeting look like?" | `openbuilder screenshot` |
260
+ | "what did they talk about?" | `openbuilder transcript` |
261
+
262
+ **NEVER read or analyze screenshot images to understand meeting content.**
263
+
264
+ ## Headless VM Tips
265
+
266
+ - Chrome flags `--use-fake-ui-for-media-stream` and `--use-fake-device-for-media-stream` are set automatically.
267
+ - No X11/Wayland display is required — runs fully headless.
268
+ - Use `--duration` to auto-leave after a set time.
269
+
270
+ ## Troubleshooting
271
+
272
+ - **Join button not found**: Google Meet UI changes occasionally. The debug screenshot shows what the bot saw — send it to the user.
273
+ - **Not admitted**: The bot joins as a guest and needs host approval. Ask the host to admit the bot. If timed out, the debug screenshot is sent automatically.
274
+ - **No captions captured**: The CC button selector may change with Meet updates. If the transcript is empty, captions may not have been enabled. Try `--headed` to verify.
275
+ - **Headless blocked**: The bot uses stealth patches to bypass headless detection. If Google Meet blocks it, try `--headed` for debugging.
276
+ - **AI report failed**: Ensure an API key is configured via `openbuilder config set anthropicApiKey <key>` or `ANTHROPIC_API_KEY` env var.