pagebolt-mcp 1.5.2 → 1.6.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +2 -0
  2. package/package.json +2 -2
  3. package/src/index.mjs +32 -12
package/README.md CHANGED
@@ -8,6 +8,8 @@ Take screenshots, generate PDFs, create OG images, inspect pages, and record dem
8
8
 
9
9
  **Works with Claude Desktop, Cursor, Windsurf, Cline, and any MCP-compatible client.**
10
10
 
11
+ <img width="1280" height="1279" alt="pagebolt-screenshot_1" src="https://github.com/user-attachments/assets/fd21a372-df4d-41cd-baf4-5b6dd6a9a685" />
12
+
11
13
  ---
12
14
 
13
15
  ## What It Does
package/package.json CHANGED
@@ -1,7 +1,7 @@
1
1
  {
2
2
  "name": "pagebolt-mcp",
3
- "version": "1.5.2",
4
- "description": "MCP server for PageBolt — take screenshots, generate PDFs, create OG images, inspect pages, and record demo videos from AI coding assistants like Claude, Cursor, and Windsurf.",
3
+ "version": "1.6.1",
4
+ "description": "MCP server for PageBolt — take screenshots, generate PDFs, create OG images, inspect pages, record demo videos with Audio Guide narration, from AI coding assistants like Claude, Cursor, and Windsurf.",
5
5
  "main": "src/index.mjs",
6
6
  "module": "src/index.mjs",
7
7
  "bin": {
package/src/index.mjs CHANGED
@@ -168,11 +168,17 @@ record_video supports polished video output:
168
168
  - frame: { enabled: true, style: "macos" } — browser chrome around the video
169
169
  - background: { enabled: true, type: "gradient", gradient: "ocean" } — gradient/glass background with padding
170
170
  - cursor: { style: "classic", persist: true } — always-visible cursor
171
- - Per-step zoom: add zoom: { enabled: true } on click steps
172
171
  - **Step notes (IMPORTANT)**: Add a "note" field to EVERY action step for guided-tour-style tooltip annotations. Notes appear as beautiful styled tooltips near the element being interacted with. Example: { action: "click", selector: "#btn", note: "Click here to open settings" }. The only steps that should NOT have notes are wait/wait_for pauses.
173
- - **Live wait steps**: Add live: true to wait steps to capture animated content (transitions, loading spinners) instead of freezing the last frame.
172
+ - **Audio Guide**: Add audioGuide: { enabled: true, script: "Welcome. {{1}} Click here. {{2}} Done." } for AI voice narration. Two modes: (1) Per-step — add "narration" text to individual steps. (2) Script provide a single "script" with {{N}} markers for continuous narration synchronized to steps.
173
+ - Audio Guide voices: ava, andrew, emma, brian, aria, guy, jenny, davis, christopher, michelle (Azure) or alloy, echo, fable, nova, onyx, shimmer (OpenAI).
174
174
  - **Variables**: Pass variables: { "base_url": "https://example.com" } and use {{base_url}} in step URLs/values for reusable recordings.
175
175
 
176
+ ## IMPORTANT: Video Step Best Practices
177
+
178
+ - **Do NOT add wait steps between every action.** The "pace" parameter already adds natural pauses between steps. Only use wait when: (1) the page needs time to load after navigation, or (2) you want to hold on a view for narration. A typical video should have very few wait steps.
179
+ - **Do NOT use zoom unless the user explicitly asks for it.** Zoom adds visual complexity and encoding time. Omit zoom entirely by default.
180
+ - **Keep videos concise.** A good demo has 5-15 action steps (navigate, click, fill, hover, scroll). More steps = longer encoding time and larger files.
181
+
176
182
  ## Common Parameters (available on most tools)
177
183
 
178
184
  - blockBanners: true — hides cookie consent banners (GDPR popups, OneTrust, CookieBot, etc.)
@@ -563,7 +569,7 @@ server.tool(
563
569
  // ═══════════════════════════════════════════════════════════════════
564
570
  server.tool(
565
571
  'record_video',
566
- 'Record a professional demo video of a multi-step browser automation sequence. Produces MP4/WebM/GIF with cursor highlighting, click effects, smooth movement, per-step zoom, step notes, browser frame (macOS/Windows), gradient/glass backgrounds, and more. Costs 3 API requests. Saves to disk.',
572
+ 'Record a professional demo video of a multi-step browser automation sequence. Produces MP4/WebM/GIF with cursor highlighting, click effects, smooth movement, step notes, browser frame (macOS/Windows), gradient/glass backgrounds, and more. Costs 3 API requests. Saves to disk. BEST PRACTICE: Keep videos concise (5-15 action steps). Do NOT add wait steps between every action — the pace parameter handles timing. Only use wait for page loads or narration holds. Do NOT use zoom unless the user explicitly asks for it.',
567
573
  {
568
574
  steps: z.array(
569
575
  z.object({
@@ -574,19 +580,20 @@ server.tool(
574
580
  url: z.string().url().optional().describe('URL to navigate to (for navigate action)'),
575
581
  selector: z.string().optional().describe('CSS selector for the target element'),
576
582
  value: z.string().optional().describe('Value to type or select'),
577
- ms: z.number().int().min(0).max(10000).optional().describe('Milliseconds to wait (for wait action)'),
583
+ ms: z.number().int().min(0).max(10000).optional().describe('Milliseconds to wait (for wait action). Only use wait steps when the page needs loading time or to hold for narration — the pace parameter handles inter-step timing automatically.'),
578
584
  timeout: z.number().int().min(0).max(15000).optional().describe('Timeout in ms for wait_for (default: 10000)'),
579
585
  x: z.number().optional().describe('Horizontal scroll position'),
580
586
  y: z.number().optional().describe('Vertical scroll position'),
581
587
  script: z.string().max(5000).optional().describe('JavaScript to execute in page context (for evaluate action)'),
582
588
  note: z.string().max(200).optional().describe('Tooltip annotation text shown during this step (max 200 chars)'),
589
+ narration: z.string().max(500).optional().describe('Text to speak at this step (max 500 chars, requires audioGuide.enabled). Used in per-step mode.'),
583
590
  live: z.boolean().optional().describe('For wait steps: true captures animated content in real-time, false freezes a single frame (default: false)'),
584
591
  zoom: z.object({
585
- enabled: z.boolean().optional().describe('Enable zoom on this step (default: false)'),
592
+ enabled: z.boolean().optional().describe('Enable zoom on this step (default: false). Only use when user explicitly requests zoom.'),
586
593
  level: z.number().min(1.2).max(4).optional().describe('Zoom magnification (inherits from global zoom.level if not set)'),
587
- }).optional().describe('Per-step zoom override (for click/dblclick steps). Overrides global zoom settings.'),
594
+ }).optional().describe('Per-step zoom override. Do NOT add zoom unless the user specifically requests it — it adds encoding time and visual complexity.'),
588
595
  })
589
- ).min(1).max(50).describe('Array of steps to execute and record. Max steps depends on plan (10-50).'),
596
+ ).min(1).max(50).describe('Array of action steps to record. Keep concise: 5-15 steps is ideal. Do NOT pad with wait steps pace handles timing.'),
590
597
  viewport: z.object({
591
598
  width: z.number().int().min(320).max(3840).optional().describe('Viewport width (default: 1280)'),
592
599
  height: z.number().int().min(200).max(2160).optional().describe('Viewport height (default: 720)'),
@@ -609,8 +616,8 @@ server.tool(
609
616
  level: z.number().min(1.2).max(4).optional().describe('Default zoom magnification (default: 1.5)'),
610
617
  duration: z.number().int().min(400).max(3000).optional().describe('Zoom animation duration in ms (default: 1200)'),
611
618
  easing: z.enum(['ease-in-out', 'linear', 'ease']).optional().describe('Zoom animation easing (default: ease-in-out)'),
612
- }).optional().describe('Global zoom settings. Per-step zoom on click/dblclick steps overrides these.'),
613
- autoZoom: z.boolean().optional().describe('Shorthand: set to true to enable auto-zoom with defaults (same as zoom.enabled=true)'),
619
+ }).optional().describe('Global zoom settings. Only use when the user explicitly requests zoom. Do NOT enable by default.'),
620
+ autoZoom: z.boolean().optional().describe('Enable auto-zoom on all clicks (default: false). Only use when user explicitly requests zoom.'),
614
621
  // ── Click effects ──
615
622
  clickEffect: z.object({
616
623
  enabled: z.boolean().optional().describe('Show click ripple effects (default: true)'),
@@ -649,6 +656,19 @@ server.tool(
649
656
  blockChats: z.boolean().optional().describe('Block live chat widgets'),
650
657
  blockTrackers: z.boolean().optional().describe('Block tracking scripts'),
651
658
  deviceScaleFactor: z.number().min(1).max(3).optional().describe('Device pixel ratio (default: 1)'),
659
+ // ── Audio Guide ──
660
+ audioGuide: z.object({
661
+ enabled: z.boolean().optional().describe('Enable Audio Guide narration'),
662
+ provider: z.enum(['azure', 'openai']).optional().describe('TTS provider (default: azure)'),
663
+ voice: z.string().optional().describe('Voice preset: ava, andrew, emma, brian, aria, guy, jenny, davis, christopher, michelle (azure) or alloy, echo, fable, nova, onyx, shimmer (openai)'),
664
+ speed: z.number().min(0.5).max(2.0).optional().describe('Speech rate (default: 1.0)'),
665
+ pitch: z.string().optional().describe('Voice pitch: default, x-low, low, medium, high, x-high (Azure only)'),
666
+ volume: z.string().optional().describe('Audio volume: default, silent, x-soft, soft, medium, loud, x-loud (Azure only)'),
667
+ style: z.string().optional().describe('Speaking style: narration-professional, cheerful, excited, friendly, etc. (Azure only)'),
668
+ styleDegree: z.number().min(0.01).max(2.0).optional().describe('Style intensity 0.01-2.0 (Azure only)'),
669
+ model: z.enum(['tts-1', 'tts-1-hd']).optional().describe('OpenAI model (OpenAI only, default: tts-1)'),
670
+ script: z.string().max(5000).optional().describe('Script mode: a single narration script with {{N}} step markers (0-indexed) for synchronized narration. Steps execute when narration reaches each marker. When provided, per-step "narration" fields are ignored.'),
671
+ }).optional().describe('Audio Guide TTS settings. Two modes: (1) Per-step — add "narration" to individual steps. (2) Script — provide "script" with {{N}} markers for continuous narration synchronized to steps.'),
652
672
  variables: z.record(z.string()).optional().describe('Key-value map for variable substitution in step URLs/values. E.g. { "base_url": "https://example.com" } replaces {{base_url}} in steps.'),
653
673
  saveTo: z.string().optional().describe('Output file path (default: ./recording.mp4)'),
654
674
  },
@@ -1023,15 +1043,15 @@ Please follow this workflow:
1023
1043
 
1024
1044
  Important tips:
1025
1045
  - Use selectors from the inspect_page results — never guess selectors
1026
- - Add wait steps (ms: 800-1200) between interactions for visual clarity
1027
- - Use wait_for after navigation to ensure the page loads
1046
+ - Do NOT add wait steps between every action — the pace parameter already handles timing between steps. Only use wait when: (1) the page needs time to load new content after navigation, or (2) you need to hold on a view for narration.
1047
+ - Do NOT use zoom unless I specifically ask for it
1028
1048
  - **ALWAYS add a "note" field on every meaningful step** — notes render as styled tooltip annotations that explain what's happening, creating a guided tour experience. Examples:
1029
1049
  - navigate: note: "Opening the dashboard"
1030
1050
  - click: note: "This button creates a new project"
1031
1051
  - fill: note: "Enter your email to get started"
1032
1052
  - hover: note: "Hover to reveal the dropdown menu"
1033
1053
  - The ONLY steps without notes should be wait/wait_for (pauses)
1034
- - Keep to 15 steps or fewer for best results
1054
+ - Keep to 5-15 action steps for best results. Fewer steps = faster encoding and smaller files.
1035
1055
  - Each video costs 3 API requests`,
1036
1056
  },
1037
1057
  },