npm - pagebolt-mcp - Versions diffs - 1.9.0 → 1.10.0 - Mend

pagebolt-mcp 1.9.0 → 1.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "pagebolt-mcp",
-  "version": "1.9.0",
+  "version": "1.10.0",
   "description": "MCP server for PageBolt — take screenshots, generate PDFs, create OG images, inspect pages, record demo videos with Audio Guide narration, from AI coding assistants like Claude, Cursor, and Windsurf.",
   "main": "src/index.mjs",
   "module": "src/index.mjs",

package/server.json CHANGED Viewed

@@ -6,12 +6,12 @@
     "url": "https://github.com/Custodia-Admin/pagebolt-mcp",
     "source": "github"
   },
-  "version": "1.9.0",
+  "version": "1.10.0",
   "packages": [
     {
       "registryType": "npm",
       "identifier": "pagebolt-mcp",
-      "version": "1.9.0",
+      "version": "1.10.0",
       "transport": {
         "type": "stdio"
       },

package/src/index.mjs CHANGED Viewed

@@ -61,7 +61,7 @@ async function callApi(endpoint, options = {}) {
   const method = options.method || 'GET';
   const headers = {
     'x-api-key': API_KEY,
-    'user-agent': 'pagebolt-mcp/1.9.0',
+    'user-agent': 'pagebolt-mcp/1.10.0',
     ...(options.body ? { 'Content-Type': 'application/json' } : {}),
   };
   const body = options.body ? JSON.stringify(options.body) : undefined;
@@ -191,6 +191,29 @@ When building sequences or videos, ALWAYS use inspect_page first to discover rel
 This avoids guessing selectors like "#submit" when the actual element is "#submitBtn".
+## Handling Dynamic UI: Dropdowns, Popovers, and Modals
+Clicking menus, avatars, profile icons, "⋯" buttons, hamburger toggles, or anything that opens a dropdown/popover/modal creates an overlay that floats ABOVE the page. This is the #1 cause of broken multi-step automations:
+- Subsequent steps get visually obscured by the still-open overlay.
+- A click intended for the underlying page lands on the overlay (or its backdrop) and navigates somewhere unexpected.
+Rules:
+1. **Don't open menus you don't need.** For a high-level tour, navigate directly to the destination URL (from inspect_page / observe_page) instead of clicking through a dropdown.
+2. **If you open an overlay, the very next step must commit to it** — either interact with an element INSIDE the overlay, or explicitly close it before continuing. There is no "press_key" action, so close an overlay with an evaluate step (note: max 2 evaluate steps per sequence):
+   { "action": "evaluate", "script": "document.activeElement&&document.activeElement.blur&&document.activeElement.blur();document.dispatchEvent(new KeyboardEvent('keydown',{key:'Escape',bubbles:true}));" }
+   (Clicking a blank area can also work, but may hit the overlay backdrop and navigate — prefer the evaluate approach or click a known-safe element.)
+3. **Never chain clicks across a state change you haven't re-perceived.** Selectors gathered before a menu opened or a route changed may now point at the wrong (or covered) element.
+## Re-perceive Between Actions (avoid getting lost)
+run_sequence and record_video execute a FIXED, pre-planned list of steps — they do NOT re-check the page between steps. For anything beyond a short, predictable flow, work iteratively instead of blind-batching:
+1. observe_page (or take_screenshot) to see the CURRENT state.
+2. Perform ONE meaningful action (a short run_sequence, or a single click/fill).
+3. observe_page / take_screenshot AGAIN, then choose the next action from the fresh result.
+Repeat. This is how an agent recovers from unexpected popovers, redirects, or layout shifts. Use session_id (create_session, Starter+) on run_sequence to keep cookies/auth/scroll state across these iterations.
+For record_video specifically (one continuous capture, no mid-recording re-perception): keep the flow short and predictable, use ONLY selectors verified via inspect_page/observe_page, and add a dismiss step after anything that could open an overlay.
 ## Visual Diff
 Use visual_diff to compare two pages pixel-by-pixel. Returns a diff image with changed pixels highlighted in red.
@@ -261,7 +284,7 @@ Use blockBanners on almost every request to get clean captures. Combine blockAds
 function createConfiguredServer() {
   const srv = new McpServer({
     name: 'pagebolt',
-    version: '1.9.0',
+    version: '1.10.0',
   }, {
     instructions: SERVER_INSTRUCTIONS,
   });
@@ -873,10 +896,12 @@ server.tool(
     blockTrackers: z.boolean().optional().describe('Block tracking scripts'),
     blockRequests: z.array(z.string()).optional().describe('URL patterns to block'),
     blockResources: z.array(z.string()).optional().describe('Resource types to block'),
+    // ── Session ──
+    session_id: z.string().optional().describe('Inspect the LIVE state of a persistent session (Starter+; create with create_session) instead of a fresh page load. Omit url to inspect the page exactly as the last run_sequence/take_screenshot left it; pass url to navigate within the session first. Ideal for re-perceiving between agent actions.'),
   },
   async (params) => {
-    if (!params.url && !params.html) {
-      return { content: [{ type: 'text', text: 'Error: Either "url" or "html" is required.' }], isError: true };
+    if (!params.url && !params.html && !params.session_id) {
+      return { content: [{ type: 'text', text: 'Error: Either "url", "html", or "session_id" is required.' }], isError: true };
     }
     try {
@@ -1003,10 +1028,12 @@ server.tool(
     blockAds: z.boolean().optional().describe('Block advertisements on the page'),
     blockChats: z.boolean().optional().describe('Block live chat widgets'),
     blockTrackers: z.boolean().optional().describe('Block tracking scripts'),
+    // ── Session ──
+    session_id: z.string().optional().describe('Observe the LIVE state of a persistent session (Starter+; create with create_session) instead of a fresh page load. Omit url to observe the page exactly as the last run_sequence/take_screenshot left it; pass url to navigate within the session first. This is the recommended way to re-perceive between agent actions and recover from popovers/redirects.'),
   },
   async (params) => {
-    if (!params.url && !params.html) {
-      return { content: [{ type: 'text', text: 'Error: Either "url" or "html" is required.' }], isError: true };
+    if (!params.url && !params.html && !params.session_id) {
+      return { content: [{ type: 'text', text: 'Error: Either "url", "html", or "session_id" is required.' }], isError: true };
     }
     try {
@@ -1454,6 +1481,9 @@ Based on the inspection and the description, plan 5–12 action steps. Rules:
   { "action": "wait", "ms": 1500, "live": true }
 - Do NOT pad with wait steps between steps that don't need load time — pace handles inter-step timing automatically.
 - Do NOT use zoom unless the user explicitly asked for it.
+- **Avoid opening dropdowns/menus/popovers** unless the demo is specifically about their contents — they stay open and obscure or misdirect later steps. Prefer navigating directly to the target URL (from the inspection) over clicking through a menu. The recording cannot re-check the page between steps, so a stuck-open overlay will break everything after it.
+- If a step DOES open an overlay, the next step must either act on an element inside it or close it. There is no key-press action; close with an evaluate step (max 2 per video):
+  { "action": "evaluate", "script": "document.activeElement&&document.activeElement.blur&&document.activeElement.blur();document.dispatchEvent(new KeyboardEvent('keydown',{key:'Escape',bubbles:true}));" }
 **Step 3 — Write the narration script**
 Write an audioGuide.script that matches the step count. Format: