illuma-agents 1.0.20 → 1.0.21

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -476,24 +476,29 @@ export function createBrowserGetPageStateTool(): DynamicStructuredTool<typeof Br
476
476
  requiresBrowserExecution: true,
477
477
  // Special flag: extension should inject fresh context into the conversation
478
478
  requiresContextRefresh: true,
479
+ // IMPORTANT: Tell the agent to wait
480
+ message: 'Page state is being captured by the browser extension. The element list will be provided in the next message. DO NOT proceed with click or type actions until you receive the actual element list.',
479
481
  });
480
482
  },
481
483
  {
482
484
  name: EBrowserTools.GET_PAGE_STATE,
483
485
  description: `Get fresh page state showing current interactive elements.
484
486
 
485
- **CRITICAL**: You MUST call this tool after:
486
- - browser_navigate (to see elements on the new page)
487
- - browser_click (if it caused navigation or page changes)
488
- - Any action that might have changed the visible elements
487
+ **CRITICAL WORKFLOW**: After calling this tool, you MUST STOP and WAIT. The browser extension will capture the page state and return the element list. DO NOT plan any browser_click or browser_type actions in the same response - you don't have the element indices yet!
489
488
 
490
- This tool returns the updated list of interactive elements with their [index] numbers.
491
- Without calling this after navigation, you will NOT know what elements exist on the new page.
489
+ **When to use**:
490
+ - After browser_navigate (to see elements on the new page)
491
+ - After browser_click (if it caused navigation or page changes)
492
+ - Any time you need to see what elements are currently on the page
492
493
 
493
- **Workflow example**:
494
- 1. browser_navigate to amazon.com
495
- 2. browser_get_page_state to see Amazon's elements
496
- 3. Now you can see the search input's [index] and use browser_type
494
+ **IMPORTANT**: This tool captures the page state asynchronously. The actual element list will be provided AFTER this tool completes. You should:
495
+ 1. Call this tool
496
+ 2. STOP and wait for the response with the element list
497
+ 3. In your NEXT response, use the element indices for click/type actions
498
+
499
+ Example workflow:
500
+ - Turn 1: browser_navigate to amazon.com, then browser_get_page_state
501
+ - Turn 2: (After receiving element list) browser_type with the correct search input index
497
502
 
498
503
  Example: browser_get_page_state({ reason: "to see elements after navigation" })`,
499
504
  schema: BrowserGetPageStateSchema,
package/src/types/run.ts CHANGED
@@ -115,6 +115,14 @@ export type RunConfig = {
115
115
  returnContent?: boolean;
116
116
  tokenCounter?: TokenCounter;
117
117
  indexTokenCountMap?: Record<string, number>;
118
+ /**
119
+ * Enable browser extension mode with interrupt-based tool execution.
120
+ * When true:
121
+ * - Uses MemorySaver checkpointer for pause/resume
122
+ * - Browser tools will interrupt execution and wait for extension results
123
+ * - Extension must call resume endpoint with Command to continue
124
+ */
125
+ browserMode?: boolean;
118
126
  };
119
127
 
120
128
  export type ProvidedCallbacks =