npm - @eko-ai/eko - Versions diffs - 1.0.7 → 1.0.8 - Mend

@eko-ai/eko 1.0.7 → 1.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (21) hide show

package/README.md +66 -23
package/dist/extension/content/index.d.ts +1 -0
package/dist/extension/tools/browser.d.ts +2 -1
package/dist/extension/tools/index.d.ts +2 -1
package/dist/extension/tools/request_login.d.ts +10 -0
package/dist/extension/utils.d.ts +1 -0
package/dist/extension.cjs.js +137 -20
package/dist/extension.esm.js +137 -20
package/dist/extension_content_script.js +129 -2
package/dist/index.cjs.js +85 -5
package/dist/index.esm.js +85 -5
package/dist/models/action.d.ts +1 -0
package/dist/nodejs/script/build_dom_tree.d.ts +1 -0
package/dist/nodejs/tools/browser_use.d.ts +28 -0
package/dist/nodejs/tools/index.d.ts +1 -0
package/dist/nodejs.cjs.js +71428 -11
package/dist/nodejs.esm.js +71422 -5
package/dist/web/tools/browser.d.ts +2 -1
package/dist/web.cjs.js +29 -17
package/dist/web.esm.js +29 -17
package/package.json +5 -7

package/README.md CHANGED Viewed

@@ -1,17 +1,31 @@
-# Eko
-[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) [![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg)](https://example.com/build-status) [![Version](https://img.shields.io/badge/version-1.0.5-yellow.svg)](https://eko.fellou.ai/docs/release/versions/)
-**Eko** is a revolutionary framework designed to empower developers and users alike to program their browser and operating system using natural language. With seamless integration of browser APIs, OS-level capabilities, and cutting-edge AI tools like Claude 3.5, Eko redefines how we interact with technology, making it intuitive, powerful, and accessible.
-## Key Features
+<h1 align="center">
+  <a href="https://github.com/FellouAI/eko" target="_blank">
+    <img src="https://github.com/user-attachments/assets/55dbdd6c-2b08-4e5f-a841-8fea7c2a0b92" alt="eko-logo" width="200" height="200">
+  </a>
+  <br>
+  <small>Eko - Build Production-ready Agentic Workflow with Natural Language</small>
+</h1>
-- **Natural Language Programming**: Transform human instructions into e- **Natural Language Programming**: Convert natural language task descriptions into executable workflows, making agent development more intuitive
-- **Two-Layer Execution Model**: Separate offline planning from online execution, making agent decisions more structured and explainable
-- **Comprehensive Tooling**: Rich built-in tools for browser automation, computer control, file operations, and web interactions
-- **Hybrid Drive System:** Combine LLM capabilities with developer control, enabling "human in the loop" and allowing interference at multiple levels of granularity
-- **Event-Driven Automation:** Trigger workflows based on browser or system events
-- **Environment Flexibility**: Work across different environments ( [Browser Extension](/docs/browseruse/browser-extension), [Web](/docs/browseruse/browser-web), [Node.js](/docs/computeruse/computer-node), [Next-Gen AI Browser Fellou](/docs/computeruse/computer-fellou) ) with consistent APIs
+[![License](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) [![Build Status](https://img.shields.io/badge/build-passing-brightgreen.svg)](https://example.com/build-status) [![Version](https://img.shields.io/github/package-json/v/FellouAI/eko?color=yellow)](https://eko.fellou.ai/docs/release/versions/)
+Eko (pronounced like ‘echo’) is a production-ready JavaScript framework that enables developers to create reliable agents, **from simple commands to complex workflows**. It provides a unified interface for running agents in both **computer and browser environments**.
+# Framework Comparison
+| Feature                              | Eko   | Langchain  | Browser-use  | Dify.ai  | Coze   |
+|--------------------------------------|-------|------------|--------------|----------|--------|
+| **Supported Platform**               | **All platform**  | Server side  | Browser  | Web  | Web  |
+| **One sentence to multi-step workflow** | ✅    | ❌          | ✅            | ❌        | ❌      |
+| **Intervenability**                  | ✅    | ✅          | ❌            | ❌        | ❌      |
+| **Development Efficiency**           | **High**  | Low      | Middle        | Middle    | Low    |
+| **Task Complexity**           | High  | High      | Low        | Middle    | Middle    | Middle       |
+| **Open-source**                      | ✅    | ✅          | ✅            | ✅        | ❌      |
+| **Access to private web resources** | ✅ **(Coming soon)** | ❌          | ❌            | ❌        | ❌      |
 ## Quickstart
@@ -19,7 +33,7 @@
 npm install @eko-ai/eko
 ```
-> The following code is for reference only. For detailed usage, please refer to the [Eko Quickstart guide](https://eko.fellou.ai/docs/getting-started/quickstart/).
+> For detailed usage, please refer to the [Eko Quickstart guide](https://eko.fellou.ai/docs/getting-started/quickstart/).
 ```typescript
 import { Eko } from '@eko-ai/eko';
@@ -38,6 +52,46 @@ await eko.execute(sysWorkflow);
 ```
+## Demos
+**Propmt:** `Collect the latest NASDAQ data on Yahoo Finance, including price changes, market capitalization, trading volume of major stocks, analyze the data and generate visualization reports`.
+https://github.com/user-attachments/assets/4087b370-8eb8-4346-a549-c4ce4d1efec3
+Click [here](https://github.com/FellouAI/eko-demos/tree/main/browser-extension-stock) to get the source code.
+---
+**Propmt:** `Based on the README of FellouAI/eko on github, search for competitors, highlight the key contributions of Eko, write a blog post advertising Eko, and post it on Write.as.`
+https://github.com/user-attachments/assets/6feaea86-2fb9-4e5c-b510-479c2473d810
+Click [here](https://github.com/FellouAI/eko-demos/tree/main/browser-extension-blog) to get the source code.
+---
+**Propmt:** `Clean up all files in the current directory larger than 1MB`
+https://github.com/user-attachments/assets/ef7feb58-3ddd-4296-a1de-bb8b6c66e48b
+Click [here](https://eko.fellou.ai/docs/computeruse/computer-node/#example-file-cleanup-workflow) to Learn more.
+---
+**Propmt:** Automatic software testing
+```
+    Current login page automation test:
+    1. Correct account and password are: admin / 666666
+    2. Please randomly combine usernames and passwords for testing to verify if login validation works properly, such as: username cannot be empty, password cannot be empty, incorrect username, incorrect password
+    3. Finally, try to login with the correct account and password to verify if login is successful
+    4. Generate test report and export
+```
+https://github.com/user-attachments/assets/7716300a-c51d-41f1-8d4f-e3f593c1b6d5
+Click [here](https://eko.fellou.ai/docs/browseruse/browser-web#example-login-automation-testing) to Learn more.
 ## Use Cases
 - Browser automation and web scraping
@@ -64,24 +118,13 @@ Eko can be used in multiple environments:
 - Browser Extension
 - Web Applications
 - Node.js Applications
-- [Fellou AI Browser](https://fellou.ai)
 ## Community and Support
 - Report issues on [GitHub Issues](https://github.com/FellouAI/eko/issues)
+- Join our [slack community discussions](https://join.slack.com/t/eko-ai/shared_invite/zt-2xhvkudv9-nHvD1g8Smp227sM51x_Meg)
 - Contribute tools and improvements
 - Share your use cases and feedback
-- Join our community discussions
-## Contributing
-We welcome contributions! See our [Contributing Guide](CONTRIBUTING.md) for details on:
-- Setting up the development environment
-- Code style guidelines
-- Submission process
-- Tool development
-- Use case optimization
 ## License

package/dist/extension/content/index.d.ts CHANGED Viewed

@@ -13,3 +13,4 @@ declare function get_dropdown_options(request: any): {
     name: any;
 } | null;
 declare function select_dropdown_option(request: any): any;
+declare function request_user_help(task_id: string, failure_type: string, failure_message: string): void;

package/dist/extension/tools/browser.d.ts CHANGED Viewed

@@ -10,7 +10,8 @@ export declare function right_click(tabId: number, coordinate?: [number, number]
 export declare function right_click_by(tabId: number, xpath?: string, highlightIndex?: number): Promise<any>;
 export declare function double_click(tabId: number, coordinate?: [number, number]): Promise<any>;
 export declare function double_click_by(tabId: number, xpath?: string, highlightIndex?: number): Promise<any>;
-export declare function screenshot(windowId: number): Promise<ScreenshotResult>;
+export declare function screenshot(windowId: number, compress?: boolean): Promise<ScreenshotResult>;
+export declare function compress_image(dataUrl: string, scale?: number, quality?: number): Promise<string>;
 export declare function scroll_to(tabId: number, coordinate: [number, number]): Promise<any>;
 export declare function scroll_to_by(tabId: number, xpath?: string, highlightIndex?: number): Promise<any>;
 export declare function get_dropdown_options(tabId: number, xpath?: string, highlightIndex?: number): Promise<any>;

package/dist/extension/tools/index.d.ts CHANGED Viewed

@@ -7,4 +7,5 @@ import { OpenUrl } from './open_url';
 import { Screenshot } from './screenshot';
 import { TabManagement } from './tab_management';
 import { WebSearch } from './web_search';
-export { BrowserUse, ElementClick, ExportFile, ExtractContent, FindElementPosition, OpenUrl, Screenshot, TabManagement, WebSearch, };
+import { RequestLogin } from './request_login';
+export { BrowserUse, ElementClick, ExportFile, ExtractContent, FindElementPosition, OpenUrl, Screenshot, TabManagement, WebSearch, RequestLogin, };

package/dist/extension/tools/request_login.d.ts ADDED Viewed

@@ -0,0 +1,10 @@
+import { Tool, InputSchema, ExecutionContext } from '../../types/action.types';
+export declare class RequestLogin implements Tool<any, any> {
+    name: string;
+    description: string;
+    input_schema: InputSchema;
+    constructor();
+    execute(context: ExecutionContext, params: any): Promise<any>;
+    awaitLogin(tabId: number, task_id: string): Promise<boolean>;
+    isLoginIn(context: ExecutionContext): Promise<boolean>;
+}

package/dist/extension/utils.d.ts CHANGED Viewed

@@ -5,6 +5,7 @@ export declare function getCurrentTabId(windowId?: number | undefined): Promise<
 export declare function open_new_tab(url: string, newWindow: boolean, windowId?: number): Promise<chrome.tabs.Tab>;
 export declare function executeScript(tabId: number, func: any, args: any[]): Promise<any>;
 export declare function waitForTabComplete(tabId: number, timeout?: number): Promise<chrome.tabs.Tab>;
+export declare function doesTabExists(tabId: number): Promise<unknown>;
 export declare function getPageSize(tabId?: number): Promise<[number, number]>;
 export declare function sleep(time: number): Promise<void>;
 export declare function injectScript(tabId: number, filename?: string): Promise<void>;

package/dist/extension.cjs.js CHANGED Viewed

@@ -138,6 +138,19 @@ async function waitForTabComplete(tabId, timeout = 15000) {
         chrome.tabs.onUpdated.addListener(listener);
     });
 }
+async function doesTabExists(tabId) {
+    const tabExists = await new Promise((resolve) => {
+        chrome.tabs.get(tabId, (tab) => {
+            if (chrome.runtime.lastError) {
+                resolve(false);
+            }
+            else {
+                resolve(true);
+            }
+        });
+    });
+    return tabExists;
+}
 async function getPageSize(tabId) {
     if (!tabId) {
         tabId = await getCurrentTabId();
@@ -235,6 +248,7 @@ var utils = /*#__PURE__*/Object.freeze({
     __proto__: null,
     CountDownLatch: CountDownLatch,
     MsgEvent: MsgEvent,
+    doesTabExists: doesTabExists,
     executeScript: executeScript,
     getCurrentTabId: getCurrentTabId,
     getPageSize: getPageSize,
@@ -339,11 +353,21 @@ async function double_click_by(tabId, xpath, highlightIndex) {
         highlightIndex,
     });
 }
-async function screenshot(windowId) {
-    let dataUrl = await chrome.tabs.captureVisibleTab(windowId, {
-        format: 'jpeg', // jpeg / png
-        quality: 50, // 0-100
-    });
+async function screenshot(windowId, compress) {
+    let dataUrl;
+    if (compress) {
+        dataUrl = await chrome.tabs.captureVisibleTab(windowId, {
+            format: 'jpeg',
+            quality: 60, // 0-100
+        });
+        dataUrl = await compress_image(dataUrl, 0.7, 1);
+    }
+    else {
+        dataUrl = await chrome.tabs.captureVisibleTab(windowId, {
+            format: 'jpeg',
+            quality: 50,
+        });
+    }
     let data = dataUrl.substring(dataUrl.indexOf('base64,') + 7);
     return {
         image: {
@@ -353,6 +377,23 @@ async function screenshot(windowId) {
         },
     };
 }
+async function compress_image(dataUrl, scale = 0.8, quality = 0.8) {
+    const bitmap = await createImageBitmap(await (await fetch(dataUrl)).blob());
+    let width = bitmap.width * scale;
+    let height = bitmap.height * scale;
+    const canvas = new OffscreenCanvas(width, height);
+    const ctx = canvas.getContext('2d');
+    ctx.drawImage(bitmap, 0, 0, width, height);
+    const blob = await canvas.convertToBlob({
+        type: 'image/jpeg',
+        quality: quality,
+    });
+    return new Promise((resolve) => {
+        const reader = new FileReader();
+        reader.onloadend = () => resolve(reader.result);
+        reader.readAsDataURL(blob);
+    });
+}
 async function scroll_to(tabId, coordinate) {
     let from_coordinate = (await cursor_position(tabId)).coordinate;
     return await chrome.tabs.sendMessage(tabId, {
@@ -397,6 +438,7 @@ var browser = /*#__PURE__*/Object.freeze({
     __proto__: null,
     clear_input: clear_input,
     clear_input_by: clear_input_by,
+    compress_image: compress_image,
     cursor_position: cursor_position,
     double_click: double_click,
     double_click_by: double_click_by,
@@ -423,7 +465,6 @@ class BrowserUse {
         this.name = 'browser_use';
         this.description = `Use structured commands to interact with the browser, manipulating page elements through screenshots and webpage element extraction.
 * This is a browser GUI interface where you need to analyze webpages by taking screenshots and extracting page element structures, and specify action sequences to complete designated tasks.
-* Some operations may need time to process, so you might need to wait and continuously take screenshots and extract element structures to check the operation results.
 * Before any operation, you must first call the \`screenshot_extract_element\` command, which will return the browser page screenshot and structured element information, both specially processed.
 * ELEMENT INTERACTION:
    - Only use indexes that exist in the provided element list
@@ -433,17 +474,7 @@ class BrowserUse {
    - If no suitable elements exist, use other functions to complete the task
    - If stuck, try alternative approaches
    - Handle popups/cookies by accepting or closing them
-   - Use scroll to find elements you are looking for
-* Form filling:
-   - If you fill a input field and your action sequence is interrupted, most often a list with suggestions poped up under the field and you need to first select the right element from the suggestion list.
-* ACTION SEQUENCING:
-   - Actions are executed in the order they appear in the list
-   - Each action should logically follow from the previous one
-   - If the page changes after an action, the sequence is interrupted and you get the new state.
-   - If content only disappears the sequence continues.
-   - Only provide the action sequence until you think the page will change.
-   - Try to be efficient, e.g. fill forms at once, or chain actions where nothing changes on the page like saving, extracting, checkboxes...
-   - only use multiple actions if it makes sense.`;
+   - Use scroll to find elements you are looking for`;
         this.input_schema = {
             type: 'object',
             properties: {
@@ -585,7 +616,7 @@ class BrowserUse {
                         return window.get_clickable_elements(true);
                     }, []);
                     context.selector_map = element_result.selector_map;
-                    let screenshot$1 = await screenshot(windowId);
+                    let screenshot$1 = await screenshot(windowId, true);
                     await executeScript(tabId, () => {
                         return window.remove_highlight();
                     }, []);
@@ -817,7 +848,7 @@ ${pseudoHtml}
 async function executeWithBrowserUse$1(context, task_prompt) {
     let tabId = await getTabId(context);
     let windowId = await getWindowId(context);
-    let screenshot_result = await screenshot(windowId);
+    let screenshot_result = await screenshot(windowId, false);
     let messages = [
         {
             role: 'user',
@@ -1065,7 +1096,7 @@ ${pseudoHtml}
 async function executeWithBrowserUse(context, task_prompt) {
     await getTabId(context);
     let windowId = await getWindowId(context);
-    let screenshot_result = await screenshot(windowId);
+    let screenshot_result = await screenshot(windowId, false);
     let messages = [
         {
             role: 'user',
@@ -1643,6 +1674,91 @@ async function doPageContent(taskId, detailLinkGroups, window) {
     return searchInfo;
 }
+class RequestLogin {
+    constructor() {
+        this.name = 'request_login';
+        this.description =
+            'Login to this website, assist with identity verification when manual intervention is needed, guide users through the login process, and wait for their confirmation of successful login.';
+        this.input_schema = {
+            type: 'object',
+            properties: {},
+        };
+    }
+    async execute(context, params) {
+        if (!params.force && await this.isLoginIn(context)) {
+            return true;
+        }
+        let tabId = await getTabId(context);
+        let task_id = 'login_required_' + tabId;
+        const request_user_help = async () => {
+            await chrome.tabs.sendMessage(tabId, {
+                type: 'request_user_help',
+                task_id,
+                failure_type: 'login_required',
+                failure_message: 'Access page require user authentication.',
+            });
+        };
+        const login_interval = setInterval(async () => {
+            try {
+                request_user_help();
+            }
+            catch (e) {
+                clearInterval(login_interval);
+            }
+        }, 2000);
+        try {
+            return await this.awaitLogin(tabId, task_id);
+        }
+        finally {
+            clearInterval(login_interval);
+        }
+    }
+    async awaitLogin(tabId, task_id) {
+        return new Promise((resolve) => {
+            const checkTabClosedInterval = setInterval(async () => {
+                const tabExists = await doesTabExists(tabId);
+                if (!tabExists) {
+                    clearInterval(checkTabClosedInterval);
+                    resolve(false);
+                    chrome.runtime.onMessage.removeListener(listener);
+                }
+            }, 1000);
+            const listener = (message) => {
+                if (message.type === 'issue_resolved' && message.task_id === task_id) {
+                    resolve(true);
+                    clearInterval(checkTabClosedInterval);
+                }
+            };
+            chrome.runtime.onMessage.addListener(listener);
+        });
+    }
+    async isLoginIn(context) {
+        let windowId = await getWindowId(context);
+        let screenshot_result = await screenshot(windowId, true);
+        let messages = [
+            {
+                role: 'user',
+                content: [
+                    {
+                        type: 'image',
+                        source: screenshot_result.image,
+                    },
+                    {
+                        type: 'text',
+                        text: 'Check if the current website is logged in. If not logged in, output `NOT_LOGIN`. If logged in, output `LOGGED_IN`. Output directly without explanation.',
+                    },
+                ],
+            },
+        ];
+        let response = await context.llmProvider.generateText(messages, { maxTokens: 256 });
+        let text = response.textContent;
+        if (!text) {
+            text = JSON.stringify(response.content);
+        }
+        return text.indexOf('LOGGED_IN') > -1;
+    }
+}
 var tools = /*#__PURE__*/Object.freeze({
     __proto__: null,
     BrowserUse: BrowserUse,
@@ -1651,6 +1767,7 @@ var tools = /*#__PURE__*/Object.freeze({
     ExtractContent: ExtractContent,
     FindElementPosition: FindElementPosition,
     OpenUrl: OpenUrl,
+    RequestLogin: RequestLogin,
     Screenshot: Screenshot,
     TabManagement: TabManagement,
     WebSearch: WebSearch

package/dist/extension.esm.js CHANGED Viewed

@@ -136,6 +136,19 @@ async function waitForTabComplete(tabId, timeout = 15000) {
         chrome.tabs.onUpdated.addListener(listener);
     });
 }
+async function doesTabExists(tabId) {
+    const tabExists = await new Promise((resolve) => {
+        chrome.tabs.get(tabId, (tab) => {
+            if (chrome.runtime.lastError) {
+                resolve(false);
+            }
+            else {
+                resolve(true);
+            }
+        });
+    });
+    return tabExists;
+}
 async function getPageSize(tabId) {
     if (!tabId) {
         tabId = await getCurrentTabId();
@@ -233,6 +246,7 @@ var utils = /*#__PURE__*/Object.freeze({
     __proto__: null,
     CountDownLatch: CountDownLatch,
     MsgEvent: MsgEvent,
+    doesTabExists: doesTabExists,
     executeScript: executeScript,
     getCurrentTabId: getCurrentTabId,
     getPageSize: getPageSize,
@@ -337,11 +351,21 @@ async function double_click_by(tabId, xpath, highlightIndex) {
         highlightIndex,
     });
 }
-async function screenshot(windowId) {
-    let dataUrl = await chrome.tabs.captureVisibleTab(windowId, {
-        format: 'jpeg', // jpeg / png
-        quality: 50, // 0-100
-    });
+async function screenshot(windowId, compress) {
+    let dataUrl;
+    if (compress) {
+        dataUrl = await chrome.tabs.captureVisibleTab(windowId, {
+            format: 'jpeg',
+            quality: 60, // 0-100
+        });
+        dataUrl = await compress_image(dataUrl, 0.7, 1);
+    }
+    else {
+        dataUrl = await chrome.tabs.captureVisibleTab(windowId, {
+            format: 'jpeg',
+            quality: 50,
+        });
+    }
     let data = dataUrl.substring(dataUrl.indexOf('base64,') + 7);
     return {
         image: {
@@ -351,6 +375,23 @@ async function screenshot(windowId) {
         },
     };
 }
+async function compress_image(dataUrl, scale = 0.8, quality = 0.8) {
+    const bitmap = await createImageBitmap(await (await fetch(dataUrl)).blob());
+    let width = bitmap.width * scale;
+    let height = bitmap.height * scale;
+    const canvas = new OffscreenCanvas(width, height);
+    const ctx = canvas.getContext('2d');
+    ctx.drawImage(bitmap, 0, 0, width, height);
+    const blob = await canvas.convertToBlob({
+        type: 'image/jpeg',
+        quality: quality,
+    });
+    return new Promise((resolve) => {
+        const reader = new FileReader();
+        reader.onloadend = () => resolve(reader.result);
+        reader.readAsDataURL(blob);
+    });
+}
 async function scroll_to(tabId, coordinate) {
     let from_coordinate = (await cursor_position(tabId)).coordinate;
     return await chrome.tabs.sendMessage(tabId, {
@@ -395,6 +436,7 @@ var browser = /*#__PURE__*/Object.freeze({
     __proto__: null,
     clear_input: clear_input,
     clear_input_by: clear_input_by,
+    compress_image: compress_image,
     cursor_position: cursor_position,
     double_click: double_click,
     double_click_by: double_click_by,
@@ -421,7 +463,6 @@ class BrowserUse {
         this.name = 'browser_use';
         this.description = `Use structured commands to interact with the browser, manipulating page elements through screenshots and webpage element extraction.
 * This is a browser GUI interface where you need to analyze webpages by taking screenshots and extracting page element structures, and specify action sequences to complete designated tasks.
-* Some operations may need time to process, so you might need to wait and continuously take screenshots and extract element structures to check the operation results.
 * Before any operation, you must first call the \`screenshot_extract_element\` command, which will return the browser page screenshot and structured element information, both specially processed.
 * ELEMENT INTERACTION:
    - Only use indexes that exist in the provided element list
@@ -431,17 +472,7 @@ class BrowserUse {
    - If no suitable elements exist, use other functions to complete the task
    - If stuck, try alternative approaches
    - Handle popups/cookies by accepting or closing them
-   - Use scroll to find elements you are looking for
-* Form filling:
-   - If you fill a input field and your action sequence is interrupted, most often a list with suggestions poped up under the field and you need to first select the right element from the suggestion list.
-* ACTION SEQUENCING:
-   - Actions are executed in the order they appear in the list
-   - Each action should logically follow from the previous one
-   - If the page changes after an action, the sequence is interrupted and you get the new state.
-   - If content only disappears the sequence continues.
-   - Only provide the action sequence until you think the page will change.
-   - Try to be efficient, e.g. fill forms at once, or chain actions where nothing changes on the page like saving, extracting, checkboxes...
-   - only use multiple actions if it makes sense.`;
+   - Use scroll to find elements you are looking for`;
         this.input_schema = {
             type: 'object',
             properties: {
@@ -583,7 +614,7 @@ class BrowserUse {
                         return window.get_clickable_elements(true);
                     }, []);
                     context.selector_map = element_result.selector_map;
-                    let screenshot$1 = await screenshot(windowId);
+                    let screenshot$1 = await screenshot(windowId, true);
                     await executeScript(tabId, () => {
                         return window.remove_highlight();
                     }, []);
@@ -815,7 +846,7 @@ ${pseudoHtml}
 async function executeWithBrowserUse$1(context, task_prompt) {
     let tabId = await getTabId(context);
     let windowId = await getWindowId(context);
-    let screenshot_result = await screenshot(windowId);
+    let screenshot_result = await screenshot(windowId, false);
     let messages = [
         {
             role: 'user',
@@ -1063,7 +1094,7 @@ ${pseudoHtml}
 async function executeWithBrowserUse(context, task_prompt) {
     await getTabId(context);
     let windowId = await getWindowId(context);
-    let screenshot_result = await screenshot(windowId);
+    let screenshot_result = await screenshot(windowId, false);
     let messages = [
         {
             role: 'user',
@@ -1641,6 +1672,91 @@ async function doPageContent(taskId, detailLinkGroups, window) {
     return searchInfo;
 }
+class RequestLogin {
+    constructor() {
+        this.name = 'request_login';
+        this.description =
+            'Login to this website, assist with identity verification when manual intervention is needed, guide users through the login process, and wait for their confirmation of successful login.';
+        this.input_schema = {
+            type: 'object',
+            properties: {},
+        };
+    }
+    async execute(context, params) {
+        if (!params.force && await this.isLoginIn(context)) {
+            return true;
+        }
+        let tabId = await getTabId(context);
+        let task_id = 'login_required_' + tabId;
+        const request_user_help = async () => {
+            await chrome.tabs.sendMessage(tabId, {
+                type: 'request_user_help',
+                task_id,
+                failure_type: 'login_required',
+                failure_message: 'Access page require user authentication.',
+            });
+        };
+        const login_interval = setInterval(async () => {
+            try {
+                request_user_help();
+            }
+            catch (e) {
+                clearInterval(login_interval);
+            }
+        }, 2000);
+        try {
+            return await this.awaitLogin(tabId, task_id);
+        }
+        finally {
+            clearInterval(login_interval);
+        }
+    }
+    async awaitLogin(tabId, task_id) {
+        return new Promise((resolve) => {
+            const checkTabClosedInterval = setInterval(async () => {
+                const tabExists = await doesTabExists(tabId);
+                if (!tabExists) {
+                    clearInterval(checkTabClosedInterval);
+                    resolve(false);
+                    chrome.runtime.onMessage.removeListener(listener);
+                }
+            }, 1000);
+            const listener = (message) => {
+                if (message.type === 'issue_resolved' && message.task_id === task_id) {
+                    resolve(true);
+                    clearInterval(checkTabClosedInterval);
+                }
+            };
+            chrome.runtime.onMessage.addListener(listener);
+        });
+    }
+    async isLoginIn(context) {
+        let windowId = await getWindowId(context);
+        let screenshot_result = await screenshot(windowId, true);
+        let messages = [
+            {
+                role: 'user',
+                content: [
+                    {
+                        type: 'image',
+                        source: screenshot_result.image,
+                    },
+                    {
+                        type: 'text',
+                        text: 'Check if the current website is logged in. If not logged in, output `NOT_LOGIN`. If logged in, output `LOGGED_IN`. Output directly without explanation.',
+                    },
+                ],
+            },
+        ];
+        let response = await context.llmProvider.generateText(messages, { maxTokens: 256 });
+        let text = response.textContent;
+        if (!text) {
+            text = JSON.stringify(response.content);
+        }
+        return text.indexOf('LOGGED_IN') > -1;
+    }
+}
 var tools = /*#__PURE__*/Object.freeze({
     __proto__: null,
     BrowserUse: BrowserUse,
@@ -1649,6 +1765,7 @@ var tools = /*#__PURE__*/Object.freeze({
     ExtractContent: ExtractContent,
     FindElementPosition: FindElementPosition,
     OpenUrl: OpenUrl,
+    RequestLogin: RequestLogin,
     Screenshot: Screenshot,
     TabManagement: TabManagement,
     WebSearch: WebSearch