@isdk/web-fetcher 0.2.4 → 0.2.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (64) hide show
  1. package/README.action.cn.md +6 -4
  2. package/README.action.md +6 -4
  3. package/README.cn.md +13 -1
  4. package/README.engine.cn.md +1 -1
  5. package/README.engine.md +1 -1
  6. package/README.md +13 -1
  7. package/dist/index.d.mts +15 -3
  8. package/dist/index.d.ts +15 -3
  9. package/dist/index.js +1 -1
  10. package/dist/index.mjs +1 -1
  11. package/docs/README.md +13 -1
  12. package/docs/_media/README.action.md +6 -4
  13. package/docs/_media/README.cn.md +13 -1
  14. package/docs/_media/README.engine.md +1 -1
  15. package/docs/classes/CheerioFetchEngine.md +76 -57
  16. package/docs/classes/ClickAction.md +23 -23
  17. package/docs/classes/ExtractAction.md +23 -23
  18. package/docs/classes/FetchAction.md +23 -23
  19. package/docs/classes/FetchEngine.md +72 -57
  20. package/docs/classes/FetchSession.md +21 -9
  21. package/docs/classes/FillAction.md +23 -23
  22. package/docs/classes/GetContentAction.md +23 -23
  23. package/docs/classes/GotoAction.md +23 -23
  24. package/docs/classes/PauseAction.md +23 -23
  25. package/docs/classes/PlaywrightFetchEngine.md +76 -57
  26. package/docs/classes/SubmitAction.md +23 -23
  27. package/docs/classes/WaitForAction.md +23 -23
  28. package/docs/classes/WebFetcher.md +5 -5
  29. package/docs/enumerations/FetchActionResultStatus.md +4 -4
  30. package/docs/functions/fetchWeb.md +2 -2
  31. package/docs/interfaces/BaseFetchActionProperties.md +25 -9
  32. package/docs/interfaces/BaseFetchCollectorActionProperties.md +37 -13
  33. package/docs/interfaces/BaseFetcherProperties.md +22 -22
  34. package/docs/interfaces/DispatchedEngineAction.md +4 -4
  35. package/docs/interfaces/ExtractActionProperties.md +33 -9
  36. package/docs/interfaces/FetchActionInContext.md +38 -14
  37. package/docs/interfaces/FetchActionProperties.md +35 -11
  38. package/docs/interfaces/FetchActionResult.md +6 -6
  39. package/docs/interfaces/FetchContext.md +33 -33
  40. package/docs/interfaces/FetchEngineContext.md +27 -27
  41. package/docs/interfaces/FetchMetadata.md +5 -5
  42. package/docs/interfaces/FetchResponse.md +13 -13
  43. package/docs/interfaces/FetchReturnTypeRegistry.md +7 -7
  44. package/docs/interfaces/FetchSite.md +25 -25
  45. package/docs/interfaces/FetcherOptions.md +25 -25
  46. package/docs/interfaces/GotoActionOptions.md +6 -6
  47. package/docs/interfaces/PendingEngineRequest.md +3 -3
  48. package/docs/interfaces/SubmitActionOptions.md +2 -2
  49. package/docs/interfaces/WaitForActionOptions.md +5 -5
  50. package/docs/type-aliases/BaseFetchActionOptions.md +2 -2
  51. package/docs/type-aliases/BaseFetchCollectorOptions.md +2 -2
  52. package/docs/type-aliases/BrowserEngine.md +1 -1
  53. package/docs/type-aliases/FetchActionCapabilities.md +1 -1
  54. package/docs/type-aliases/FetchActionCapabilityMode.md +1 -1
  55. package/docs/type-aliases/FetchActionOptions.md +2 -2
  56. package/docs/type-aliases/FetchEngineAction.md +1 -1
  57. package/docs/type-aliases/FetchEngineType.md +1 -1
  58. package/docs/type-aliases/FetchReturnType.md +1 -1
  59. package/docs/type-aliases/FetchReturnTypeFor.md +1 -1
  60. package/docs/type-aliases/OnFetchPauseCallback.md +1 -1
  61. package/docs/type-aliases/ResourceType.md +1 -1
  62. package/docs/variables/DefaultFetcherProperties.md +1 -1
  63. package/docs/variables/FetcherOptionKeys.md +1 -1
  64. package/package.json +1 -1
@@ -23,7 +23,7 @@ Action 脚本系统的核心目标是提供一个**声明式、引擎无关**的
23
23
 
24
24
  `FetchAction` 是所有 Action 的抽象基类,它定义了一个 Action 的核心要素:
25
25
 
26
- * `static id`: Action 的唯一标识符,例如 `'click'`。
26
+ * `static id`: Action 的唯一标识符,例如 `'click'`。在 Action 脚本中,你可以使用 `id`、`name` 或 `action` 来指定此标识符。
27
27
  * `static returnType`: Action 执行后返回结果的类型,例如 `'none'`, `'response'`。
28
28
  * `static capabilities`: 声明此 Action 在不同引擎(`http`, `browser`)下的能力级别(`native`, `simulate`, `noop`)。
29
29
  * `static register()`: 一个静态方法,用于将 Action 类注册到全局注册表中,使其可以通过 `id` 被动态创建。
@@ -62,14 +62,16 @@ export class FillAction extends FetchAction {
62
62
 
63
63
  对于简单的线性流程,可以直接使用库内置的原子 Action 列表。
64
64
 
65
+ > **💡 提示**:你可以使用 `action` 或 `name` 作为 `id` 的别名,使用 `args` 作为 `params` 的别名。
66
+
65
67
  **示例:在 Google 搜索 "gemini"**
66
68
 
67
69
  ```json
68
70
  {
69
71
  "actions": [
70
- { "id": "goto", "params": { "url": "https://www.google.com" } },
71
- { "id": "fill", "params": { "selector": "textarea[name=q]", "value": "gemini" } },
72
- { "id": "submit", "params": { "selector": "form" } }
72
+ { "action": "goto", "args": { "url": "https://www.google.com" } },
73
+ { "action": "fill", "args": { "selector": "textarea[name=q]", "value": "gemini" } },
74
+ { "action": "submit", "args": { "selector": "form" } }
73
75
  ]
74
76
  }
75
77
  ```
package/README.action.md CHANGED
@@ -23,7 +23,7 @@ This approach allows users to describe a complete business process with intuitiv
23
23
 
24
24
  `FetchAction` is the abstract base class for all Actions. It defines the core elements of an Action:
25
25
 
26
- * `static id`: The unique identifier for the Action, e.g., `'click'`.
26
+ * `static id`: The unique identifier for the Action, e.g., `'click'`. In Action Script, you can use `id`, `name`, or `action` to specify this identifier.
27
27
  * `static returnType`: The type of the result returned after the Action executes, e.g., `'none'`, `'response'`.
28
28
  * `static capabilities`: Declares the capability level of this Action in different engines (`http`, `browser`), such as `native`, `simulate`, or `noop`.
29
29
  * `static register()`: A static method to register the Action class in a global registry, allowing it to be dynamically created by its `id`.
@@ -62,14 +62,16 @@ Users define a complete automation workflow via a JSON-formatted `actions` array
62
62
 
63
63
  For simple, linear workflows, you can use a list of the library's built-in atomic actions directly.
64
64
 
65
+ > **💡 Tip**: You can use `action` or `name` as an alias for `id`, and `args` as an alias for `params`.
66
+
65
67
  **Example: Searching for "gemini" on Google**
66
68
 
67
69
  ```json
68
70
  {
69
71
  "actions": [
70
- { "id": "goto", "params": { "url": "https://www.google.com" } },
71
- { "id": "fill", "params": { "selector": "textarea[name=q]", "value": "gemini" } },
72
- { "id": "submit", "params": { "selector": "form" } }
72
+ { "action": "goto", "args": { "url": "https://www.google.com" } },
73
+ { "action": "fill", "args": { "selector": "textarea[name=q]", "value": "gemini" } },
74
+ { "action": "submit", "args": { "selector": "form" } }
73
75
  ]
74
76
  }
75
77
  ```
package/README.cn.md CHANGED
@@ -131,7 +131,7 @@ searchGoogle('gemini');
131
131
 
132
132
  * `url` (string): 要导航的初始 URL。
133
133
  * `engine` ('http' | 'browser' | 'auto'): 要使用的引擎。默认为 `auto`。
134
- * `actions` (FetchActionOptions[]): 要执行的动作对象数组。
134
+ * `actions` (FetchActionOptions[]): 要执行的动作对象数组。(支持 `action`/`name` 作为 `id` 的别名,`args` 作为 `params` 的别名)
135
135
  * `headers` (Record<string, string>): 用于所有请求的头信息。
136
136
  * ...以及许多其他用于代理、Cookie、重试等的选项。
137
137
 
@@ -148,6 +148,18 @@ searchGoogle('gemini');
148
148
  * `getContent`: 获取当前页面状态的完整内容(HTML、文本等)。
149
149
  * `extract`: 使用富有表现力的声明式 Schema,可轻松提取页面中的任意结构化数据。
150
150
 
151
+ ### 响应结构
152
+
153
+ `fetchWeb` 函数返回一个对象,包含:
154
+
155
+ * `result` (FetchResponse):
156
+ * `url`: 最终 URL。
157
+ * `statusCode`: HTTP 状态码。
158
+ * `headers`: HTTP 头信息。
159
+ * `cookies`: Cookie 数组。
160
+ * `text`, `html`: 页面内容。
161
+ * `outputs` (Record<string, any>): 通过 `storeAs` 提取并存储的数据。
162
+
151
163
  ---
152
164
 
153
165
  ## 📜 许可证
@@ -24,7 +24,7 @@
24
24
  * **关键抽象**:
25
25
  * **生命周期**:`initialize()` 和 `cleanup()` 方法。
26
26
  * **核心操作**:`goto()`、`getContent()`、`click()`、`fill()`、`submit()`、`waitFor()`、`extract()`。
27
- * **配置**:`headers()`、`cookies()`、`blockResources()`。
27
+ * **配置与状态**:`headers()`、`cookies()`、`blockResources()`、`getState()`。
28
28
  * **静态注册表**:它维护所有可用引擎实现的静态注册表(`FetchEngine.register`),允许通过 `id` 或 `mode` 动态选择引擎。
29
29
 
30
30
  ### `FetchEngine.create(context, options)`
package/README.engine.md CHANGED
@@ -24,7 +24,7 @@ This is the abstract base class that defines the contract for all fetch engines.
24
24
  * **Key Abstractions**:
25
25
  * **Lifecycle**: `initialize()` and `cleanup()` methods.
26
26
  * **Core Actions**: `goto()`, `getContent()`, `click()`, `fill()`, `submit()`, `waitFor()`, `extract()`.
27
- * **Configuration**: `headers()`, `cookies()`, `blockResources()`.
27
+ * **Configuration & State**: `headers()`, `cookies()`, `blockResources()`, `getState()`.
28
28
  * **Static Registry**: It maintains a static registry of all available engine implementations (`FetchEngine.register`), allowing for dynamic selection by `id` or `mode`.
29
29
 
30
30
  ### `FetchEngine.create(context, options)`
package/README.md CHANGED
@@ -131,7 +131,7 @@ This is the main entry point for the library.
131
131
 
132
132
  * `url` (string): The initial URL to navigate to.
133
133
  * `engine` ('http' | 'browser' | 'auto'): The engine to use. Defaults to `auto`.
134
- * `actions` (FetchActionOptions[]): An array of action objects to execute.
134
+ * `actions` (FetchActionOptions[]): An array of action objects to execute. (Supports `action`/`name` as alias for `id`, and `args` as alias for `params`)
135
135
  * `headers` (Record<string, string>): Headers to use for all requests.
136
136
  * ...and many other options for proxy, cookies, retries, etc.
137
137
 
@@ -148,6 +148,18 @@ Here are the essential built-in actions:
148
148
  * `getContent`: Retrieves the full content (HTML, text, etc.) of the current page state.
149
149
  * `extract`: Extracts any structured data from the page with ease using an expressive, declarative schema.
150
150
 
151
+ ### Response Structure
152
+
153
+ The `fetchWeb` function returns an object containing:
154
+
155
+ * `result` (FetchResponse):
156
+ * `url`: The final URL.
157
+ * `statusCode`: HTTP status code.
158
+ * `headers`: HTTP headers.
159
+ * `cookies`: Array of cookies.
160
+ * `text`, `html`: Page content.
161
+ * `outputs` (Record<string, any>): Data extracted and stored via `storeAs`.
162
+
151
163
  ---
152
164
 
153
165
  ## 📜 License
package/dist/index.d.mts CHANGED
@@ -1066,6 +1066,13 @@ declare abstract class FetchEngine<TContext extends CrawlingContext = any, TCraw
1066
1066
  * Gets the unique identifier of this engine implementation.
1067
1067
  */
1068
1068
  get id(): string;
1069
+ /**
1070
+ * Returns the current state of the engine (cookies)
1071
+ * that can be used to restore the session later.
1072
+ */
1073
+ getState(): Promise<{
1074
+ cookies: Cookie[];
1075
+ }>;
1069
1076
  /**
1070
1077
  * Gets the execution mode of this engine (`'http'` or `'browser'`).
1071
1078
  */
@@ -1316,7 +1323,9 @@ interface FetchActionResult<R extends FetchReturnType = FetchReturnType> {
1316
1323
  interface BaseFetchActionProperties {
1317
1324
  id?: string;
1318
1325
  name?: string;
1326
+ action?: string | FetchAction;
1319
1327
  params?: any;
1328
+ args?: any;
1320
1329
  storeAs?: string;
1321
1330
  failOnError?: boolean;
1322
1331
  failOnTimeout?: boolean;
@@ -1324,18 +1333,18 @@ interface BaseFetchActionProperties {
1324
1333
  maxRetries?: number;
1325
1334
  [key: string]: any;
1326
1335
  }
1327
- type BaseFetchActionOptions = RequireAtLeastOne<BaseFetchActionProperties, 'id' | 'name'>;
1336
+ type BaseFetchActionOptions = RequireAtLeastOne<BaseFetchActionProperties, 'id' | 'name' | 'action'>;
1328
1337
  interface BaseFetchCollectorActionProperties extends BaseFetchActionProperties {
1329
1338
  activateOn?: string | RegExp | Array<string | RegExp>;
1330
1339
  deactivateOn?: string | RegExp | Array<string | RegExp>;
1331
1340
  collectOn?: string | RegExp | Array<string | RegExp>;
1332
1341
  background?: boolean;
1333
1342
  }
1334
- type BaseFetchCollectorOptions = RequireAtLeastOne<BaseFetchCollectorActionProperties, 'id' | 'name'>;
1343
+ type BaseFetchCollectorOptions = RequireAtLeastOne<BaseFetchCollectorActionProperties, 'id' | 'name' | 'action'>;
1335
1344
  interface FetchActionProperties extends BaseFetchActionProperties {
1336
1345
  collectors?: BaseFetchCollectorOptions[];
1337
1346
  }
1338
- type FetchActionOptions = RequireAtLeastOne<FetchActionProperties, 'id' | 'name'>;
1347
+ type FetchActionOptions = RequireAtLeastOne<FetchActionProperties, 'id' | 'name' | 'action'>;
1339
1348
  type FetchActionCapabilities = {
1340
1349
  [mode in FetchEngineType]?: FetchActionCapabilityMode;
1341
1350
  };
@@ -1493,6 +1502,9 @@ declare class FetchSession {
1493
1502
  outputs: Record<string, any>;
1494
1503
  }>;
1495
1504
  getOutputs(): Record<string, any>;
1505
+ getState(): Promise<{
1506
+ cookies: Cookie[];
1507
+ } | undefined>;
1496
1508
  dispose(): Promise<void>;
1497
1509
  private ensureEngine;
1498
1510
  private createContext;
package/dist/index.d.ts CHANGED
@@ -1066,6 +1066,13 @@ declare abstract class FetchEngine<TContext extends CrawlingContext = any, TCraw
1066
1066
  * Gets the unique identifier of this engine implementation.
1067
1067
  */
1068
1068
  get id(): string;
1069
+ /**
1070
+ * Returns the current state of the engine (cookies)
1071
+ * that can be used to restore the session later.
1072
+ */
1073
+ getState(): Promise<{
1074
+ cookies: Cookie[];
1075
+ }>;
1069
1076
  /**
1070
1077
  * Gets the execution mode of this engine (`'http'` or `'browser'`).
1071
1078
  */
@@ -1316,7 +1323,9 @@ interface FetchActionResult<R extends FetchReturnType = FetchReturnType> {
1316
1323
  interface BaseFetchActionProperties {
1317
1324
  id?: string;
1318
1325
  name?: string;
1326
+ action?: string | FetchAction;
1319
1327
  params?: any;
1328
+ args?: any;
1320
1329
  storeAs?: string;
1321
1330
  failOnError?: boolean;
1322
1331
  failOnTimeout?: boolean;
@@ -1324,18 +1333,18 @@ interface BaseFetchActionProperties {
1324
1333
  maxRetries?: number;
1325
1334
  [key: string]: any;
1326
1335
  }
1327
- type BaseFetchActionOptions = RequireAtLeastOne<BaseFetchActionProperties, 'id' | 'name'>;
1336
+ type BaseFetchActionOptions = RequireAtLeastOne<BaseFetchActionProperties, 'id' | 'name' | 'action'>;
1328
1337
  interface BaseFetchCollectorActionProperties extends BaseFetchActionProperties {
1329
1338
  activateOn?: string | RegExp | Array<string | RegExp>;
1330
1339
  deactivateOn?: string | RegExp | Array<string | RegExp>;
1331
1340
  collectOn?: string | RegExp | Array<string | RegExp>;
1332
1341
  background?: boolean;
1333
1342
  }
1334
- type BaseFetchCollectorOptions = RequireAtLeastOne<BaseFetchCollectorActionProperties, 'id' | 'name'>;
1343
+ type BaseFetchCollectorOptions = RequireAtLeastOne<BaseFetchCollectorActionProperties, 'id' | 'name' | 'action'>;
1335
1344
  interface FetchActionProperties extends BaseFetchActionProperties {
1336
1345
  collectors?: BaseFetchCollectorOptions[];
1337
1346
  }
1338
- type FetchActionOptions = RequireAtLeastOne<FetchActionProperties, 'id' | 'name'>;
1347
+ type FetchActionOptions = RequireAtLeastOne<FetchActionProperties, 'id' | 'name' | 'action'>;
1339
1348
  type FetchActionCapabilities = {
1340
1349
  [mode in FetchEngineType]?: FetchActionCapabilityMode;
1341
1350
  };
@@ -1493,6 +1502,9 @@ declare class FetchSession {
1493
1502
  outputs: Record<string, any>;
1494
1503
  }>;
1495
1504
  getOutputs(): Record<string, any>;
1505
+ getState(): Promise<{
1506
+ cookies: Cookie[];
1507
+ } | undefined>;
1496
1508
  dispose(): Promise<void>;
1497
1509
  private ensureEngine;
1498
1510
  private createContext;
package/dist/index.js CHANGED
@@ -1 +1 @@
1
- "use strict";var t,e=Object.create,i=Object.defineProperty,s=Object.getOwnPropertyDescriptor,r=Object.getOwnPropertyNames,n=Object.getPrototypeOf,o=Object.prototype.hasOwnProperty,a=(t,e,n,a)=>{if(e&&"object"==typeof e||"function"==typeof e)for(let c of r(e))o.call(t,c)||c===n||i(t,c,{get:()=>e[c],enumerable:!(a=s(e,c))||a.enumerable});return t},c={};((t,e)=>{for(var s in e)i(t,s,{get:e[s],enumerable:!0})})(c,{CheerioFetchEngine:()=>F,ClickAction:()=>H,DefaultFetcherProperties:()=>l,ExtractAction:()=>W,FetchAction:()=>f,FetchActionResultStatus:()=>h,FetchEngine:()=>k,FetchSession:()=>$,FetcherOptionKeys:()=>u,FillAction:()=>L,GetContentAction:()=>M,GotoAction:()=>G,PauseAction:()=>B,PlaywrightFetchEngine:()=>N,SubmitAction:()=>z,WaitForAction:()=>D,WebFetcher:()=>R,fetchWeb:()=>I}),module.exports=(t=c,a(i({},"__esModule",{value:!0}),t));var l={engine:"auto",enableSmart:!0,useSiteRegistry:!0,antibot:!1,headers:{},cookies:[],reuseCookies:!0,throwHttpErrors:void 0,proxy:[],blockResources:[],ignoreSslErrors:!0,browser:{engine:"playwright",headless:!0,waitUntil:"domcontentloaded"},http:{method:"GET"},timeoutMs:6e4,requestHandlerTimeoutSecs:void 0,maxConcurrency:1,maxRequestsPerMinute:1e3,delayBetweenRequestsMs:0,retries:0,sites:[]},u=Object.keys(l).concat(["actions","onPause"]),h=(t=>(t[t.Failed=0]="Failed",t[t.Success=1]="Success",t[t.Skipped=2]="Skipped",t))(h||{}),w=class t{static register(t){const e=t.id;if(!e)throw new Error("FetchAction.register: actionClass.id is required");this.registry.set(e,t)}static get(t){return this.registry.get(t)}static create(t){const e="string"==typeof t?t:t.id||t.name;if(!e)throw new Error("Action must have id or name");const i=this.registry.get(e);return i?new i:void 0}static has(t){return this.registry.has(t)}static list(){return Array.from(this.registry.keys())}static getCapability(t){return this.capabilities[t]??"noop"}getCapability(t){return this.constructor.getCapability(t)}get id(){return this.constructor.id}get returnType(){return this.constructor.returnType}get capabilities(){return this.constructor.capabilities}async delegateToEngine(t,e,...i){const s=t.internal.engine;if(!s)throw new Error("No engine available");if("function"!=typeof s[e])throw new Error(`Engine does not have a method named '${String(e)}'`);return await s[e](...i)}installCollectors(e,i){const s=i?.collectors;if(!s?.length)return;const r=[],n=new Set;for(const i of s){const s=d(i.activateOn),o=d(i.collectOn),a=d(i.deactivateOn),c=!(i.background??!0),l=t.create(i);if(!l)continue;let u=!1,h=!1,w=0;const f=async t=>{if(!u&&!h){u=!0;try{await(l.onBeforeExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:l.id,phase:"before",error:t})}}},m=async(t,s)=>{if(!h){u||await f(s);try{const r=Promise.resolve(l.onExecute?.(e,i,s)).then(s=>{var r,n;if(i.storeAs){((r=e.outputs)[n=i.storeAs]||(r[n]=[])).push(s)}return e.eventBus.emit("collector:result",{action:this.id,collector:i.id||i.name,event:t,result:s}),s}).catch(s=>{e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,event:t,phase:"exec",error:s})}).finally(()=>{w++});c&&(n.add(r),r.finally(()=>n.delete(r)))}catch(i){e.eventBus.emit("collector:error",{action:this.id,collector:l.id,event:t,phase:"exec",error:i})}}},g=async()=>{if(!h){0===w&&m("collector:after"),h=!0;try{await(l.onAfterExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,phase:"after",error:t})}finally{e.eventBus.emit("collector:end",{action:this.id,collector:i.id||i.name}),v.forEach(t=>t())}}},b=p(e,s,f),v=y(e,o,m),x=p(e,a,g);if(r.push(...b,...v,...x),!s.length&&!o.length&&!a.length){const t=()=>{g()};e.eventBus.once(`action:${this.id}.end`,t),r.push(()=>e.eventBus.off("fetcher:action:end",t))}}return r.length||n.size>0?{cleanup:()=>r.forEach(t=>t()),awaitExecPendings:async()=>{n.size>0&&await Promise.allSettled(Array.from(n))}}:void 0}async beforeExec(t,e){t.internal.actionStack||(t.internal.actionStack=[]);const i=t.internal.actionStack,s=i.length,r=i.length>0?i[i.length-1].id:void 0,n={...e,id:this.id,depth:s,parent:r};i.push(n),t.currentAction=n;const o={action:this,context:t,options:e,depth:s,stack:[...i]};t.eventBus.emit(`action:${this.id}.start`,o),t.eventBus.emit("action:start",o),await(this.onBeforeExec?.(t,e));return{entry:o,collectors:this.installCollectors(t,e)}}async afterExec(t,e,i,s){const r=t.internal.actionStack,n=r.length-1,o=s?.collectors;try{await(o?.awaitExecPendings()),t.lastResult=i,"response"!==i?.returnType||i.error||(t.lastResponse=i.result),e?.storeAs&&(t.outputs[e.storeAs]=i?.result),i?.error&&(t.currentAction.error=i.error),await(this.onAfterExec?.(t,e));const s={action:this,context:t,options:e,result:i,depth:n,stack:[...r]};i?.error&&(s.error=i.error);try{t.eventBus.emit(`action:${this.id}.end`,s)}catch(t){}try{t.eventBus.emit("action:end",s)}catch(t){}}finally{try{o?.cleanup()}finally{r.pop();const e=r.length;t.currentAction=e>0?r[e-1]:void 0}}}async execute(t,e){const i=await this.beforeExec(t,e);let s;try{const i=e?.failOnError??!0;return t.throwHttpErrors=i,s=await this.onExecute(t,e),s&&s.returnType||(s={status:1,returnType:this.returnType??"any",result:s}),s}catch(i){if(s={status:0,error:i,meta:{id:this.id,engineType:t.engine,capability:this.getCapability(t.engine)}},e?.failOnError)throw i;return s}finally{await this.afterExec(t,e,s,i)}}};w.registry=new Map,w.returnType="any",w.capabilities={http:"noop",browser:"noop"};var f=w;function d(t){return t?Array.isArray(t)?t:[t]:[]}function p(t,e,i){const s=[];for(const r of e)if("string"==typeof r||r instanceof RegExp){const e=(...t)=>{i(t[0])};t.eventBus.once(r,e),s.push(()=>t.eventBus.off(r,e))}return s}function y(t,e,i){const s=[];for(const r of e)if("string"==typeof r||r instanceof RegExp){const e=t=>i(r,t);t.eventBus.on(r,e),s.push(()=>t.eventBus.off(r,e))}return s}var m=require("events-ex");var g=require("lodash-es"),b=(0,require("nanoid").customAlphabet)("0123456789abcdefghijklmnopqrstuvwxyz",12);var v=require("lodash-es"),x=require("events-ex"),q=require("@isdk/common-error"),E=require("crawlee");function S(){let t=()=>{};const e=new Promise(e=>{t=e});return e.release=t,e}E.Configuration.getGlobalConfig().set("persistStorage",!1);var k=class{constructor(){this.hdrs={},this.jar=[],this.pendingRequests=new Map,this.requestCounter=0,this.actionEmitter=new x.EventEmitter,this.isPageActive=!1,this.navigationLock=function(){const t=S();return t.release(),t}(),this.blockedTypes=new Set}static register(t){const e=t.id;if(!e)throw new Error("Engine must define static id");if(this.registry.has(e))throw new Error(`Engine id duplicated: ${e}`);this.registry.set(e,t)}static get(t){return this.registry.get(t)}static getByMode(t){for(const[e,i]of this.registry.entries())if(i.mode===t)return i}static async create(t,e){const i=(0,v.defaultsDeep)(e,t,l),s=i.engine??t.engine,r=s?this.get(s)??this.getByMode(s):null;if(r){const e=new r;return await e.initialize(t,i),e}}async _extract(t,e){const i=t.type;if(!e)return"array"===i?[]:null;if("object"===i){const{selector:i,properties:s}=t;let r=e;if(i){const t=await this._querySelectorAll(e,i);r=t.length>0?t[0]:null}if(!r)return null;const n={};for(const t in s)n[t]=await this._extract(s[t],r);return n}if("array"===i){const{selector:i,items:s}=t,r=i?await this._querySelectorAll(e,i):[e],n=[];for(const t of r)n.push(await this._extract(s,t));return n}const{selector:s}=t;let r=e;if(s){const t=await this._querySelectorAll(e,s);r=t.length>0?t[0]:null}return r?this._extractValue(t,r):null}async buildResponse(t){const e=await this._buildResponse(t),i=e.headers["content-type"]||"";return e.contentType=i.split(";")[0].trim(),e}waitFor(t){return this.dispatchAction({type:"waitFor",options:t})}click(t){return this.dispatchAction({type:"click",selector:t})}fill(t,e){return this.dispatchAction({type:"fill",selector:t,value:e})}submit(t,e){return this.dispatchAction({type:"submit",selector:t,options:e})}pause(t){return this.dispatchAction({type:"pause",message:t})}extract(t){const e=this._normalizeSchema(t);return this.dispatchAction({type:"extract",schema:e})}_normalizeSchema(t){const e=JSON.parse(JSON.stringify(t));if(e.properties)for(const t in e.properties)e.properties[t]=this._normalizeSchema(e.properties[t]);if(e.items&&(e.items=this._normalizeSchema(e.items)),"array"===e.type&&(e.attribute&&!e.items&&(e.items={attribute:e.attribute},delete e.attribute),e.items||(e.items={type:"string"})),e.selector&&(e.has||e.exclude)){const{selector:t,has:i,exclude:s}=e,r=t.split(",").map(t=>{let e=t.trim();return i&&(e=`${e}:has(${i})`),s&&(e=`${e}:not(${s})`),e}).join(", ");e.selector=r,delete e.has,delete e.exclude}return e}get id(){return this.constructor.id}get mode(){return this.constructor.mode}get context(){return this.ctx}async initialize(t,e){if(this.ctx)return;(0,v.merge)(t,e),this.ctx=t,this.opts=t,this.hdrs=function(t){const e={};if(t&&"object"==typeof t)for(const[i,s]of Object.entries(t))e[i.toLowerCase()]=s;return e}(t.headers),this.jar=[...t.cookies??[]],t.internal||(t.internal={}),t.internal.engine=this,t.engine=this.mode,this.actionEmitter.setMaxListeners(100),this.requestQueue=await E.RequestQueue.open();const i=await this._getSpecificCrawlerOptions(t),s={...(0,v.defaultsDeep)(i,{requestQueue:this.requestQueue,maxConcurrency:1,minConcurrency:1,useSessionPool:!0,persistCookiesPerSession:!0,sessionPoolOptions:{maxPoolSize:1,persistenceOptions:{enable:!1},sessionOptions:{maxUsageCount:1e3,maxErrorScore:3}}}),requestHandler:this._requestHandler.bind(this),errorHandler:this._failedRequestHandler.bind(this),failedRequestHandler:this._failedRequestHandler.bind(this)};this.crawler=this._createCrawler(s),this.crawler.run().then(()=>{this.isCrawlerReady=!0}).catch(t=>{this.isCrawlerReady=!1,console.error("Crawler background error:",t)})}async cleanup(){await(this._cleanup?.()),await this._commonCleanup();const t=this.ctx;t&&t.internal?.engine===this&&(t.internal.engine=void 0),this.ctx=void 0,this.opts=void 0}async _executePendingActions(t){await new Promise(e=>{const i=async({action:e,resolve:i,reject:s})=>{try{if("dispose"===e.type)return this.actionEmitter.emit("dispose"),void i();i(await this.executeAction(t,e))}catch(t){s(t)}};this.actionEmitter.on("dispatch",i),this.actionEmitter.once("dispose",()=>{this.actionEmitter.removeListener("dispatch",i),e()})})}async _sharedRequestHandler(t){try{const{request:e}=t;this.isPageActive=!0;const i=this.pendingRequests.get(e.userData.requestId);if(i){const s=await this._buildResponse(t),r=!s.statusCode||s.statusCode>=400;if(this.ctx?.throwHttpErrors&&r){const t=new q.CommonError(`Request for ${s.finalUrl} failed with status ${s.statusCode||"N/A"}`,"request",s.statusCode);i.reject(t)}else this.lastResponse=s,i.resolve(s);this.pendingRequests.delete(e.userData.requestId)}await this._executePendingActions(t)}finally{this.isPageActive=!1,this.navigationLock.release()}}async _sharedFailedRequestHandler(t,e){const{request:i}=t,s=this.pendingRequests.get(i.userData.requestId);if(s&&e&&this.ctx?.throwHttpErrors){this.pendingRequests.delete(i.userData.requestId);const t=e.response,r=t?.statusCode||500,n=t?.url?t.url:i.url,o=new q.CommonError(`Request${n?" for "+n:""} failed: ${e.message}`,"request",r);s.reject(o)}return this._sharedRequestHandler(t)}async dispatchAction(t){if(!this.isPageActive)throw new Error("No active page. Call goto() before performing actions.");return new Promise((e,i)=>{this.actionEmitter.emit("dispatch",{action:t,resolve:e,reject:i})})}async _requestHandler(t){await this._sharedRequestHandler(t)}async _failedRequestHandler(t,e){await this._sharedFailedRequestHandler(t,e)}async _commonCleanup(){if(this.isPageActive&&await this.dispatchAction({type:"dispose"}).catch(()=>{}),this.pendingRequests.size>0)for(const[,t]of this.pendingRequests)t.reject(new Error("Cleanup:Request cancelled"));if(this.actionEmitter.removeAllListeners(),this.crawler){try{await(this.crawler.teardown?.())}catch(t){console.error("ccrawler teardown error:",t)}this.crawler=void 0}this.isCrawlerReady=void 0,this.requestQueue&&(await this.requestQueue.drop(),this.requestQueue=void 0),this.pendingRequests.clear()}async blockResources(t,e){return e&&this.blockedTypes.clear(),t.forEach(t=>this.blockedTypes.add(t)),t.length}getContent(){return this.lastResponse?Promise.resolve(this.lastResponse):Promise.reject(new Error("No content fetched yet. Call goto() first."))}async headers(t,e){if(void 0===t)return{...this.hdrs};if("string"==typeof t&&void 0===e)return this.hdrs[t.toLowerCase()]||"";if(null!==t&&"object"==typeof t){const i={};for(const[e,s]of Object.entries(t))i[e.toLowerCase()]=String(s);return this.hdrs=!0===e?i:{...this.hdrs,...i},!0}return"string"==typeof t&&("string"==typeof e?this.hdrs[t.toLowerCase()]=e:null===e&&delete this.hdrs[t.toLowerCase()],!0)}async cookies(t){return Array.isArray(t)?(this.jar=[...t],!0):null===t?(this.jar=[],!0):[...this.jar]}async dispose(){await this.cleanup()}};async function C(t,e){const i=function(t,e){if(!t||!e?.length)return null;const i=new URL(t);let s=e.find(t=>t.domain===i.hostname);s||(s=e.find(t=>i.hostname.endsWith(t.domain)));if(!s)return null;if(s.pathScope?.length){if(!s.pathScope.some(t=>i.pathname.startsWith(t)))return null}return s}(e?.url||t.url,t.sites),s=t.engine||i?.engine||"auto";let r=await k.create(t,{engine:s});return r||(r=await k.create(t,{engine:"http"})),r}k.registry=new Map;var $=class{constructor(t={}){this.options=t,this.closed=!1,this.id=b(),this.context=this.createContext(t)}async execute(t){await this.ensureEngine(t);const e=f.create(t);if(!e)throw new Error(`Unknown action: ${t.id||t.name}`);let i,s;this.context.internal.actionIndex=(this.context.internal.actionIndex||0)+1,this.context.currentAction={...t,index:this.context.internal.actionIndex,startedAt:Date.now()};try{return i=await e.execute(this.context,t),i}catch(t){throw s=t,s}finally{this.context.currentAction=void 0}}async executeAll(t){try{for(let e=0;e<t.length;e++){const i=t[e];await this.execute(i)}const e=await this.execute({id:"getContent"});return{result:e?.result,outputs:this.getOutputs()}}catch(t){throw t}}getOutputs(){return this.context.outputs}async dispose(){if(this.closed)return;const t=this.context.eventBus;t.emit("session:closing",{sessionId:this.id});try{await(this.context.internal.engine?.dispose())}finally{this.closed=!0}t.emit("session:closed",{sessionId:this.id})}async ensureEngine(t){if(this.closed)throw new Error("Session is closed");if(!this.context.internal.engine){const e=t?.params?.url??this.context.url;if(!await C(this.context,{url:e}))throw new Error("No engine found")}}createContext(t=this.options){const e=new m.EventEmitter;return(0,g.defaultsDeep)({...t,id:this.id,eventBus:e,outputs:{},internal:{},execute:async t=>this.execute(t),action:async function(t,e,i){return this.execute({name:t,params:e,...i})}},l)}},R=class{constructor(t={}){this.defaults=t}async createSession(t){const e={...this.defaults,...t||{}};return new $(e)}async fetch(t,e){"string"!=typeof t&&(t=(e=t).url);const i=await this.createSession(e);try{const s=e?.actions||[];t&&0!==s.findIndex(e=>"goto"===e.id&&e.params?.url===t)&&s.unshift({id:"goto",params:{url:t}});return await i.executeAll(s)}finally{await i.dispose()}}},A=require("crawlee"),P=((t,s,r)=>(r=null!=t?e(n(t)):{},a(!s&&t&&t.__esModule?r:i(r,"default",{value:t,enumerable:!0}),t)))(require("cheerio")),j=require("@isdk/common-error"),F=class extends k{async _buildResponse(t){const{request:e,response:i,body:s,$:r}=t,n=r?.html();let o="string"==typeof s?s:Buffer.isBuffer(s)?s.toString("utf-8"):String(s??"");n&&n!==o&&(o=n);let a=i?.headers;if(!a&&i?.rawHeaders){a={};const t=i.rawHeaders;for(let e=0;e<t.length;e+=2)a[t[e].toLowerCase()]=t[e+1]}return{url:e.url,finalUrl:e.loadedUrl||e.url,statusCode:i?.statusCode??200,statusText:i?.statusMessage,headers:a||{},body:s,html:o,text:o}}async _querySelectorAll(t,e){const{$:i,el:s}=t;return s.find(e).toArray().map(t=>({$:i,el:i(t)}))}async _extractValue(t,e){const{el:i}=e,{attribute:s,type:r="string"}=t;if(0===i.length)return null;let n="";if(n=s?i.attr(s)??null:"html"===r?i.html():i.text().trim(),null===n)return null;switch(r){case"number":return parseFloat(n.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=n.toLowerCase();return"true"===t||"1"===t;default:return n}}async executeAction(t,e){const{$:i}=t;switch(e.type){case"dispose":return;case"extract":if(!i)throw new j.CommonError(`Cheerio context not available for action: ${e.type}`,"extract");return this._extract(e.schema,{$:i,el:i.root()});case"click":{if(!i)throw new j.CommonError(`Cheerio context not available for action: ${e.type}`,"click");const s=e.selector,r=i(s).first();let n;if(0===r.length)try{n=new URL(s,t.request.loadedUrl||t.request.url).href}catch{throw new j.CommonError(`click: selector not found or invalid URL: ${s}`,"click")}else{if(!r.is("a")||!r.attr("href")){if(r.is('input[type="submit"], button[type="submit"], button, input')){const e=r.closest("form");if(e.length)return this.executeAction(t,{type:"submit",selector:e});throw new j.CommonError("click: submit-like element without form","click")}throw new j.CommonError(`click: unsupported element for http simulate. Selector: ${s}`,"click")}{const e=r.attr("href");n=new URL(e,t.request.loadedUrl||t.request.url).href}}const o=await t.sendRequest({url:n});return void this._updateStateAfterNavigation(t,o)}case"fill":{if(!i)throw new j.CommonError(`Cheerio context not available for action: ${e.type}`),"fill";const s=i(e.selector).first();if(0===s.length)throw new j.CommonError(`fill: selector not found: ${e.selector}`);if(!s.is("input, textarea, select"))throw new j.CommonError(`fill: not a form field: ${e.selector}`);return s.val(e.value),void(this.lastResponse=await this.buildResponse(t))}case"waitFor":return void(e.options?.ms&&await new Promise(t=>setTimeout(t,e.options.ms)));case"pause":const s=this.ctx?.onPause;return void(s?(console.info(e.message||"Execution paused for manual intervention."),await s({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."));case"submit":{if(!i)throw new j.CommonError(`Cheerio context not available for action: ${e.type}`,"submit");const s="string"==typeof e.selector?i(e.selector).first():null!=e.selector?e.selector:i("form").first();if(0===s.length)throw new j.NotFoundError(e.selector,"submit");const r=s.attr("action")||t.request.loadedUrl||t.request.url,n=(s.attr("method")||"GET").toUpperCase(),o=new URL(r,t.request.loadedUrl||t.request.url).href,a={};let c;if(s.find("input, select, textarea").each((t,e)=>{const s=i(e),r=s.attr("name");if(!r)return;const n=s.val();null!=n&&(a[r]=String(n))}),"GET"===n){const e=new URL(o);Object.entries(a).forEach(([t,i])=>e.searchParams.set(t,i)),c=await t.sendRequest({url:e.href,method:"GET"})}else{let i;const r={};"application/json"===(e.options?.enctype||s.attr("enctype")||"application/x-www-form-urlencoded")?(i=JSON.stringify(a),r["Content-Type"]="application/json"):(i=new URLSearchParams(a).toString(),r["Content-Type"]="application/x-www-form-urlencoded"),c=await t.sendRequest({url:o,method:"POST",body:i,headers:r})}return void this._updateStateAfterNavigation(t,c)}case"getContent":return this.buildResponse(t);default:throw new j.CommonError(`Unknown action type: ${e.type}`,"CheerioFetchEngine.executeAction",j.ErrorCode.NotSupported)}}_updateStateAfterNavigation(t,e){const i=e;let s=i.headers;if(!s&&i.rawHeaders){s={};for(let t=0;t<i.rawHeaders.length;t+=2)s[i.rawHeaders[t].toLowerCase()]=i.rawHeaders[t+1]}s=s||{};const r=i.body,n=P.load(r??"");t.$=n,t.response=i,t.body=r;const o=n.html(),a=n.text(),c=(s["content-type"]||"").split(";")[0].trim();this.lastResponse={url:t.request.url,finalUrl:i.url,statusCode:i.statusCode,statusText:i.statusMessage,headers:s,contentType:c,body:r,html:o,text:a}}_createCrawler(t){return new A.CheerioCrawler(t)}_getSpecificCrawlerOptions(t){const e=this.opts?.proxy?"string"==typeof this.opts.proxy?[this.opts.proxy]:this.opts.proxy:void 0,i=e?.length?new A.ProxyConfiguration({proxyUrls:e}):void 0;return{additionalMimeTypes:["text/plain"],maxRequestRetries:1,requestHandlerTimeoutSecs:t.requestHandlerTimeoutSecs,proxyConfiguration:i,preNavigationHooks:[(e,i)=>{i.throwHttpErrors=t.throwHttpErrors,this.opts?.timeoutMs&&(i.timeout={request:this.opts.timeoutMs})}]}}async goto(t,e){this.isPageActive&&this.dispatchAction({type:"dispose"}).catch(()=>{});const i="req-"+ ++this.requestCounter,s=new Promise((t,s)=>{const r=e?.timeoutMs||this.opts?.timeoutMs||3e4,n=setTimeout(()=>{this.pendingRequests.delete(i),this.navigationLock.release(),s(new j.CommonError(`goto timed out after ${r}ms.`,"gotoTimeout",j.ErrorCode.RequestTimeout))},r);this.pendingRequests.set(i,{resolve:e=>{clearTimeout(n),t(e)},reject:t=>{clearTimeout(n),s(t)}})});return this.requestQueue.addRequest({...e,url:t,headers:{...this.hdrs,...e?.headers},userData:{requestId:i},uniqueKey:`${t}-${i}`}).catch(t=>{const e=this.pendingRequests.get(i);e&&(this.pendingRequests.delete(i),this.navigationLock.release(),e.reject(t))}),await this.navigationLock,this.navigationLock=S(),s}};F.id="cheerio",F.mode="http",k.register(F);var T=require("crawlee"),O=require("playwright"),U=require("camoufox-js"),_=require("@isdk/common-error"),N=class extends k{async _buildResponse(t){const{page:e,response:i,request:s}=t;if(!e||e.isClosed())return{url:s.url,finalUrl:s.loadedUrl||s.url,statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},body:"",html:"",text:""};const r=await e.content(),n=await e.textContent("body");return{url:e.url(),finalUrl:e.url(),statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},body:r,html:r,text:n||""}}async _querySelectorAll(t,e){return t.locator(e).all()}async _extractValue(t,e){const{attribute:i,type:s="string"}=t;if(0===await e.count())return null;let r="";if(r=i?await e.getAttribute(i):"html"===s?await e.innerHTML():await e.textContent(),null===r)return null;switch(r=r.trim(),s){case"number":return parseFloat(r.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=r.toLowerCase();return"true"===t||"1"===t;default:return r}}async executeAction(t,e){const{page:i}=t,s=this.opts?.timeoutMs||3e4;switch(e.type){case"navigate":{const s=await i.goto(e.url,{waitUntil:e.opts?.waitUntil||"domcontentloaded",timeout:this.opts?.timeoutMs||3e4});s&&(t={...t,response:s});const r=await this.buildResponse(t);return this.lastResponse=r,r}case"extract":{const s=await this._extract(e.schema,i.locator("body"));return this.lastResponse=await this.buildResponse(t),s}case"click":{await i.click(e.selector,{timeout:s}),await i.waitForLoadState("networkidle",{timeout:s});const r=await this.buildResponse(t);return void(this.lastResponse=r)}case"fill":await i.fill(e.selector,e.value,{timeout:s});const r=await this.buildResponse(t);return void(this.lastResponse=r);case"waitFor":try{e.options?.selector&&await i.waitForSelector(e.options.selector,{timeout:s}),e.options?.networkIdle&&await i.waitForLoadState("networkidle",{timeout:s})}catch(t){if(!1!==e.options?.failOnTimeout)throw t}return void(e.options?.ms&&await i.waitForTimeout(e.options.ms));case"submit":{const r=e.selector||"form",n=i.locator(r).first();if(0===await n.count())throw new _.NotFoundError(r,"submit");if("application/json"===(e.options?.enctype||"application/x-www-form-urlencoded")){const t=await n.elementHandle();if(!t)throw new _.CommonError(`submit: could not get form handle for ${r}`,"submit");const e=await t.evaluate(async t=>{const e=new FormData(t),i={};e.forEach((t,e)=>{i[e]=t.toString()});const s=await fetch(t.action,{method:t.method,headers:{"Content-Type":"application/json"},body:JSON.stringify(i)}),r=await s.text();return{status:s.status,statusText:s.statusText,headers:Object.fromEntries(s.headers.entries()),body:r,html:r,text:r,url:t.action,finalUrl:s.url}});return await t.dispose(),await i.setContent(e.html),void(this.lastResponse=e)}return await n.evaluate(t=>t.submit()),await i.waitForLoadState("networkidle",{timeout:s}),void(this.lastResponse=await this.buildResponse(t))}case"pause":{const t=this.ctx?.onPause;return void(t?(console.info(e.message||"Execution paused for manual intervention."),await t({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."))}case"getContent":return this.buildResponse(t);default:throw new _.CommonError(`Unknown action type: ${e.type}`,"PlaywrightFetchEngine.executeAction",_.ErrorCode.NotSupported)}}_createCrawler(t){return new T.PlaywrightCrawler(t)}async _getSpecificCrawlerOptions(t){const e=t.browser?.headless??!0,i={maxRequestRetries:t.retries||3,headless:e,requestHandlerTimeoutSecs:t.requestHandlerTimeoutSecs,preNavigationHooks:[async({page:e,request:i},s)=>{s.throwHttpErrors=t.throwHttpErrors,this.jar.length>0&&await e.context().addCookies(this.jar.map(t=>({...t,url:i.url,domain:t.domain||new URL(i.url).hostname})));const r=this.blockedTypes;r.size>0&&await e.route("**/*",t=>{r.has(t.request().resourceType())?t.abort():t.continue()})}]};if(this.opts?.antibot){i.browserPoolOptions={useFingerprints:!1};const t=await(0,U.launchOptions)({headless:e});i.launchContext={launcher:O.firefox,launchOptions:t},i.postNavigationHooks=[async({page:t,handleCloudflareChallenge:e})=>{await e()}]}return i}async goto(t,e){if(this.isPageActive)return this.dispatchAction({type:"navigate",url:t,opts:e});if(!this.requestQueue)throw new _.CommonError("RequestQueue not initialized","goto");const i="req-"+ ++this.requestCounter,s=new Promise((t,e)=>{this.pendingRequests.set(i,{resolve:t,reject:e})});return await this.requestQueue.addRequest({url:t,headers:this.hdrs,userData:{requestId:i,waitUntil:e?.waitUntil||"domcontentloaded"},uniqueKey:`${t}-${i}`}),s}};N.id="playwright",N.mode="browser",k.register(N);var H=class extends f{async onExecute(t,e){const{selector:i,...s}=e?.params||{};if(!i)throw new Error("Selector is required for click action");await this.delegateToEngine(t,"click",i,s)}};H.id="click",H.returnType="none",H.capabilities={http:"simulate",browser:"native"},f.register(H);var L=class extends f{async onExecute(t,e){const{selector:i,value:s,...r}=e?.params||{};if(!i)throw new Error("Selector is required for fill action");if(void 0===s)throw new Error("Value is required for fill action");await this.delegateToEngine(t,"fill",i,s,r)}};L.id="fill",L.returnType="none",L.capabilities={http:"simulate",browser:"native"},f.register(L);var M=class extends f{async onExecute(t,e){return await this.delegateToEngine(t,"getContent",e?.params)}};M.id="getContent",M.returnType="response",M.capabilities={http:"native",browser:"native"},f.register(M);var G=class extends f{async onExecute(t,e,i){const s=e?.params,r=s?.url||t.url;if(!r)throw new Error("URL is required for goto action");const n=t.internal.engine;if(!n)throw new Error("No engine available");t.url=r;return await n.goto(r,s)}};G.id="goto",G.returnType="response",G.capabilities={http:"native",browser:"native"},f.register(G);var z=class extends f{async onExecute(t,e){const{selector:i,...s}=e?.params||{};await this.delegateToEngine(t,"submit",i,s)}};z.id="submit",z.returnType="none",z.capabilities={http:"simulate",browser:"native"},f.register(z);var D=class extends f{async onExecute(t,e){const i=t.internal.engine;if(!i)throw new Error("No engine available");await i.waitFor(e?.params)}};D.id="waitFor",D.returnType="none",D.capabilities={http:"native",browser:"native"},f.register(D);var W=class extends f{async onExecute(t,e){const i=e?.params;if(!i)throw new Error("Schema is required for extract action");return this.delegateToEngine(t,"extract",i)}};W.id="extract",W.returnType="any",W.capabilities={http:"native",browser:"native"},f.register(W);var B=class extends f{async onExecute(t,e){const{selector:i,message:s,attribute:r}=e?.params||{},n=t.internal.engine;if("browser"===n?.mode){if(i){if(!await(n?.extract({selector:i,attribute:r})))return}n&&"pause"in n?await n.pause(s):console.warn("[PauseAction] was called, but the current engine does not support `pause`. Skipped.")}else console.warn("[PauseAction] can only run in browser engine. Skipped.")}};async function I(t,e){return(new R).fetch(t,e)}B.id="pause",B.capabilities={http:"native",browser:"native"},B.returnType="none",f.register(B);
1
+ "use strict";var t,e=Object.create,i=Object.defineProperty,s=Object.getOwnPropertyDescriptor,n=Object.getOwnPropertyNames,r=Object.getPrototypeOf,o=Object.prototype.hasOwnProperty,a=(t,e,r,a)=>{if(e&&"object"==typeof e||"function"==typeof e)for(let c of n(e))o.call(t,c)||c===r||i(t,c,{get:()=>e[c],enumerable:!(a=s(e,c))||a.enumerable});return t},c={};((t,e)=>{for(var s in e)i(t,s,{get:e[s],enumerable:!0})})(c,{CheerioFetchEngine:()=>j,ClickAction:()=>H,DefaultFetcherProperties:()=>l,ExtractAction:()=>W,FetchAction:()=>f,FetchActionResultStatus:()=>h,FetchEngine:()=>S,FetchSession:()=>$,FetcherOptionKeys:()=>u,FillAction:()=>M,GetContentAction:()=>L,GotoAction:()=>G,PauseAction:()=>B,PlaywrightFetchEngine:()=>N,SubmitAction:()=>z,WaitForAction:()=>D,WebFetcher:()=>R,fetchWeb:()=>I}),module.exports=(t=c,a(i({},"__esModule",{value:!0}),t));var l={engine:"auto",enableSmart:!0,useSiteRegistry:!0,antibot:!1,headers:{},cookies:[],reuseCookies:!0,throwHttpErrors:void 0,proxy:[],blockResources:[],ignoreSslErrors:!0,browser:{engine:"playwright",headless:!0,waitUntil:"domcontentloaded"},http:{method:"GET"},timeoutMs:6e4,requestHandlerTimeoutSecs:void 0,maxConcurrency:1,maxRequestsPerMinute:1e3,delayBetweenRequestsMs:0,retries:0,sites:[]},u=Object.keys(l).concat(["actions","onPause"]),h=(t=>(t[t.Failed=0]="Failed",t[t.Success=1]="Success",t[t.Skipped=2]="Skipped",t))(h||{}),w=class t{static register(t){const e=t.id;if(!e)throw new Error("FetchAction.register: actionClass.id is required");this.registry.set(e,t)}static get(t){return this.registry.get(t)}static create(e){const i="string"==typeof e?e:e.id||e.name||e.action;if(!i)throw new Error("Action must have id, name or action");const s=i instanceof t?i.constructor:this.registry.get(i);return s?new s:void 0}static has(t){return this.registry.has(t)}static list(){return Array.from(this.registry.keys())}static getCapability(t){return this.capabilities[t]??"noop"}getCapability(t){return this.constructor.getCapability(t)}get id(){return this.constructor.id}get returnType(){return this.constructor.returnType}get capabilities(){return this.constructor.capabilities}async delegateToEngine(t,e,...i){const s=t.internal.engine;if(!s)throw new Error("No engine available");if("function"!=typeof s[e])throw new Error(`Engine does not have a method named '${String(e)}'`);return await s[e](...i)}installCollectors(e,i){const s=i?.collectors;if(!s?.length)return;const n=[],r=new Set;for(const i of s){const s=d(i.activateOn),o=d(i.collectOn),a=d(i.deactivateOn),c=!(i.background??!0),l=t.create(i);if(!l)continue;let u=!1,h=!1,w=0;const f=async t=>{if(!u&&!h){u=!0;try{await(l.onBeforeExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:l.id,phase:"before",error:t})}}},m=async(t,s)=>{if(!h){u||await f(s);try{const n=Promise.resolve(l.onExecute?.(e,i,s)).then(s=>{var n,r;if(i.storeAs){((n=e.outputs)[r=i.storeAs]||(n[r]=[])).push(s)}return e.eventBus.emit("collector:result",{action:this.id,collector:i.id||i.name,event:t,result:s}),s}).catch(s=>{e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,event:t,phase:"exec",error:s})}).finally(()=>{w++});c&&(r.add(n),n.finally(()=>r.delete(n)))}catch(i){e.eventBus.emit("collector:error",{action:this.id,collector:l.id,event:t,phase:"exec",error:i})}}},g=async()=>{if(!h){0===w&&m("collector:after"),h=!0;try{await(l.onAfterExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,phase:"after",error:t})}finally{e.eventBus.emit("collector:end",{action:this.id,collector:i.id||i.name}),v.forEach(t=>t())}}},b=p(e,s,f),v=y(e,o,m),x=p(e,a,g);if(n.push(...b,...v,...x),!s.length&&!o.length&&!a.length){const t=()=>{g()};e.eventBus.once(`action:${this.id}.end`,t),n.push(()=>e.eventBus.off("fetcher:action:end",t))}}return n.length||r.size>0?{cleanup:()=>n.forEach(t=>t()),awaitExecPendings:async()=>{r.size>0&&await Promise.allSettled(Array.from(r))}}:void 0}async beforeExec(t,e){t.internal.actionStack||(t.internal.actionStack=[]);const i=t.internal.actionStack,s=i.length,n=i.length>0?i[i.length-1].id:void 0,r={...e,id:this.id,depth:s,parent:n};i.push(r),t.currentAction=r;const o={action:this,context:t,options:e,depth:s,stack:[...i]};t.eventBus.emit(`action:${this.id}.start`,o),t.eventBus.emit("action:start",o),await(this.onBeforeExec?.(t,e));return{entry:o,collectors:this.installCollectors(t,e)}}async afterExec(t,e,i,s){const n=t.internal.actionStack,r=n.length-1,o=s?.collectors;try{await(o?.awaitExecPendings()),t.lastResult=i,"response"!==i?.returnType||i.error||(t.lastResponse=i.result),e?.storeAs&&(t.outputs[e.storeAs]=i?.result),i?.error&&(t.currentAction.error=i.error),await(this.onAfterExec?.(t,e));const s={action:this,context:t,options:e,result:i,depth:r,stack:[...n]};i?.error&&(s.error=i.error);try{t.eventBus.emit(`action:${this.id}.end`,s)}catch(t){}try{t.eventBus.emit("action:end",s)}catch(t){}}finally{try{o?.cleanup()}finally{n.pop();const e=n.length;t.currentAction=e>0?n[e-1]:void 0}}}async execute(t,e){e?.args&&!e.params&&(e.params=e.args);const i=await this.beforeExec(t,e);let s;try{const i=e?.failOnError??!0;return t.throwHttpErrors=i,s=await this.onExecute(t,e),s&&s.returnType||(s={status:1,returnType:this.returnType??"any",result:s}),s}catch(i){if(s={status:0,error:i,meta:{id:this.id,engineType:t.engine,capability:this.getCapability(t.engine)}},e?.failOnError)throw i;return s}finally{await this.afterExec(t,e,s,i)}}};w.registry=new Map,w.returnType="any",w.capabilities={http:"noop",browser:"noop"};var f=w;function d(t){return t?Array.isArray(t)?t:[t]:[]}function p(t,e,i){const s=[];for(const n of e)if("string"==typeof n||n instanceof RegExp){const e=(...t)=>{i(t[0])};t.eventBus.once(n,e),s.push(()=>t.eventBus.off(n,e))}return s}function y(t,e,i){const s=[];for(const n of e)if("string"==typeof n||n instanceof RegExp){const e=t=>i(n,t);t.eventBus.on(n,e),s.push(()=>t.eventBus.off(n,e))}return s}var m=require("events-ex");var g=require("lodash-es"),b=(0,require("nanoid").customAlphabet)("0123456789abcdefghijklmnopqrstuvwxyz",12);var v=require("lodash-es"),x=require("events-ex"),q=require("@isdk/common-error"),E=require("crawlee");function k(){let t=()=>{};const e=new Promise(e=>{t=e});return e.release=t,e}E.Configuration.getGlobalConfig().set("persistStorage",!1);var S=class{constructor(){this.hdrs={},this.jar=[],this.pendingRequests=new Map,this.requestCounter=0,this.actionEmitter=new x.EventEmitter,this.isPageActive=!1,this.navigationLock=function(){const t=k();return t.release(),t}(),this.blockedTypes=new Set}static register(t){const e=t.id;if(!e)throw new Error("Engine must define static id");if(this.registry.has(e))throw new Error(`Engine id duplicated: ${e}`);this.registry.set(e,t)}static get(t){return this.registry.get(t)}static getByMode(t){for(const[e,i]of this.registry.entries())if(i.mode===t)return i}static async create(t,e){const i=(0,v.defaultsDeep)(e,t,l),s=i.engine??t.engine,n=s?this.get(s)??this.getByMode(s):null;if(n){const e=new n;return await e.initialize(t,i),e}}async _extract(t,e){const i=t.type;if(!e)return"array"===i?[]:null;if("object"===i){const{selector:i,properties:s}=t;let n=e;if(i){const t=await this._querySelectorAll(e,i);n=t.length>0?t[0]:null}if(!n)return null;const r={};for(const t in s)r[t]=await this._extract(s[t],n);return r}if("array"===i){const{selector:i,items:s}=t,n=i?await this._querySelectorAll(e,i):[e],r=[];for(const t of n)r.push(await this._extract(s,t));return r}const{selector:s}=t;let n=e;if(s){const t=await this._querySelectorAll(e,s);n=t.length>0?t[0]:null}return n?this._extractValue(t,n):null}async buildResponse(t){const e=await this._buildResponse(t),i=e.headers["content-type"]||"";return e.contentType=i.split(";")[0].trim(),e}waitFor(t){return this.dispatchAction({type:"waitFor",options:t})}click(t){return this.dispatchAction({type:"click",selector:t})}fill(t,e){return this.dispatchAction({type:"fill",selector:t,value:e})}submit(t,e){return this.dispatchAction({type:"submit",selector:t,options:e})}pause(t){return this.dispatchAction({type:"pause",message:t})}extract(t){const e=this._normalizeSchema(t);return this.dispatchAction({type:"extract",schema:e})}_normalizeSchema(t){const e=JSON.parse(JSON.stringify(t));if(e.properties)for(const t in e.properties)e.properties[t]=this._normalizeSchema(e.properties[t]);if(e.items&&(e.items=this._normalizeSchema(e.items)),"array"===e.type&&(e.attribute&&!e.items&&(e.items={attribute:e.attribute},delete e.attribute),e.items||(e.items={type:"string"})),e.selector&&(e.has||e.exclude)){const{selector:t,has:i,exclude:s}=e,n=t.split(",").map(t=>{let e=t.trim();return i&&(e=`${e}:has(${i})`),s&&(e=`${e}:not(${s})`),e}).join(", ");e.selector=n,delete e.has,delete e.exclude}return e}get id(){return this.constructor.id}async getState(){return{cookies:await this.cookies()}}get mode(){return this.constructor.mode}get context(){return this.ctx}async initialize(t,e){if(this.ctx)return;(0,v.merge)(t,e),this.ctx=t,this.opts=t,this.hdrs=function(t){const e={};if(t&&"object"==typeof t)for(const[i,s]of Object.entries(t))e[i.toLowerCase()]=s;return e}(t.headers),this.jar=[...t.cookies??[]],t.internal||(t.internal={}),t.internal.engine=this,t.engine=this.mode,this.actionEmitter.setMaxListeners(100),this.requestQueue=await E.RequestQueue.open();const i=await this._getSpecificCrawlerOptions(t),s={...(0,v.defaultsDeep)(i,{requestQueue:this.requestQueue,maxConcurrency:1,minConcurrency:1,useSessionPool:!0,persistCookiesPerSession:!0,sessionPoolOptions:{maxPoolSize:1,persistenceOptions:{enable:!1},sessionOptions:{maxUsageCount:1e3,maxErrorScore:3}}}),requestHandler:this._requestHandler.bind(this),errorHandler:this._failedRequestHandler.bind(this),failedRequestHandler:this._failedRequestHandler.bind(this)};this.crawler=this._createCrawler(s),this.crawler.run().then(()=>{this.isCrawlerReady=!0}).catch(t=>{this.isCrawlerReady=!1,console.error("Crawler background error:",t)})}async cleanup(){await(this._cleanup?.()),await this._commonCleanup();const t=this.ctx;t&&t.internal?.engine===this&&(t.internal.engine=void 0),this.ctx=void 0,this.opts=void 0}async _executePendingActions(t){await new Promise(e=>{const i=async({action:e,resolve:i,reject:s})=>{try{if("dispose"===e.type)return this.actionEmitter.emit("dispose"),void i();i(await this.executeAction(t,e))}catch(t){s(t)}};this.actionEmitter.on("dispatch",i),this.actionEmitter.once("dispose",()=>{this.actionEmitter.removeListener("dispatch",i),e()})})}async _sharedRequestHandler(t){try{const{request:e}=t;this.isPageActive=!0;const i=this.pendingRequests.get(e.userData.requestId);if(i){const s=await this._buildResponse(t),n=!s.statusCode||s.statusCode>=400;if(this.ctx?.throwHttpErrors&&n){const t=new q.CommonError(`Request for ${s.finalUrl} failed with status ${s.statusCode||"N/A"}`,"request",s.statusCode);i.reject(t)}else this.lastResponse=s,i.resolve(s);this.pendingRequests.delete(e.userData.requestId)}await this._executePendingActions(t)}finally{this.isPageActive=!1,this.navigationLock.release()}}async _sharedFailedRequestHandler(t,e){const{request:i}=t,s=this.pendingRequests.get(i.userData.requestId);if(s&&e&&this.ctx?.throwHttpErrors){this.pendingRequests.delete(i.userData.requestId);const t=e.response,n=t?.statusCode||500,r=t?.url?t.url:i.url,o=new q.CommonError(`Request${r?" for "+r:""} failed: ${e.message}`,"request",n);s.reject(o)}return this._sharedRequestHandler(t)}async dispatchAction(t){if(!this.isPageActive)throw new Error("No active page. Call goto() before performing actions.");return new Promise((e,i)=>{this.actionEmitter.emit("dispatch",{action:t,resolve:e,reject:i})})}async _requestHandler(t){await this._sharedRequestHandler(t)}async _failedRequestHandler(t,e){await this._sharedFailedRequestHandler(t,e)}async _commonCleanup(){if(this.isPageActive&&await this.dispatchAction({type:"dispose"}).catch(()=>{}),this.pendingRequests.size>0)for(const[,t]of this.pendingRequests)t.reject(new Error("Cleanup:Request cancelled"));if(this.actionEmitter.removeAllListeners(),this.crawler){try{await(this.crawler.teardown?.())}catch(t){console.error("ccrawler teardown error:",t)}this.crawler=void 0}this.isCrawlerReady=void 0,this.requestQueue&&(await this.requestQueue.drop(),this.requestQueue=void 0),this.pendingRequests.clear()}async blockResources(t,e){return e&&this.blockedTypes.clear(),t.forEach(t=>this.blockedTypes.add(t)),t.length}getContent(){return this.lastResponse?Promise.resolve(this.lastResponse):Promise.reject(new Error("No content fetched yet. Call goto() first."))}async headers(t,e){if(void 0===t)return{...this.hdrs};if("string"==typeof t&&void 0===e)return this.hdrs[t.toLowerCase()]||"";if(null!==t&&"object"==typeof t){const i={};for(const[e,s]of Object.entries(t))i[e.toLowerCase()]=String(s);return this.hdrs=!0===e?i:{...this.hdrs,...i},!0}return"string"==typeof t&&("string"==typeof e?this.hdrs[t.toLowerCase()]=e:null===e&&delete this.hdrs[t.toLowerCase()],!0)}async cookies(t){return Array.isArray(t)?(this.jar=[...t],!0):null===t?(this.jar=[],!0):[...this.jar]}async dispose(){await this.cleanup()}};async function C(t,e){const i=function(t,e){if(!t||!e?.length)return null;const i=new URL(t);let s=e.find(t=>t.domain===i.hostname);s||(s=e.find(t=>i.hostname.endsWith(t.domain)));if(!s)return null;if(s.pathScope?.length){if(!s.pathScope.some(t=>i.pathname.startsWith(t)))return null}return s}(e?.url||t.url,t.sites),s=t.engine||i?.engine||"auto";let n=await S.create(t,{engine:s});return n||(n=await S.create(t,{engine:"http"})),n}S.registry=new Map;var $=class{constructor(t={}){this.options=t,this.closed=!1,this.id=b(),this.context=this.createContext(t)}async execute(t){await this.ensureEngine(t);const e=f.create(t);if(!e)throw new Error(`Unknown action: ${t.id||t.name}`);let i,s;this.context.internal.actionIndex=(this.context.internal.actionIndex||0)+1,this.context.currentAction={...t,index:this.context.internal.actionIndex,startedAt:Date.now()};try{return i=await e.execute(this.context,t),i}catch(t){throw s=t,s}finally{this.context.currentAction=void 0}}async executeAll(t){let e=0;try{for(;e<t.length;){const i=t[e];await this.execute(i),e++}const i=await this.execute({id:"getContent"});return{result:i?.result,outputs:this.getOutputs()}}catch(t){throw t.actionIndex=e,t}}getOutputs(){return this.context.outputs}async getState(){return this.context.internal.engine?.getState()}async dispose(){if(this.closed)return;const t=this.context.eventBus;t.emit("session:closing",{sessionId:this.id});try{await(this.context.internal.engine?.dispose())}finally{this.closed=!0}t.emit("session:closed",{sessionId:this.id})}async ensureEngine(t){if(this.closed)throw new Error("Session is closed");if(!this.context.internal.engine){const e=t?.params?.url??this.context.url;if(!await C(this.context,{url:e}))throw new Error("No engine found")}}createContext(t=this.options){const e=new m.EventEmitter;return(0,g.defaultsDeep)({...t,id:this.id,eventBus:e,outputs:{},internal:{},execute:async t=>this.execute(t),action:async function(t,e,i){return this.execute({name:t,params:e,...i})}},l)}},R=class{constructor(t={}){this.defaults=t}async createSession(t){const e={...this.defaults,...t||{}};return new $(e)}async fetch(t,e){"string"!=typeof t&&(t=(e=t).url);const i=await this.createSession(e);try{const s=e?.actions||[];t&&0!==s.findIndex(e=>"goto"===e.id&&e.params?.url===t)&&s.unshift({id:"goto",params:{url:t}});return await i.executeAll(s)}finally{await i.dispose()}}},A=require("crawlee"),P=((t,s,n)=>(n=null!=t?e(r(t)):{},a(!s&&t&&t.__esModule?n:i(n,"default",{value:t,enumerable:!0}),t)))(require("cheerio")),F=require("@isdk/common-error"),j=class extends S{async _buildResponse(t){const{request:e,response:i,body:s,$:n}=t,r=n?.html();let o="string"==typeof s?s:Buffer.isBuffer(s)?s.toString("utf-8"):String(s??"");r&&r!==o&&(o=r);let a=i?.headers;if(!a&&i?.rawHeaders){a={};const t=i.rawHeaders;for(let e=0;e<t.length;e+=2)a[t[e].toLowerCase()]=t[e+1]}return{url:e.url,finalUrl:e.loadedUrl||e.url,statusCode:i?.statusCode??200,statusText:i?.statusMessage,headers:a||{},cookies:t.session?.getCookies(e.url),body:s,html:o,text:o}}async _querySelectorAll(t,e){const{$:i,el:s}=t;return s.find(e).toArray().map(t=>({$:i,el:i(t)}))}async _extractValue(t,e){const{el:i}=e,{attribute:s,type:n="string"}=t;if(0===i.length)return null;let r="";if(r=s?i.attr(s)??null:"html"===n?i.html():i.text().trim(),null===r)return null;switch(n){case"number":return parseFloat(r.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=r.toLowerCase();return"true"===t||"1"===t;default:return r}}async executeAction(t,e){const{$:i}=t;switch(e.type){case"dispose":return;case"extract":if(!i)throw new F.CommonError(`Cheerio context not available for action: ${e.type}`,"extract");return this._extract(e.schema,{$:i,el:i.root()});case"click":{if(!i)throw new F.CommonError(`Cheerio context not available for action: ${e.type}`,"click");const s=e.selector,n=i(s).first();let r;if(0===n.length)try{r=new URL(s,t.request.loadedUrl||t.request.url).href}catch{throw new F.CommonError(`click: selector not found or invalid URL: ${s}`,"click")}else{if(!n.is("a")||!n.attr("href")){if(n.is('input[type="submit"], button[type="submit"], button, input')){const e=n.closest("form");if(e.length)return this.executeAction(t,{type:"submit",selector:e});throw new F.CommonError("click: submit-like element without form","click")}throw new F.CommonError(`click: unsupported element for http simulate. Selector: ${s}`,"click")}{const e=n.attr("href");r=new URL(e,t.request.loadedUrl||t.request.url).href}}const o=await t.sendRequest({url:r});return void this._updateStateAfterNavigation(t,o)}case"fill":{if(!i)throw new F.CommonError(`Cheerio context not available for action: ${e.type}`),"fill";const s=i(e.selector).first();if(0===s.length)throw new F.CommonError(`fill: selector not found: ${e.selector}`);if(!s.is("input, textarea, select"))throw new F.CommonError(`fill: not a form field: ${e.selector}`);return s.val(e.value),void(this.lastResponse=await this.buildResponse(t))}case"waitFor":return void(e.options?.ms&&await new Promise(t=>setTimeout(t,e.options.ms)));case"pause":const s=this.ctx?.onPause;return void(s?(console.info(e.message||"Execution paused for manual intervention."),await s({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."));case"submit":{if(!i)throw new F.CommonError(`Cheerio context not available for action: ${e.type}`,"submit");const s="string"==typeof e.selector?i(e.selector).first():null!=e.selector?e.selector:i("form").first();if(0===s.length)throw new F.NotFoundError(e.selector,"submit");const n=s.attr("action")||t.request.loadedUrl||t.request.url,r=(s.attr("method")||"GET").toUpperCase(),o=new URL(n,t.request.loadedUrl||t.request.url).href,a={};let c;if(s.find("input, select, textarea").each((t,e)=>{const s=i(e),n=s.attr("name");if(!n)return;const r=s.val();null!=r&&(a[n]=String(r))}),"GET"===r){const e=new URL(o);Object.entries(a).forEach(([t,i])=>e.searchParams.set(t,i)),c=await t.sendRequest({url:e.href,method:"GET"})}else{let i;const n={};"application/json"===(e.options?.enctype||s.attr("enctype")||"application/x-www-form-urlencoded")?(i=JSON.stringify(a),n["Content-Type"]="application/json"):(i=new URLSearchParams(a).toString(),n["Content-Type"]="application/x-www-form-urlencoded"),c=await t.sendRequest({url:o,method:"POST",body:i,headers:n})}return void this._updateStateAfterNavigation(t,c)}case"getContent":return this.buildResponse(t);default:throw new F.CommonError(`Unknown action type: ${e.type}`,"CheerioFetchEngine.executeAction",F.ErrorCode.NotSupported)}}_updateStateAfterNavigation(t,e){const i=e;let s=i.headers;if(!s&&i.rawHeaders){s={};for(let t=0;t<i.rawHeaders.length;t+=2)s[i.rawHeaders[t].toLowerCase()]=i.rawHeaders[t+1]}s=s||{};const n=i.body,r=P.load(n??"");t.$=r,t.response=i,t.body=n;const o=r.html(),a=r.text(),c=(s["content-type"]||"").split(";")[0].trim();this.lastResponse={url:t.request.url,finalUrl:i.url,statusCode:i.statusCode,statusText:i.statusMessage,headers:s,contentType:c,body:n,html:o,text:a}}_createCrawler(t){return new A.CheerioCrawler(t)}_getSpecificCrawlerOptions(t){const e=this.opts?.proxy?"string"==typeof this.opts.proxy?[this.opts.proxy]:this.opts.proxy:void 0,i=e?.length?new A.ProxyConfiguration({proxyUrls:e}):void 0;return{additionalMimeTypes:["text/plain"],maxRequestRetries:1,requestHandlerTimeoutSecs:t.requestHandlerTimeoutSecs,proxyConfiguration:i,preNavigationHooks:[({session:e,request:i},s)=>{if(s.throwHttpErrors=t.throwHttpErrors,this.opts?.timeoutMs&&(s.timeout={request:this.opts.timeoutMs}),this.jar.length>0&&e)for(const t of this.jar){const s=`${t.name}=${t.value}`;e.setCookie(s,i.url)}}]}}async goto(t,e){this.isPageActive&&this.dispatchAction({type:"dispose"}).catch(()=>{});const i="req-"+ ++this.requestCounter,s=new Promise((t,s)=>{const n=e?.timeoutMs||this.opts?.timeoutMs||3e4,r=setTimeout(()=>{this.pendingRequests.delete(i),this.navigationLock.release(),s(new F.CommonError(`goto timed out after ${n}ms.`,"gotoTimeout",F.ErrorCode.RequestTimeout))},n);this.pendingRequests.set(i,{resolve:e=>{clearTimeout(r),t(e)},reject:t=>{clearTimeout(r),s(t)}})});return this.requestQueue.addRequest({...e,url:t,headers:{...this.hdrs,...e?.headers},userData:{requestId:i},uniqueKey:`${t}-${i}`}).catch(t=>{const e=this.pendingRequests.get(i);e&&(this.pendingRequests.delete(i),this.navigationLock.release(),e.reject(t))}),await this.navigationLock,this.navigationLock=k(),s}};j.id="cheerio",j.mode="http",S.register(j);var T=require("crawlee"),O=require("playwright"),_=require("camoufox-js"),U=require("@isdk/common-error"),N=class extends S{async _buildResponse(t){const{page:e,response:i,request:s}=t;if(!e||e.isClosed())return{url:s.url,finalUrl:s.loadedUrl||s.url,statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},cookies:[],body:"",html:"",text:""};const n=await e.content(),r=await e.textContent("body");return{url:e.url(),finalUrl:e.url(),statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},cookies:await e.context().cookies(),body:n,html:n,text:r||""}}async _querySelectorAll(t,e){return t.locator(e).all()}async _extractValue(t,e){const{attribute:i,type:s="string"}=t;if(0===await e.count())return null;let n="";if(n=i?await e.getAttribute(i):"html"===s?await e.innerHTML():await e.textContent(),null===n)return null;switch(n=n.trim(),s){case"number":return parseFloat(n.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=n.toLowerCase();return"true"===t||"1"===t;default:return n}}async executeAction(t,e){const{page:i}=t,s=this.opts?.timeoutMs||3e4;switch(e.type){case"navigate":{const s=await i.goto(e.url,{waitUntil:e.opts?.waitUntil||"domcontentloaded",timeout:this.opts?.timeoutMs||3e4});s&&(t={...t,response:s});const n=await this.buildResponse(t);return this.lastResponse=n,n}case"extract":{const s=await this._extract(e.schema,i.locator("body"));return this.lastResponse=await this.buildResponse(t),s}case"click":{await i.click(e.selector,{timeout:s}),await i.waitForLoadState("networkidle",{timeout:s});const n=await this.buildResponse(t);return void(this.lastResponse=n)}case"fill":await i.fill(e.selector,e.value,{timeout:s});const n=await this.buildResponse(t);return void(this.lastResponse=n);case"waitFor":try{e.options?.selector&&await i.waitForSelector(e.options.selector,{timeout:s}),e.options?.networkIdle&&await i.waitForLoadState("networkidle",{timeout:s})}catch(t){if(!1!==e.options?.failOnTimeout)throw t}return void(e.options?.ms&&await i.waitForTimeout(e.options.ms));case"submit":{const n=e.selector||"form",r=i.locator(n).first();if(0===await r.count())throw new U.NotFoundError(n,"submit");if("application/json"===(e.options?.enctype||"application/x-www-form-urlencoded")){const t=await r.elementHandle();if(!t)throw new U.CommonError(`submit: could not get form handle for ${n}`,"submit");const e=await t.evaluate(async t=>{const e=new FormData(t),i={};e.forEach((t,e)=>{i[e]=t.toString()});const s=await fetch(t.action,{method:t.method,headers:{"Content-Type":"application/json"},body:JSON.stringify(i)}),n=await s.text();return{status:s.status,statusText:s.statusText,headers:Object.fromEntries(s.headers.entries()),body:n,html:n,text:n,url:t.action,finalUrl:s.url}});return await t.dispose(),await i.setContent(e.html),void(this.lastResponse=e)}return await r.evaluate(t=>t.submit()),await i.waitForLoadState("networkidle",{timeout:s}),void(this.lastResponse=await this.buildResponse(t))}case"pause":{const t=this.ctx?.onPause;return void(t?(console.info(e.message||"Execution paused for manual intervention."),await t({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."))}case"getContent":return this.buildResponse(t);default:throw new U.CommonError(`Unknown action type: ${e.type}`,"PlaywrightFetchEngine.executeAction",U.ErrorCode.NotSupported)}}_createCrawler(t){return new T.PlaywrightCrawler(t)}async _getSpecificCrawlerOptions(t){const e=t.browser?.headless??!0,i={maxRequestRetries:t.retries||3,headless:e,requestHandlerTimeoutSecs:t.requestHandlerTimeoutSecs,preNavigationHooks:[async({page:e,request:i},s)=>{if(s.throwHttpErrors=t.throwHttpErrors,this.jar.length>0)try{const t=this.jar.map(t=>{const e={...t};return e.domain||e.url||(e.url=i.url),"no_restriction"===e.sameSite&&(e.sameSite="None"),e});await e.context().addCookies(t)}catch(t){console.error("[Playwright] Failed to restore cookies:",t)}const n=this.blockedTypes;n.size>0&&await e.route("**/*",t=>{n.has(t.request().resourceType())?t.abort():t.continue()})}]};if(this.opts?.antibot){i.browserPoolOptions={useFingerprints:!1};const t=await(0,_.launchOptions)({headless:e});i.launchContext={launcher:O.firefox,launchOptions:t},i.postNavigationHooks=[async({page:t,handleCloudflareChallenge:e})=>{await e()}]}return i}async goto(t,e){if(this.isPageActive)return this.dispatchAction({type:"navigate",url:t,opts:e});if(!this.requestQueue)throw new U.CommonError("RequestQueue not initialized","goto");const i="req-"+ ++this.requestCounter,s=new Promise((t,e)=>{this.pendingRequests.set(i,{resolve:t,reject:e})});return await this.requestQueue.addRequest({url:t,headers:this.hdrs,userData:{requestId:i,waitUntil:e?.waitUntil||"domcontentloaded"},uniqueKey:`${t}-${i}`}),s}};N.id="playwright",N.mode="browser",S.register(N);var H=class extends f{async onExecute(t,e){const{selector:i,...s}=e?.params||{};if(!i)throw new Error("Selector is required for click action");await this.delegateToEngine(t,"click",i,s)}};H.id="click",H.returnType="none",H.capabilities={http:"simulate",browser:"native"},f.register(H);var M=class extends f{async onExecute(t,e){const{selector:i,value:s,...n}=e?.params||{};if(!i)throw new Error("Selector is required for fill action");if(void 0===s)throw new Error("Value is required for fill action");await this.delegateToEngine(t,"fill",i,s,n)}};M.id="fill",M.returnType="none",M.capabilities={http:"simulate",browser:"native"},f.register(M);var L=class extends f{async onExecute(t,e){return await this.delegateToEngine(t,"getContent",e?.params)}};L.id="getContent",L.returnType="response",L.capabilities={http:"native",browser:"native"},f.register(L);var G=class extends f{async onExecute(t,e,i){const s=e?.params,n=s?.url||t.url;if(!n)throw new Error("URL is required for goto action");const r=t.internal.engine;if(!r)throw new Error("No engine available");t.url=n;return await r.goto(n,s)}};G.id="goto",G.returnType="response",G.capabilities={http:"native",browser:"native"},f.register(G);var z=class extends f{async onExecute(t,e){const{selector:i,...s}=e?.params||{};await this.delegateToEngine(t,"submit",i,s)}};z.id="submit",z.returnType="none",z.capabilities={http:"simulate",browser:"native"},f.register(z);var D=class extends f{async onExecute(t,e){const i=t.internal.engine;if(!i)throw new Error("No engine available");await i.waitFor(e?.params)}};D.id="waitFor",D.returnType="none",D.capabilities={http:"native",browser:"native"},f.register(D);var W=class extends f{async onExecute(t,e){const i=e?.params;if(!i)throw new Error("Schema is required for extract action");return this.delegateToEngine(t,"extract",i)}};W.id="extract",W.returnType="any",W.capabilities={http:"native",browser:"native"},f.register(W);var B=class extends f{async onExecute(t,e){const{selector:i,message:s,attribute:n}=e?.params||{},r=t.internal.engine;if("browser"===r?.mode){if(i){if(!await(r?.extract({selector:i,attribute:n})))return}r&&"pause"in r?await r.pause(s):console.warn("[PauseAction] was called, but the current engine does not support `pause`. Skipped.")}else console.warn("[PauseAction] can only run in browser engine. Skipped.")}};async function I(t,e){return(new R).fetch(t,e)}B.id="pause",B.capabilities={http:"native",browser:"native"},B.returnType="none",f.register(B);
package/dist/index.mjs CHANGED
@@ -1 +1 @@
1
- var t={engine:"auto",enableSmart:!0,useSiteRegistry:!0,antibot:!1,headers:{},cookies:[],reuseCookies:!0,throwHttpErrors:void 0,proxy:[],blockResources:[],ignoreSslErrors:!0,browser:{engine:"playwright",headless:!0,waitUntil:"domcontentloaded"},http:{method:"GET"},timeoutMs:6e4,requestHandlerTimeoutSecs:void 0,maxConcurrency:1,maxRequestsPerMinute:1e3,delayBetweenRequestsMs:0,retries:0,sites:[]},e=Object.keys(t).concat(["actions","onPause"]),i=(t=>(t[t.Failed=0]="Failed",t[t.Success=1]="Success",t[t.Skipped=2]="Skipped",t))(i||{}),s=class t{static register(t){const e=t.id;if(!e)throw new Error("FetchAction.register: actionClass.id is required");this.registry.set(e,t)}static get(t){return this.registry.get(t)}static create(t){const e="string"==typeof t?t:t.id||t.name;if(!e)throw new Error("Action must have id or name");const i=this.registry.get(e);return i?new i:void 0}static has(t){return this.registry.has(t)}static list(){return Array.from(this.registry.keys())}static getCapability(t){return this.capabilities[t]??"noop"}getCapability(t){return this.constructor.getCapability(t)}get id(){return this.constructor.id}get returnType(){return this.constructor.returnType}get capabilities(){return this.constructor.capabilities}async delegateToEngine(t,e,...i){const s=t.internal.engine;if(!s)throw new Error("No engine available");if("function"!=typeof s[e])throw new Error(`Engine does not have a method named '${String(e)}'`);return await s[e](...i)}installCollectors(e,i){const s=i?.collectors;if(!s?.length)return;const r=[],c=new Set;for(const i of s){const s=n(i.activateOn),l=n(i.collectOn),u=n(i.deactivateOn),h=!(i.background??!0),w=t.create(i);if(!w)continue;let f=!1,d=!1,p=0;const y=async t=>{if(!f&&!d){f=!0;try{await(w.onBeforeExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:w.id,phase:"before",error:t})}}},m=async(t,s)=>{if(!d){f||await y(s);try{const r=Promise.resolve(w.onExecute?.(e,i,s)).then(s=>{var r,n;if(i.storeAs){((r=e.outputs)[n=i.storeAs]||(r[n]=[])).push(s)}return e.eventBus.emit("collector:result",{action:this.id,collector:i.id||i.name,event:t,result:s}),s}).catch(s=>{e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,event:t,phase:"exec",error:s})}).finally(()=>{p++});h&&(c.add(r),r.finally(()=>c.delete(r)))}catch(i){e.eventBus.emit("collector:error",{action:this.id,collector:w.id,event:t,phase:"exec",error:i})}}},g=async()=>{if(!d){0===p&&m("collector:after"),d=!0;try{await(w.onAfterExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,phase:"after",error:t})}finally{e.eventBus.emit("collector:end",{action:this.id,collector:i.id||i.name}),x.forEach(t=>t())}}},v=o(e,s,y),x=a(e,l,m),b=o(e,u,g);if(r.push(...v,...x,...b),!s.length&&!l.length&&!u.length){const t=()=>{g()};e.eventBus.once(`action:${this.id}.end`,t),r.push(()=>e.eventBus.off("fetcher:action:end",t))}}return r.length||c.size>0?{cleanup:()=>r.forEach(t=>t()),awaitExecPendings:async()=>{c.size>0&&await Promise.allSettled(Array.from(c))}}:void 0}async beforeExec(t,e){t.internal.actionStack||(t.internal.actionStack=[]);const i=t.internal.actionStack,s=i.length,r=i.length>0?i[i.length-1].id:void 0,n={...e,id:this.id,depth:s,parent:r};i.push(n),t.currentAction=n;const o={action:this,context:t,options:e,depth:s,stack:[...i]};t.eventBus.emit(`action:${this.id}.start`,o),t.eventBus.emit("action:start",o),await(this.onBeforeExec?.(t,e));return{entry:o,collectors:this.installCollectors(t,e)}}async afterExec(t,e,i,s){const r=t.internal.actionStack,n=r.length-1,o=s?.collectors;try{await(o?.awaitExecPendings()),t.lastResult=i,"response"!==i?.returnType||i.error||(t.lastResponse=i.result),e?.storeAs&&(t.outputs[e.storeAs]=i?.result),i?.error&&(t.currentAction.error=i.error),await(this.onAfterExec?.(t,e));const s={action:this,context:t,options:e,result:i,depth:n,stack:[...r]};i?.error&&(s.error=i.error);try{t.eventBus.emit(`action:${this.id}.end`,s)}catch(t){}try{t.eventBus.emit("action:end",s)}catch(t){}}finally{try{o?.cleanup()}finally{r.pop();const e=r.length;t.currentAction=e>0?r[e-1]:void 0}}}async execute(t,e){const i=await this.beforeExec(t,e);let s;try{const i=e?.failOnError??!0;return t.throwHttpErrors=i,s=await this.onExecute(t,e),s&&s.returnType||(s={status:1,returnType:this.returnType??"any",result:s}),s}catch(i){if(s={status:0,error:i,meta:{id:this.id,engineType:t.engine,capability:this.getCapability(t.engine)}},e?.failOnError)throw i;return s}finally{await this.afterExec(t,e,s,i)}}};s.registry=new Map,s.returnType="any",s.capabilities={http:"noop",browser:"noop"};var r=s;function n(t){return t?Array.isArray(t)?t:[t]:[]}function o(t,e,i){const s=[];for(const r of e)if("string"==typeof r||r instanceof RegExp){const e=(...t)=>{i(t[0])};t.eventBus.once(r,e),s.push(()=>t.eventBus.off(r,e))}return s}function a(t,e,i){const s=[];for(const r of e)if("string"==typeof r||r instanceof RegExp){const e=t=>i(r,t);t.eventBus.on(r,e),s.push(()=>t.eventBus.off(r,e))}return s}import{EventEmitter as c}from"events-ex";import{defaultsDeep as l}from"lodash-es";import{customAlphabet as u}from"nanoid";var h=u("0123456789abcdefghijklmnopqrstuvwxyz",12);import{defaultsDeep as w,merge as f}from"lodash-es";import{EventEmitter as d}from"events-ex";import{CommonError as p}from"@isdk/common-error";import{Configuration as y,RequestQueue as m}from"crawlee";function g(){let t=()=>{};const e=new Promise(e=>{t=e});return e.release=t,e}y.getGlobalConfig().set("persistStorage",!1);var v=class{constructor(){this.hdrs={},this.jar=[],this.pendingRequests=new Map,this.requestCounter=0,this.actionEmitter=new d,this.isPageActive=!1,this.navigationLock=function(){const t=g();return t.release(),t}(),this.blockedTypes=new Set}static register(t){const e=t.id;if(!e)throw new Error("Engine must define static id");if(this.registry.has(e))throw new Error(`Engine id duplicated: ${e}`);this.registry.set(e,t)}static get(t){return this.registry.get(t)}static getByMode(t){for(const[e,i]of this.registry.entries())if(i.mode===t)return i}static async create(e,i){const s=w(i,e,t),r=s.engine??e.engine,n=r?this.get(r)??this.getByMode(r):null;if(n){const t=new n;return await t.initialize(e,s),t}}async _extract(t,e){const i=t.type;if(!e)return"array"===i?[]:null;if("object"===i){const{selector:i,properties:s}=t;let r=e;if(i){const t=await this._querySelectorAll(e,i);r=t.length>0?t[0]:null}if(!r)return null;const n={};for(const t in s)n[t]=await this._extract(s[t],r);return n}if("array"===i){const{selector:i,items:s}=t,r=i?await this._querySelectorAll(e,i):[e],n=[];for(const t of r)n.push(await this._extract(s,t));return n}const{selector:s}=t;let r=e;if(s){const t=await this._querySelectorAll(e,s);r=t.length>0?t[0]:null}return r?this._extractValue(t,r):null}async buildResponse(t){const e=await this._buildResponse(t),i=e.headers["content-type"]||"";return e.contentType=i.split(";")[0].trim(),e}waitFor(t){return this.dispatchAction({type:"waitFor",options:t})}click(t){return this.dispatchAction({type:"click",selector:t})}fill(t,e){return this.dispatchAction({type:"fill",selector:t,value:e})}submit(t,e){return this.dispatchAction({type:"submit",selector:t,options:e})}pause(t){return this.dispatchAction({type:"pause",message:t})}extract(t){const e=this._normalizeSchema(t);return this.dispatchAction({type:"extract",schema:e})}_normalizeSchema(t){const e=JSON.parse(JSON.stringify(t));if(e.properties)for(const t in e.properties)e.properties[t]=this._normalizeSchema(e.properties[t]);if(e.items&&(e.items=this._normalizeSchema(e.items)),"array"===e.type&&(e.attribute&&!e.items&&(e.items={attribute:e.attribute},delete e.attribute),e.items||(e.items={type:"string"})),e.selector&&(e.has||e.exclude)){const{selector:t,has:i,exclude:s}=e,r=t.split(",").map(t=>{let e=t.trim();return i&&(e=`${e}:has(${i})`),s&&(e=`${e}:not(${s})`),e}).join(", ");e.selector=r,delete e.has,delete e.exclude}return e}get id(){return this.constructor.id}get mode(){return this.constructor.mode}get context(){return this.ctx}async initialize(t,e){if(this.ctx)return;f(t,e),this.ctx=t,this.opts=t,this.hdrs=function(t){const e={};if(t&&"object"==typeof t)for(const[i,s]of Object.entries(t))e[i.toLowerCase()]=s;return e}(t.headers),this.jar=[...t.cookies??[]],t.internal||(t.internal={}),t.internal.engine=this,t.engine=this.mode,this.actionEmitter.setMaxListeners(100),this.requestQueue=await m.open();const i=await this._getSpecificCrawlerOptions(t),s={...w(i,{requestQueue:this.requestQueue,maxConcurrency:1,minConcurrency:1,useSessionPool:!0,persistCookiesPerSession:!0,sessionPoolOptions:{maxPoolSize:1,persistenceOptions:{enable:!1},sessionOptions:{maxUsageCount:1e3,maxErrorScore:3}}}),requestHandler:this._requestHandler.bind(this),errorHandler:this._failedRequestHandler.bind(this),failedRequestHandler:this._failedRequestHandler.bind(this)};this.crawler=this._createCrawler(s),this.crawler.run().then(()=>{this.isCrawlerReady=!0}).catch(t=>{this.isCrawlerReady=!1,console.error("Crawler background error:",t)})}async cleanup(){await(this._cleanup?.()),await this._commonCleanup();const t=this.ctx;t&&t.internal?.engine===this&&(t.internal.engine=void 0),this.ctx=void 0,this.opts=void 0}async _executePendingActions(t){await new Promise(e=>{const i=async({action:e,resolve:i,reject:s})=>{try{if("dispose"===e.type)return this.actionEmitter.emit("dispose"),void i();i(await this.executeAction(t,e))}catch(t){s(t)}};this.actionEmitter.on("dispatch",i),this.actionEmitter.once("dispose",()=>{this.actionEmitter.removeListener("dispatch",i),e()})})}async _sharedRequestHandler(t){try{const{request:e}=t;this.isPageActive=!0;const i=this.pendingRequests.get(e.userData.requestId);if(i){const s=await this._buildResponse(t),r=!s.statusCode||s.statusCode>=400;if(this.ctx?.throwHttpErrors&&r){const t=new p(`Request for ${s.finalUrl} failed with status ${s.statusCode||"N/A"}`,"request",s.statusCode);i.reject(t)}else this.lastResponse=s,i.resolve(s);this.pendingRequests.delete(e.userData.requestId)}await this._executePendingActions(t)}finally{this.isPageActive=!1,this.navigationLock.release()}}async _sharedFailedRequestHandler(t,e){const{request:i}=t,s=this.pendingRequests.get(i.userData.requestId);if(s&&e&&this.ctx?.throwHttpErrors){this.pendingRequests.delete(i.userData.requestId);const t=e.response,r=t?.statusCode||500,n=t?.url?t.url:i.url,o=new p(`Request${n?" for "+n:""} failed: ${e.message}`,"request",r);s.reject(o)}return this._sharedRequestHandler(t)}async dispatchAction(t){if(!this.isPageActive)throw new Error("No active page. Call goto() before performing actions.");return new Promise((e,i)=>{this.actionEmitter.emit("dispatch",{action:t,resolve:e,reject:i})})}async _requestHandler(t){await this._sharedRequestHandler(t)}async _failedRequestHandler(t,e){await this._sharedFailedRequestHandler(t,e)}async _commonCleanup(){if(this.isPageActive&&await this.dispatchAction({type:"dispose"}).catch(()=>{}),this.pendingRequests.size>0)for(const[,t]of this.pendingRequests)t.reject(new Error("Cleanup:Request cancelled"));if(this.actionEmitter.removeAllListeners(),this.crawler){try{await(this.crawler.teardown?.())}catch(t){console.error("ccrawler teardown error:",t)}this.crawler=void 0}this.isCrawlerReady=void 0,this.requestQueue&&(await this.requestQueue.drop(),this.requestQueue=void 0),this.pendingRequests.clear()}async blockResources(t,e){return e&&this.blockedTypes.clear(),t.forEach(t=>this.blockedTypes.add(t)),t.length}getContent(){return this.lastResponse?Promise.resolve(this.lastResponse):Promise.reject(new Error("No content fetched yet. Call goto() first."))}async headers(t,e){if(void 0===t)return{...this.hdrs};if("string"==typeof t&&void 0===e)return this.hdrs[t.toLowerCase()]||"";if(null!==t&&"object"==typeof t){const i={};for(const[e,s]of Object.entries(t))i[e.toLowerCase()]=String(s);return this.hdrs=!0===e?i:{...this.hdrs,...i},!0}return"string"==typeof t&&("string"==typeof e?this.hdrs[t.toLowerCase()]=e:null===e&&delete this.hdrs[t.toLowerCase()],!0)}async cookies(t){return Array.isArray(t)?(this.jar=[...t],!0):null===t?(this.jar=[],!0):[...this.jar]}async dispose(){await this.cleanup()}};async function x(t,e){const i=function(t,e){if(!t||!e?.length)return null;const i=new URL(t);let s=e.find(t=>t.domain===i.hostname);s||(s=e.find(t=>i.hostname.endsWith(t.domain)));if(!s)return null;if(s.pathScope?.length){if(!s.pathScope.some(t=>i.pathname.startsWith(t)))return null}return s}(e?.url||t.url,t.sites),s=t.engine||i?.engine||"auto";let r=await v.create(t,{engine:s});return r||(r=await v.create(t,{engine:"http"})),r}v.registry=new Map;var b=class{constructor(t={}){this.options=t,this.closed=!1,this.id=h(),this.context=this.createContext(t)}async execute(t){await this.ensureEngine(t);const e=r.create(t);if(!e)throw new Error(`Unknown action: ${t.id||t.name}`);let i,s;this.context.internal.actionIndex=(this.context.internal.actionIndex||0)+1,this.context.currentAction={...t,index:this.context.internal.actionIndex,startedAt:Date.now()};try{return i=await e.execute(this.context,t),i}catch(t){throw s=t,s}finally{this.context.currentAction=void 0}}async executeAll(t){try{for(let e=0;e<t.length;e++){const i=t[e];await this.execute(i)}const e=await this.execute({id:"getContent"});return{result:e?.result,outputs:this.getOutputs()}}catch(t){throw t}}getOutputs(){return this.context.outputs}async dispose(){if(this.closed)return;const t=this.context.eventBus;t.emit("session:closing",{sessionId:this.id});try{await(this.context.internal.engine?.dispose())}finally{this.closed=!0}t.emit("session:closed",{sessionId:this.id})}async ensureEngine(t){if(this.closed)throw new Error("Session is closed");if(!this.context.internal.engine){const e=t?.params?.url??this.context.url;if(!await x(this.context,{url:e}))throw new Error("No engine found")}}createContext(e=this.options){const i=new c;return l({...e,id:this.id,eventBus:i,outputs:{},internal:{},execute:async t=>this.execute(t),action:async function(t,e,i){return this.execute({name:t,params:e,...i})}},t)}},E=class{constructor(t={}){this.defaults=t}async createSession(t){const e={...this.defaults,...t||{}};return new b(e)}async fetch(t,e){"string"!=typeof t&&(t=(e=t).url);const i=await this.createSession(e);try{const s=e?.actions||[];t&&0!==s.findIndex(e=>"goto"===e.id&&e.params?.url===t)&&s.unshift({id:"goto",params:{url:t}});return await i.executeAll(s)}finally{await i.dispose()}}};import{CheerioCrawler as C,ProxyConfiguration as k}from"crawlee";import*as q from"cheerio";import{CommonError as S,ErrorCode as $,NotFoundError as R}from"@isdk/common-error";var P=class extends v{async _buildResponse(t){const{request:e,response:i,body:s,$:r}=t,n=r?.html();let o="string"==typeof s?s:Buffer.isBuffer(s)?s.toString("utf-8"):String(s??"");n&&n!==o&&(o=n);let a=i?.headers;if(!a&&i?.rawHeaders){a={};const t=i.rawHeaders;for(let e=0;e<t.length;e+=2)a[t[e].toLowerCase()]=t[e+1]}return{url:e.url,finalUrl:e.loadedUrl||e.url,statusCode:i?.statusCode??200,statusText:i?.statusMessage,headers:a||{},body:s,html:o,text:o}}async _querySelectorAll(t,e){const{$:i,el:s}=t;return s.find(e).toArray().map(t=>({$:i,el:i(t)}))}async _extractValue(t,e){const{el:i}=e,{attribute:s,type:r="string"}=t;if(0===i.length)return null;let n="";if(n=s?i.attr(s)??null:"html"===r?i.html():i.text().trim(),null===n)return null;switch(r){case"number":return parseFloat(n.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=n.toLowerCase();return"true"===t||"1"===t;default:return n}}async executeAction(t,e){const{$:i}=t;switch(e.type){case"dispose":return;case"extract":if(!i)throw new S(`Cheerio context not available for action: ${e.type}`,"extract");return this._extract(e.schema,{$:i,el:i.root()});case"click":{if(!i)throw new S(`Cheerio context not available for action: ${e.type}`,"click");const s=e.selector,r=i(s).first();let n;if(0===r.length)try{n=new URL(s,t.request.loadedUrl||t.request.url).href}catch{throw new S(`click: selector not found or invalid URL: ${s}`,"click")}else{if(!r.is("a")||!r.attr("href")){if(r.is('input[type="submit"], button[type="submit"], button, input')){const e=r.closest("form");if(e.length)return this.executeAction(t,{type:"submit",selector:e});throw new S("click: submit-like element without form","click")}throw new S(`click: unsupported element for http simulate. Selector: ${s}`,"click")}{const e=r.attr("href");n=new URL(e,t.request.loadedUrl||t.request.url).href}}const o=await t.sendRequest({url:n});return void this._updateStateAfterNavigation(t,o)}case"fill":{if(!i)throw new S(`Cheerio context not available for action: ${e.type}`),"fill";const s=i(e.selector).first();if(0===s.length)throw new S(`fill: selector not found: ${e.selector}`);if(!s.is("input, textarea, select"))throw new S(`fill: not a form field: ${e.selector}`);return s.val(e.value),void(this.lastResponse=await this.buildResponse(t))}case"waitFor":return void(e.options?.ms&&await new Promise(t=>setTimeout(t,e.options.ms)));case"pause":const s=this.ctx?.onPause;return void(s?(console.info(e.message||"Execution paused for manual intervention."),await s({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."));case"submit":{if(!i)throw new S(`Cheerio context not available for action: ${e.type}`,"submit");const s="string"==typeof e.selector?i(e.selector).first():null!=e.selector?e.selector:i("form").first();if(0===s.length)throw new R(e.selector,"submit");const r=s.attr("action")||t.request.loadedUrl||t.request.url,n=(s.attr("method")||"GET").toUpperCase(),o=new URL(r,t.request.loadedUrl||t.request.url).href,a={};let c;if(s.find("input, select, textarea").each((t,e)=>{const s=i(e),r=s.attr("name");if(!r)return;const n=s.val();null!=n&&(a[r]=String(n))}),"GET"===n){const e=new URL(o);Object.entries(a).forEach(([t,i])=>e.searchParams.set(t,i)),c=await t.sendRequest({url:e.href,method:"GET"})}else{let i;const r={};"application/json"===(e.options?.enctype||s.attr("enctype")||"application/x-www-form-urlencoded")?(i=JSON.stringify(a),r["Content-Type"]="application/json"):(i=new URLSearchParams(a).toString(),r["Content-Type"]="application/x-www-form-urlencoded"),c=await t.sendRequest({url:o,method:"POST",body:i,headers:r})}return void this._updateStateAfterNavigation(t,c)}case"getContent":return this.buildResponse(t);default:throw new S(`Unknown action type: ${e.type}`,"CheerioFetchEngine.executeAction",$.NotSupported)}}_updateStateAfterNavigation(t,e){const i=e;let s=i.headers;if(!s&&i.rawHeaders){s={};for(let t=0;t<i.rawHeaders.length;t+=2)s[i.rawHeaders[t].toLowerCase()]=i.rawHeaders[t+1]}s=s||{};const r=i.body,n=q.load(r??"");t.$=n,t.response=i,t.body=r;const o=n.html(),a=n.text(),c=(s["content-type"]||"").split(";")[0].trim();this.lastResponse={url:t.request.url,finalUrl:i.url,statusCode:i.statusCode,statusText:i.statusMessage,headers:s,contentType:c,body:r,html:o,text:a}}_createCrawler(t){return new C(t)}_getSpecificCrawlerOptions(t){const e=this.opts?.proxy?"string"==typeof this.opts.proxy?[this.opts.proxy]:this.opts.proxy:void 0,i=e?.length?new k({proxyUrls:e}):void 0;return{additionalMimeTypes:["text/plain"],maxRequestRetries:1,requestHandlerTimeoutSecs:t.requestHandlerTimeoutSecs,proxyConfiguration:i,preNavigationHooks:[(e,i)=>{i.throwHttpErrors=t.throwHttpErrors,this.opts?.timeoutMs&&(i.timeout={request:this.opts.timeoutMs})}]}}async goto(t,e){this.isPageActive&&this.dispatchAction({type:"dispose"}).catch(()=>{});const i="req-"+ ++this.requestCounter,s=new Promise((t,s)=>{const r=e?.timeoutMs||this.opts?.timeoutMs||3e4,n=setTimeout(()=>{this.pendingRequests.delete(i),this.navigationLock.release(),s(new S(`goto timed out after ${r}ms.`,"gotoTimeout",$.RequestTimeout))},r);this.pendingRequests.set(i,{resolve:e=>{clearTimeout(n),t(e)},reject:t=>{clearTimeout(n),s(t)}})});return this.requestQueue.addRequest({...e,url:t,headers:{...this.hdrs,...e?.headers},userData:{requestId:i},uniqueKey:`${t}-${i}`}).catch(t=>{const e=this.pendingRequests.get(i);e&&(this.pendingRequests.delete(i),this.navigationLock.release(),e.reject(t))}),await this.navigationLock,this.navigationLock=g(),s}};P.id="cheerio",P.mode="http",v.register(P);import{PlaywrightCrawler as T}from"crawlee";import{firefox as A}from"playwright";import{launchOptions as U}from"camoufox-js";import{CommonError as _,ErrorCode as j,NotFoundError as O}from"@isdk/common-error";var F=class extends v{async _buildResponse(t){const{page:e,response:i,request:s}=t;if(!e||e.isClosed())return{url:s.url,finalUrl:s.loadedUrl||s.url,statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},body:"",html:"",text:""};const r=await e.content(),n=await e.textContent("body");return{url:e.url(),finalUrl:e.url(),statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},body:r,html:r,text:n||""}}async _querySelectorAll(t,e){return t.locator(e).all()}async _extractValue(t,e){const{attribute:i,type:s="string"}=t;if(0===await e.count())return null;let r="";if(r=i?await e.getAttribute(i):"html"===s?await e.innerHTML():await e.textContent(),null===r)return null;switch(r=r.trim(),s){case"number":return parseFloat(r.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=r.toLowerCase();return"true"===t||"1"===t;default:return r}}async executeAction(t,e){const{page:i}=t,s=this.opts?.timeoutMs||3e4;switch(e.type){case"navigate":{const s=await i.goto(e.url,{waitUntil:e.opts?.waitUntil||"domcontentloaded",timeout:this.opts?.timeoutMs||3e4});s&&(t={...t,response:s});const r=await this.buildResponse(t);return this.lastResponse=r,r}case"extract":{const s=await this._extract(e.schema,i.locator("body"));return this.lastResponse=await this.buildResponse(t),s}case"click":{await i.click(e.selector,{timeout:s}),await i.waitForLoadState("networkidle",{timeout:s});const r=await this.buildResponse(t);return void(this.lastResponse=r)}case"fill":await i.fill(e.selector,e.value,{timeout:s});const r=await this.buildResponse(t);return void(this.lastResponse=r);case"waitFor":try{e.options?.selector&&await i.waitForSelector(e.options.selector,{timeout:s}),e.options?.networkIdle&&await i.waitForLoadState("networkidle",{timeout:s})}catch(t){if(!1!==e.options?.failOnTimeout)throw t}return void(e.options?.ms&&await i.waitForTimeout(e.options.ms));case"submit":{const r=e.selector||"form",n=i.locator(r).first();if(0===await n.count())throw new O(r,"submit");if("application/json"===(e.options?.enctype||"application/x-www-form-urlencoded")){const t=await n.elementHandle();if(!t)throw new _(`submit: could not get form handle for ${r}`,"submit");const e=await t.evaluate(async t=>{const e=new FormData(t),i={};e.forEach((t,e)=>{i[e]=t.toString()});const s=await fetch(t.action,{method:t.method,headers:{"Content-Type":"application/json"},body:JSON.stringify(i)}),r=await s.text();return{status:s.status,statusText:s.statusText,headers:Object.fromEntries(s.headers.entries()),body:r,html:r,text:r,url:t.action,finalUrl:s.url}});return await t.dispose(),await i.setContent(e.html),void(this.lastResponse=e)}return await n.evaluate(t=>t.submit()),await i.waitForLoadState("networkidle",{timeout:s}),void(this.lastResponse=await this.buildResponse(t))}case"pause":{const t=this.ctx?.onPause;return void(t?(console.info(e.message||"Execution paused for manual intervention."),await t({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."))}case"getContent":return this.buildResponse(t);default:throw new _(`Unknown action type: ${e.type}`,"PlaywrightFetchEngine.executeAction",j.NotSupported)}}_createCrawler(t){return new T(t)}async _getSpecificCrawlerOptions(t){const e=t.browser?.headless??!0,i={maxRequestRetries:t.retries||3,headless:e,requestHandlerTimeoutSecs:t.requestHandlerTimeoutSecs,preNavigationHooks:[async({page:e,request:i},s)=>{s.throwHttpErrors=t.throwHttpErrors,this.jar.length>0&&await e.context().addCookies(this.jar.map(t=>({...t,url:i.url,domain:t.domain||new URL(i.url).hostname})));const r=this.blockedTypes;r.size>0&&await e.route("**/*",t=>{r.has(t.request().resourceType())?t.abort():t.continue()})}]};if(this.opts?.antibot){i.browserPoolOptions={useFingerprints:!1};const t=await U({headless:e});i.launchContext={launcher:A,launchOptions:t},i.postNavigationHooks=[async({page:t,handleCloudflareChallenge:e})=>{await e()}]}return i}async goto(t,e){if(this.isPageActive)return this.dispatchAction({type:"navigate",url:t,opts:e});if(!this.requestQueue)throw new _("RequestQueue not initialized","goto");const i="req-"+ ++this.requestCounter,s=new Promise((t,e)=>{this.pendingRequests.set(i,{resolve:t,reject:e})});return await this.requestQueue.addRequest({url:t,headers:this.hdrs,userData:{requestId:i,waitUntil:e?.waitUntil||"domcontentloaded"},uniqueKey:`${t}-${i}`}),s}};F.id="playwright",F.mode="browser",v.register(F);var N=class extends r{async onExecute(t,e){const{selector:i,...s}=e?.params||{};if(!i)throw new Error("Selector is required for click action");await this.delegateToEngine(t,"click",i,s)}};N.id="click",N.returnType="none",N.capabilities={http:"simulate",browser:"native"},r.register(N);var H=class extends r{async onExecute(t,e){const{selector:i,value:s,...r}=e?.params||{};if(!i)throw new Error("Selector is required for fill action");if(void 0===s)throw new Error("Value is required for fill action");await this.delegateToEngine(t,"fill",i,s,r)}};H.id="fill",H.returnType="none",H.capabilities={http:"simulate",browser:"native"},r.register(H);var L=class extends r{async onExecute(t,e){return await this.delegateToEngine(t,"getContent",e?.params)}};L.id="getContent",L.returnType="response",L.capabilities={http:"native",browser:"native"},r.register(L);var M=class extends r{async onExecute(t,e,i){const s=e?.params,r=s?.url||t.url;if(!r)throw new Error("URL is required for goto action");const n=t.internal.engine;if(!n)throw new Error("No engine available");t.url=r;return await n.goto(r,s)}};M.id="goto",M.returnType="response",M.capabilities={http:"native",browser:"native"},r.register(M);var z=class extends r{async onExecute(t,e){const{selector:i,...s}=e?.params||{};await this.delegateToEngine(t,"submit",i,s)}};z.id="submit",z.returnType="none",z.capabilities={http:"simulate",browser:"native"},r.register(z);var D=class extends r{async onExecute(t,e){const i=t.internal.engine;if(!i)throw new Error("No engine available");await i.waitFor(e?.params)}};D.id="waitFor",D.returnType="none",D.capabilities={http:"native",browser:"native"},r.register(D);var B=class extends r{async onExecute(t,e){const i=e?.params;if(!i)throw new Error("Schema is required for extract action");return this.delegateToEngine(t,"extract",i)}};B.id="extract",B.returnType="any",B.capabilities={http:"native",browser:"native"},r.register(B);var G=class extends r{async onExecute(t,e){const{selector:i,message:s,attribute:r}=e?.params||{},n=t.internal.engine;if("browser"===n?.mode){if(i){if(!await(n?.extract({selector:i,attribute:r})))return}n&&"pause"in n?await n.pause(s):console.warn("[PauseAction] was called, but the current engine does not support `pause`. Skipped.")}else console.warn("[PauseAction] can only run in browser engine. Skipped.")}};async function I(t,e){return(new E).fetch(t,e)}G.id="pause",G.capabilities={http:"native",browser:"native"},G.returnType="none",r.register(G);export{P as CheerioFetchEngine,N as ClickAction,t as DefaultFetcherProperties,B as ExtractAction,r as FetchAction,i as FetchActionResultStatus,v as FetchEngine,b as FetchSession,e as FetcherOptionKeys,H as FillAction,L as GetContentAction,M as GotoAction,G as PauseAction,F as PlaywrightFetchEngine,z as SubmitAction,D as WaitForAction,E as WebFetcher,I as fetchWeb};
1
+ var t={engine:"auto",enableSmart:!0,useSiteRegistry:!0,antibot:!1,headers:{},cookies:[],reuseCookies:!0,throwHttpErrors:void 0,proxy:[],blockResources:[],ignoreSslErrors:!0,browser:{engine:"playwright",headless:!0,waitUntil:"domcontentloaded"},http:{method:"GET"},timeoutMs:6e4,requestHandlerTimeoutSecs:void 0,maxConcurrency:1,maxRequestsPerMinute:1e3,delayBetweenRequestsMs:0,retries:0,sites:[]},e=Object.keys(t).concat(["actions","onPause"]),s=(t=>(t[t.Failed=0]="Failed",t[t.Success=1]="Success",t[t.Skipped=2]="Skipped",t))(s||{}),i=class t{static register(t){const e=t.id;if(!e)throw new Error("FetchAction.register: actionClass.id is required");this.registry.set(e,t)}static get(t){return this.registry.get(t)}static create(e){const s="string"==typeof e?e:e.id||e.name||e.action;if(!s)throw new Error("Action must have id, name or action");const i=s instanceof t?s.constructor:this.registry.get(s);return i?new i:void 0}static has(t){return this.registry.has(t)}static list(){return Array.from(this.registry.keys())}static getCapability(t){return this.capabilities[t]??"noop"}getCapability(t){return this.constructor.getCapability(t)}get id(){return this.constructor.id}get returnType(){return this.constructor.returnType}get capabilities(){return this.constructor.capabilities}async delegateToEngine(t,e,...s){const i=t.internal.engine;if(!i)throw new Error("No engine available");if("function"!=typeof i[e])throw new Error(`Engine does not have a method named '${String(e)}'`);return await i[e](...s)}installCollectors(e,s){const i=s?.collectors;if(!i?.length)return;const r=[],c=new Set;for(const s of i){const i=n(s.activateOn),l=n(s.collectOn),u=n(s.deactivateOn),h=!(s.background??!0),w=t.create(s);if(!w)continue;let f=!1,d=!1,p=0;const y=async t=>{if(!f&&!d){f=!0;try{await(w.onBeforeExec?.(e,s))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:w.id,phase:"before",error:t})}}},m=async(t,i)=>{if(!d){f||await y(i);try{const r=Promise.resolve(w.onExecute?.(e,s,i)).then(i=>{var r,n;if(s.storeAs){((r=e.outputs)[n=s.storeAs]||(r[n]=[])).push(i)}return e.eventBus.emit("collector:result",{action:this.id,collector:s.id||s.name,event:t,result:i}),i}).catch(i=>{e.eventBus.emit("collector:error",{action:this.id,collector:s.id||s.name,event:t,phase:"exec",error:i})}).finally(()=>{p++});h&&(c.add(r),r.finally(()=>c.delete(r)))}catch(s){e.eventBus.emit("collector:error",{action:this.id,collector:w.id,event:t,phase:"exec",error:s})}}},g=async()=>{if(!d){0===p&&m("collector:after"),d=!0;try{await(w.onAfterExec?.(e,s))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:s.id||s.name,phase:"after",error:t})}finally{e.eventBus.emit("collector:end",{action:this.id,collector:s.id||s.name}),x.forEach(t=>t())}}},v=o(e,i,y),x=a(e,l,m),b=o(e,u,g);if(r.push(...v,...x,...b),!i.length&&!l.length&&!u.length){const t=()=>{g()};e.eventBus.once(`action:${this.id}.end`,t),r.push(()=>e.eventBus.off("fetcher:action:end",t))}}return r.length||c.size>0?{cleanup:()=>r.forEach(t=>t()),awaitExecPendings:async()=>{c.size>0&&await Promise.allSettled(Array.from(c))}}:void 0}async beforeExec(t,e){t.internal.actionStack||(t.internal.actionStack=[]);const s=t.internal.actionStack,i=s.length,r=s.length>0?s[s.length-1].id:void 0,n={...e,id:this.id,depth:i,parent:r};s.push(n),t.currentAction=n;const o={action:this,context:t,options:e,depth:i,stack:[...s]};t.eventBus.emit(`action:${this.id}.start`,o),t.eventBus.emit("action:start",o),await(this.onBeforeExec?.(t,e));return{entry:o,collectors:this.installCollectors(t,e)}}async afterExec(t,e,s,i){const r=t.internal.actionStack,n=r.length-1,o=i?.collectors;try{await(o?.awaitExecPendings()),t.lastResult=s,"response"!==s?.returnType||s.error||(t.lastResponse=s.result),e?.storeAs&&(t.outputs[e.storeAs]=s?.result),s?.error&&(t.currentAction.error=s.error),await(this.onAfterExec?.(t,e));const i={action:this,context:t,options:e,result:s,depth:n,stack:[...r]};s?.error&&(i.error=s.error);try{t.eventBus.emit(`action:${this.id}.end`,i)}catch(t){}try{t.eventBus.emit("action:end",i)}catch(t){}}finally{try{o?.cleanup()}finally{r.pop();const e=r.length;t.currentAction=e>0?r[e-1]:void 0}}}async execute(t,e){e?.args&&!e.params&&(e.params=e.args);const s=await this.beforeExec(t,e);let i;try{const s=e?.failOnError??!0;return t.throwHttpErrors=s,i=await this.onExecute(t,e),i&&i.returnType||(i={status:1,returnType:this.returnType??"any",result:i}),i}catch(s){if(i={status:0,error:s,meta:{id:this.id,engineType:t.engine,capability:this.getCapability(t.engine)}},e?.failOnError)throw s;return i}finally{await this.afterExec(t,e,i,s)}}};i.registry=new Map,i.returnType="any",i.capabilities={http:"noop",browser:"noop"};var r=i;function n(t){return t?Array.isArray(t)?t:[t]:[]}function o(t,e,s){const i=[];for(const r of e)if("string"==typeof r||r instanceof RegExp){const e=(...t)=>{s(t[0])};t.eventBus.once(r,e),i.push(()=>t.eventBus.off(r,e))}return i}function a(t,e,s){const i=[];for(const r of e)if("string"==typeof r||r instanceof RegExp){const e=t=>s(r,t);t.eventBus.on(r,e),i.push(()=>t.eventBus.off(r,e))}return i}import{EventEmitter as c}from"events-ex";import{defaultsDeep as l}from"lodash-es";import{customAlphabet as u}from"nanoid";var h=u("0123456789abcdefghijklmnopqrstuvwxyz",12);import{defaultsDeep as w,merge as f}from"lodash-es";import{EventEmitter as d}from"events-ex";import{CommonError as p}from"@isdk/common-error";import{Configuration as y,RequestQueue as m}from"crawlee";function g(){let t=()=>{};const e=new Promise(e=>{t=e});return e.release=t,e}y.getGlobalConfig().set("persistStorage",!1);var v=class{constructor(){this.hdrs={},this.jar=[],this.pendingRequests=new Map,this.requestCounter=0,this.actionEmitter=new d,this.isPageActive=!1,this.navigationLock=function(){const t=g();return t.release(),t}(),this.blockedTypes=new Set}static register(t){const e=t.id;if(!e)throw new Error("Engine must define static id");if(this.registry.has(e))throw new Error(`Engine id duplicated: ${e}`);this.registry.set(e,t)}static get(t){return this.registry.get(t)}static getByMode(t){for(const[e,s]of this.registry.entries())if(s.mode===t)return s}static async create(e,s){const i=w(s,e,t),r=i.engine??e.engine,n=r?this.get(r)??this.getByMode(r):null;if(n){const t=new n;return await t.initialize(e,i),t}}async _extract(t,e){const s=t.type;if(!e)return"array"===s?[]:null;if("object"===s){const{selector:s,properties:i}=t;let r=e;if(s){const t=await this._querySelectorAll(e,s);r=t.length>0?t[0]:null}if(!r)return null;const n={};for(const t in i)n[t]=await this._extract(i[t],r);return n}if("array"===s){const{selector:s,items:i}=t,r=s?await this._querySelectorAll(e,s):[e],n=[];for(const t of r)n.push(await this._extract(i,t));return n}const{selector:i}=t;let r=e;if(i){const t=await this._querySelectorAll(e,i);r=t.length>0?t[0]:null}return r?this._extractValue(t,r):null}async buildResponse(t){const e=await this._buildResponse(t),s=e.headers["content-type"]||"";return e.contentType=s.split(";")[0].trim(),e}waitFor(t){return this.dispatchAction({type:"waitFor",options:t})}click(t){return this.dispatchAction({type:"click",selector:t})}fill(t,e){return this.dispatchAction({type:"fill",selector:t,value:e})}submit(t,e){return this.dispatchAction({type:"submit",selector:t,options:e})}pause(t){return this.dispatchAction({type:"pause",message:t})}extract(t){const e=this._normalizeSchema(t);return this.dispatchAction({type:"extract",schema:e})}_normalizeSchema(t){const e=JSON.parse(JSON.stringify(t));if(e.properties)for(const t in e.properties)e.properties[t]=this._normalizeSchema(e.properties[t]);if(e.items&&(e.items=this._normalizeSchema(e.items)),"array"===e.type&&(e.attribute&&!e.items&&(e.items={attribute:e.attribute},delete e.attribute),e.items||(e.items={type:"string"})),e.selector&&(e.has||e.exclude)){const{selector:t,has:s,exclude:i}=e,r=t.split(",").map(t=>{let e=t.trim();return s&&(e=`${e}:has(${s})`),i&&(e=`${e}:not(${i})`),e}).join(", ");e.selector=r,delete e.has,delete e.exclude}return e}get id(){return this.constructor.id}async getState(){return{cookies:await this.cookies()}}get mode(){return this.constructor.mode}get context(){return this.ctx}async initialize(t,e){if(this.ctx)return;f(t,e),this.ctx=t,this.opts=t,this.hdrs=function(t){const e={};if(t&&"object"==typeof t)for(const[s,i]of Object.entries(t))e[s.toLowerCase()]=i;return e}(t.headers),this.jar=[...t.cookies??[]],t.internal||(t.internal={}),t.internal.engine=this,t.engine=this.mode,this.actionEmitter.setMaxListeners(100),this.requestQueue=await m.open();const s=await this._getSpecificCrawlerOptions(t),i={...w(s,{requestQueue:this.requestQueue,maxConcurrency:1,minConcurrency:1,useSessionPool:!0,persistCookiesPerSession:!0,sessionPoolOptions:{maxPoolSize:1,persistenceOptions:{enable:!1},sessionOptions:{maxUsageCount:1e3,maxErrorScore:3}}}),requestHandler:this._requestHandler.bind(this),errorHandler:this._failedRequestHandler.bind(this),failedRequestHandler:this._failedRequestHandler.bind(this)};this.crawler=this._createCrawler(i),this.crawler.run().then(()=>{this.isCrawlerReady=!0}).catch(t=>{this.isCrawlerReady=!1,console.error("Crawler background error:",t)})}async cleanup(){await(this._cleanup?.()),await this._commonCleanup();const t=this.ctx;t&&t.internal?.engine===this&&(t.internal.engine=void 0),this.ctx=void 0,this.opts=void 0}async _executePendingActions(t){await new Promise(e=>{const s=async({action:e,resolve:s,reject:i})=>{try{if("dispose"===e.type)return this.actionEmitter.emit("dispose"),void s();s(await this.executeAction(t,e))}catch(t){i(t)}};this.actionEmitter.on("dispatch",s),this.actionEmitter.once("dispose",()=>{this.actionEmitter.removeListener("dispatch",s),e()})})}async _sharedRequestHandler(t){try{const{request:e}=t;this.isPageActive=!0;const s=this.pendingRequests.get(e.userData.requestId);if(s){const i=await this._buildResponse(t),r=!i.statusCode||i.statusCode>=400;if(this.ctx?.throwHttpErrors&&r){const t=new p(`Request for ${i.finalUrl} failed with status ${i.statusCode||"N/A"}`,"request",i.statusCode);s.reject(t)}else this.lastResponse=i,s.resolve(i);this.pendingRequests.delete(e.userData.requestId)}await this._executePendingActions(t)}finally{this.isPageActive=!1,this.navigationLock.release()}}async _sharedFailedRequestHandler(t,e){const{request:s}=t,i=this.pendingRequests.get(s.userData.requestId);if(i&&e&&this.ctx?.throwHttpErrors){this.pendingRequests.delete(s.userData.requestId);const t=e.response,r=t?.statusCode||500,n=t?.url?t.url:s.url,o=new p(`Request${n?" for "+n:""} failed: ${e.message}`,"request",r);i.reject(o)}return this._sharedRequestHandler(t)}async dispatchAction(t){if(!this.isPageActive)throw new Error("No active page. Call goto() before performing actions.");return new Promise((e,s)=>{this.actionEmitter.emit("dispatch",{action:t,resolve:e,reject:s})})}async _requestHandler(t){await this._sharedRequestHandler(t)}async _failedRequestHandler(t,e){await this._sharedFailedRequestHandler(t,e)}async _commonCleanup(){if(this.isPageActive&&await this.dispatchAction({type:"dispose"}).catch(()=>{}),this.pendingRequests.size>0)for(const[,t]of this.pendingRequests)t.reject(new Error("Cleanup:Request cancelled"));if(this.actionEmitter.removeAllListeners(),this.crawler){try{await(this.crawler.teardown?.())}catch(t){console.error("ccrawler teardown error:",t)}this.crawler=void 0}this.isCrawlerReady=void 0,this.requestQueue&&(await this.requestQueue.drop(),this.requestQueue=void 0),this.pendingRequests.clear()}async blockResources(t,e){return e&&this.blockedTypes.clear(),t.forEach(t=>this.blockedTypes.add(t)),t.length}getContent(){return this.lastResponse?Promise.resolve(this.lastResponse):Promise.reject(new Error("No content fetched yet. Call goto() first."))}async headers(t,e){if(void 0===t)return{...this.hdrs};if("string"==typeof t&&void 0===e)return this.hdrs[t.toLowerCase()]||"";if(null!==t&&"object"==typeof t){const s={};for(const[e,i]of Object.entries(t))s[e.toLowerCase()]=String(i);return this.hdrs=!0===e?s:{...this.hdrs,...s},!0}return"string"==typeof t&&("string"==typeof e?this.hdrs[t.toLowerCase()]=e:null===e&&delete this.hdrs[t.toLowerCase()],!0)}async cookies(t){return Array.isArray(t)?(this.jar=[...t],!0):null===t?(this.jar=[],!0):[...this.jar]}async dispose(){await this.cleanup()}};async function x(t,e){const s=function(t,e){if(!t||!e?.length)return null;const s=new URL(t);let i=e.find(t=>t.domain===s.hostname);i||(i=e.find(t=>s.hostname.endsWith(t.domain)));if(!i)return null;if(i.pathScope?.length){if(!i.pathScope.some(t=>s.pathname.startsWith(t)))return null}return i}(e?.url||t.url,t.sites),i=t.engine||s?.engine||"auto";let r=await v.create(t,{engine:i});return r||(r=await v.create(t,{engine:"http"})),r}v.registry=new Map;var b=class{constructor(t={}){this.options=t,this.closed=!1,this.id=h(),this.context=this.createContext(t)}async execute(t){await this.ensureEngine(t);const e=r.create(t);if(!e)throw new Error(`Unknown action: ${t.id||t.name}`);let s,i;this.context.internal.actionIndex=(this.context.internal.actionIndex||0)+1,this.context.currentAction={...t,index:this.context.internal.actionIndex,startedAt:Date.now()};try{return s=await e.execute(this.context,t),s}catch(t){throw i=t,i}finally{this.context.currentAction=void 0}}async executeAll(t){let e=0;try{for(;e<t.length;){const s=t[e];await this.execute(s),e++}const s=await this.execute({id:"getContent"});return{result:s?.result,outputs:this.getOutputs()}}catch(t){throw t.actionIndex=e,t}}getOutputs(){return this.context.outputs}async getState(){return this.context.internal.engine?.getState()}async dispose(){if(this.closed)return;const t=this.context.eventBus;t.emit("session:closing",{sessionId:this.id});try{await(this.context.internal.engine?.dispose())}finally{this.closed=!0}t.emit("session:closed",{sessionId:this.id})}async ensureEngine(t){if(this.closed)throw new Error("Session is closed");if(!this.context.internal.engine){const e=t?.params?.url??this.context.url;if(!await x(this.context,{url:e}))throw new Error("No engine found")}}createContext(e=this.options){const s=new c;return l({...e,id:this.id,eventBus:s,outputs:{},internal:{},execute:async t=>this.execute(t),action:async function(t,e,s){return this.execute({name:t,params:e,...s})}},t)}},E=class{constructor(t={}){this.defaults=t}async createSession(t){const e={...this.defaults,...t||{}};return new b(e)}async fetch(t,e){"string"!=typeof t&&(t=(e=t).url);const s=await this.createSession(e);try{const i=e?.actions||[];t&&0!==i.findIndex(e=>"goto"===e.id&&e.params?.url===t)&&i.unshift({id:"goto",params:{url:t}});return await s.executeAll(i)}finally{await s.dispose()}}};import{CheerioCrawler as k,ProxyConfiguration as S}from"crawlee";import*as q from"cheerio";import{CommonError as C,ErrorCode as $,NotFoundError as R}from"@isdk/common-error";var P=class extends v{async _buildResponse(t){const{request:e,response:s,body:i,$:r}=t,n=r?.html();let o="string"==typeof i?i:Buffer.isBuffer(i)?i.toString("utf-8"):String(i??"");n&&n!==o&&(o=n);let a=s?.headers;if(!a&&s?.rawHeaders){a={};const t=s.rawHeaders;for(let e=0;e<t.length;e+=2)a[t[e].toLowerCase()]=t[e+1]}return{url:e.url,finalUrl:e.loadedUrl||e.url,statusCode:s?.statusCode??200,statusText:s?.statusMessage,headers:a||{},cookies:t.session?.getCookies(e.url),body:i,html:o,text:o}}async _querySelectorAll(t,e){const{$:s,el:i}=t;return i.find(e).toArray().map(t=>({$:s,el:s(t)}))}async _extractValue(t,e){const{el:s}=e,{attribute:i,type:r="string"}=t;if(0===s.length)return null;let n="";if(n=i?s.attr(i)??null:"html"===r?s.html():s.text().trim(),null===n)return null;switch(r){case"number":return parseFloat(n.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=n.toLowerCase();return"true"===t||"1"===t;default:return n}}async executeAction(t,e){const{$:s}=t;switch(e.type){case"dispose":return;case"extract":if(!s)throw new C(`Cheerio context not available for action: ${e.type}`,"extract");return this._extract(e.schema,{$:s,el:s.root()});case"click":{if(!s)throw new C(`Cheerio context not available for action: ${e.type}`,"click");const i=e.selector,r=s(i).first();let n;if(0===r.length)try{n=new URL(i,t.request.loadedUrl||t.request.url).href}catch{throw new C(`click: selector not found or invalid URL: ${i}`,"click")}else{if(!r.is("a")||!r.attr("href")){if(r.is('input[type="submit"], button[type="submit"], button, input')){const e=r.closest("form");if(e.length)return this.executeAction(t,{type:"submit",selector:e});throw new C("click: submit-like element without form","click")}throw new C(`click: unsupported element for http simulate. Selector: ${i}`,"click")}{const e=r.attr("href");n=new URL(e,t.request.loadedUrl||t.request.url).href}}const o=await t.sendRequest({url:n});return void this._updateStateAfterNavigation(t,o)}case"fill":{if(!s)throw new C(`Cheerio context not available for action: ${e.type}`),"fill";const i=s(e.selector).first();if(0===i.length)throw new C(`fill: selector not found: ${e.selector}`);if(!i.is("input, textarea, select"))throw new C(`fill: not a form field: ${e.selector}`);return i.val(e.value),void(this.lastResponse=await this.buildResponse(t))}case"waitFor":return void(e.options?.ms&&await new Promise(t=>setTimeout(t,e.options.ms)));case"pause":const i=this.ctx?.onPause;return void(i?(console.info(e.message||"Execution paused for manual intervention."),await i({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."));case"submit":{if(!s)throw new C(`Cheerio context not available for action: ${e.type}`,"submit");const i="string"==typeof e.selector?s(e.selector).first():null!=e.selector?e.selector:s("form").first();if(0===i.length)throw new R(e.selector,"submit");const r=i.attr("action")||t.request.loadedUrl||t.request.url,n=(i.attr("method")||"GET").toUpperCase(),o=new URL(r,t.request.loadedUrl||t.request.url).href,a={};let c;if(i.find("input, select, textarea").each((t,e)=>{const i=s(e),r=i.attr("name");if(!r)return;const n=i.val();null!=n&&(a[r]=String(n))}),"GET"===n){const e=new URL(o);Object.entries(a).forEach(([t,s])=>e.searchParams.set(t,s)),c=await t.sendRequest({url:e.href,method:"GET"})}else{let s;const r={};"application/json"===(e.options?.enctype||i.attr("enctype")||"application/x-www-form-urlencoded")?(s=JSON.stringify(a),r["Content-Type"]="application/json"):(s=new URLSearchParams(a).toString(),r["Content-Type"]="application/x-www-form-urlencoded"),c=await t.sendRequest({url:o,method:"POST",body:s,headers:r})}return void this._updateStateAfterNavigation(t,c)}case"getContent":return this.buildResponse(t);default:throw new C(`Unknown action type: ${e.type}`,"CheerioFetchEngine.executeAction",$.NotSupported)}}_updateStateAfterNavigation(t,e){const s=e;let i=s.headers;if(!i&&s.rawHeaders){i={};for(let t=0;t<s.rawHeaders.length;t+=2)i[s.rawHeaders[t].toLowerCase()]=s.rawHeaders[t+1]}i=i||{};const r=s.body,n=q.load(r??"");t.$=n,t.response=s,t.body=r;const o=n.html(),a=n.text(),c=(i["content-type"]||"").split(";")[0].trim();this.lastResponse={url:t.request.url,finalUrl:s.url,statusCode:s.statusCode,statusText:s.statusMessage,headers:i,contentType:c,body:r,html:o,text:a}}_createCrawler(t){return new k(t)}_getSpecificCrawlerOptions(t){const e=this.opts?.proxy?"string"==typeof this.opts.proxy?[this.opts.proxy]:this.opts.proxy:void 0,s=e?.length?new S({proxyUrls:e}):void 0;return{additionalMimeTypes:["text/plain"],maxRequestRetries:1,requestHandlerTimeoutSecs:t.requestHandlerTimeoutSecs,proxyConfiguration:s,preNavigationHooks:[({session:e,request:s},i)=>{if(i.throwHttpErrors=t.throwHttpErrors,this.opts?.timeoutMs&&(i.timeout={request:this.opts.timeoutMs}),this.jar.length>0&&e)for(const t of this.jar){const i=`${t.name}=${t.value}`;e.setCookie(i,s.url)}}]}}async goto(t,e){this.isPageActive&&this.dispatchAction({type:"dispose"}).catch(()=>{});const s="req-"+ ++this.requestCounter,i=new Promise((t,i)=>{const r=e?.timeoutMs||this.opts?.timeoutMs||3e4,n=setTimeout(()=>{this.pendingRequests.delete(s),this.navigationLock.release(),i(new C(`goto timed out after ${r}ms.`,"gotoTimeout",$.RequestTimeout))},r);this.pendingRequests.set(s,{resolve:e=>{clearTimeout(n),t(e)},reject:t=>{clearTimeout(n),i(t)}})});return this.requestQueue.addRequest({...e,url:t,headers:{...this.hdrs,...e?.headers},userData:{requestId:s},uniqueKey:`${t}-${s}`}).catch(t=>{const e=this.pendingRequests.get(s);e&&(this.pendingRequests.delete(s),this.navigationLock.release(),e.reject(t))}),await this.navigationLock,this.navigationLock=g(),i}};P.id="cheerio",P.mode="http",v.register(P);import{PlaywrightCrawler as T}from"crawlee";import{firefox as A}from"playwright";import{launchOptions as U}from"camoufox-js";import{CommonError as _,ErrorCode as j,NotFoundError as O}from"@isdk/common-error";var F=class extends v{async _buildResponse(t){const{page:e,response:s,request:i}=t;if(!e||e.isClosed())return{url:i.url,finalUrl:i.loadedUrl||i.url,statusCode:s?.status(),statusText:s?.statusText(),headers:await(s?.allHeaders())||{},cookies:[],body:"",html:"",text:""};const r=await e.content(),n=await e.textContent("body");return{url:e.url(),finalUrl:e.url(),statusCode:s?.status(),statusText:s?.statusText(),headers:await(s?.allHeaders())||{},cookies:await e.context().cookies(),body:r,html:r,text:n||""}}async _querySelectorAll(t,e){return t.locator(e).all()}async _extractValue(t,e){const{attribute:s,type:i="string"}=t;if(0===await e.count())return null;let r="";if(r=s?await e.getAttribute(s):"html"===i?await e.innerHTML():await e.textContent(),null===r)return null;switch(r=r.trim(),i){case"number":return parseFloat(r.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=r.toLowerCase();return"true"===t||"1"===t;default:return r}}async executeAction(t,e){const{page:s}=t,i=this.opts?.timeoutMs||3e4;switch(e.type){case"navigate":{const i=await s.goto(e.url,{waitUntil:e.opts?.waitUntil||"domcontentloaded",timeout:this.opts?.timeoutMs||3e4});i&&(t={...t,response:i});const r=await this.buildResponse(t);return this.lastResponse=r,r}case"extract":{const i=await this._extract(e.schema,s.locator("body"));return this.lastResponse=await this.buildResponse(t),i}case"click":{await s.click(e.selector,{timeout:i}),await s.waitForLoadState("networkidle",{timeout:i});const r=await this.buildResponse(t);return void(this.lastResponse=r)}case"fill":await s.fill(e.selector,e.value,{timeout:i});const r=await this.buildResponse(t);return void(this.lastResponse=r);case"waitFor":try{e.options?.selector&&await s.waitForSelector(e.options.selector,{timeout:i}),e.options?.networkIdle&&await s.waitForLoadState("networkidle",{timeout:i})}catch(t){if(!1!==e.options?.failOnTimeout)throw t}return void(e.options?.ms&&await s.waitForTimeout(e.options.ms));case"submit":{const r=e.selector||"form",n=s.locator(r).first();if(0===await n.count())throw new O(r,"submit");if("application/json"===(e.options?.enctype||"application/x-www-form-urlencoded")){const t=await n.elementHandle();if(!t)throw new _(`submit: could not get form handle for ${r}`,"submit");const e=await t.evaluate(async t=>{const e=new FormData(t),s={};e.forEach((t,e)=>{s[e]=t.toString()});const i=await fetch(t.action,{method:t.method,headers:{"Content-Type":"application/json"},body:JSON.stringify(s)}),r=await i.text();return{status:i.status,statusText:i.statusText,headers:Object.fromEntries(i.headers.entries()),body:r,html:r,text:r,url:t.action,finalUrl:i.url}});return await t.dispose(),await s.setContent(e.html),void(this.lastResponse=e)}return await n.evaluate(t=>t.submit()),await s.waitForLoadState("networkidle",{timeout:i}),void(this.lastResponse=await this.buildResponse(t))}case"pause":{const t=this.ctx?.onPause;return void(t?(console.info(e.message||"Execution paused for manual intervention."),await t({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."))}case"getContent":return this.buildResponse(t);default:throw new _(`Unknown action type: ${e.type}`,"PlaywrightFetchEngine.executeAction",j.NotSupported)}}_createCrawler(t){return new T(t)}async _getSpecificCrawlerOptions(t){const e=t.browser?.headless??!0,s={maxRequestRetries:t.retries||3,headless:e,requestHandlerTimeoutSecs:t.requestHandlerTimeoutSecs,preNavigationHooks:[async({page:e,request:s},i)=>{if(i.throwHttpErrors=t.throwHttpErrors,this.jar.length>0)try{const t=this.jar.map(t=>{const e={...t};return e.domain||e.url||(e.url=s.url),"no_restriction"===e.sameSite&&(e.sameSite="None"),e});await e.context().addCookies(t)}catch(t){console.error("[Playwright] Failed to restore cookies:",t)}const r=this.blockedTypes;r.size>0&&await e.route("**/*",t=>{r.has(t.request().resourceType())?t.abort():t.continue()})}]};if(this.opts?.antibot){s.browserPoolOptions={useFingerprints:!1};const t=await U({headless:e});s.launchContext={launcher:A,launchOptions:t},s.postNavigationHooks=[async({page:t,handleCloudflareChallenge:e})=>{await e()}]}return s}async goto(t,e){if(this.isPageActive)return this.dispatchAction({type:"navigate",url:t,opts:e});if(!this.requestQueue)throw new _("RequestQueue not initialized","goto");const s="req-"+ ++this.requestCounter,i=new Promise((t,e)=>{this.pendingRequests.set(s,{resolve:t,reject:e})});return await this.requestQueue.addRequest({url:t,headers:this.hdrs,userData:{requestId:s,waitUntil:e?.waitUntil||"domcontentloaded"},uniqueKey:`${t}-${s}`}),i}};F.id="playwright",F.mode="browser",v.register(F);var N=class extends r{async onExecute(t,e){const{selector:s,...i}=e?.params||{};if(!s)throw new Error("Selector is required for click action");await this.delegateToEngine(t,"click",s,i)}};N.id="click",N.returnType="none",N.capabilities={http:"simulate",browser:"native"},r.register(N);var H=class extends r{async onExecute(t,e){const{selector:s,value:i,...r}=e?.params||{};if(!s)throw new Error("Selector is required for fill action");if(void 0===i)throw new Error("Value is required for fill action");await this.delegateToEngine(t,"fill",s,i,r)}};H.id="fill",H.returnType="none",H.capabilities={http:"simulate",browser:"native"},r.register(H);var L=class extends r{async onExecute(t,e){return await this.delegateToEngine(t,"getContent",e?.params)}};L.id="getContent",L.returnType="response",L.capabilities={http:"native",browser:"native"},r.register(L);var M=class extends r{async onExecute(t,e,s){const i=e?.params,r=i?.url||t.url;if(!r)throw new Error("URL is required for goto action");const n=t.internal.engine;if(!n)throw new Error("No engine available");t.url=r;return await n.goto(r,i)}};M.id="goto",M.returnType="response",M.capabilities={http:"native",browser:"native"},r.register(M);var z=class extends r{async onExecute(t,e){const{selector:s,...i}=e?.params||{};await this.delegateToEngine(t,"submit",s,i)}};z.id="submit",z.returnType="none",z.capabilities={http:"simulate",browser:"native"},r.register(z);var D=class extends r{async onExecute(t,e){const s=t.internal.engine;if(!s)throw new Error("No engine available");await s.waitFor(e?.params)}};D.id="waitFor",D.returnType="none",D.capabilities={http:"native",browser:"native"},r.register(D);var B=class extends r{async onExecute(t,e){const s=e?.params;if(!s)throw new Error("Schema is required for extract action");return this.delegateToEngine(t,"extract",s)}};B.id="extract",B.returnType="any",B.capabilities={http:"native",browser:"native"},r.register(B);var G=class extends r{async onExecute(t,e){const{selector:s,message:i,attribute:r}=e?.params||{},n=t.internal.engine;if("browser"===n?.mode){if(s){if(!await(n?.extract({selector:s,attribute:r})))return}n&&"pause"in n?await n.pause(i):console.warn("[PauseAction] was called, but the current engine does not support `pause`. Skipped.")}else console.warn("[PauseAction] can only run in browser engine. Skipped.")}};async function I(t,e){return(new E).fetch(t,e)}G.id="pause",G.capabilities={http:"native",browser:"native"},G.returnType="none",r.register(G);export{P as CheerioFetchEngine,N as ClickAction,t as DefaultFetcherProperties,B as ExtractAction,r as FetchAction,s as FetchActionResultStatus,v as FetchEngine,b as FetchSession,e as FetcherOptionKeys,H as FillAction,L as GetContentAction,M as GotoAction,G as PauseAction,F as PlaywrightFetchEngine,z as SubmitAction,D as WaitForAction,E as WebFetcher,I as fetchWeb};
package/docs/README.md CHANGED
@@ -135,7 +135,7 @@ This is the main entry point for the library.
135
135
 
136
136
  * `url` (string): The initial URL to navigate to.
137
137
  * `engine` ('http' | 'browser' | 'auto'): The engine to use. Defaults to `auto`.
138
- * `actions` (FetchActionOptions[]): An array of action objects to execute.
138
+ * `actions` (FetchActionOptions[]): An array of action objects to execute. (Supports `action`/`name` as alias for `id`, and `args` as alias for `params`)
139
139
  * `headers` (Record<string, string>): Headers to use for all requests.
140
140
  * ...and many other options for proxy, cookies, retries, etc.
141
141
 
@@ -152,6 +152,18 @@ Here are the essential built-in actions:
152
152
  * `getContent`: Retrieves the full content (HTML, text, etc.) of the current page state.
153
153
  * `extract`: Extracts any structured data from the page with ease using an expressive, declarative schema.
154
154
 
155
+ ### Response Structure
156
+
157
+ The `fetchWeb` function returns an object containing:
158
+
159
+ * `result` (FetchResponse):
160
+ * `url`: The final URL.
161
+ * `statusCode`: HTTP status code.
162
+ * `headers`: HTTP headers.
163
+ * `cookies`: Array of cookies.
164
+ * `text`, `html`: Page content.
165
+ * `outputs` (Record<string, any>): Data extracted and stored via `storeAs`.
166
+
155
167
  ---
156
168
 
157
169
  ## 📜 License
@@ -23,7 +23,7 @@ This approach allows users to describe a complete business process with intuitiv
23
23
 
24
24
  `FetchAction` is the abstract base class for all Actions. It defines the core elements of an Action:
25
25
 
26
- * `static id`: The unique identifier for the Action, e.g., `'click'`.
26
+ * `static id`: The unique identifier for the Action, e.g., `'click'`. In Action Script, you can use `id`, `name`, or `action` to specify this identifier.
27
27
  * `static returnType`: The type of the result returned after the Action executes, e.g., `'none'`, `'response'`.
28
28
  * `static capabilities`: Declares the capability level of this Action in different engines (`http`, `browser`), such as `native`, `simulate`, or `noop`.
29
29
  * `static register()`: A static method to register the Action class in a global registry, allowing it to be dynamically created by its `id`.
@@ -62,14 +62,16 @@ Users define a complete automation workflow via a JSON-formatted `actions` array
62
62
 
63
63
  For simple, linear workflows, you can use a list of the library's built-in atomic actions directly.
64
64
 
65
+ > **💡 Tip**: You can use `action` or `name` as an alias for `id`, and `args` as an alias for `params`.
66
+
65
67
  **Example: Searching for "gemini" on Google**
66
68
 
67
69
  ```json
68
70
  {
69
71
  "actions": [
70
- { "id": "goto", "params": { "url": "https://www.google.com" } },
71
- { "id": "fill", "params": { "selector": "textarea[name=q]", "value": "gemini" } },
72
- { "id": "submit", "params": { "selector": "form" } }
72
+ { "action": "goto", "args": { "url": "https://www.google.com" } },
73
+ { "action": "fill", "args": { "selector": "textarea[name=q]", "value": "gemini" } },
74
+ { "action": "submit", "args": { "selector": "form" } }
73
75
  ]
74
76
  }
75
77
  ```