@isdk/web-fetcher 0.2.0 → 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (53) hide show
  1. package/README.reddit.md +248 -0
  2. package/README.v2ex.md +109 -0
  3. package/dist/index.js +1 -1
  4. package/dist/index.mjs +1 -1
  5. package/docs/classes/CheerioFetchEngine.md +56 -56
  6. package/docs/classes/ClickAction.md +23 -23
  7. package/docs/classes/ExtractAction.md +23 -23
  8. package/docs/classes/FetchAction.md +23 -23
  9. package/docs/classes/FetchEngine.md +56 -56
  10. package/docs/classes/FetchSession.md +8 -8
  11. package/docs/classes/FillAction.md +23 -23
  12. package/docs/classes/GetContentAction.md +23 -23
  13. package/docs/classes/GotoAction.md +23 -23
  14. package/docs/classes/PauseAction.md +23 -23
  15. package/docs/classes/PlaywrightFetchEngine.md +56 -56
  16. package/docs/classes/SubmitAction.md +23 -23
  17. package/docs/classes/WaitForAction.md +23 -23
  18. package/docs/classes/WebFetcher.md +5 -5
  19. package/docs/enumerations/FetchActionResultStatus.md +4 -4
  20. package/docs/functions/fetchWeb.md +2 -2
  21. package/docs/interfaces/BaseFetchActionProperties.md +9 -9
  22. package/docs/interfaces/BaseFetchCollectorActionProperties.md +13 -13
  23. package/docs/interfaces/BaseFetcherProperties.md +21 -21
  24. package/docs/interfaces/DispatchedEngineAction.md +4 -4
  25. package/docs/interfaces/ExtractActionProperties.md +9 -9
  26. package/docs/interfaces/FetchActionInContext.md +13 -13
  27. package/docs/interfaces/FetchActionProperties.md +10 -10
  28. package/docs/interfaces/FetchActionResult.md +6 -6
  29. package/docs/interfaces/FetchContext.md +31 -31
  30. package/docs/interfaces/FetchEngineContext.md +26 -26
  31. package/docs/interfaces/FetchMetadata.md +5 -5
  32. package/docs/interfaces/FetchResponse.md +13 -13
  33. package/docs/interfaces/FetchReturnTypeRegistry.md +7 -7
  34. package/docs/interfaces/FetchSite.md +24 -24
  35. package/docs/interfaces/FetcherOptions.md +23 -23
  36. package/docs/interfaces/GotoActionOptions.md +6 -6
  37. package/docs/interfaces/PendingEngineRequest.md +3 -3
  38. package/docs/interfaces/SubmitActionOptions.md +2 -2
  39. package/docs/interfaces/WaitForActionOptions.md +4 -4
  40. package/docs/type-aliases/BaseFetchActionOptions.md +1 -1
  41. package/docs/type-aliases/BaseFetchCollectorOptions.md +1 -1
  42. package/docs/type-aliases/BrowserEngine.md +1 -1
  43. package/docs/type-aliases/FetchActionCapabilities.md +1 -1
  44. package/docs/type-aliases/FetchActionCapabilityMode.md +1 -1
  45. package/docs/type-aliases/FetchActionOptions.md +1 -1
  46. package/docs/type-aliases/FetchEngineAction.md +1 -1
  47. package/docs/type-aliases/FetchEngineType.md +1 -1
  48. package/docs/type-aliases/FetchReturnType.md +1 -1
  49. package/docs/type-aliases/FetchReturnTypeFor.md +1 -1
  50. package/docs/type-aliases/OnFetchPauseCallback.md +1 -1
  51. package/docs/type-aliases/ResourceType.md +1 -1
  52. package/docs/variables/DefaultFetcherProperties.md +1 -1
  53. package/package.json +1 -1
@@ -0,0 +1,248 @@
1
+ ### 推荐的 Reddit 频道 (Subreddits)
2
+
3
+ 根据项目特点,最适合发布此帖子的社区是:
4
+
5
+ 1. **r/webscraping**: 这是最直接相关的社区。这里的用户会非常欣赏其双引擎架构和反机器人侦测功能。
6
+ 2. **r/javascript**: 这是一个庞大的 JavaScript 开发者社区。由于这是一个 npm 库,在这里发布可以获得广泛的关注。
7
+ 3. **r/ArtificialIntelligence**: 鉴于该库明确为 AI 代理设计,这个社区的成员会对“声明式 JSON 工作流”等特性特别感兴趣,因为这使得 AI 可以轻松生成和执行自己的自动化脚本。
8
+ 4. **r/programming**: 一个更广泛的编程社区,适合分享技术含量高的新项目。
9
+
10
+ ---
11
+
12
+ ### Reddit 帖子草稿
13
+
14
+ **标题:** 我为 AI 代理构建了一个 Web 自动化库,它拥有双引擎(快速 HTTP + 完整浏览器)和声明式操作!
15
+
16
+ **正文:**
17
+
18
+ 大家好,
19
+
20
+ 我非常激动地想和大家分享我最近一直在开发的一个项目: **`@isdk/web-fetcher`**。
21
+
22
+ 在构建需要与 web 交互的 AI 代理或复杂的 web scraper 时,我们常常需要编写脆弱、命令式的代码来处理导航、点击和数据提取。这个过程不仅繁琐,而且难以维护。
23
+
24
+ `@isdk/web-fetcher` 就是为了解决这个问题而生的。它是一个功能强大且灵活的 web 抓取和浏览器自动化库,其核心是为 AI 应用和高级数据抓取任务而设计的。
25
+
26
+ ---
27
+
28
+ #### 🤔 为什么要造这个轮子?
29
+
30
+ 你可能会想:“为什么不直接用 `fetch` 或者 Playwright 呢?”
31
+
32
+ * **`fetch` 的局限**: `fetch` API 非常适合请求 API 或获取静态 HTML,但对于现代的、由 JavaScript 动态渲染内容的网站(比如单页应用 SPA)就无能为力了。你拿到手的只是一堆不含实际内容的模板代码。
33
+ * **Playwright 不够“开箱即用”**: 虽然 Playwright 很强大,但直接使用它意味着你需要自己处理很多额外的工作。比如:
34
+ * **反爬虫措施**: 很多网站有 Cloudflare 等反机器人机制,需要复杂的策略来绕过。
35
+ * **登录与会话**: 获取某些内容前必须登录,你需要手动管理登录流程和会话状态。
36
+ * **代码冗余**: 大量的配置和重复的交互逻辑代码会迅速膨胀。
37
+
38
+ `@isdk/web-fetcher` 在这些工具之上构建了一个更高层次的抽象,旨在提供一个既强大又易于使用的统一接口,让你可以专注于业务逻辑,而不是底层的实现细节。
39
+
40
+ ---
41
+
42
+ #### ✨ 核心功能:
43
+
44
+ * **⚙️ 双引擎架构**: 你可以根据任务需求选择合适的引擎。使用 **`http` 模式** (基于 Cheerio) 在静态网站上实现闪电般的速度,或者切换到 **`browser` 模式** (基于 Playwright) 来处理需要完整 JavaScript 执行的动态网站。
45
+ * **📜 声明式操作脚本**: 你可以用简单、可读的 JSON 格式定义复杂的多步骤工作流(如登录、填写表单、点击按钮)。这使得 AI 代理可以动态生成自己的自动化脚本。
46
+ * **📊 强大的数据提取**: 通过一个直观且富有表现力的声明式 Schema,可以轻松提取从简单文本到复杂嵌套对象的各种结构化数据。
47
+ * **🛡️ 反机器人侦测**: 在 `browser` 模式下,一个可选的 `antibot` 标志可以帮助你绕过像 Cloudflare 这样的常见反机器人措施。
48
+ * **🧩 可扩展**: 你可以轻松创建自定义的、高级别的“复合”操作来封装可复用的业务逻辑(例如,一个 `login` 动作)。
49
+
50
+ ---
51
+
52
+ #### 🚀 快速上手
53
+
54
+ 下面是一个简单的例子,展示了如何抓取一个网页并提取其标题:
55
+
56
+ ```typescript
57
+ import { fetchWeb } from '@isdk/web-fetcher';
58
+
59
+ async function getTitle(url: string) {
60
+ const { outputs } = await fetchWeb({
61
+ url,
62
+ actions: [
63
+ {
64
+ id: 'extract',
65
+ params: {
66
+ // 提取 <title> 标签的文本内容
67
+ selector: 'title',
68
+ },
69
+ // 将结果存储在 outputs 对象的 'pageTitle' 键下
70
+ storeAs: 'pageTitle',
71
+ },
72
+ ],
73
+ });
74
+
75
+ console.log('Page Title:', outputs.pageTitle);
76
+ }
77
+
78
+ getTitle('https://www.google.com');
79
+ ```
80
+
81
+ ---
82
+
83
+ #### 🤖 高级用法: 多步骤表单提交
84
+
85
+ 这个例子演示了如何使用 `browser` 引擎在 Google 上执行搜索。
86
+
87
+ ```typescript
88
+ import { fetchWeb } from '@isdk/web-fetcher';
89
+
90
+ async function searchGoogle(query: string) {
91
+ const { result } = await fetchWeb({
92
+ url: 'https://www.google.com',
93
+ engine: 'browser', // 使用完整的浏览器引擎进行交互
94
+ actions: [
95
+ // 初始导航由 url 选项处理
96
+ { id: 'fill', params: { selector: 'textarea[name=q]', value: query } },
97
+ { id: 'submit', params: { selector: 'form' } },
98
+ { id: 'waitFor', params: { selector: '#search' } }, // 等待搜索结果容器出现
99
+ ]
100
+ });
101
+
102
+ console.log('搜索结果 URL:', result?.finalUrl);
103
+ }
104
+
105
+ searchGoogle('gemini');
106
+ ```
107
+
108
+ ---
109
+
110
+ #### 🌱 项目状态与未来展望
111
+
112
+ 这个项目刚刚出炉,我们已经搭建好了一个坚实且高度可扩展的核心架构,并提供了上述所有基本功能。
113
+
114
+ 当然,它还有一些待完善的地方。我们接下来的一个重要目标是实现更智能的抓取策略,比如 **能够根据目标网站的特点自动选择最佳的抓取方式** (例如,自动从 `http` 模式升级到 `browser` 模式)。
115
+
116
+ ---
117
+
118
+ 这个项目是开源的,你可以在 NPM 和 GitHub 上找到它。
119
+
120
+ * **NPM:** `npm install @isdk/web-fetcher`
121
+ * **GitHub:** [链接到你的 GitHub 仓库]
122
+
123
+ 欢迎大家试用并提出宝贵的反馈!我很想听听你们的想法。
124
+
125
+
126
+ ------------------------------
127
+
128
+
129
+ 好的,这是一个非常重要的信息!透露你使用了像 `crawlee` 这样成熟的库作为底层,可以极大地增强你项目的可信度。这表明你没有“重复造轮子”,而是在一个坚实的基础上构建了一个更高层次的抽象。这在技术社区里是一个非常积极的信号。
130
+
131
+ 我会将这一点巧妙地融入帖子中,让它成为一个加分项。
132
+
133
+ ---
134
+
135
+ 这是更新后的版本,它强调了你的库是如何站在巨人肩膀上的:
136
+
137
+ ### **Title: I built a web automation library for AI agents so they can browse the web like a human, not a bot.**
138
+
139
+ Hey everyone,
140
+
141
+ Ever tried to make an AI agent *actually use* a website? You quickly run into a wall of pain.
142
+
143
+ You're not trying to crawl an entire domain like a traditional scraper. You want your agent to perform a specific task: log in, find a price, fill out a form, and get the result. But this means writing brittle, imperative code (`page.waitForSelector()`, `page.click()`, `page.evaluate()`, repeat) that breaks the moment a UI element changes.
144
+
145
+ I've been building AI agents and got deeply frustrated by this. So, I created a solution: **`@isdk/web-fetcher`**.
146
+
147
+ It’s a library designed to give agents a "browser on a leash"—a way to perform targeted, human-like actions on the web without the messy implementation details.
148
+
149
+ ---
150
+
151
+ ### 🤔 "Why not just use Playwright or Crawlee?"
152
+
153
+ Great question, and the answer gets to the heart of this project. I'm a huge fan of not reinventing the wheel, which is why **this library uses the incredible `crawlee` library under the hood.**
154
+
155
+ * **The Low-Level Tools (`fetch`, Playwright):** `fetch` is for static content, and Playwright is a fantastic browser control tool. But using it directly is like being given a box of engine parts and told to build a car.
156
+ * **The Powerful Framework (`crawlee`):** `crawlee` is a massive step up. It solves huge problems like request queuing, proxy management, and browser pooling. It's the robust engine and chassis for our car.
157
+ * **The Missing Piece (My Library):** Even with `crawlee`, you often still need to write *imperative, procedural code* to define *what* happens on the page. Your agent's logic gets mixed up with `page.click()` and `page.fill()`.
158
+
159
+ `@isdk/web-fetcher` is the final layer: **the simple, declarative dashboard for the car.** It sits on top of `crawlee`'s power and provides a JSON-based instruction set. This allows an AI to easily generate a "plan" of what to do, without worrying about the implementation.
160
+
161
+ So, it's not a replacement; it's an abstraction layer specifically for **agent-driven automation**.
162
+
163
+ ---
164
+
165
+ ### ✨ Core Features: What Makes It Different?
166
+
167
+ * **⚙️ Dual-Engine Architecture (via Crawlee):** Choose your weapon. Use the blazing-fast **`http` mode** for simple sites, or the full-featured **`browser` mode** for complex, interactive web apps.
168
+ * **📜 Declarative Action Scripts:** This is the key for AI. Instead of code, you define multi-step tasks (log in, search, extract) in simple JSON. **This means an AI agent can dynamically generate its own automation plans.**
169
+ * **📊 Clean, Declarative Data Extraction:** Define the data you want with a simple schema. No more wrestling with DOM traversal in your application code.
170
+ * **🛡️ Built-in Anti-Bot Evasion:** By leveraging `crawlee`'s capabilities, a simple `antibot: true` flag helps navigate common bot detection hurdles like Cloudflare.
171
+ * **🧩 Extensible by Design:** Bundle complex sequences into your own high-level actions. For example, create a single, reusable `loginToGitHub` action that encapsulates the entire login flow.
172
+
173
+ ---
174
+
175
+ ### 🚀 Quick Start: Grab a Page Title
176
+
177
+ Here’s how simple it is. The library handles the engine choice and execution.
178
+
179
+ ```typescript
180
+ import { fetchWeb } from '@isdk/web-fetcher';
181
+
182
+ async function getTitle(url: string) {
183
+ const { outputs } = await fetchWeb({
184
+ url,
185
+ actions: [
186
+ {
187
+ id: 'extract',
188
+ params: {
189
+ // Tell it to grab the text from the <title> tag
190
+ selector: 'title',
191
+ },
192
+ // Store the result under the 'pageTitle' key
193
+ storeAs: 'pageTitle',
194
+ },
195
+ ],
196
+ });
197
+
198
+ console.log('Page Title:', outputs.pageTitle);
199
+ }
200
+
201
+ getTitle('https://news.ycombinator.com');
202
+ ```
203
+
204
+ ---
205
+
206
+ ### 🤖 Advanced Example: A Human-like Task (Google Search)
207
+
208
+ This shows how an agent could perform a search. Notice we're just *describing* the steps.
209
+
210
+ ```typescript
211
+ import { fetchWeb } from '@isdk/web-fetcher';
212
+
213
+ async function searchGoogle(query: string) {
214
+ const { result } = await fetchWeb({
215
+ url: 'https://www.google.com',
216
+ engine: 'browser', // We need a real browser for this
217
+ actions: [
218
+ // Step 1: Fill the search bar
219
+ { id: 'fill', params: { selector: 'textarea[name=q]', value: query } },
220
+ // Step 2: Submit the form (like pressing Enter)
221
+ { id: 'submit', params: { selector: 'form' } },
222
+ // Step 3: Wait for search results to appear
223
+ { id: 'waitFor', params: { selector: '#search' } },
224
+ ]
225
+ });
226
+
227
+ console.log('Search Results URL:', result?.finalUrl);
228
+ }
229
+
230
+ searchGoogle('Gemini vs. GPT-4');
231
+ ```
232
+
233
+ ---
234
+
235
+ ### 🌱 Project Status & The Road Ahead
236
+
237
+ This project is fresh out of the oven. The core architecture is solid, and the features above are ready to use.
238
+
239
+ My next big goal is to make it even smarter. I want to implement a strategy where it can **automatically upgrade from `http` to `browser` mode** if it detects that a simple request isn't enough to get the job done.
240
+
241
+ ---
242
+
243
+ The project is open source and I'd be thrilled for you to check it out, give it a spin, and share your feedback.
244
+
245
+ * **NPM:** `npm install @isdk/web-fetcher`
246
+ * **GitHub:** [https://github.com/isdk/web-fetcher.js](https://github.com/isdk/web-fetcher.js)
247
+
248
+ I’m really excited to hear what you think and what you might build with it. Thanks for reading
package/README.v2ex.md ADDED
@@ -0,0 +1,109 @@
1
+ 非常好!这个补充说明是项目的核心亮点之一,必须在帖子里清晰地强调出来。这正是区分你的库和简单封装的关键所在:**提供一个跨引擎的、统一的抽象层**。
2
+
3
+ 我会将这一点融入到 V2EX 帖子中,让技术用户一眼就能看出这个设计的精妙之处。
4
+
5
+ ---
6
+
7
+ 这是更新后的 V2EX 帖子最终版:
8
+
9
+ #### **标题 (推荐):**
10
+
11
+ `[自荐] 做了一个为 AI 代理设计的 Web 自动化库:统一 API 驱动双引擎,让 AI 像人一样浏览网页。`
12
+
13
+ *(这个标题直接点出了“统一 API”和“双引擎”两个核心技术点)*
14
+
15
+ ---
16
+
17
+ #### **正文:**
18
+
19
+ 大家好,
20
+
21
+ 最近一直在折腾 AI Agent,发现让 Agent 可靠地与 Web 交互是个大难题,现有工具要么太底层,要么不够灵活。所以动手撸了一个轮子: **`@isdk/web-fetcher`**,想和大家分享一下,也希望能得到一些反馈。
22
+
23
+ * **GitHub:** `https://github.com/isdk/web-fetcher.js`
24
+ * **NPM:** `npm install @isdk/web-fetcher`
25
+
26
+
27
+ #### 解决了什么痛点?
28
+
29
+ 你可能会问,为啥不用 `fetch` 或 Playwright/Crawlee?
30
+
31
+ * `fetch` 拿不到 JS 动态渲染的内容,对现代网页基本没用。
32
+ * Playwright 虽然强大,但需要写大量命令式的过程代码 (`await page.click(...)` 等),不仅繁琐,而且 AI (比如 LLM) 很难直接生成这种复杂的逻辑。
33
+
34
+ 我不想重复造轮子,所以**底层用了`Crawlee` 库**来处理。
35
+
36
+ 我的目标是在 `Crawlee` 之上构建一个跨引擎一致性:抽象/模拟 HTTP 与 Browser 的共有行为,**声明式的“意图层”**,让 AI 可以通过生成简单的 JSON 来“指挥”浏览器完成任务,而不是去写具体的执行代码。
37
+
38
+ #### 核心功能
39
+
40
+ * **⚙️ 双引擎架构**: 你可以选择 `http` 模式(基于 Cheerio)来极速抓取静态内容,也可以用 `browser` 模式(基于 Playwright)来处理复杂的动态网页。
41
+ * **✨ 统一的操作模型 (核心设计)**: 这是最关键的一点。**我抽象了 `http` 和 `browser` 模式下的共性行为**。无论底层用哪个引擎,你都使用同一套 `actions` API。比如 `extract` (提取数据) 这个操作,在 `http` 模式下它会通过 Cheerio 解析静态 HTML,在 `browser` 模式下它会操作浏览器渲染后的 DOM。**你只需要学习一套 API,库在内部完成了适配和翻译**。
42
+ * **📜 声明式操作脚本**: 基于统一的模型,你可以用 JSON 定义一个多步骤任务流(登录、填表、点击),AI 生成这个 JSON 的成本远低于生成 JS 代码。
43
+ * **📊 强大的数据提取**: 同样是声明式的 Schema,轻松从页面提取结构化数据。
44
+ * **🛡️ 内置反爬**: `browser` 模式下开启 `antibot: true`,能处理一些常见的 Cloudflare 挑战。
45
+ * **🧩 易于扩展**: 可以自己封装常用的操作,比如把“登录知乎”封装成一个 `loginToZhihu` 的自定义动作。
46
+
47
+ ---
48
+
49
+ #### 快速上手:提取个标题
50
+
51
+ 注意,下面的代码不关心目标 URL 是静态还是动态的,`extract` 操作在两种模式下都有效。
52
+
53
+ ```typescript
54
+ import { fetchWeb } from '@isdk/web-fetcher';
55
+
56
+ async function getTitle(url: string) {
57
+ const { outputs } = await fetchWeb({
58
+ url,
59
+ actions: [
60
+ {
61
+ id: 'extract',
62
+ params: {
63
+ selector: 'title', // 提取 <title> 标签内容
64
+ },
65
+ storeAs: 'pageTitle', // 结果存到 outputs.pageTitle
66
+ },
67
+ ],
68
+ });
69
+
70
+ console.log('页面标题:', outputs.pageTitle);
71
+ }
72
+
73
+ getTitle('https://www.v2ex.com');
74
+ ```
75
+
76
+ #### 进阶玩法:多步表单提交 (Google 搜索)
77
+
78
+ 这个例子展示了如何用 JSON 指挥浏览器执行一系列动作。
79
+
80
+ ```typescript
81
+ import { fetchWeb } from '@isdk/web-fetcher';
82
+
83
+ async function searchGoogle(query: string) {
84
+ const { result } = await fetchWeb({
85
+ url: 'https://www.google.com',
86
+ engine: 'browser', // 显式指定需要浏览器环境
87
+ actions: [
88
+ // 步骤 1: 找到输入框并填入内容
89
+ { id: 'fill', params: { selector: 'textarea[name=q]', value: query } },
90
+ // 步骤 2: 提交表单
91
+ { id: 'submit', params: { selector: 'form' } },
92
+ // 步骤 3: 等待搜索结果容器加载出来
93
+ { id: 'waitFor', params: { selector: '#search' } },
94
+ ]
95
+ });
96
+
97
+ console.log('搜索结果页 URL:', result?.finalUrl);
98
+ }
99
+
100
+ searchGoogle('V2EX');
101
+ ```
102
+
103
+ ---
104
+
105
+ #### 项目状态
106
+
107
+ 项目刚起步,核心架构已经搭好。下一步计划是实现更智能的抓取策略(比如发现 http 模式拿不到内容时,自动升级到 browser 模式)。
108
+
109
+ 项目是开源的,欢迎大家试用、Star、提 Issue,或者狠狠地拍砖!感谢。
package/dist/index.js CHANGED
@@ -1 +1 @@
1
- "use strict";var t,e=Object.create,i=Object.defineProperty,s=Object.getOwnPropertyDescriptor,n=Object.getOwnPropertyNames,r=Object.getPrototypeOf,o=Object.prototype.hasOwnProperty,a=(t,e,r,a)=>{if(e&&"object"==typeof e||"function"==typeof e)for(let c of n(e))o.call(t,c)||c===r||i(t,c,{get:()=>e[c],enumerable:!(a=s(e,c))||a.enumerable});return t},c=(t,s,n)=>(n=null!=t?e(r(t)):{},a(!s&&t&&t.__esModule?n:i(n,"default",{value:t,enumerable:!0}),t)),l={};((t,e)=>{for(var s in e)i(t,s,{get:e[s],enumerable:!0})})(l,{CheerioFetchEngine:()=>_,ClickAction:()=>H,DefaultFetcherProperties:()=>u,ExtractAction:()=>V,FetchAction:()=>f,FetchActionResultStatus:()=>h,FetchEngine:()=>A,FetchSession:()=>U,FillAction:()=>L,GetContentAction:()=>z,GotoAction:()=>I,PauseAction:()=>K,PlaywrightFetchEngine:()=>D,SubmitAction:()=>W,WaitForAction:()=>J,WebFetcher:()=>j,fetchWeb:()=>Q}),module.exports=(t=l,a(i({},"__esModule",{value:!0}),t));var u={enableSmart:!0,useSiteRegistry:!0,antibot:!1,headers:{},cookies:[],reuseCookies:!0,proxy:[],blockResources:[],ignoreSslErrors:!0,browser:{engine:"playwright",headless:!0,waitUntil:"domcontentloaded"},http:{method:"GET"},timeoutMs:6e4,maxConcurrency:1,maxRequestsPerMinute:1e3,delayBetweenRequestsMs:0,retries:0,sites:[]},h=(t=>(t[t.Failed=0]="Failed",t[t.Success=1]="Success",t[t.Skipped=2]="Skipped",t))(h||{}),w=class t{static register(t){const e=t.id;if(!e)throw new Error("FetchAction.register: actionClass.id is required");this.registry.set(e,t)}static get(t){return this.registry.get(t)}static create(t){const e="string"==typeof t?t:t.id||t.name;if(!e)throw new Error("Action must have id or name");const i=this.registry.get(e);return i?new i:void 0}static has(t){return this.registry.has(t)}static list(){return Array.from(this.registry.keys())}static getCapability(t){return this.capabilities[t]??"noop"}getCapability(t){return this.constructor.getCapability(t)}get id(){return this.constructor.id}get returnType(){return this.constructor.returnType}get capabilities(){return this.constructor.capabilities}async delegateToEngine(t,e,...i){const s=t.internal.engine;if(!s)throw new Error("No engine available");if("function"!=typeof s[e])throw new Error(`Engine does not have a method named '${String(e)}'`);return await s[e](...i)}installCollectors(e,i){const s=i?.collectors;if(!s?.length)return;const n=[],r=new Set;for(const i of s){const s=d(i.activateOn),o=d(i.collectOn),a=d(i.deactivateOn),c=!(i.background??!0),l=t.create(i);if(!l)continue;let u=!1,h=!1,w=0;const f=async t=>{if(!u&&!h){u=!0;try{await(l.onBeforeExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:l.id,phase:"before",error:t})}}},m=async(t,s)=>{if(!h){u||await f(s);try{const n=Promise.resolve(l.onExecute?.(e,i,s)).then(s=>{var n,r;if(i.storeAs){((n=e.outputs)[r=i.storeAs]||(n[r]=[])).push(s)}return e.eventBus.emit("collector:result",{action:this.id,collector:i.id||i.name,event:t,result:s}),s}).catch(s=>{e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,event:t,phase:"exec",error:s})}).finally(()=>{w++});c&&(r.add(n),n.finally(()=>r.delete(n)))}catch(i){e.eventBus.emit("collector:error",{action:this.id,collector:l.id,event:t,phase:"exec",error:i})}}},g=async()=>{if(!h){0===w&&m("collector:after"),h=!0;try{await(l.onAfterExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,phase:"after",error:t})}finally{e.eventBus.emit("collector:end",{action:this.id,collector:i.id||i.name}),v.forEach(t=>t())}}},b=p(e,s,f),v=y(e,o,m),x=p(e,a,g);if(n.push(...b,...v,...x),!s.length&&!o.length&&!a.length){const t=()=>{g()};e.eventBus.once(`action:${this.id}.end`,t),n.push(()=>e.eventBus.off("fetcher:action:end",t))}}return n.length||r.size>0?{cleanup:()=>n.forEach(t=>t()),awaitExecPendings:async()=>{r.size>0&&await Promise.allSettled(Array.from(r))}}:void 0}async beforeExec(t,e){t.internal.actionStack||(t.internal.actionStack=[]);const i=t.internal.actionStack,s=i.length,n=i.length>0?i[i.length-1].id:void 0,r={...e,id:this.id,depth:s,parent:n};i.push(r),t.currentAction=r;const o={action:this,context:t,options:e,depth:s,stack:[...i]};t.eventBus.emit(`action:${this.id}.start`,o),t.eventBus.emit("action:start",o),await(this.onBeforeExec?.(t,e));return{entry:o,collectors:this.installCollectors(t,e)}}async afterExec(t,e,i,s){const n=t.internal.actionStack,r=n.length-1,o=s?.collectors;try{await(o?.awaitExecPendings()),t.lastResult=i,"response"!==i?.returnType||i.error||(t.lastResponse=i.result),e?.storeAs&&(t.outputs[e.storeAs]=i?.result),i?.error&&(t.currentAction.error=i.error),await(this.onAfterExec?.(t,e));const s={action:this,context:t,options:e,result:i,depth:r,stack:[...n]};i?.error&&(s.error=i.error);try{t.eventBus.emit(`action:${this.id}.end`,s)}catch(t){}try{t.eventBus.emit("action:end",s)}catch(t){}}finally{try{o?.cleanup()}finally{n.pop();const e=n.length;t.currentAction=e>0?n[e-1]:void 0}}}async execute(t,e){const i=await this.beforeExec(t,e);let s;try{const i=e?.failOnError??!0;return t.throwHttpErrors=i,s=await this.onExecute(t,e),s&&s.returnType||(s={status:1,returnType:this.returnType??"any",result:s}),s}catch(i){if(s={status:0,error:i,meta:{id:this.id,engineType:t.engine,capability:this.getCapability(t.engine)}},e?.failOnError)throw i;return s}finally{await this.afterExec(t,e,s,i)}}};w.registry=new Map,w.returnType="any",w.capabilities={http:"noop",browser:"noop"};var f=w;function d(t){return t?Array.isArray(t)?t:[t]:[]}function p(t,e,i){const s=[];for(const n of e)if("string"==typeof n||n instanceof RegExp){const e=(...t)=>{i(t[0])};t.eventBus.once(n,e),s.push(()=>t.eventBus.off(n,e))}return s}function y(t,e,i){const s=[];for(const n of e)if("string"==typeof n||n instanceof RegExp){const e=t=>i(n,t);t.eventBus.on(n,e),s.push(()=>t.eventBus.off(n,e))}return s}var m=require("events-ex");var g,b,v=require("lodash-es"),x=c(require("crypto"),1),E=t=>((t=>{!g||g.length<t?(g=Buffer.allocUnsafe(128*t),x.default.randomFillSync(g),b=0):b+t>g.length&&(x.default.randomFillSync(g),b=0),b+=t})(t|=0),g.subarray(b-t,b)),q=((t,e=21)=>((t,e,i)=>{let s=(2<<31-Math.clz32(t.length-1|1))-1,n=Math.ceil(1.6*s*e/t.length);return(r=e)=>{let o="";for(;;){let e=i(n),a=n;for(;a--;)if(o+=t[e[a]&s]||"",o.length===r)return o}}})(t,e,E))("0123456789abcdefghijklmnopqrstuvwxyz",12);var C=require("lodash-es"),k=require("events-ex"),S=require("@isdk/common-error"),$=require("crawlee");function R(){let t=()=>{};const e=new Promise(e=>{t=e});return e.release=t,e}$.Configuration.getGlobalConfig().set("persistStorage",!1);var A=class{constructor(){this.hdrs={},this.jar=[],this.pendingRequests=new Map,this.requestCounter=0,this.actionEmitter=new k.EventEmitter,this.isPageActive=!1,this.navigationLock=function(){const t=R();return t.release(),t}(),this.blockedTypes=new Set}static register(t){const e=t.id;if(!e)throw new Error("Engine must define static id");if(this.registry.has(e))throw new Error(`Engine id duplicated: ${e}`);this.registry.set(e,t)}static get(t){return this.registry.get(t)}static getByMode(t){for(const[e,i]of this.registry.entries())if(i.mode===t)return i}static async create(t,e){const i=(0,C.defaultsDeep)(e,t,u),s=i.engine??t.engine,n=s?this.get(s)??this.getByMode(s):null;if(n){const e=new n;return await e.initialize(t,i),e}}async _extract(t,e){const i=t.type;if(!e)return"array"===i?[]:null;if("object"===i){const{selector:i,properties:s}=t;let n=e;if(i){const t=await this._querySelectorAll(e,i);n=t.length>0?t[0]:null}if(!n)return null;const r={};for(const t in s)r[t]=await this._extract(s[t],n);return r}if("array"===i){const{selector:i,items:s}=t,n=i?await this._querySelectorAll(e,i):[e],r=[];for(const t of n)r.push(await this._extract(s,t));return r}const{selector:s}=t;let n=e;if(s){const t=await this._querySelectorAll(e,s);n=t.length>0?t[0]:null}return n?this._extractValue(t,n):null}waitFor(t){return this.dispatchAction({type:"waitFor",options:t})}click(t){return this.dispatchAction({type:"click",selector:t})}fill(t,e){return this.dispatchAction({type:"fill",selector:t,value:e})}submit(t,e){return this.dispatchAction({type:"submit",selector:t,options:e})}pause(t){return this.dispatchAction({type:"pause",message:t})}extract(t){const e=this._normalizeSchema(t);return this.dispatchAction({type:"extract",schema:e})}_normalizeSchema(t){const e=JSON.parse(JSON.stringify(t));if(e.properties)for(const t in e.properties)e.properties[t]=this._normalizeSchema(e.properties[t]);if(e.items&&(e.items=this._normalizeSchema(e.items)),"array"===e.type&&(e.attribute&&!e.items&&(e.items={attribute:e.attribute},delete e.attribute),e.items||(e.items={type:"string"})),e.selector&&(e.has||e.exclude)){const{selector:t,has:i,exclude:s}=e,n=t.split(",").map(t=>{let e=t.trim();return i&&(e=`${e}:has(${i})`),s&&(e=`${e}:not(${s})`),e}).join(", ");e.selector=n,delete e.has,delete e.exclude}return e}get id(){return this.constructor.id}get mode(){return this.constructor.mode}get context(){return this.ctx}async initialize(t,e){if(this.ctx)return;(0,C.merge)(t,e),this.ctx=t,this.opts=t,this.hdrs=function(t){const e={};if(t&&"object"==typeof t)for(const[i,s]of Object.entries(t))e[i.toLowerCase()]=s;return e}(t.headers),this.jar=[...t.cookies??[]],t.internal||(t.internal={}),t.internal.engine=this,t.engine=this.mode,this.requestQueue=await $.RequestQueue.open();const i=await this._getSpecificCrawlerOptions(t),s={...(0,C.defaultsDeep)(i,{requestQueue:this.requestQueue,maxConcurrency:1,minConcurrency:1,useSessionPool:!0,persistCookiesPerSession:!0,sessionPoolOptions:{maxPoolSize:1,persistenceOptions:{enable:!1},sessionOptions:{maxUsageCount:1e3,maxErrorScore:3}}}),requestHandler:this._requestHandler.bind(this),errorHandler:this._failedRequestHandler.bind(this),failedRequestHandler:this._failedRequestHandler.bind(this)};this.crawler=this._createCrawler(s),this.crawler.run().then(()=>{this.isCrawlerReady=!0}).catch(t=>{this.isCrawlerReady=!1,console.error("Crawler background error:",t)})}async cleanup(){await(this._cleanup?.()),await this._commonCleanup();const t=this.ctx;t&&t.internal?.engine===this&&(t.internal.engine=void 0),this.ctx=void 0,this.opts=void 0}async _executePendingActions(t){await new Promise(e=>{const i=async({action:e,resolve:i,reject:s})=>{try{if("dispose"===e.type)return this.actionEmitter.emit("dispose"),void i();i(await this.executeAction(t,e))}catch(t){s(t)}};this.actionEmitter.on("dispatch",i),this.actionEmitter.once("dispose",()=>{this.actionEmitter.removeListener("dispatch",i),e()})})}async _sharedRequestHandler(t){try{const{request:e}=t;this.isPageActive=!0;const i=this.pendingRequests.get(e.userData.requestId);if(i){const s=await this.buildResponse(t),n=!s.statusCode||s.statusCode>=400;if(this.ctx?.throwHttpErrors&&n){const t=new S.CommonError(`Request for ${s.finalUrl} failed with status ${s.statusCode||"N/A"}`,"request",s.statusCode);i.reject(t)}else this.lastResponse=s,i.resolve(s);this.pendingRequests.delete(e.userData.requestId)}await this._executePendingActions(t)}finally{this.isPageActive=!1,this.navigationLock.release()}}async _sharedFailedRequestHandler(t,e){const{request:i}=t,s=this.pendingRequests.get(i.userData.requestId);if(s&&e&&this.ctx?.throwHttpErrors){this.pendingRequests.delete(i.userData.requestId);const t=e.response,n=t?.statusCode||500,r=t?.url?t.url:i.url,o=new S.CommonError(`Request${r?" for "+r:""} failed: ${e.message}`,"request",n);s.reject(o)}return this._sharedRequestHandler(t)}async dispatchAction(t){if(!this.isPageActive)throw new Error("No active page. Call goto() before performing actions.");return new Promise((e,i)=>{this.actionEmitter.emit("dispatch",{action:t,resolve:e,reject:i})})}async _requestHandler(t){await this._sharedRequestHandler(t)}async _failedRequestHandler(t,e){await this._sharedFailedRequestHandler(t,e)}async _commonCleanup(){if(this.isPageActive&&await this.dispatchAction({type:"dispose"}).catch(()=>{}),this.pendingRequests.size>0)for(const[,t]of this.pendingRequests)t.reject(new Error("Cleanup:Request cancelled"));if(this.actionEmitter.removeAllListeners(),this.crawler){try{await(this.crawler.teardown?.())}catch(t){console.error("ccrawler teardown error:",t)}this.crawler=void 0}this.isCrawlerReady=void 0,this.requestQueue&&(await this.requestQueue.drop(),this.requestQueue=void 0),this.pendingRequests.clear()}async blockResources(t,e){return e&&this.blockedTypes.clear(),t.forEach(t=>this.blockedTypes.add(t)),t.length}getContent(){return this.lastResponse?Promise.resolve(this.lastResponse):Promise.reject(new Error("No content fetched yet. Call goto() first."))}async headers(t,e){if(void 0===t)return{...this.hdrs};if("string"==typeof t&&void 0===e)return this.hdrs[t.toLowerCase()]||"";if(null!==t&&"object"==typeof t){const i={};for(const[e,s]of Object.entries(t))i[e.toLowerCase()]=String(s);return this.hdrs=!0===e?i:{...this.hdrs,...i},!0}return"string"==typeof t&&("string"==typeof e?this.hdrs[t.toLowerCase()]=e:null===e&&delete this.hdrs[t.toLowerCase()],!0)}async cookies(t){return Array.isArray(t)?(this.jar=[...t],!0):null===t?(this.jar=[],!0):[...this.jar]}async dispose(){await this.cleanup()}};async function P(t,e){const i=function(t,e){if(!t||!e?.length)return null;const i=new URL(t);let s=e.find(t=>t.domain===i.hostname);s||(s=e.find(t=>i.hostname.endsWith(t.domain)));if(!s)return null;if(s.pathScope?.length){if(!s.pathScope.some(t=>i.pathname.startsWith(t)))return null}return s}(e?.url||t.url,t.sites),s=t.engine||i?.engine||"auto";let n=await A.create(t,{engine:s});return n||(n=await A.create(t,{engine:"http"})),n}A.registry=new Map;var U=class{constructor(t={}){this.options=t,this.closed=!1,this.id=q(),this.context=this.createContext(t)}async execute(t){await this.ensureEngine(t);const e=f.create(t);if(!e)throw new Error(`Unknown action: ${t.id||t.name}`);let i,s;this.context.internal.actionIndex=(this.context.internal.actionIndex||0)+1,this.context.currentAction={...t,index:this.context.internal.actionIndex,startedAt:Date.now()};try{return i=await e.execute(this.context,t),i}catch(t){throw s=t,s}finally{this.context.currentAction=void 0}}async executeAll(t){try{for(let e=0;e<t.length;e++){const i=t[e];await this.execute(i)}const e=await this.execute({id:"getContent"});return{result:e?.result,outputs:this.getOutputs()}}catch(t){throw t}}getOutputs(){return this.context.outputs}async dispose(){if(this.closed)return;const t=this.context.eventBus;t.emit("session:closing",{sessionId:this.id});try{await(this.context.internal.engine?.dispose())}finally{this.closed=!0}t.emit("session:closed",{sessionId:this.id})}async ensureEngine(t){if(this.closed)throw new Error("Session is closed");if(!this.context.internal.engine){const e=t?.params?.url??this.context.url;if(!await P(this.context,{url:e}))throw new Error("No engine found")}}createContext(t=this.options){const e=new m.EventEmitter;return(0,v.defaultsDeep)({...t,id:this.id,eventBus:e,outputs:{},internal:{},execute:async t=>this.execute(t),action:async function(t,e,i){return this.execute({name:t,params:e,...i})}},u)}},j=class{constructor(t={}){this.defaults=t}async createSession(t){const e={...this.defaults,...t||{}};return new U(e)}async fetch(t,e){"string"!=typeof t&&(t=(e=t).url);const i=await this.createSession(e);try{const s=e?.actions||[];t&&0!==s.findIndex(e=>"goto"===e.id&&e.params?.url===t)&&s.unshift({id:"goto",params:{url:t}});return await i.executeAll(s)}finally{await i.dispose()}}},F=require("crawlee"),O=c(require("cheerio")),T=require("@isdk/common-error"),_=class extends A{async buildResponse(t){const{request:e,response:i,body:s,$:n}=t,r=n?.html();let o="string"==typeof s?s:Buffer.isBuffer(s)?s.toString("utf-8"):String(s??"");return r&&r!==o&&(o=r),{url:e.url,finalUrl:e.loadedUrl||e.url,statusCode:i?.statusCode??200,statusText:i?.statusMessage,headers:i?.headers,body:s,html:o,text:o}}async _querySelectorAll(t,e){const{$:i,el:s}=t;return s.find(e).toArray().map(t=>({$:i,el:i(t)}))}async _extractValue(t,e){const{el:i}=e,{attribute:s,type:n="string"}=t;if(0===i.length)return null;let r="";if(r=s?i.attr(s)??null:"html"===n?i.html():i.text().trim(),null===r)return null;switch(n){case"number":return parseFloat(r.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=r.toLowerCase();return"true"===t||"1"===t;default:return r}}async executeAction(t,e){const{$:i}=t;switch(e.type){case"dispose":return;case"extract":if(!i)throw new T.CommonError(`Cheerio context not available for action: ${e.type}`,"extract");return this._extract(e.schema,{$:i,el:i.root()});case"click":{if(!i)throw new T.CommonError(`Cheerio context not available for action: ${e.type}`,"click");const s=e.selector,n=i(s).first();let r;if(0===n.length)try{r=new URL(s,t.request.loadedUrl||t.request.url).href}catch{throw new T.CommonError(`click: selector not found or invalid URL: ${s}`,"click")}else{if(!n.is("a")||!n.attr("href")){if(n.is('input[type="submit"], button[type="submit"], button, input')){const e=n.closest("form");if(e.length)return this.executeAction(t,{type:"submit",selector:e});throw new T.CommonError("click: submit-like element without form","click")}throw new T.CommonError(`click: unsupported element for http simulate. Selector: ${s}`,"click")}{const e=n.attr("href");r=new URL(e,t.request.loadedUrl||t.request.url).href}}const o=await t.sendRequest({url:r});return void this._updateStateAfterNavigation(t,o)}case"fill":{if(!i)throw new T.CommonError(`Cheerio context not available for action: ${e.type}`),"fill";const s=i(e.selector).first();if(0===s.length)throw new T.CommonError(`fill: selector not found: ${e.selector}`);if(!s.is("input, textarea, select"))throw new T.CommonError(`fill: not a form field: ${e.selector}`,"fill");{s.val(e.value);const i=this.buildResponse(t);this.lastResponse=i}return}case"waitFor":return void(e.options?.ms&&await new Promise(t=>setTimeout(t,e.options.ms)));case"pause":const s=this.ctx?.onPause;return void(s?(console.info(e.message||"Execution paused for manual intervention."),await s({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."));case"submit":{if(!i)throw new T.CommonError(`Cheerio context not available for action: ${e.type}`,"submit");const s="string"==typeof e.selector?i(e.selector).first():null!=e.selector?e.selector:i("form").first();if(0===s.length)throw new T.NotFoundError(e.selector,"submit");const n=s.attr("action")||t.request.loadedUrl||t.request.url,r=(s.attr("method")||"GET").toUpperCase(),o=new URL(n,t.request.loadedUrl||t.request.url).href,a={};let c;if(s.find("input, select, textarea").each((t,e)=>{const s=i(e),n=s.attr("name");if(!n)return;const r=s.val();null!=r&&(a[n]=String(r))}),"GET"===r){const e=new URL(o);Object.entries(a).forEach(([t,i])=>e.searchParams.set(t,i)),c=await t.sendRequest({url:e.href,method:"GET"})}else{let i;const n={};"application/json"===(e.options?.enctype||s.attr("enctype")||"application/x-www-form-urlencoded")?(i=JSON.stringify(a),n["Content-Type"]="application/json"):(i=new URLSearchParams(a).toString(),n["Content-Type"]="application/x-www-form-urlencoded"),c=await t.sendRequest({url:o,method:"POST",body:i,headers:n})}return void this._updateStateAfterNavigation(t,c)}case"getContent":return this.buildResponse(t);default:throw new T.CommonError(`Unknown action type: ${e.type}`,"CheerioFetchEngine.executeAction",T.ErrorCode.NotSupported)}}_updateStateAfterNavigation(t,e){const i=e.response||e,{body:s,headers:n,statusCode:r,statusMessage:o}=i,{url:a,loadedUrl:c}=e,l="string"==typeof s?s:Buffer.isBuffer(s)?s.toString("utf-8"):String(s??"");n&&n["content-type"]?.includes("html")&&(t.$=O.load(l)),this.lastResponse={url:a,finalUrl:c||a,statusCode:r,statusText:o,headers:n||{},body:s,html:l,text:l}}_createCrawler(t){return new F.CheerioCrawler(t)}_getSpecificCrawlerOptions(t){const e=this.opts?.proxy?"string"==typeof this.opts.proxy?[this.opts.proxy]:this.opts.proxy:void 0,i=e?.length?new F.ProxyConfiguration({proxyUrls:e}):void 0;return{additionalMimeTypes:["text/plain"],maxRequestRetries:1,requestHandlerTimeoutSecs:Math.max(5,Math.floor((this.opts?.timeoutMs||3e4)/1e3)),proxyConfiguration:i,preNavigationHooks:[(e,i)=>{i.throwHttpErrors=t.throwHttpErrors,this.opts?.timeoutMs&&(i.timeout={request:this.opts.timeoutMs})}]}}async goto(t,e){this.isPageActive&&this.dispatchAction({type:"dispose"}).catch(()=>{});const i="req-"+ ++this.requestCounter,s=new Promise((t,s)=>{const n=e?.timeoutMs||this.opts?.timeoutMs||3e4,r=setTimeout(()=>{this.pendingRequests.delete(i),this.navigationLock.release(),s(new T.CommonError(`goto timed out after ${n}ms.`,"gotoTimeout",T.ErrorCode.RequestTimeout))},n);this.pendingRequests.set(i,{resolve:e=>{clearTimeout(r),t(e)},reject:t=>{clearTimeout(r),s(t)}})});return this.requestQueue.addRequest({...e,url:t,headers:{...this.hdrs,...e?.headers},userData:{requestId:i},uniqueKey:`${t}-${i}`}).catch(t=>{const e=this.pendingRequests.get(i);e&&(this.pendingRequests.delete(i),this.navigationLock.release(),e.reject(t))}),await this.navigationLock,this.navigationLock=R(),s}};_.id="cheerio",_.mode="http",A.register(_);var N=require("crawlee"),M=require("playwright"),B=require("camoufox-js"),G=require("@isdk/common-error"),D=class extends A{async buildResponse(t){const{page:e,response:i,request:s}=t;if(!e||e.isClosed())return{url:s.url,finalUrl:s.loadedUrl||s.url,statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},body:"",html:"",text:""};const n=await e.content(),r=await e.textContent("body");return{url:e.url(),finalUrl:e.url(),statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},body:n,html:n,text:r||""}}async _querySelectorAll(t,e){return t.locator(e).all()}async _extractValue(t,e){const{attribute:i,type:s="string"}=t;if(0===await e.count())return null;let n="";if(n=i?await e.getAttribute(i):"html"===s?await e.innerHTML():await e.textContent(),null===n)return null;switch(n=n.trim(),s){case"number":return parseFloat(n.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=n.toLowerCase();return"true"===t||"1"===t;default:return n}}async executeAction(t,e){const{page:i}=t,s=this.opts?.timeoutMs||3e4;switch(e.type){case"navigate":{const s=await i.goto(e.url,{waitUntil:e.opts?.waitUntil||"domcontentloaded",timeout:this.opts?.timeoutMs||3e4});s&&(t={...t,response:s});const n=await this.buildResponse(t);return this.lastResponse=n,n}case"extract":return this._extract(e.schema,i.locator("body"));case"click":{await i.click(e.selector,{timeout:s}),await i.waitForLoadState("networkidle",{timeout:s});const n=await this.buildResponse(t);return void(this.lastResponse=n)}case"fill":await i.fill(e.selector,e.value,{timeout:s});const n=await this.buildResponse(t);return void(this.lastResponse=n);case"waitFor":return e.options?.selector&&await i.waitForSelector(e.options.selector,{timeout:s}),e.options?.networkIdle&&await i.waitForLoadState("networkidle",{timeout:s}),void(e.options?.ms&&await i.waitForTimeout(e.options.ms));case"submit":{const n=e.selector||"form",r=i.locator(n).first();if(0===await r.count())throw new G.NotFoundError(n,"submit");if("application/json"===(e.options?.enctype||"application/x-www-form-urlencoded")){const t=await r.elementHandle();if(!t)throw new G.CommonError(`submit: could not get form handle for ${n}`,"submit");const e=await t.evaluate(async t=>{const e=new FormData(t),i={};e.forEach((t,e)=>{i[e]=t.toString()});const s=await fetch(t.action,{method:t.method,headers:{"Content-Type":"application/json"},body:JSON.stringify(i)}),n=await s.text();return{status:s.status,statusText:s.statusText,headers:Object.fromEntries(s.headers.entries()),body:n,html:n,text:n,url:t.action,finalUrl:s.url}});return await t.dispose(),await i.setContent(e.html),void(this.lastResponse=e)}return await r.evaluate(t=>t.submit()),await i.waitForLoadState("networkidle",{timeout:s}),void(this.lastResponse=await this.buildResponse(t))}case"pause":{const t=this.ctx?.onPause;return void(t?(console.info(e.message||"Execution paused for manual intervention."),await t({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."))}case"getContent":return this.buildResponse(t);default:throw new G.CommonError(`Unknown action type: ${e.type}`,"PlaywrightFetchEngine.executeAction",G.ErrorCode.NotSupported)}}_createCrawler(t){return new N.PlaywrightCrawler(t)}async _getSpecificCrawlerOptions(t){const e=t.browser?.headless??!0,i={maxRequestRetries:t.retries||3,headless:e,preNavigationHooks:[async({page:e,request:i},s)=>{s.throwHttpErrors=t.throwHttpErrors,this.jar.length>0&&await e.context().addCookies(this.jar.map(t=>({...t,url:i.url,domain:t.domain||new URL(i.url).hostname})));const n=this.blockedTypes;n.size>0&&await e.route("**/*",t=>{n.has(t.request().resourceType())?t.abort():t.continue()})}]};if(this.opts?.antibot){console.log("[DEBUG] antibot enabled, configuring camoufox..."),i.browserPoolOptions={useFingerprints:!1},console.log("[DEBUG] Calling launchOptions...");const t=await(0,B.launchOptions)({headless:e});console.log("[DEBUG] launchOptions returned."),i.launchContext={launcher:M.firefox,launchOptions:t},i.postNavigationHooks=[async({page:t,handleCloudflareChallenge:e})=>{console.log(`[DEBUG] In postNavigationHook for ${t.url()}. Calling handleCloudflareChallenge...`),await e(),console.log("[DEBUG] handleCloudflareChallenge returned.")}],console.log("[DEBUG] camoufox configuration complete.")}return i}async goto(t,e){if(this.isPageActive)return this.dispatchAction({type:"navigate",url:t,opts:e});if(!this.requestQueue)throw new G.CommonError("RequestQueue not initialized","goto");const i="req-"+ ++this.requestCounter,s=new Promise((t,e)=>{this.pendingRequests.set(i,{resolve:t,reject:e})});return await this.requestQueue.addRequest({url:t,headers:this.hdrs,userData:{requestId:i,waitUntil:e?.waitUntil||"domcontentloaded"},uniqueKey:`${t}-${i}`}),s}};D.id="playwright",D.mode="browser",A.register(D);var H=class extends f{async onExecute(t,e){const{selector:i,...s}=e?.params||{};if(!i)throw new Error("Selector is required for click action");await this.delegateToEngine(t,"click",i,s)}};H.id="click",H.returnType="none",H.capabilities={http:"simulate",browser:"native"},f.register(H);var L=class extends f{async onExecute(t,e){const{selector:i,value:s,...n}=e?.params||{};if(!i)throw new Error("Selector is required for fill action");if(void 0===s)throw new Error("Value is required for fill action");await this.delegateToEngine(t,"fill",i,s,n)}};L.id="fill",L.returnType="none",L.capabilities={http:"simulate",browser:"native"},f.register(L);var z=class extends f{async onExecute(t,e){return await this.delegateToEngine(t,"getContent",e?.params)}};z.id="getContent",z.returnType="response",z.capabilities={http:"native",browser:"native"},f.register(z);var I=class extends f{async onExecute(t,e,i){const s=e?.params,n=s?.url||t.url;if(!n)throw new Error("URL is required for goto action");const r=t.internal.engine;if(!r)throw new Error("No engine available");t.url=n;return await r.goto(n,s)}};I.id="goto",I.returnType="response",I.capabilities={http:"native",browser:"native"},f.register(I);var W=class extends f{async onExecute(t,e){const{selector:i,...s}=e?.params||{};await this.delegateToEngine(t,"submit",i,s)}};W.id="submit",W.returnType="none",W.capabilities={http:"simulate",browser:"native"},f.register(W);var J=class extends f{async onExecute(t,e){const i=t.internal.engine;if(!i)throw new Error("No engine available");await i.waitFor(e?.params)}};J.id="waitFor",J.returnType="none",J.capabilities={http:"native",browser:"native"},f.register(J);var V=class extends f{async onExecute(t,e){const i=e?.params;if(!i)throw new Error("Schema is required for extract action");return this.delegateToEngine(t,"extract",i)}};V.id="extract",V.returnType="any",V.capabilities={http:"native",browser:"native"},f.register(V);var K=class extends f{async onExecute(t,e){const{selector:i,message:s,attribute:n}=e?.params||{},r=t.internal.engine;if("browser"===r?.mode){if(i){if(!await(r?.extract({selector:i,attribute:n})))return}r&&"pause"in r?await r.pause(s):console.warn("[PauseAction] was called, but the current engine does not support `pause`. Skipped.")}else console.warn("[PauseAction] can only run in browser engine. Skipped.")}};async function Q(t,e){return(new j).fetch(t,e)}K.id="pause",K.capabilities={http:"native",browser:"native"},K.returnType="none",f.register(K);
1
+ "use strict";var t,e=Object.create,i=Object.defineProperty,s=Object.getOwnPropertyDescriptor,r=Object.getOwnPropertyNames,n=Object.getPrototypeOf,o=Object.prototype.hasOwnProperty,a=(t,e,n,a)=>{if(e&&"object"==typeof e||"function"==typeof e)for(let c of r(e))o.call(t,c)||c===n||i(t,c,{get:()=>e[c],enumerable:!(a=s(e,c))||a.enumerable});return t},c=(t,s,r)=>(r=null!=t?e(n(t)):{},a(!s&&t&&t.__esModule?r:i(r,"default",{value:t,enumerable:!0}),t)),l={};((t,e)=>{for(var s in e)i(t,s,{get:e[s],enumerable:!0})})(l,{CheerioFetchEngine:()=>_,ClickAction:()=>G,DefaultFetcherProperties:()=>u,ExtractAction:()=>V,FetchAction:()=>f,FetchActionResultStatus:()=>h,FetchEngine:()=>A,FetchSession:()=>j,FillAction:()=>z,GetContentAction:()=>D,GotoAction:()=>W,PauseAction:()=>K,PlaywrightFetchEngine:()=>B,SubmitAction:()=>I,WaitForAction:()=>J,WebFetcher:()=>F,fetchWeb:()=>Q}),module.exports=(t=l,a(i({},"__esModule",{value:!0}),t));var u={enableSmart:!0,useSiteRegistry:!0,antibot:!1,headers:{},cookies:[],reuseCookies:!0,proxy:[],blockResources:[],ignoreSslErrors:!0,browser:{engine:"playwright",headless:!0,waitUntil:"domcontentloaded"},http:{method:"GET"},timeoutMs:6e4,maxConcurrency:1,maxRequestsPerMinute:1e3,delayBetweenRequestsMs:0,retries:0,sites:[]},h=(t=>(t[t.Failed=0]="Failed",t[t.Success=1]="Success",t[t.Skipped=2]="Skipped",t))(h||{}),w=class t{static register(t){const e=t.id;if(!e)throw new Error("FetchAction.register: actionClass.id is required");this.registry.set(e,t)}static get(t){return this.registry.get(t)}static create(t){const e="string"==typeof t?t:t.id||t.name;if(!e)throw new Error("Action must have id or name");const i=this.registry.get(e);return i?new i:void 0}static has(t){return this.registry.has(t)}static list(){return Array.from(this.registry.keys())}static getCapability(t){return this.capabilities[t]??"noop"}getCapability(t){return this.constructor.getCapability(t)}get id(){return this.constructor.id}get returnType(){return this.constructor.returnType}get capabilities(){return this.constructor.capabilities}async delegateToEngine(t,e,...i){const s=t.internal.engine;if(!s)throw new Error("No engine available");if("function"!=typeof s[e])throw new Error(`Engine does not have a method named '${String(e)}'`);return await s[e](...i)}installCollectors(e,i){const s=i?.collectors;if(!s?.length)return;const r=[],n=new Set;for(const i of s){const s=d(i.activateOn),o=d(i.collectOn),a=d(i.deactivateOn),c=!(i.background??!0),l=t.create(i);if(!l)continue;let u=!1,h=!1,w=0;const f=async t=>{if(!u&&!h){u=!0;try{await(l.onBeforeExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:l.id,phase:"before",error:t})}}},m=async(t,s)=>{if(!h){u||await f(s);try{const r=Promise.resolve(l.onExecute?.(e,i,s)).then(s=>{var r,n;if(i.storeAs){((r=e.outputs)[n=i.storeAs]||(r[n]=[])).push(s)}return e.eventBus.emit("collector:result",{action:this.id,collector:i.id||i.name,event:t,result:s}),s}).catch(s=>{e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,event:t,phase:"exec",error:s})}).finally(()=>{w++});c&&(n.add(r),r.finally(()=>n.delete(r)))}catch(i){e.eventBus.emit("collector:error",{action:this.id,collector:l.id,event:t,phase:"exec",error:i})}}},g=async()=>{if(!h){0===w&&m("collector:after"),h=!0;try{await(l.onAfterExec?.(e,i))}catch(t){e.eventBus.emit("collector:error",{action:this.id,collector:i.id||i.name,phase:"after",error:t})}finally{e.eventBus.emit("collector:end",{action:this.id,collector:i.id||i.name}),v.forEach(t=>t())}}},b=p(e,s,f),v=y(e,o,m),x=p(e,a,g);if(r.push(...b,...v,...x),!s.length&&!o.length&&!a.length){const t=()=>{g()};e.eventBus.once(`action:${this.id}.end`,t),r.push(()=>e.eventBus.off("fetcher:action:end",t))}}return r.length||n.size>0?{cleanup:()=>r.forEach(t=>t()),awaitExecPendings:async()=>{n.size>0&&await Promise.allSettled(Array.from(n))}}:void 0}async beforeExec(t,e){t.internal.actionStack||(t.internal.actionStack=[]);const i=t.internal.actionStack,s=i.length,r=i.length>0?i[i.length-1].id:void 0,n={...e,id:this.id,depth:s,parent:r};i.push(n),t.currentAction=n;const o={action:this,context:t,options:e,depth:s,stack:[...i]};t.eventBus.emit(`action:${this.id}.start`,o),t.eventBus.emit("action:start",o),await(this.onBeforeExec?.(t,e));return{entry:o,collectors:this.installCollectors(t,e)}}async afterExec(t,e,i,s){const r=t.internal.actionStack,n=r.length-1,o=s?.collectors;try{await(o?.awaitExecPendings()),t.lastResult=i,"response"!==i?.returnType||i.error||(t.lastResponse=i.result),e?.storeAs&&(t.outputs[e.storeAs]=i?.result),i?.error&&(t.currentAction.error=i.error),await(this.onAfterExec?.(t,e));const s={action:this,context:t,options:e,result:i,depth:n,stack:[...r]};i?.error&&(s.error=i.error);try{t.eventBus.emit(`action:${this.id}.end`,s)}catch(t){}try{t.eventBus.emit("action:end",s)}catch(t){}}finally{try{o?.cleanup()}finally{r.pop();const e=r.length;t.currentAction=e>0?r[e-1]:void 0}}}async execute(t,e){const i=await this.beforeExec(t,e);let s;try{const i=e?.failOnError??!0;return t.throwHttpErrors=i,s=await this.onExecute(t,e),s&&s.returnType||(s={status:1,returnType:this.returnType??"any",result:s}),s}catch(i){if(s={status:0,error:i,meta:{id:this.id,engineType:t.engine,capability:this.getCapability(t.engine)}},e?.failOnError)throw i;return s}finally{await this.afterExec(t,e,s,i)}}};w.registry=new Map,w.returnType="any",w.capabilities={http:"noop",browser:"noop"};var f=w;function d(t){return t?Array.isArray(t)?t:[t]:[]}function p(t,e,i){const s=[];for(const r of e)if("string"==typeof r||r instanceof RegExp){const e=(...t)=>{i(t[0])};t.eventBus.once(r,e),s.push(()=>t.eventBus.off(r,e))}return s}function y(t,e,i){const s=[];for(const r of e)if("string"==typeof r||r instanceof RegExp){const e=t=>i(r,t);t.eventBus.on(r,e),s.push(()=>t.eventBus.off(r,e))}return s}var m=require("events-ex");var g,b,v=require("lodash-es"),x=c(require("crypto"),1),q=t=>((t=>{!g||g.length<t?(g=Buffer.allocUnsafe(128*t),x.default.randomFillSync(g),b=0):b+t>g.length&&(x.default.randomFillSync(g),b=0),b+=t})(t|=0),g.subarray(b-t,b)),E=((t,e=21)=>((t,e,i)=>{let s=(2<<31-Math.clz32(t.length-1|1))-1,r=Math.ceil(1.6*s*e/t.length);return(n=e)=>{let o="";for(;;){let e=i(r),a=r;for(;a--;)if(o+=t[e[a]&s]||"",o.length===n)return o}}})(t,e,q))("0123456789abcdefghijklmnopqrstuvwxyz",12);var C=require("lodash-es"),S=require("events-ex"),k=require("@isdk/common-error"),$=require("crawlee");function R(){let t=()=>{};const e=new Promise(e=>{t=e});return e.release=t,e}$.Configuration.getGlobalConfig().set("persistStorage",!1);var A=class{constructor(){this.hdrs={},this.jar=[],this.pendingRequests=new Map,this.requestCounter=0,this.actionEmitter=new S.EventEmitter,this.isPageActive=!1,this.navigationLock=function(){const t=R();return t.release(),t}(),this.blockedTypes=new Set}static register(t){const e=t.id;if(!e)throw new Error("Engine must define static id");if(this.registry.has(e))throw new Error(`Engine id duplicated: ${e}`);this.registry.set(e,t)}static get(t){return this.registry.get(t)}static getByMode(t){for(const[e,i]of this.registry.entries())if(i.mode===t)return i}static async create(t,e){const i=(0,C.defaultsDeep)(e,t,u),s=i.engine??t.engine,r=s?this.get(s)??this.getByMode(s):null;if(r){const e=new r;return await e.initialize(t,i),e}}async _extract(t,e){const i=t.type;if(!e)return"array"===i?[]:null;if("object"===i){const{selector:i,properties:s}=t;let r=e;if(i){const t=await this._querySelectorAll(e,i);r=t.length>0?t[0]:null}if(!r)return null;const n={};for(const t in s)n[t]=await this._extract(s[t],r);return n}if("array"===i){const{selector:i,items:s}=t,r=i?await this._querySelectorAll(e,i):[e],n=[];for(const t of r)n.push(await this._extract(s,t));return n}const{selector:s}=t;let r=e;if(s){const t=await this._querySelectorAll(e,s);r=t.length>0?t[0]:null}return r?this._extractValue(t,r):null}waitFor(t){return this.dispatchAction({type:"waitFor",options:t})}click(t){return this.dispatchAction({type:"click",selector:t})}fill(t,e){return this.dispatchAction({type:"fill",selector:t,value:e})}submit(t,e){return this.dispatchAction({type:"submit",selector:t,options:e})}pause(t){return this.dispatchAction({type:"pause",message:t})}extract(t){const e=this._normalizeSchema(t);return this.dispatchAction({type:"extract",schema:e})}_normalizeSchema(t){const e=JSON.parse(JSON.stringify(t));if(e.properties)for(const t in e.properties)e.properties[t]=this._normalizeSchema(e.properties[t]);if(e.items&&(e.items=this._normalizeSchema(e.items)),"array"===e.type&&(e.attribute&&!e.items&&(e.items={attribute:e.attribute},delete e.attribute),e.items||(e.items={type:"string"})),e.selector&&(e.has||e.exclude)){const{selector:t,has:i,exclude:s}=e,r=t.split(",").map(t=>{let e=t.trim();return i&&(e=`${e}:has(${i})`),s&&(e=`${e}:not(${s})`),e}).join(", ");e.selector=r,delete e.has,delete e.exclude}return e}get id(){return this.constructor.id}get mode(){return this.constructor.mode}get context(){return this.ctx}async initialize(t,e){if(this.ctx)return;(0,C.merge)(t,e),this.ctx=t,this.opts=t,this.hdrs=function(t){const e={};if(t&&"object"==typeof t)for(const[i,s]of Object.entries(t))e[i.toLowerCase()]=s;return e}(t.headers),this.jar=[...t.cookies??[]],t.internal||(t.internal={}),t.internal.engine=this,t.engine=this.mode,this.requestQueue=await $.RequestQueue.open();const i=await this._getSpecificCrawlerOptions(t),s={...(0,C.defaultsDeep)(i,{requestQueue:this.requestQueue,maxConcurrency:1,minConcurrency:1,useSessionPool:!0,persistCookiesPerSession:!0,sessionPoolOptions:{maxPoolSize:1,persistenceOptions:{enable:!1},sessionOptions:{maxUsageCount:1e3,maxErrorScore:3}}}),requestHandler:this._requestHandler.bind(this),errorHandler:this._failedRequestHandler.bind(this),failedRequestHandler:this._failedRequestHandler.bind(this)};this.crawler=this._createCrawler(s),this.crawler.run().then(()=>{this.isCrawlerReady=!0}).catch(t=>{this.isCrawlerReady=!1,console.error("Crawler background error:",t)})}async cleanup(){await(this._cleanup?.()),await this._commonCleanup();const t=this.ctx;t&&t.internal?.engine===this&&(t.internal.engine=void 0),this.ctx=void 0,this.opts=void 0}async _executePendingActions(t){await new Promise(e=>{const i=async({action:e,resolve:i,reject:s})=>{try{if("dispose"===e.type)return this.actionEmitter.emit("dispose"),void i();i(await this.executeAction(t,e))}catch(t){s(t)}};this.actionEmitter.on("dispatch",i),this.actionEmitter.once("dispose",()=>{this.actionEmitter.removeListener("dispatch",i),e()})})}async _sharedRequestHandler(t){try{const{request:e}=t;this.isPageActive=!0;const i=this.pendingRequests.get(e.userData.requestId);if(i){const s=await this.buildResponse(t),r=!s.statusCode||s.statusCode>=400;if(this.ctx?.throwHttpErrors&&r){const t=new k.CommonError(`Request for ${s.finalUrl} failed with status ${s.statusCode||"N/A"}`,"request",s.statusCode);i.reject(t)}else this.lastResponse=s,i.resolve(s);this.pendingRequests.delete(e.userData.requestId)}await this._executePendingActions(t)}finally{this.isPageActive=!1,this.navigationLock.release()}}async _sharedFailedRequestHandler(t,e){const{request:i}=t,s=this.pendingRequests.get(i.userData.requestId);if(s&&e&&this.ctx?.throwHttpErrors){this.pendingRequests.delete(i.userData.requestId);const t=e.response,r=t?.statusCode||500,n=t?.url?t.url:i.url,o=new k.CommonError(`Request${n?" for "+n:""} failed: ${e.message}`,"request",r);s.reject(o)}return this._sharedRequestHandler(t)}async dispatchAction(t){if(!this.isPageActive)throw new Error("No active page. Call goto() before performing actions.");return new Promise((e,i)=>{this.actionEmitter.emit("dispatch",{action:t,resolve:e,reject:i})})}async _requestHandler(t){await this._sharedRequestHandler(t)}async _failedRequestHandler(t,e){await this._sharedFailedRequestHandler(t,e)}async _commonCleanup(){if(this.isPageActive&&await this.dispatchAction({type:"dispose"}).catch(()=>{}),this.pendingRequests.size>0)for(const[,t]of this.pendingRequests)t.reject(new Error("Cleanup:Request cancelled"));if(this.actionEmitter.removeAllListeners(),this.crawler){try{await(this.crawler.teardown?.())}catch(t){console.error("ccrawler teardown error:",t)}this.crawler=void 0}this.isCrawlerReady=void 0,this.requestQueue&&(await this.requestQueue.drop(),this.requestQueue=void 0),this.pendingRequests.clear()}async blockResources(t,e){return e&&this.blockedTypes.clear(),t.forEach(t=>this.blockedTypes.add(t)),t.length}getContent(){return this.lastResponse?Promise.resolve(this.lastResponse):Promise.reject(new Error("No content fetched yet. Call goto() first."))}async headers(t,e){if(void 0===t)return{...this.hdrs};if("string"==typeof t&&void 0===e)return this.hdrs[t.toLowerCase()]||"";if(null!==t&&"object"==typeof t){const i={};for(const[e,s]of Object.entries(t))i[e.toLowerCase()]=String(s);return this.hdrs=!0===e?i:{...this.hdrs,...i},!0}return"string"==typeof t&&("string"==typeof e?this.hdrs[t.toLowerCase()]=e:null===e&&delete this.hdrs[t.toLowerCase()],!0)}async cookies(t){return Array.isArray(t)?(this.jar=[...t],!0):null===t?(this.jar=[],!0):[...this.jar]}async dispose(){await this.cleanup()}};async function P(t,e){const i=function(t,e){if(!t||!e?.length)return null;const i=new URL(t);let s=e.find(t=>t.domain===i.hostname);s||(s=e.find(t=>i.hostname.endsWith(t.domain)));if(!s)return null;if(s.pathScope?.length){if(!s.pathScope.some(t=>i.pathname.startsWith(t)))return null}return s}(e?.url||t.url,t.sites),s=t.engine||i?.engine||"auto";let r=await A.create(t,{engine:s});return r||(r=await A.create(t,{engine:"http"})),r}A.registry=new Map;var j=class{constructor(t={}){this.options=t,this.closed=!1,this.id=E(),this.context=this.createContext(t)}async execute(t){await this.ensureEngine(t);const e=f.create(t);if(!e)throw new Error(`Unknown action: ${t.id||t.name}`);let i,s;this.context.internal.actionIndex=(this.context.internal.actionIndex||0)+1,this.context.currentAction={...t,index:this.context.internal.actionIndex,startedAt:Date.now()};try{return i=await e.execute(this.context,t),i}catch(t){throw s=t,s}finally{this.context.currentAction=void 0}}async executeAll(t){try{for(let e=0;e<t.length;e++){const i=t[e];await this.execute(i)}const e=await this.execute({id:"getContent"});return{result:e?.result,outputs:this.getOutputs()}}catch(t){throw t}}getOutputs(){return this.context.outputs}async dispose(){if(this.closed)return;const t=this.context.eventBus;t.emit("session:closing",{sessionId:this.id});try{await(this.context.internal.engine?.dispose())}finally{this.closed=!0}t.emit("session:closed",{sessionId:this.id})}async ensureEngine(t){if(this.closed)throw new Error("Session is closed");if(!this.context.internal.engine){const e=t?.params?.url??this.context.url;if(!await P(this.context,{url:e}))throw new Error("No engine found")}}createContext(t=this.options){const e=new m.EventEmitter;return(0,v.defaultsDeep)({...t,id:this.id,eventBus:e,outputs:{},internal:{},execute:async t=>this.execute(t),action:async function(t,e,i){return this.execute({name:t,params:e,...i})}},u)}},F=class{constructor(t={}){this.defaults=t}async createSession(t){const e={...this.defaults,...t||{}};return new j(e)}async fetch(t,e){"string"!=typeof t&&(t=(e=t).url);const i=await this.createSession(e);try{const s=e?.actions||[];t&&0!==s.findIndex(e=>"goto"===e.id&&e.params?.url===t)&&s.unshift({id:"goto",params:{url:t}});return await i.executeAll(s)}finally{await i.dispose()}}},U=require("crawlee"),O=c(require("cheerio")),T=require("@isdk/common-error"),_=class extends A{async buildResponse(t){const{request:e,response:i,body:s,$:r}=t,n=r?.html();let o="string"==typeof s?s:Buffer.isBuffer(s)?s.toString("utf-8"):String(s??"");return n&&n!==o&&(o=n),{url:e.url,finalUrl:e.loadedUrl||e.url,statusCode:i?.statusCode??200,statusText:i?.statusMessage,headers:i?.headers,body:s,html:o,text:o}}async _querySelectorAll(t,e){const{$:i,el:s}=t;return s.find(e).toArray().map(t=>({$:i,el:i(t)}))}async _extractValue(t,e){const{el:i}=e,{attribute:s,type:r="string"}=t;if(0===i.length)return null;let n="";if(n=s?i.attr(s)??null:"html"===r?i.html():i.text().trim(),null===n)return null;switch(r){case"number":return parseFloat(n.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=n.toLowerCase();return"true"===t||"1"===t;default:return n}}async executeAction(t,e){const{$:i}=t;switch(e.type){case"dispose":return;case"extract":if(!i)throw new T.CommonError(`Cheerio context not available for action: ${e.type}`,"extract");return this._extract(e.schema,{$:i,el:i.root()});case"click":{if(!i)throw new T.CommonError(`Cheerio context not available for action: ${e.type}`,"click");const s=e.selector,r=i(s).first();let n;if(0===r.length)try{n=new URL(s,t.request.loadedUrl||t.request.url).href}catch{throw new T.CommonError(`click: selector not found or invalid URL: ${s}`,"click")}else{if(!r.is("a")||!r.attr("href")){if(r.is('input[type="submit"], button[type="submit"], button, input')){const e=r.closest("form");if(e.length)return this.executeAction(t,{type:"submit",selector:e});throw new T.CommonError("click: submit-like element without form","click")}throw new T.CommonError(`click: unsupported element for http simulate. Selector: ${s}`,"click")}{const e=r.attr("href");n=new URL(e,t.request.loadedUrl||t.request.url).href}}const o=await t.sendRequest({url:n});return void this._updateStateAfterNavigation(t,o)}case"fill":{if(!i)throw new T.CommonError(`Cheerio context not available for action: ${e.type}`),"fill";const s=i(e.selector).first();if(0===s.length)throw new T.CommonError(`fill: selector not found: ${e.selector}`);if(!s.is("input, textarea, select"))throw new T.CommonError(`fill: not a form field: ${e.selector}`,"fill");{s.val(e.value);const i=this.buildResponse(t);this.lastResponse=i}return}case"waitFor":return void(e.options?.ms&&await new Promise(t=>setTimeout(t,e.options.ms)));case"pause":const s=this.ctx?.onPause;return void(s?(console.info(e.message||"Execution paused for manual intervention."),await s({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."));case"submit":{if(!i)throw new T.CommonError(`Cheerio context not available for action: ${e.type}`,"submit");const s="string"==typeof e.selector?i(e.selector).first():null!=e.selector?e.selector:i("form").first();if(0===s.length)throw new T.NotFoundError(e.selector,"submit");const r=s.attr("action")||t.request.loadedUrl||t.request.url,n=(s.attr("method")||"GET").toUpperCase(),o=new URL(r,t.request.loadedUrl||t.request.url).href,a={};let c;if(s.find("input, select, textarea").each((t,e)=>{const s=i(e),r=s.attr("name");if(!r)return;const n=s.val();null!=n&&(a[r]=String(n))}),"GET"===n){const e=new URL(o);Object.entries(a).forEach(([t,i])=>e.searchParams.set(t,i)),c=await t.sendRequest({url:e.href,method:"GET"})}else{let i;const r={};"application/json"===(e.options?.enctype||s.attr("enctype")||"application/x-www-form-urlencoded")?(i=JSON.stringify(a),r["Content-Type"]="application/json"):(i=new URLSearchParams(a).toString(),r["Content-Type"]="application/x-www-form-urlencoded"),c=await t.sendRequest({url:o,method:"POST",body:i,headers:r})}return void this._updateStateAfterNavigation(t,c)}case"getContent":return this.buildResponse(t);default:throw new T.CommonError(`Unknown action type: ${e.type}`,"CheerioFetchEngine.executeAction",T.ErrorCode.NotSupported)}}_updateStateAfterNavigation(t,e){const i=e.response||e,{body:s,headers:r,statusCode:n,statusMessage:o}=i,{url:a,loadedUrl:c}=e,l="string"==typeof s?s:Buffer.isBuffer(s)?s.toString("utf-8"):String(s??"");r&&r["content-type"]?.includes("html")&&(t.$=O.load(l)),this.lastResponse={url:a,finalUrl:c||a,statusCode:n,statusText:o,headers:r||{},body:s,html:l,text:l}}_createCrawler(t){return new U.CheerioCrawler(t)}_getSpecificCrawlerOptions(t){const e=this.opts?.proxy?"string"==typeof this.opts.proxy?[this.opts.proxy]:this.opts.proxy:void 0,i=e?.length?new U.ProxyConfiguration({proxyUrls:e}):void 0;return{additionalMimeTypes:["text/plain"],maxRequestRetries:1,requestHandlerTimeoutSecs:Math.max(5,Math.floor((this.opts?.timeoutMs||3e4)/1e3)),proxyConfiguration:i,preNavigationHooks:[(e,i)=>{i.throwHttpErrors=t.throwHttpErrors,this.opts?.timeoutMs&&(i.timeout={request:this.opts.timeoutMs})}]}}async goto(t,e){this.isPageActive&&this.dispatchAction({type:"dispose"}).catch(()=>{});const i="req-"+ ++this.requestCounter,s=new Promise((t,s)=>{const r=e?.timeoutMs||this.opts?.timeoutMs||3e4,n=setTimeout(()=>{this.pendingRequests.delete(i),this.navigationLock.release(),s(new T.CommonError(`goto timed out after ${r}ms.`,"gotoTimeout",T.ErrorCode.RequestTimeout))},r);this.pendingRequests.set(i,{resolve:e=>{clearTimeout(n),t(e)},reject:t=>{clearTimeout(n),s(t)}})});return this.requestQueue.addRequest({...e,url:t,headers:{...this.hdrs,...e?.headers},userData:{requestId:i},uniqueKey:`${t}-${i}`}).catch(t=>{const e=this.pendingRequests.get(i);e&&(this.pendingRequests.delete(i),this.navigationLock.release(),e.reject(t))}),await this.navigationLock,this.navigationLock=R(),s}};_.id="cheerio",_.mode="http",A.register(_);var M=require("crawlee"),N=require("playwright"),H=require("camoufox-js"),L=require("@isdk/common-error"),B=class extends A{async buildResponse(t){const{page:e,response:i,request:s}=t;if(!e||e.isClosed())return{url:s.url,finalUrl:s.loadedUrl||s.url,statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},body:"",html:"",text:""};const r=await e.content(),n=await e.textContent("body");return{url:e.url(),finalUrl:e.url(),statusCode:i?.status(),statusText:i?.statusText(),headers:await(i?.allHeaders())||{},body:r,html:r,text:n||""}}async _querySelectorAll(t,e){return t.locator(e).all()}async _extractValue(t,e){const{attribute:i,type:s="string"}=t;if(0===await e.count())return null;let r="";if(r=i?await e.getAttribute(i):"html"===s?await e.innerHTML():await e.textContent(),null===r)return null;switch(r=r.trim(),s){case"number":return parseFloat(r.replace(/[^0-9.-]+/g,""))||null;case"boolean":const t=r.toLowerCase();return"true"===t||"1"===t;default:return r}}async executeAction(t,e){const{page:i}=t,s=this.opts?.timeoutMs||3e4;switch(e.type){case"navigate":{const s=await i.goto(e.url,{waitUntil:e.opts?.waitUntil||"domcontentloaded",timeout:this.opts?.timeoutMs||3e4});s&&(t={...t,response:s});const r=await this.buildResponse(t);return this.lastResponse=r,r}case"extract":return this._extract(e.schema,i.locator("body"));case"click":{await i.click(e.selector,{timeout:s}),await i.waitForLoadState("networkidle",{timeout:s});const r=await this.buildResponse(t);return void(this.lastResponse=r)}case"fill":await i.fill(e.selector,e.value,{timeout:s});const r=await this.buildResponse(t);return void(this.lastResponse=r);case"waitFor":return e.options?.selector&&await i.waitForSelector(e.options.selector,{timeout:s}),e.options?.networkIdle&&await i.waitForLoadState("networkidle",{timeout:s}),void(e.options?.ms&&await i.waitForTimeout(e.options.ms));case"submit":{const r=e.selector||"form",n=i.locator(r).first();if(0===await n.count())throw new L.NotFoundError(r,"submit");if("application/json"===(e.options?.enctype||"application/x-www-form-urlencoded")){const t=await n.elementHandle();if(!t)throw new L.CommonError(`submit: could not get form handle for ${r}`,"submit");const e=await t.evaluate(async t=>{const e=new FormData(t),i={};e.forEach((t,e)=>{i[e]=t.toString()});const s=await fetch(t.action,{method:t.method,headers:{"Content-Type":"application/json"},body:JSON.stringify(i)}),r=await s.text();return{status:s.status,statusText:s.statusText,headers:Object.fromEntries(s.headers.entries()),body:r,html:r,text:r,url:t.action,finalUrl:s.url}});return await t.dispose(),await i.setContent(e.html),void(this.lastResponse=e)}return await n.evaluate(t=>t.submit()),await i.waitForLoadState("networkidle",{timeout:s}),void(this.lastResponse=await this.buildResponse(t))}case"pause":{const t=this.ctx?.onPause;return void(t?(console.info(e.message||"Execution paused for manual intervention."),await t({message:e.message}),console.info("Resuming execution...")):console.warn("[PauseAction] was called, but no `onPause` handler was provided in fetchWeb options. Skipped."))}case"getContent":return this.buildResponse(t);default:throw new L.CommonError(`Unknown action type: ${e.type}`,"PlaywrightFetchEngine.executeAction",L.ErrorCode.NotSupported)}}_createCrawler(t){return new M.PlaywrightCrawler(t)}async _getSpecificCrawlerOptions(t){const e=t.browser?.headless??!0,i={maxRequestRetries:t.retries||3,headless:e,preNavigationHooks:[async({page:e,request:i},s)=>{s.throwHttpErrors=t.throwHttpErrors,this.jar.length>0&&await e.context().addCookies(this.jar.map(t=>({...t,url:i.url,domain:t.domain||new URL(i.url).hostname})));const r=this.blockedTypes;r.size>0&&await e.route("**/*",t=>{r.has(t.request().resourceType())?t.abort():t.continue()})}]};if(this.opts?.antibot){i.browserPoolOptions={useFingerprints:!1};const t=await(0,H.launchOptions)({headless:e});i.launchContext={launcher:N.firefox,launchOptions:t},i.postNavigationHooks=[async({page:t,handleCloudflareChallenge:e})=>{await e()}]}return i}async goto(t,e){if(this.isPageActive)return this.dispatchAction({type:"navigate",url:t,opts:e});if(!this.requestQueue)throw new L.CommonError("RequestQueue not initialized","goto");const i="req-"+ ++this.requestCounter,s=new Promise((t,e)=>{this.pendingRequests.set(i,{resolve:t,reject:e})});return await this.requestQueue.addRequest({url:t,headers:this.hdrs,userData:{requestId:i,waitUntil:e?.waitUntil||"domcontentloaded"},uniqueKey:`${t}-${i}`}),s}};B.id="playwright",B.mode="browser",A.register(B);var G=class extends f{async onExecute(t,e){const{selector:i,...s}=e?.params||{};if(!i)throw new Error("Selector is required for click action");await this.delegateToEngine(t,"click",i,s)}};G.id="click",G.returnType="none",G.capabilities={http:"simulate",browser:"native"},f.register(G);var z=class extends f{async onExecute(t,e){const{selector:i,value:s,...r}=e?.params||{};if(!i)throw new Error("Selector is required for fill action");if(void 0===s)throw new Error("Value is required for fill action");await this.delegateToEngine(t,"fill",i,s,r)}};z.id="fill",z.returnType="none",z.capabilities={http:"simulate",browser:"native"},f.register(z);var D=class extends f{async onExecute(t,e){return await this.delegateToEngine(t,"getContent",e?.params)}};D.id="getContent",D.returnType="response",D.capabilities={http:"native",browser:"native"},f.register(D);var W=class extends f{async onExecute(t,e,i){const s=e?.params,r=s?.url||t.url;if(!r)throw new Error("URL is required for goto action");const n=t.internal.engine;if(!n)throw new Error("No engine available");t.url=r;return await n.goto(r,s)}};W.id="goto",W.returnType="response",W.capabilities={http:"native",browser:"native"},f.register(W);var I=class extends f{async onExecute(t,e){const{selector:i,...s}=e?.params||{};await this.delegateToEngine(t,"submit",i,s)}};I.id="submit",I.returnType="none",I.capabilities={http:"simulate",browser:"native"},f.register(I);var J=class extends f{async onExecute(t,e){const i=t.internal.engine;if(!i)throw new Error("No engine available");await i.waitFor(e?.params)}};J.id="waitFor",J.returnType="none",J.capabilities={http:"native",browser:"native"},f.register(J);var V=class extends f{async onExecute(t,e){const i=e?.params;if(!i)throw new Error("Schema is required for extract action");return this.delegateToEngine(t,"extract",i)}};V.id="extract",V.returnType="any",V.capabilities={http:"native",browser:"native"},f.register(V);var K=class extends f{async onExecute(t,e){const{selector:i,message:s,attribute:r}=e?.params||{},n=t.internal.engine;if("browser"===n?.mode){if(i){if(!await(n?.extract({selector:i,attribute:r})))return}n&&"pause"in n?await n.pause(s):console.warn("[PauseAction] was called, but the current engine does not support `pause`. Skipped.")}else console.warn("[PauseAction] can only run in browser engine. Skipped.")}};async function Q(t,e){return(new F).fetch(t,e)}K.id="pause",K.capabilities={http:"native",browser:"native"},K.returnType="none",f.register(K);