jumpy-lion 0.0.36 → 0.0.39
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/page.d.ts +45 -14
- package/dist/page.d.ts.map +1 -1
- package/dist/page.js +126 -80
- package/dist/page.js.map +1 -1
- package/dist/tsconfig.build.tsbuildinfo +1 -1
- package/package.json +1 -1
- package/README.md +0 -510
package/package.json
CHANGED
package/README.md
DELETED
|
@@ -1,510 +0,0 @@
|
|
|
1
|
-
# Crawler Documentation
|
|
2
|
-
|
|
3
|
-
## Table of Contents
|
|
4
|
-
|
|
5
|
-
- [Overview](#overview)
|
|
6
|
-
- [NPM Package](#npm-package)
|
|
7
|
-
- [Usage](#usage)
|
|
8
|
-
- [Example Project](#example-project)
|
|
9
|
-
- [Internal Guide](#internal-guide)
|
|
10
|
-
- [Examples and Configuration](#examples-and-configuration)
|
|
11
|
-
- [Advanced Fingerprints Usage](#advanced-fingerprints-usage)
|
|
12
|
-
- [Syncing BrowserPool and launchOptions fingerprints](#syncing-browserpool-and-launchoptions-fingerprints)
|
|
13
|
-
- [Configurable Fingerprint Options](#configurable-fingerprint-options)
|
|
14
|
-
- [Usage](#usage-1)
|
|
15
|
-
- [Available Options](#available-options)
|
|
16
|
-
- [Core Stealth Options](#core-stealth-options)
|
|
17
|
-
- [Fingerprint Spoofing](#fingerprint-spoofing)
|
|
18
|
-
- [Platform Configuration](#platform-configuration)
|
|
19
|
-
- [Additional Features](#additional-features)
|
|
20
|
-
- [Default Behavior](#default-behavior)
|
|
21
|
-
- [Best Practices](#best-practices)
|
|
22
|
-
- [Performance Considerations](#performance-considerations)
|
|
23
|
-
- [Crawler Class Documentation](#crawler-class-documentation)
|
|
24
|
-
- [Constructor](#constructor)
|
|
25
|
-
- [CdpPage Class Documentation](#cdppage-class-documentation)
|
|
26
|
-
- [Constructor](#constructor-1)
|
|
27
|
-
- [Static Methods](#static-methods)
|
|
28
|
-
- [Public Methods](#public-methods)
|
|
29
|
-
- [Utility Functions](#utility-functions)
|
|
30
|
-
- [createCDPRouter](#createcdprouter)
|
|
31
|
-
|
|
32
|
-
## Overview
|
|
33
|
-
|
|
34
|
-
The `Crawler` class is a custom implementation of the `BrowserCrawler` from Crawlee, designed to utilize the Chrome DevTools Protocol (CDP) for advanced antiblocking capabilities.
|
|
35
|
-
|
|
36
|
-
## NPM Package
|
|
37
|
-
|
|
38
|
-
The `jumpy-lion` is official cdp crawler package. See it [here](https://www.npmjs.com/package/jumpy-lion).
|
|
39
|
-
|
|
40
|
-
---
|
|
41
|
-
---
|
|
42
|
-
|
|
43
|
-
## Usage
|
|
44
|
-
|
|
45
|
-
### Example Project
|
|
46
|
-
|
|
47
|
-
Refer to this [GitHub repository](https://github.com/apify-projects/cdp-crawler-example) for a complete example of using the `Crawler` class.
|
|
48
|
-
|
|
49
|
-
### Internal Guide
|
|
50
|
-
|
|
51
|
-
Check out the [CDP Crawler internal guide](https://www.notion.so/apify/CDP-Crawler-internal-guide-183f39950a2280be81d7c86dc048a47a?pvs=4) for tutorial.
|
|
52
|
-
|
|
53
|
-
### Examples and Configuration
|
|
54
|
-
|
|
55
|
-
For detailed examples and configuration patterns, see the [Examples README](./examples/README.md). The examples include:
|
|
56
|
-
|
|
57
|
-
- **Basic Configuration**: Simple fingerprint setup for common use cases
|
|
58
|
-
- **Comprehensive Configuration**: Full feature setup with all spoofing options
|
|
59
|
-
- **Platform-Specific Configurations**: macOS, Windows, and Linux targeting
|
|
60
|
-
- **Performance-Focused Configuration**: Optimized settings for speed
|
|
61
|
-
- **Minimal Configuration**: Using intelligent defaults
|
|
62
|
-
|
|
63
|
-
The examples demonstrate real-world usage patterns and best practices for different scenarios.
|
|
64
|
-
|
|
65
|
-
### Advanced Fingerprints usage
|
|
66
|
-
|
|
67
|
-
To use advanced fingerprints, you need to set the `useExperimentalFingerprints` option to `true` in the `launchContext.launchOptions` of the `Crawler` constructor.
|
|
68
|
-
|
|
69
|
-
```typescript
|
|
70
|
-
const crawler = new Crawler({
|
|
71
|
-
launchContext: {
|
|
72
|
-
launchOptions: {
|
|
73
|
-
useExperimentalFingerprints: true,
|
|
74
|
-
}
|
|
75
|
-
},
|
|
76
|
-
});
|
|
77
|
-
```
|
|
78
|
-
|
|
79
|
-
---
|
|
80
|
-
|
|
81
|
-
### Syncing BrowserPool and launchOptions fingerprints
|
|
82
|
-
|
|
83
|
-
**Always keep the operating system in sync between BrowserPool fingerprints and `launchOptions.fingerprintOptions`.** A mismatch can lead to inconsistent signals (for example `navigator.platform`, User-Agent, WebGL, fonts) and reduce antibot effectiveness.
|
|
84
|
-
|
|
85
|
-
- **launchOptions side**: Set `launchContext.launchOptions.fingerprintOptions.platform` to the desired platform string.
|
|
86
|
-
- **BrowserPool side**: When `browserPoolOptions.useFingerprints` is `true`, set `browserPoolOptions.fingerprintOptions.fingerprintGeneratorOptions.operatingSystems` to the corresponding OS.
|
|
87
|
-
|
|
88
|
-
Mapping guidance:
|
|
89
|
-
- `platform: 'Win32'` ↔ `operatingSystems: ['windows']`
|
|
90
|
-
- `platform: 'MacIntel'` ↔ `operatingSystems: ['macos']`
|
|
91
|
-
- `platform: 'Linux x86_64'` ↔ `operatingSystems: ['linux']`
|
|
92
|
-
|
|
93
|
-
Example:
|
|
94
|
-
|
|
95
|
-
```typescript
|
|
96
|
-
const crawler = new Crawler({
|
|
97
|
-
launchContext: {
|
|
98
|
-
launchOptions: {
|
|
99
|
-
useExperimentalFingerprints: true,
|
|
100
|
-
fingerprintOptions: {
|
|
101
|
-
platform: 'Win32', // Keep this in sync with BrowserPool OS
|
|
102
|
-
},
|
|
103
|
-
},
|
|
104
|
-
},
|
|
105
|
-
browserPoolOptions: {
|
|
106
|
-
useFingerprints: true,
|
|
107
|
-
fingerprintOptions: {
|
|
108
|
-
fingerprintGeneratorOptions: {
|
|
109
|
-
browsers: ['chrome'],
|
|
110
|
-
operatingSystems: ['windows'], // Matches platform: 'Win32'
|
|
111
|
-
devices: ['desktop'],
|
|
112
|
-
},
|
|
113
|
-
},
|
|
114
|
-
},
|
|
115
|
-
});
|
|
116
|
-
```
|
|
117
|
-
|
|
118
|
-
Note: This configuration surface will be unified later. We are currently testing our custom fingerprint injector so it works even with the BrowserPool built‑in fingerprints turned off. If you prefer, you can rely solely on the custom injector by setting `browserPoolOptions.useFingerprints: false` and keeping `launchOptions.useExperimentalFingerprints: true`.
|
|
119
|
-
|
|
120
|
-
---
|
|
121
|
-
|
|
122
|
-
## Configurable Fingerprint Options
|
|
123
|
-
|
|
124
|
-
The CDP crawler supports configurable fingerprint options that can be passed through the crawler options. This allows you to customize the fingerprint spoofing behavior for different use cases.
|
|
125
|
-
|
|
126
|
-
### Usage
|
|
127
|
-
|
|
128
|
-
You can configure fingerprint options by adding them to the `launchContext.launchOptions.fingerprintOptions` in your crawler configuration:
|
|
129
|
-
|
|
130
|
-
```typescript
|
|
131
|
-
import { Crawler } from 'cdp-crawler';
|
|
132
|
-
|
|
133
|
-
const crawler = new Crawler({
|
|
134
|
-
launchContext: {
|
|
135
|
-
launchOptions: {
|
|
136
|
-
fingerprintOptions: {
|
|
137
|
-
// Enable advanced stealth features
|
|
138
|
-
enableAdvancedStealth: true,
|
|
139
|
-
|
|
140
|
-
// Bypass Runtime.enable detection
|
|
141
|
-
bypassRuntimeEnable: true,
|
|
142
|
-
|
|
143
|
-
// Humanize mouse interactions
|
|
144
|
-
humanizeInteractions: true,
|
|
145
|
-
|
|
146
|
-
// Spoof WebGL fingerprinting
|
|
147
|
-
spoofWebGL: true,
|
|
148
|
-
|
|
149
|
-
// Spoof audio context fingerprinting
|
|
150
|
-
spoofAudioContext: true,
|
|
151
|
-
|
|
152
|
-
// Add variations to client rect measurements
|
|
153
|
-
spoofClientRects: true,
|
|
154
|
-
|
|
155
|
-
// Mask automation flags
|
|
156
|
-
maskAutomationFlags: true,
|
|
157
|
-
|
|
158
|
-
// Use fingerprint-generator defaults when available
|
|
159
|
-
useFingerprintDefaults: true,
|
|
160
|
-
|
|
161
|
-
// Platform to spoof (defaults to Win32 for better evasion)
|
|
162
|
-
platform: 'Win32', // 'Win32' | 'MacIntel' | 'Linux x86_64'
|
|
163
|
-
|
|
164
|
-
// Spoof font measurements
|
|
165
|
-
spoofFonts: true,
|
|
166
|
-
|
|
167
|
-
// Spoof performance timing
|
|
168
|
-
spoofPerformance: true,
|
|
169
|
-
|
|
170
|
-
// Spoof locale settings
|
|
171
|
-
spoofLocale: true,
|
|
172
|
-
|
|
173
|
-
// Detect timezone from proxy (useful with residential proxies)
|
|
174
|
-
detectTimezone: true,
|
|
175
|
-
}
|
|
176
|
-
}
|
|
177
|
-
},
|
|
178
|
-
// ... other crawler options
|
|
179
|
-
});
|
|
180
|
-
```
|
|
181
|
-
|
|
182
|
-
### Available Options
|
|
183
|
-
|
|
184
|
-
#### Core Stealth Options
|
|
185
|
-
|
|
186
|
-
- **`enableAdvancedStealth`** (boolean): Enables advanced stealth features including WebGPU spoofing and platform consistency
|
|
187
|
-
- **`bypassRuntimeEnable`** (boolean): Prevents CDP detection through Runtime.enable bypass techniques
|
|
188
|
-
- **`humanizeInteractions`** (boolean): Generates human-like mouse movements using bezier curves
|
|
189
|
-
|
|
190
|
-
#### Fingerprint Spoofing
|
|
191
|
-
|
|
192
|
-
- **`spoofWebGL`** (boolean): Spoofs WebGL fingerprinting by modifying GPU adapter information
|
|
193
|
-
- **`spoofAudioContext`** (boolean): Adds noise to audio processing to prevent audio fingerprinting
|
|
194
|
-
- **`spoofClientRects`** (boolean): Adds small variations to getBoundingClientRect results
|
|
195
|
-
- **`spoofFonts`** (boolean): Hides platform-specific fonts and adds font measurement variations
|
|
196
|
-
- **`spoofPerformance`** (boolean): Modifies timing characteristics to match the target platform
|
|
197
|
-
- **`spoofLocale`** (boolean): Ensures consistent locale formatting across all browser properties
|
|
198
|
-
|
|
199
|
-
#### Platform Configuration
|
|
200
|
-
|
|
201
|
-
- **`platform`** (string): Target platform to spoof. Options: `'Win32'`, `'MacIntel'`, `'Linux x86_64'`
|
|
202
|
-
- **`useFingerprintDefaults`** (boolean): Use hardcoded defaults instead of fingerprint-generator values. When `false`, uses generated fingerprint values; when `true` (default), uses hardcoded defaults
|
|
203
|
-
|
|
204
|
-
#### Additional Features
|
|
205
|
-
|
|
206
|
-
- **`maskAutomationFlags`** (boolean): Masks automation-related flags in the browser
|
|
207
|
-
- **`detectTimezone`** (boolean): Automatically detect timezone from proxy IP (useful with residential proxies)
|
|
208
|
-
|
|
209
|
-
### Default Behavior
|
|
210
|
-
|
|
211
|
-
When no fingerprint options are provided, the crawler uses intelligent defaults:
|
|
212
|
-
|
|
213
|
-
- **On Apify**: Uses Apify-recommended settings optimized for the Apify environment
|
|
214
|
-
- **On other platforms**: Uses a comprehensive set of stealth features with Windows platform spoofing
|
|
215
|
-
|
|
216
|
-
### Best Practices
|
|
217
|
-
|
|
218
|
-
1. **Use `platform: 'Win32'`** for better evasion on Linux servers (like Apify)
|
|
219
|
-
2. **Enable `detectTimezone: true`** when using residential proxies
|
|
220
|
-
3. **Use `useFingerprintDefaults: false`** to leverage fingerprint-generator's realistic values
|
|
221
|
-
4. **Enable `bypassRuntimeEnable: true`** for sites that detect automation
|
|
222
|
-
5. **Use `enableAdvancedStealth: true`** for maximum protection against fingerprinting
|
|
223
|
-
6. **Keep OS settings in sync** between `launchOptions.fingerprintOptions.platform` and `browserPoolOptions.fingerprintOptions.fingerprintGeneratorOptions.operatingSystems`
|
|
224
|
-
|
|
225
|
-
### Performance Considerations
|
|
226
|
-
|
|
227
|
-
- More fingerprint options enabled = slightly higher CPU usage
|
|
228
|
-
- WebGPU spoofing may add a small delay to page loads
|
|
229
|
-
- Humanized interactions add realistic delays to mouse movements
|
|
230
|
-
|
|
231
|
-
The fingerprint options are designed to provide maximum protection while maintaining good performance for web scraping tasks.
|
|
232
|
-
|
|
233
|
-
For more configuration examples and patterns, see the [Examples README](./examples/README.md).
|
|
234
|
-
|
|
235
|
-
---
|
|
236
|
-
|
|
237
|
-
## `Crawler` Class Documentation
|
|
238
|
-
|
|
239
|
-
### Constructor
|
|
240
|
-
|
|
241
|
-
#### `constructor(options: BrowserCrawlerOptions = {}, override readonly config = Configuration.getGlobalConfig())`
|
|
242
|
-
|
|
243
|
-
Initializes the `Crawler` instance with default and provided options.
|
|
244
|
-
|
|
245
|
-
- **Parameters**:
|
|
246
|
-
|
|
247
|
-
- `options` (BrowserCrawlerOptions): Configuration options for the crawler.
|
|
248
|
-
- `launchContext`: Specifies browser launch parameters.
|
|
249
|
-
- Default: `{}`
|
|
250
|
-
- `headless`: Runs the browser in headless mode.
|
|
251
|
-
- Default: `false`
|
|
252
|
-
- `browserPoolOptions`: Configuration for managing browser instances.
|
|
253
|
-
- `config` (Configuration): Global Crawlee configuration.
|
|
254
|
-
- Default: `Configuration.getGlobalConfig()`
|
|
255
|
-
|
|
256
|
-
- **Default Behavior**:
|
|
257
|
-
- Throws an error if `launchContext.proxyUrl` is provided. Use `proxyConfiguration` instead.
|
|
258
|
-
- Throws an error if `browserPoolOptions.browserPlugins` is set. Use `launchContext.launcher` instead.
|
|
259
|
-
|
|
260
|
-
---
|
|
261
|
-
|
|
262
|
-
## `CdpPage` Class Documentation
|
|
263
|
-
|
|
264
|
-
### Constructor
|
|
265
|
-
|
|
266
|
-
#### `constructor(client: CDP.Client)`
|
|
267
|
-
|
|
268
|
-
Initializes the `CdpPage` instance with a CDP client.
|
|
269
|
-
|
|
270
|
-
- **Parameters**:
|
|
271
|
-
|
|
272
|
-
- `client` (CDP.Client): The Chrome DevTools Protocol client.
|
|
273
|
-
|
|
274
|
-
- **Emitted Events**:
|
|
275
|
-
- `PAGE_CREATED`: Triggered upon the creation of the page.
|
|
276
|
-
|
|
277
|
-
### Static Methods
|
|
278
|
-
|
|
279
|
-
#### `static async create(client: CDP.Client): Promise<CdpPage>`
|
|
280
|
-
|
|
281
|
-
Creates and initializes a new `CdpPage` instance.
|
|
282
|
-
|
|
283
|
-
- **Parameters**:
|
|
284
|
-
|
|
285
|
-
- `client` (CDP.Client): The CDP client.
|
|
286
|
-
|
|
287
|
-
- **Returns**:
|
|
288
|
-
- `Promise<CdpPage>`: A promise resolving to the new `CdpPage` instance.
|
|
289
|
-
|
|
290
|
-
---
|
|
291
|
-
|
|
292
|
-
### Public Methods
|
|
293
|
-
|
|
294
|
-
#### `async url(): Promise<string>`
|
|
295
|
-
Gets the current URL of the page.
|
|
296
|
-
|
|
297
|
-
- **Returns**:
|
|
298
|
-
- `Promise<string>`: The current URL.
|
|
299
|
-
|
|
300
|
-
#### `async goto(url: string, options?: GotoOptions): Promise<void>`
|
|
301
|
-
Navigates to a specified URL.
|
|
302
|
-
|
|
303
|
-
- **Parameters**:
|
|
304
|
-
- `url` (string): The URL to navigate to.
|
|
305
|
-
- `options` (GotoOptions): Navigation options, including:
|
|
306
|
-
- `waitUntil`: When to consider navigation finished (`domcontentloaded` or `load`).
|
|
307
|
-
- `timeout`: Maximum time to wait for navigation in milliseconds.
|
|
308
|
-
|
|
309
|
-
#### `async click(selector: string): Promise<void>`
|
|
310
|
-
Simulates a click on an element identified by the selector.
|
|
311
|
-
|
|
312
|
-
- **Parameters**:
|
|
313
|
-
- `selector` (string): CSS selector of the element.
|
|
314
|
-
|
|
315
|
-
#### `async type(selector: string, text: string, options?: { delay?: number }): Promise<void>`
|
|
316
|
-
Types text into an input field.
|
|
317
|
-
|
|
318
|
-
- **Parameters**:
|
|
319
|
-
- `selector` (string): CSS selector of the element.
|
|
320
|
-
- `text` (string): Text to type.
|
|
321
|
-
- `options` (object): Options for typing:
|
|
322
|
-
- `delay`: Time in milliseconds between key presses.
|
|
323
|
-
|
|
324
|
-
#### `async screenshot(options?: { path?: string; fullPage?: boolean; format?: 'png' | 'jpeg' }): Promise<Buffer>`
|
|
325
|
-
Takes a screenshot of the page, with support for PNG and JPEG formats.
|
|
326
|
-
|
|
327
|
-
- **Parameters**:
|
|
328
|
-
- `options` (object): Screenshot options:
|
|
329
|
-
- `path`: File path to save the screenshot.
|
|
330
|
-
- `fullPage`: Capture the entire page.
|
|
331
|
-
- `format`: Image format, either `'png'` (default) or `'jpeg'`.
|
|
332
|
-
|
|
333
|
-
- **Returns**:
|
|
334
|
-
- `Promise<Buffer>`: The screenshot as a buffer.
|
|
335
|
-
|
|
336
|
-
#### `async content(): Promise<string>`
|
|
337
|
-
Gets the HTML content of the page.
|
|
338
|
-
|
|
339
|
-
- **Returns**:
|
|
340
|
-
- `Promise<string>`: The page's HTML.
|
|
341
|
-
|
|
342
|
-
#### `async toCheerio(): Promise<cheerio.CheerioAPI>`
|
|
343
|
-
Converts the current page content to a Cheerio instance for DOM manipulation.
|
|
344
|
-
|
|
345
|
-
- **Returns**:
|
|
346
|
-
- `Promise<cheerio.CheerioAPI>`: A Cheerio API instance.
|
|
347
|
-
|
|
348
|
-
#### `async setViewport(viewport: Viewport): Promise<void>`
|
|
349
|
-
Sets the page's viewport dimensions.
|
|
350
|
-
|
|
351
|
-
- **Parameters**:
|
|
352
|
-
- `viewport` (Viewport): Object with `width` and `height` properties.
|
|
353
|
-
|
|
354
|
-
#### `async setUserAgent(userAgent: string): Promise<void>`
|
|
355
|
-
Overrides the user-agent string.
|
|
356
|
-
|
|
357
|
-
- **Parameters**:
|
|
358
|
-
- `userAgent` (string): The new user-agent string.
|
|
359
|
-
|
|
360
|
-
#### `async setExtraHTTPHeaders(headers: Record<string, string>): Promise<void>`
|
|
361
|
-
Sets additional HTTP headers for requests.
|
|
362
|
-
|
|
363
|
-
- **Parameters**:
|
|
364
|
-
- `headers` (Record<string, string>): Key-value pairs of headers.
|
|
365
|
-
|
|
366
|
-
#### `async waitForResponse(urlPart: string, statusCode?: number, timeout?: number): Promise<any>`
|
|
367
|
-
Waits for a specific network response.
|
|
368
|
-
|
|
369
|
-
- **Parameters**:
|
|
370
|
-
- `urlPart` (string): Part of the URL to match.
|
|
371
|
-
- `statusCode` (number): Expected HTTP status code.
|
|
372
|
-
- `timeout` (number): Maximum wait time in milliseconds.
|
|
373
|
-
|
|
374
|
-
- **Returns**:
|
|
375
|
-
- `Promise<any>`: The response.
|
|
376
|
-
|
|
377
|
-
#### `async setCookies(cookies: Cookie[]): Promise<void>`
|
|
378
|
-
Sets cookies for the page.
|
|
379
|
-
|
|
380
|
-
- **Parameters**:
|
|
381
|
-
- `cookies` (Cookie[]): Array of cookies to set.
|
|
382
|
-
|
|
383
|
-
#### `async getCookies(urls?: string[]): Promise<Cookie[]>`
|
|
384
|
-
Retrieves cookies for the given URLs or all cookies if no URLs are specified.
|
|
385
|
-
|
|
386
|
-
- **Parameters**:
|
|
387
|
-
- `urls` (string[]): Optional array of URLs.
|
|
388
|
-
|
|
389
|
-
- **Returns**:
|
|
390
|
-
- `Promise<Cookie[]>`: Array of cookies.
|
|
391
|
-
|
|
392
|
-
#### `async waitForSelector(selector: string, options?: { timeout?: number }): Promise<void>`
|
|
393
|
-
Waits for an element matching the selector to appear.
|
|
394
|
-
|
|
395
|
-
- **Parameters**:
|
|
396
|
-
- `selector` (string): CSS selector of the element.
|
|
397
|
-
- `options` (object): Options for waiting:
|
|
398
|
-
- `timeout`: Maximum wait time in milliseconds.
|
|
399
|
-
|
|
400
|
-
#### `async elementExists(selector: string): Promise<boolean>`
|
|
401
|
-
Checks if an element exists.
|
|
402
|
-
|
|
403
|
-
- **Parameters**:
|
|
404
|
-
- `selector` (string): CSS selector of the element.
|
|
405
|
-
|
|
406
|
-
- **Returns**:
|
|
407
|
-
- `Promise<boolean>`: `true` if the element exists, `false` otherwise.
|
|
408
|
-
|
|
409
|
-
#### `async getTextContent(selector: string): Promise<string>`
|
|
410
|
-
Gets the text content of an element.
|
|
411
|
-
|
|
412
|
-
- **Parameters**:
|
|
413
|
-
- `selector` (string): CSS selector of the element.
|
|
414
|
-
|
|
415
|
-
- **Returns**:
|
|
416
|
-
- `Promise<string>`: The element's text content.
|
|
417
|
-
|
|
418
|
-
#### `async getHref(selector: string): Promise<string>`
|
|
419
|
-
Gets the `href` attribute of an anchor element.
|
|
420
|
-
|
|
421
|
-
- **Parameters**:
|
|
422
|
-
- `selector` (string): CSS selector of the anchor element.
|
|
423
|
-
|
|
424
|
-
- **Returns**:
|
|
425
|
-
- `Promise<string>`: The `href` value.
|
|
426
|
-
|
|
427
|
-
#### `async reload(options?: GotoOptions): Promise<void>`
|
|
428
|
-
Reloads the current page.
|
|
429
|
-
|
|
430
|
-
- **Parameters**:
|
|
431
|
-
- `options` (GotoOptions): Navigation options, including:
|
|
432
|
-
- `waitUntil`: When to consider reload finished (`domcontentloaded` or `load`).
|
|
433
|
-
- `timeout`: Maximum time to wait for reload in milliseconds.
|
|
434
|
-
|
|
435
|
-
#### `async deleteInput(selector: string): Promise<void>`
|
|
436
|
-
Clears the value of an input field specified by the selector.
|
|
437
|
-
|
|
438
|
-
- **Parameters**:
|
|
439
|
-
- `selector` (string): CSS selector of the input element.
|
|
440
|
-
|
|
441
|
-
#### `async isVisible(selector: string): Promise<boolean>`
|
|
442
|
-
Checks if the element specified by selector is visible (not `display: none` and not `visibility: hidden`).
|
|
443
|
-
The selector should be the root item which can be hidden, otherwise this function could return a false positive.
|
|
444
|
-
|
|
445
|
-
- **Parameters**:
|
|
446
|
-
- `selector` (string): CSS selector of the element.
|
|
447
|
-
- **Returns**:
|
|
448
|
-
- `Promise<boolean>`: `true` if the element is visible, `false` otherwise.
|
|
449
|
-
|
|
450
|
-
#### `async selectOption(selector: string, targetSelector: string | string[], options?: SelectOptionOptions): Promise<void>`
|
|
451
|
-
Selects one or more options from a select element or dropdown. Supports both regular HTML select elements and custom virtualized dropdowns with hidden options.
|
|
452
|
-
|
|
453
|
-
- **Parameters**:
|
|
454
|
-
- `selector` (string): CSS selector for the select element or dropdown trigger.
|
|
455
|
-
- `targetSelector` (string | string[]): CSS selector(s) for the option(s) to select. Can be a single selector or array of selectors.
|
|
456
|
-
- `options` (SelectOptionOptions): Optional configuration object with the following properties:
|
|
457
|
-
- `timeout` (number): Maximum wait time in milliseconds. Default: 30000.
|
|
458
|
-
- `force` (boolean): Bypass visibility and disabled checks. Default: false.
|
|
459
|
-
- `waitForOptions` (boolean): Wait for dropdown options to load. Default: true.
|
|
460
|
-
- `maxScrollAttempts` (number): Maximum scroll attempts for virtualized dropdowns. Default: 10.
|
|
461
|
-
- `optionSelector` (string): CSS selector for option elements in custom dropdowns (required for non-select elements).
|
|
462
|
-
- `dropdownSelector` (string): CSS selector for dropdown container in custom dropdowns (required for non-select elements).
|
|
463
|
-
|
|
464
|
-
#### `async waitForElementPositionToStabilize(selector: string, timeout?: number, checkInterval?: number, stabilityThreshold?: number, tolerance?: number): Promise<void>`
|
|
465
|
-
Waits for an element's position to stabilize by polling its bounding box. Useful before interactions after scrolling/animations.
|
|
466
|
-
|
|
467
|
-
- **Parameters**:
|
|
468
|
-
- `selector` (string): Target element selector
|
|
469
|
-
- `timeout` (number): Max time to wait. Default: 2000
|
|
470
|
-
- `checkInterval` (number): Polling interval. Default: 100
|
|
471
|
-
- `stabilityThreshold` (number): Consecutive stable checks required. Default: 3
|
|
472
|
-
- `tolerance` (number): Max pixel delta to consider stable. Default: 1
|
|
473
|
-
|
|
474
|
-
- **Usage Examples**:
|
|
475
|
-
```typescript
|
|
476
|
-
// Regular HTML select element
|
|
477
|
-
await page.selectOption('select#country', 'option[value="us"]');
|
|
478
|
-
|
|
479
|
-
// Multiple selection
|
|
480
|
-
await page.selectOption('select#languages', ['option[value="en"]', 'option[value="es"]']);
|
|
481
|
-
|
|
482
|
-
// Custom dropdown with specific selectors
|
|
483
|
-
await page.selectOption(
|
|
484
|
-
'.dropdown-trigger',
|
|
485
|
-
'.option[data-value="premium"]',
|
|
486
|
-
{
|
|
487
|
-
dropdownSelector: '.dropdown-menu',
|
|
488
|
-
optionSelector: '.option',
|
|
489
|
-
timeout: 5000
|
|
490
|
-
}
|
|
491
|
-
);
|
|
492
|
-
```
|
|
493
|
-
|
|
494
|
-
---
|
|
495
|
-
|
|
496
|
-
## Utility Functions
|
|
497
|
-
|
|
498
|
-
### `createCDPRouter`
|
|
499
|
-
|
|
500
|
-
#### `export function createCDPRouter<Context extends CDPCrawlingContext = CDPCrawlingContext, UserData extends Dictionary = GetUserDataFromRequest<Context['request']>>(routes?: RouterRoutes<Context, UserData>): Router<Context>`
|
|
501
|
-
|
|
502
|
-
Creates a custom router for handling crawling routes using CDP.
|
|
503
|
-
|
|
504
|
-
- **Parameters**:
|
|
505
|
-
- `routes` (RouterRoutes<Context, UserData>): Optional routes for defining crawl logic.
|
|
506
|
-
|
|
507
|
-
- **Returns**:
|
|
508
|
-
- `Router<Context>`: A configured router instance.
|
|
509
|
-
|
|
510
|
-
---
|