@isdk/web-searcher 0.1.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.cn.md +274 -0
- package/README.md +274 -0
- package/dist/index.d.mts +321 -0
- package/dist/index.d.ts +321 -0
- package/dist/index.js +1 -0
- package/dist/index.mjs +1 -0
- package/docs/README.md +278 -0
- package/docs/classes/GoogleSearcher.md +695 -0
- package/docs/classes/WebSearcher.md +661 -0
- package/docs/globals.md +26 -0
- package/docs/interfaces/CustomTimeRange.md +29 -0
- package/docs/interfaces/PaginationConfig.md +86 -0
- package/docs/interfaces/SearchContext.md +41 -0
- package/docs/interfaces/SearchOptions.md +105 -0
- package/docs/interfaces/StandardSearchResult.md +58 -0
- package/docs/type-aliases/SafeSearchLevel.md +11 -0
- package/docs/type-aliases/SearchCategory.md +11 -0
- package/docs/type-aliases/SearchTimeRange.md +11 -0
- package/docs/type-aliases/SearchTimeRangePreset.md +11 -0
- package/docs/type-aliases/SearcherConstructor.md +23 -0
- package/package.json +87 -0
|
@@ -0,0 +1,695 @@
|
|
|
1
|
+
[**@isdk/web-searcher**](../README.md)
|
|
2
|
+
|
|
3
|
+
***
|
|
4
|
+
|
|
5
|
+
[@isdk/web-searcher](../globals.md) / GoogleSearcher
|
|
6
|
+
|
|
7
|
+
# Class: GoogleSearcher
|
|
8
|
+
|
|
9
|
+
Defined in: [web-searcher/src/engines/google.ts:24](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/engines/google.ts#L24)
|
|
10
|
+
|
|
11
|
+
A sample implementation of a Google Search scraper.
|
|
12
|
+
|
|
13
|
+
## Remarks
|
|
14
|
+
|
|
15
|
+
**⚠️ DEMO ONLY ⚠️**
|
|
16
|
+
|
|
17
|
+
This class serves as a **reference implementation** to demonstrate how to extend
|
|
18
|
+
the `WebSearcher` base class. It is **NOT intended for production use**.
|
|
19
|
+
|
|
20
|
+
Google frequently changes its HTML structure and employs sophisticated anti-bot measures.
|
|
21
|
+
A production-grade Google scraper would require robust proxy rotation, CAPTCHA solving,
|
|
22
|
+
and constant maintenance of selectors, or usage of an official API.
|
|
23
|
+
|
|
24
|
+
Use this class to understand:
|
|
25
|
+
1. How to define a fetch template with variable injection.
|
|
26
|
+
2. How to map standard options (like time range) to engine-specific URL parameters.
|
|
27
|
+
3. How to handle pagination.
|
|
28
|
+
4. How to transform and clean raw extracted data.
|
|
29
|
+
|
|
30
|
+
## Extends
|
|
31
|
+
|
|
32
|
+
- [`WebSearcher`](WebSearcher.md)
|
|
33
|
+
|
|
34
|
+
## Constructors
|
|
35
|
+
|
|
36
|
+
### Constructor
|
|
37
|
+
|
|
38
|
+
> **new GoogleSearcher**(`options?`): `GoogleSearcher`
|
|
39
|
+
|
|
40
|
+
Defined in: web-fetcher/dist/index.d.ts:2192
|
|
41
|
+
|
|
42
|
+
Creates a new FetchSession.
|
|
43
|
+
|
|
44
|
+
#### Parameters
|
|
45
|
+
|
|
46
|
+
##### options?
|
|
47
|
+
|
|
48
|
+
`FetcherOptions`
|
|
49
|
+
|
|
50
|
+
Configuration options for the fetcher.
|
|
51
|
+
|
|
52
|
+
#### Returns
|
|
53
|
+
|
|
54
|
+
`GoogleSearcher`
|
|
55
|
+
|
|
56
|
+
#### Inherited from
|
|
57
|
+
|
|
58
|
+
[`WebSearcher`](WebSearcher.md).[`constructor`](WebSearcher.md#constructor)
|
|
59
|
+
|
|
60
|
+
## Properties
|
|
61
|
+
|
|
62
|
+
### closed
|
|
63
|
+
|
|
64
|
+
> `protected` **closed**: `boolean`
|
|
65
|
+
|
|
66
|
+
Defined in: web-fetcher/dist/index.d.ts:2186
|
|
67
|
+
|
|
68
|
+
#### Inherited from
|
|
69
|
+
|
|
70
|
+
[`WebSearcher`](WebSearcher.md).[`closed`](WebSearcher.md#closed)
|
|
71
|
+
|
|
72
|
+
***
|
|
73
|
+
|
|
74
|
+
### context
|
|
75
|
+
|
|
76
|
+
> `readonly` **context**: `FetchContext`
|
|
77
|
+
|
|
78
|
+
Defined in: web-fetcher/dist/index.d.ts:2185
|
|
79
|
+
|
|
80
|
+
The execution context for this session, containing configurations, event bus, and shared state.
|
|
81
|
+
|
|
82
|
+
#### Inherited from
|
|
83
|
+
|
|
84
|
+
[`WebSearcher`](WebSearcher.md).[`context`](WebSearcher.md#context)
|
|
85
|
+
|
|
86
|
+
***
|
|
87
|
+
|
|
88
|
+
### id
|
|
89
|
+
|
|
90
|
+
> `readonly` **id**: `string`
|
|
91
|
+
|
|
92
|
+
Defined in: web-fetcher/dist/index.d.ts:2181
|
|
93
|
+
|
|
94
|
+
Unique identifier for the session.
|
|
95
|
+
|
|
96
|
+
#### Inherited from
|
|
97
|
+
|
|
98
|
+
[`WebSearcher`](WebSearcher.md).[`id`](WebSearcher.md#id)
|
|
99
|
+
|
|
100
|
+
***
|
|
101
|
+
|
|
102
|
+
### options
|
|
103
|
+
|
|
104
|
+
> `protected` **options**: `FetcherOptions`
|
|
105
|
+
|
|
106
|
+
Defined in: web-fetcher/dist/index.d.ts:2177
|
|
107
|
+
|
|
108
|
+
#### Inherited from
|
|
109
|
+
|
|
110
|
+
[`WebSearcher`](WebSearcher.md).[`options`](WebSearcher.md#options)
|
|
111
|
+
|
|
112
|
+
***
|
|
113
|
+
|
|
114
|
+
### \_isFactory
|
|
115
|
+
|
|
116
|
+
> `static` **\_isFactory**: `boolean` = `false`
|
|
117
|
+
|
|
118
|
+
Defined in: [web-searcher/src/searcher.ts:33](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L33)
|
|
119
|
+
|
|
120
|
+
#### Inherited from
|
|
121
|
+
|
|
122
|
+
[`WebSearcher`](WebSearcher.md).[`_isFactory`](WebSearcher.md#_isfactory)
|
|
123
|
+
|
|
124
|
+
***
|
|
125
|
+
|
|
126
|
+
### alias
|
|
127
|
+
|
|
128
|
+
> `static` **alias**: `string`[]
|
|
129
|
+
|
|
130
|
+
Defined in: [web-searcher/src/engines/google.ts:25](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/engines/google.ts#L25)
|
|
131
|
+
|
|
132
|
+
Engine alias(es). Can be a single string or an array of strings.
|
|
133
|
+
Useful for registering shorthand names (e.g., 'g' for 'Google').
|
|
134
|
+
|
|
135
|
+
#### Overrides
|
|
136
|
+
|
|
137
|
+
[`WebSearcher`](WebSearcher.md).[`alias`](WebSearcher.md#alias)
|
|
138
|
+
|
|
139
|
+
***
|
|
140
|
+
|
|
141
|
+
### createObject()
|
|
142
|
+
|
|
143
|
+
> `static` **createObject**: (`name`, ...`args`) => [`WebSearcher`](WebSearcher.md)
|
|
144
|
+
|
|
145
|
+
Defined in: [web-searcher/src/searcher.ts:78](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L78)
|
|
146
|
+
|
|
147
|
+
Creates an instance of the registered search engine.
|
|
148
|
+
|
|
149
|
+
#### Parameters
|
|
150
|
+
|
|
151
|
+
##### name
|
|
152
|
+
|
|
153
|
+
`string`
|
|
154
|
+
|
|
155
|
+
The name of the engine.
|
|
156
|
+
|
|
157
|
+
##### args
|
|
158
|
+
|
|
159
|
+
...`any`[]
|
|
160
|
+
|
|
161
|
+
Arguments to pass to the constructor.
|
|
162
|
+
|
|
163
|
+
#### Returns
|
|
164
|
+
|
|
165
|
+
[`WebSearcher`](WebSearcher.md)
|
|
166
|
+
|
|
167
|
+
An instance of the search engine.
|
|
168
|
+
|
|
169
|
+
#### Inherited from
|
|
170
|
+
|
|
171
|
+
[`WebSearcher`](WebSearcher.md).[`createObject`](WebSearcher.md#createobject)
|
|
172
|
+
|
|
173
|
+
***
|
|
174
|
+
|
|
175
|
+
### forEach()
|
|
176
|
+
|
|
177
|
+
> `static` **forEach**: (`cb`) => `void`
|
|
178
|
+
|
|
179
|
+
Defined in: [web-searcher/src/searcher.ts:85](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L85)
|
|
180
|
+
|
|
181
|
+
Iterates over all registered engines.
|
|
182
|
+
|
|
183
|
+
#### Parameters
|
|
184
|
+
|
|
185
|
+
##### cb
|
|
186
|
+
|
|
187
|
+
(`ctor`, `name`) => `void`
|
|
188
|
+
|
|
189
|
+
Callback function to invoke for each registered engine.
|
|
190
|
+
|
|
191
|
+
#### Returns
|
|
192
|
+
|
|
193
|
+
`void`
|
|
194
|
+
|
|
195
|
+
#### Inherited from
|
|
196
|
+
|
|
197
|
+
[`WebSearcher`](WebSearcher.md).[`forEach`](WebSearcher.md#foreach)
|
|
198
|
+
|
|
199
|
+
***
|
|
200
|
+
|
|
201
|
+
### get()
|
|
202
|
+
|
|
203
|
+
> `static` **get**: (`name`) => *typeof* [`WebSearcher`](WebSearcher.md)
|
|
204
|
+
|
|
205
|
+
Defined in: [web-searcher/src/searcher.ts:69](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L69)
|
|
206
|
+
|
|
207
|
+
Retrieves a registered search engine class by name.
|
|
208
|
+
|
|
209
|
+
#### Parameters
|
|
210
|
+
|
|
211
|
+
##### name
|
|
212
|
+
|
|
213
|
+
`string`
|
|
214
|
+
|
|
215
|
+
The name of the engine (e.g., 'Google').
|
|
216
|
+
|
|
217
|
+
#### Returns
|
|
218
|
+
|
|
219
|
+
*typeof* [`WebSearcher`](WebSearcher.md)
|
|
220
|
+
|
|
221
|
+
The search engine class constructor.
|
|
222
|
+
|
|
223
|
+
#### Inherited from
|
|
224
|
+
|
|
225
|
+
[`WebSearcher`](WebSearcher.md).[`get`](WebSearcher.md#get)
|
|
226
|
+
|
|
227
|
+
***
|
|
228
|
+
|
|
229
|
+
### name?
|
|
230
|
+
|
|
231
|
+
> `static` `optional` **name**: `string`
|
|
232
|
+
|
|
233
|
+
Defined in: [web-searcher/src/searcher.ts:40](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L40)
|
|
234
|
+
|
|
235
|
+
Custom engine name. If not provided, it is derived from the class name.
|
|
236
|
+
For example, `GoogleSearcher` becomes `Google`.
|
|
237
|
+
|
|
238
|
+
#### Inherited from
|
|
239
|
+
|
|
240
|
+
[`WebSearcher`](WebSearcher.md).[`name`](WebSearcher.md#name)
|
|
241
|
+
|
|
242
|
+
***
|
|
243
|
+
|
|
244
|
+
### register()
|
|
245
|
+
|
|
246
|
+
> `static` **register**: (`ctor`, `options?`) => `boolean`
|
|
247
|
+
|
|
248
|
+
Defined in: [web-searcher/src/searcher.ts:54](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L54)
|
|
249
|
+
|
|
250
|
+
Registers a search engine class.
|
|
251
|
+
|
|
252
|
+
#### Parameters
|
|
253
|
+
|
|
254
|
+
##### ctor
|
|
255
|
+
|
|
256
|
+
*typeof* [`WebSearcher`](WebSearcher.md)
|
|
257
|
+
|
|
258
|
+
The search engine class to register.
|
|
259
|
+
|
|
260
|
+
##### options?
|
|
261
|
+
|
|
262
|
+
Registration options. If a string is provided, it is used as the registered name.
|
|
263
|
+
|
|
264
|
+
`string` | `IBaseFactoryOptions`
|
|
265
|
+
|
|
266
|
+
#### Returns
|
|
267
|
+
|
|
268
|
+
`boolean`
|
|
269
|
+
|
|
270
|
+
`true` if registration was successful.
|
|
271
|
+
|
|
272
|
+
#### Inherited from
|
|
273
|
+
|
|
274
|
+
[`WebSearcher`](WebSearcher.md).[`register`](WebSearcher.md#register)
|
|
275
|
+
|
|
276
|
+
***
|
|
277
|
+
|
|
278
|
+
### setAliases()
|
|
279
|
+
|
|
280
|
+
> `static` **setAliases**: (`ctor`, ...`aliases`) => `void`
|
|
281
|
+
|
|
282
|
+
Defined in: [web-searcher/src/searcher.ts:93](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L93)
|
|
283
|
+
|
|
284
|
+
Sets aliases for a registered engine.
|
|
285
|
+
|
|
286
|
+
#### Parameters
|
|
287
|
+
|
|
288
|
+
##### ctor
|
|
289
|
+
|
|
290
|
+
*typeof* [`WebSearcher`](WebSearcher.md)
|
|
291
|
+
|
|
292
|
+
The search engine class.
|
|
293
|
+
|
|
294
|
+
##### aliases
|
|
295
|
+
|
|
296
|
+
...`string`[]
|
|
297
|
+
|
|
298
|
+
Aliases to add.
|
|
299
|
+
|
|
300
|
+
#### Returns
|
|
301
|
+
|
|
302
|
+
`void`
|
|
303
|
+
|
|
304
|
+
#### Inherited from
|
|
305
|
+
|
|
306
|
+
[`WebSearcher`](WebSearcher.md).[`setAliases`](WebSearcher.md#setaliases)
|
|
307
|
+
|
|
308
|
+
***
|
|
309
|
+
|
|
310
|
+
### unregister()
|
|
311
|
+
|
|
312
|
+
> `static` **unregister**: (`name?`) => `void`
|
|
313
|
+
|
|
314
|
+
Defined in: [web-searcher/src/searcher.ts:61](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L61)
|
|
315
|
+
|
|
316
|
+
Unregisters a search engine.
|
|
317
|
+
|
|
318
|
+
#### Parameters
|
|
319
|
+
|
|
320
|
+
##### name?
|
|
321
|
+
|
|
322
|
+
The name or class to unregister.
|
|
323
|
+
|
|
324
|
+
`string` | *typeof* [`WebSearcher`](WebSearcher.md)
|
|
325
|
+
|
|
326
|
+
#### Returns
|
|
327
|
+
|
|
328
|
+
`void`
|
|
329
|
+
|
|
330
|
+
#### Inherited from
|
|
331
|
+
|
|
332
|
+
[`WebSearcher`](WebSearcher.md).[`unregister`](WebSearcher.md#unregister)
|
|
333
|
+
|
|
334
|
+
## Accessors
|
|
335
|
+
|
|
336
|
+
### pagination
|
|
337
|
+
|
|
338
|
+
#### Get Signature
|
|
339
|
+
|
|
340
|
+
> **get** **pagination**(): [`PaginationConfig`](../interfaces/PaginationConfig.md)
|
|
341
|
+
|
|
342
|
+
Defined in: [web-searcher/src/engines/google.ts:61](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/engines/google.ts#L61)
|
|
343
|
+
|
|
344
|
+
Configures pagination for Google Search results.
|
|
345
|
+
Uses the 'start' URL parameter, incrementing by 10 for each page.
|
|
346
|
+
|
|
347
|
+
##### Returns
|
|
348
|
+
|
|
349
|
+
[`PaginationConfig`](../interfaces/PaginationConfig.md)
|
|
350
|
+
|
|
351
|
+
#### Overrides
|
|
352
|
+
|
|
353
|
+
[`WebSearcher`](WebSearcher.md).[`pagination`](WebSearcher.md#pagination)
|
|
354
|
+
|
|
355
|
+
***
|
|
356
|
+
|
|
357
|
+
### template
|
|
358
|
+
|
|
359
|
+
#### Get Signature
|
|
360
|
+
|
|
361
|
+
> **get** **template**(): `FetcherOptions`
|
|
362
|
+
|
|
363
|
+
Defined in: [web-searcher/src/engines/google.ts:32](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/engines/google.ts#L32)
|
|
364
|
+
|
|
365
|
+
Defines the fetch template for Google Search.
|
|
366
|
+
|
|
367
|
+
##### Returns
|
|
368
|
+
|
|
369
|
+
`FetcherOptions`
|
|
370
|
+
|
|
371
|
+
The fetcher configuration including the URL pattern and extraction rules.
|
|
372
|
+
|
|
373
|
+
#### Overrides
|
|
374
|
+
|
|
375
|
+
[`WebSearcher`](WebSearcher.md).[`template`](WebSearcher.md#template)
|
|
376
|
+
|
|
377
|
+
## Methods
|
|
378
|
+
|
|
379
|
+
### createContext()
|
|
380
|
+
|
|
381
|
+
> `protected` **createContext**(`options`): `FetchContext`
|
|
382
|
+
|
|
383
|
+
Defined in: [web-searcher/src/searcher.ts:155](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L155)
|
|
384
|
+
|
|
385
|
+
#### Parameters
|
|
386
|
+
|
|
387
|
+
##### options
|
|
388
|
+
|
|
389
|
+
`FetcherOptions` = `...`
|
|
390
|
+
|
|
391
|
+
#### Returns
|
|
392
|
+
|
|
393
|
+
`FetchContext`
|
|
394
|
+
|
|
395
|
+
#### Inherited from
|
|
396
|
+
|
|
397
|
+
[`WebSearcher`](WebSearcher.md).[`createContext`](WebSearcher.md#createcontext)
|
|
398
|
+
|
|
399
|
+
***
|
|
400
|
+
|
|
401
|
+
### dispose()
|
|
402
|
+
|
|
403
|
+
> **dispose**(): `Promise`\<`void`\>
|
|
404
|
+
|
|
405
|
+
Defined in: web-fetcher/dist/index.d.ts:2251
|
|
406
|
+
|
|
407
|
+
Disposes of the session and its associated engine.
|
|
408
|
+
|
|
409
|
+
#### Returns
|
|
410
|
+
|
|
411
|
+
`Promise`\<`void`\>
|
|
412
|
+
|
|
413
|
+
#### Remarks
|
|
414
|
+
|
|
415
|
+
This method should be called when the session is no longer needed to free up resources
|
|
416
|
+
(e.g., closing browser instances, purging temporary storage).
|
|
417
|
+
|
|
418
|
+
#### Inherited from
|
|
419
|
+
|
|
420
|
+
[`WebSearcher`](WebSearcher.md).[`dispose`](WebSearcher.md#dispose)
|
|
421
|
+
|
|
422
|
+
***
|
|
423
|
+
|
|
424
|
+
### execute()
|
|
425
|
+
|
|
426
|
+
> **execute**\<`R`\>(`actionOptions`, `context?`): `Promise`\<`FetchActionResult`\<`R`\>\>
|
|
427
|
+
|
|
428
|
+
Defined in: web-fetcher/dist/index.d.ts:2206
|
|
429
|
+
|
|
430
|
+
Executes a single action within the session.
|
|
431
|
+
|
|
432
|
+
#### Type Parameters
|
|
433
|
+
|
|
434
|
+
##### R
|
|
435
|
+
|
|
436
|
+
`R` *extends* `FetchReturnType` = `"response"`
|
|
437
|
+
|
|
438
|
+
The expected return type of the action.
|
|
439
|
+
|
|
440
|
+
#### Parameters
|
|
441
|
+
|
|
442
|
+
##### actionOptions
|
|
443
|
+
|
|
444
|
+
`_RequireAtLeastOne`
|
|
445
|
+
|
|
446
|
+
Configuration for the action to be executed.
|
|
447
|
+
|
|
448
|
+
##### context?
|
|
449
|
+
|
|
450
|
+
`FetchContext`
|
|
451
|
+
|
|
452
|
+
Optional context override for this specific execution. Defaults to the session context.
|
|
453
|
+
|
|
454
|
+
#### Returns
|
|
455
|
+
|
|
456
|
+
`Promise`\<`FetchActionResult`\<`R`\>\>
|
|
457
|
+
|
|
458
|
+
A promise that resolves to the result of the action.
|
|
459
|
+
|
|
460
|
+
#### Example
|
|
461
|
+
|
|
462
|
+
```ts
|
|
463
|
+
await session.execute({ name: 'goto', params: { url: 'https://example.com' } });
|
|
464
|
+
```
|
|
465
|
+
|
|
466
|
+
#### Inherited from
|
|
467
|
+
|
|
468
|
+
[`WebSearcher`](WebSearcher.md).[`execute`](WebSearcher.md#execute)
|
|
469
|
+
|
|
470
|
+
***
|
|
471
|
+
|
|
472
|
+
### executeAll()
|
|
473
|
+
|
|
474
|
+
> **executeAll**(`actions`, `options?`): `Promise`\<\{ `outputs`: `Record`\<`string`, `any`\>; `result`: `FetchResponse` \| `undefined`; \}\>
|
|
475
|
+
|
|
476
|
+
Defined in: web-fetcher/dist/index.d.ts:2223
|
|
477
|
+
|
|
478
|
+
Executes a sequence of actions.
|
|
479
|
+
|
|
480
|
+
#### Parameters
|
|
481
|
+
|
|
482
|
+
##### actions
|
|
483
|
+
|
|
484
|
+
`_RequireAtLeastOne`\<`FetchActionProperties`, `"id"` \| `"name"` \| `"action"`\>[]
|
|
485
|
+
|
|
486
|
+
An array of action options to be executed in order.
|
|
487
|
+
|
|
488
|
+
##### options?
|
|
489
|
+
|
|
490
|
+
`Partial`\<`FetcherOptions`\> & `object`
|
|
491
|
+
|
|
492
|
+
Optional temporary configuration overrides (e.g., timeoutMs, headers) for this batch of actions.
|
|
493
|
+
These overrides do not affect the main session context.
|
|
494
|
+
|
|
495
|
+
#### Returns
|
|
496
|
+
|
|
497
|
+
`Promise`\<\{ `outputs`: `Record`\<`string`, `any`\>; `result`: `FetchResponse` \| `undefined`; \}\>
|
|
498
|
+
|
|
499
|
+
A promise that resolves to an object containing the result of the last action and all accumulated outputs.
|
|
500
|
+
|
|
501
|
+
#### Example
|
|
502
|
+
|
|
503
|
+
```ts
|
|
504
|
+
const { result, outputs } = await session.executeAll([
|
|
505
|
+
{ name: 'goto', params: { url: 'https://example.com' } },
|
|
506
|
+
{ name: 'extract', params: { schema: { title: 'h1' } }, storeAs: 'data' }
|
|
507
|
+
], { timeoutMs: 30000 });
|
|
508
|
+
```
|
|
509
|
+
|
|
510
|
+
#### Inherited from
|
|
511
|
+
|
|
512
|
+
[`WebSearcher`](WebSearcher.md).[`executeAll`](WebSearcher.md#executeall)
|
|
513
|
+
|
|
514
|
+
***
|
|
515
|
+
|
|
516
|
+
### formatOptions()
|
|
517
|
+
|
|
518
|
+
> `protected` **formatOptions**(`options`): `Record`\<`string`, `any`\>
|
|
519
|
+
|
|
520
|
+
Defined in: [web-searcher/src/engines/google.ts:82](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/engines/google.ts#L82)
|
|
521
|
+
|
|
522
|
+
Maps standard `SearchOptions` to Google's specific URL parameters.
|
|
523
|
+
|
|
524
|
+
- `timeRange` -> `tbs` (e.g., 'qdr:d' for day)
|
|
525
|
+
- `category` -> `tbm` (e.g., 'isch' for images)
|
|
526
|
+
- `region` -> `gl`
|
|
527
|
+
- `language` -> `hl`
|
|
528
|
+
- `safeSearch` -> `safe`
|
|
529
|
+
|
|
530
|
+
#### Parameters
|
|
531
|
+
|
|
532
|
+
##### options
|
|
533
|
+
|
|
534
|
+
[`SearchOptions`](../interfaces/SearchOptions.md)
|
|
535
|
+
|
|
536
|
+
The user-provided search options.
|
|
537
|
+
|
|
538
|
+
#### Returns
|
|
539
|
+
|
|
540
|
+
`Record`\<`string`, `any`\>
|
|
541
|
+
|
|
542
|
+
A map of variables to inject into the URL template.
|
|
543
|
+
|
|
544
|
+
#### Overrides
|
|
545
|
+
|
|
546
|
+
[`WebSearcher`](WebSearcher.md).[`formatOptions`](WebSearcher.md#formatoptions)
|
|
547
|
+
|
|
548
|
+
***
|
|
549
|
+
|
|
550
|
+
### getOutputs()
|
|
551
|
+
|
|
552
|
+
> **getOutputs**(): `Record`\<`string`, `any`\>
|
|
553
|
+
|
|
554
|
+
Defined in: web-fetcher/dist/index.d.ts:2234
|
|
555
|
+
|
|
556
|
+
Retrieves all outputs accumulated during the session.
|
|
557
|
+
|
|
558
|
+
#### Returns
|
|
559
|
+
|
|
560
|
+
`Record`\<`string`, `any`\>
|
|
561
|
+
|
|
562
|
+
A record of stored output data.
|
|
563
|
+
|
|
564
|
+
#### Inherited from
|
|
565
|
+
|
|
566
|
+
[`WebSearcher`](WebSearcher.md).[`getOutputs`](WebSearcher.md#getoutputs)
|
|
567
|
+
|
|
568
|
+
***
|
|
569
|
+
|
|
570
|
+
### getState()
|
|
571
|
+
|
|
572
|
+
> **getState**(): `Promise`\<\{ `cookies`: `Cookie`[]; `sessionState?`: `any`; \} \| `undefined`\>
|
|
573
|
+
|
|
574
|
+
Defined in: web-fetcher/dist/index.d.ts:2240
|
|
575
|
+
|
|
576
|
+
Gets the current state of the session, including cookies and engine-specific state.
|
|
577
|
+
|
|
578
|
+
#### Returns
|
|
579
|
+
|
|
580
|
+
`Promise`\<\{ `cookies`: `Cookie`[]; `sessionState?`: `any`; \} \| `undefined`\>
|
|
581
|
+
|
|
582
|
+
A promise resolving to the session state, or undefined if no engine is initialized.
|
|
583
|
+
|
|
584
|
+
#### Inherited from
|
|
585
|
+
|
|
586
|
+
[`WebSearcher`](WebSearcher.md).[`getState`](WebSearcher.md#getstate)
|
|
587
|
+
|
|
588
|
+
***
|
|
589
|
+
|
|
590
|
+
### search()
|
|
591
|
+
|
|
592
|
+
> **search**(`query`, `options`): `Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
|
|
593
|
+
|
|
594
|
+
Defined in: [web-searcher/src/searcher.ts:182](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L182)
|
|
595
|
+
|
|
596
|
+
Executes a search query.
|
|
597
|
+
|
|
598
|
+
This method handles the pagination loop, variable injection, fetching,
|
|
599
|
+
and result transformation.
|
|
600
|
+
|
|
601
|
+
#### Parameters
|
|
602
|
+
|
|
603
|
+
##### query
|
|
604
|
+
|
|
605
|
+
`string`
|
|
606
|
+
|
|
607
|
+
The search query string.
|
|
608
|
+
|
|
609
|
+
##### options
|
|
610
|
+
|
|
611
|
+
[`SearchOptions`](../interfaces/SearchOptions.md) = `{}`
|
|
612
|
+
|
|
613
|
+
Optional search parameters (e.g., limit, timeRange).
|
|
614
|
+
|
|
615
|
+
#### Returns
|
|
616
|
+
|
|
617
|
+
`Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
|
|
618
|
+
|
|
619
|
+
A promise resolving to an array of standardized search results.
|
|
620
|
+
|
|
621
|
+
#### Inherited from
|
|
622
|
+
|
|
623
|
+
[`WebSearcher`](WebSearcher.md).[`search`](WebSearcher.md#search)
|
|
624
|
+
|
|
625
|
+
***
|
|
626
|
+
|
|
627
|
+
### transform()
|
|
628
|
+
|
|
629
|
+
> `protected` **transform**(`outputs`): `Promise`\<`any`[]\>
|
|
630
|
+
|
|
631
|
+
Defined in: [web-searcher/src/engines/google.ts:144](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/engines/google.ts#L144)
|
|
632
|
+
|
|
633
|
+
Cleans and normalizes the extracted results.
|
|
634
|
+
Specifically, it unwraps Google's redirect URLs (starting with `/url?q=`).
|
|
635
|
+
|
|
636
|
+
#### Parameters
|
|
637
|
+
|
|
638
|
+
##### outputs
|
|
639
|
+
|
|
640
|
+
`Record`\<`string`, `any`\>
|
|
641
|
+
|
|
642
|
+
The raw outputs from the fetcher.
|
|
643
|
+
|
|
644
|
+
#### Returns
|
|
645
|
+
|
|
646
|
+
`Promise`\<`any`[]\>
|
|
647
|
+
|
|
648
|
+
An array of cleaned search results.
|
|
649
|
+
|
|
650
|
+
#### Overrides
|
|
651
|
+
|
|
652
|
+
[`WebSearcher`](WebSearcher.md).[`transform`](WebSearcher.md#transform)
|
|
653
|
+
|
|
654
|
+
***
|
|
655
|
+
|
|
656
|
+
### search()
|
|
657
|
+
|
|
658
|
+
> `static` **search**(`engineName`, `query`, `options`): `Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
|
|
659
|
+
|
|
660
|
+
Defined in: [web-searcher/src/searcher.ts:106](https://github.com/isdk/web-searcher.js/blob/e9a6e5ec9526780489427743389b927a5c16db5c/src/searcher.ts#L106)
|
|
661
|
+
|
|
662
|
+
Static helper to execute a one-off search.
|
|
663
|
+
|
|
664
|
+
It creates an instance of the specified engine, executes the search, and then
|
|
665
|
+
automatically disposes of the session.
|
|
666
|
+
|
|
667
|
+
#### Parameters
|
|
668
|
+
|
|
669
|
+
##### engineName
|
|
670
|
+
|
|
671
|
+
`string`
|
|
672
|
+
|
|
673
|
+
The name of the engine to use (e.g., 'Google').
|
|
674
|
+
|
|
675
|
+
##### query
|
|
676
|
+
|
|
677
|
+
`string`
|
|
678
|
+
|
|
679
|
+
The search query string.
|
|
680
|
+
|
|
681
|
+
##### options
|
|
682
|
+
|
|
683
|
+
[`SearchOptions`](../interfaces/SearchOptions.md) & `FetcherOptions` = `{}`
|
|
684
|
+
|
|
685
|
+
Combined search options and fetcher options.
|
|
686
|
+
|
|
687
|
+
#### Returns
|
|
688
|
+
|
|
689
|
+
`Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
|
|
690
|
+
|
|
691
|
+
A promise resolving to an array of standardized search results.
|
|
692
|
+
|
|
693
|
+
#### Inherited from
|
|
694
|
+
|
|
695
|
+
[`WebSearcher`](WebSearcher.md).[`search`](WebSearcher.md#search-2)
|