@isdk/web-searcher 0.1.4 → 0.1.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.cn.md +196 -7
- package/README.md +196 -7
- package/dist/index.d.mts +234 -11
- package/dist/index.d.ts +234 -11
- package/dist/index.js +1 -1
- package/dist/index.mjs +1 -1
- package/docs/README.md +196 -7
- package/docs/classes/GoogleSearcher.md +289 -60
- package/docs/classes/WebSearcher.md +264 -61
- package/docs/functions/extractDate.md +42 -0
- package/docs/functions/extractMetadataFrom.md +40 -0
- package/docs/functions/fetchHeaders.md +34 -0
- package/docs/functions/fetchPartial.md +41 -0
- package/docs/functions/normalizeDate.md +29 -0
- package/docs/functions/parseHeaders.md +28 -0
- package/docs/functions/parseHtml.md +31 -0
- package/docs/functions/testUrlsByLatency.md +42 -0
- package/docs/globals.md +18 -0
- package/docs/interfaces/CustomTimeRange.md +3 -3
- package/docs/interfaces/ExtractOptions.md +54 -0
- package/docs/interfaces/FetchExtractorOptions.md +35 -0
- package/docs/interfaces/FetcherOptions.md +436 -0
- package/docs/interfaces/HtmlData.md +53 -0
- package/docs/interfaces/MetadataResult.md +27 -0
- package/docs/interfaces/PaginationConfig.md +9 -9
- package/docs/interfaces/SearchContext.md +30 -4
- package/docs/interfaces/SearchOptions.md +77 -11
- package/docs/interfaces/StandardSearchResult.md +10 -10
- package/docs/interfaces/VerifiedUrl.md +25 -0
- package/docs/type-aliases/MetadataType.md +13 -0
- package/docs/type-aliases/SafeSearchLevel.md +1 -1
- package/docs/type-aliases/SearchCategory.md +2 -2
- package/docs/type-aliases/SearchTimeRange.md +1 -1
- package/docs/type-aliases/SearchTimeRangePreset.md +1 -1
- package/docs/type-aliases/SearcherConstructor.md +2 -2
- package/package.json +3 -2
|
@@ -6,7 +6,7 @@
|
|
|
6
6
|
|
|
7
7
|
# Abstract Class: WebSearcher
|
|
8
8
|
|
|
9
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
9
|
+
Defined in: [web-searcher/src/searcher.ts:32](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L32)
|
|
10
10
|
|
|
11
11
|
The abstract base class for all search engines.
|
|
12
12
|
|
|
@@ -41,7 +41,7 @@ WebSearcher.register(MySearcher);
|
|
|
41
41
|
|
|
42
42
|
> **new WebSearcher**(`options?`): `WebSearcher`
|
|
43
43
|
|
|
44
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
44
|
+
Defined in: web-fetcher/dist/index.d.ts:1172
|
|
45
45
|
|
|
46
46
|
Creates a new FetchSession.
|
|
47
47
|
|
|
@@ -49,7 +49,7 @@ Creates a new FetchSession.
|
|
|
49
49
|
|
|
50
50
|
##### options?
|
|
51
51
|
|
|
52
|
-
`FetcherOptions`
|
|
52
|
+
[`FetcherOptions`](../interfaces/FetcherOptions.md)
|
|
53
53
|
|
|
54
54
|
Configuration options for the fetcher.
|
|
55
55
|
|
|
@@ -67,7 +67,7 @@ Configuration options for the fetcher.
|
|
|
67
67
|
|
|
68
68
|
> `protected` **closed**: `boolean`
|
|
69
69
|
|
|
70
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
70
|
+
Defined in: web-fetcher/dist/index.d.ts:1166
|
|
71
71
|
|
|
72
72
|
#### Inherited from
|
|
73
73
|
|
|
@@ -79,7 +79,7 @@ Defined in: web-fetcher/dist/index.d.ts:2281
|
|
|
79
79
|
|
|
80
80
|
> `readonly` **context**: `FetchContext`
|
|
81
81
|
|
|
82
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
82
|
+
Defined in: web-fetcher/dist/index.d.ts:1165
|
|
83
83
|
|
|
84
84
|
The execution context for this session, containing configurations, event bus, and shared state.
|
|
85
85
|
|
|
@@ -93,7 +93,7 @@ The execution context for this session, containing configurations, event bus, an
|
|
|
93
93
|
|
|
94
94
|
> `readonly` **id**: `string`
|
|
95
95
|
|
|
96
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
96
|
+
Defined in: web-fetcher/dist/index.d.ts:1161
|
|
97
97
|
|
|
98
98
|
Unique identifier for the session.
|
|
99
99
|
|
|
@@ -105,9 +105,9 @@ Unique identifier for the session.
|
|
|
105
105
|
|
|
106
106
|
### options
|
|
107
107
|
|
|
108
|
-
> `protected` **options**: `FetcherOptions`
|
|
108
|
+
> `protected` **options**: [`FetcherOptions`](../interfaces/FetcherOptions.md)
|
|
109
109
|
|
|
110
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
110
|
+
Defined in: web-fetcher/dist/index.d.ts:1157
|
|
111
111
|
|
|
112
112
|
#### Inherited from
|
|
113
113
|
|
|
@@ -115,11 +115,21 @@ Defined in: web-fetcher/dist/index.d.ts:2272
|
|
|
115
115
|
|
|
116
116
|
***
|
|
117
117
|
|
|
118
|
+
### \_defaultOptions?
|
|
119
|
+
|
|
120
|
+
> `static` `optional` **\_defaultOptions**: [`SearchOptions`](../interfaces/SearchOptions.md)
|
|
121
|
+
|
|
122
|
+
Defined in: [web-searcher/src/searcher.ts:55](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L55)
|
|
123
|
+
|
|
124
|
+
**`Internal`**
|
|
125
|
+
|
|
126
|
+
***
|
|
127
|
+
|
|
118
128
|
### \_isFactory
|
|
119
129
|
|
|
120
130
|
> `static` **\_isFactory**: `boolean` = `false`
|
|
121
131
|
|
|
122
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
132
|
+
Defined in: [web-searcher/src/searcher.ts:34](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L34)
|
|
123
133
|
|
|
124
134
|
***
|
|
125
135
|
|
|
@@ -127,7 +137,7 @@ Defined in: [web-searcher/src/searcher.ts:33](https://github.com/isdk/web-search
|
|
|
127
137
|
|
|
128
138
|
> `static` `optional` **alias**: `string` \| `string`[]
|
|
129
139
|
|
|
130
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
140
|
+
Defined in: [web-searcher/src/searcher.ts:46](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L46)
|
|
131
141
|
|
|
132
142
|
Engine alias(es). Can be a single string or an array of strings.
|
|
133
143
|
Useful for registering shorthand names (e.g., 'g' for 'Google').
|
|
@@ -138,7 +148,7 @@ Useful for registering shorthand names (e.g., 'g' for 'Google').
|
|
|
138
148
|
|
|
139
149
|
> `static` **createObject**: (`name`, ...`args`) => `WebSearcher`
|
|
140
150
|
|
|
141
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
151
|
+
Defined in: [web-searcher/src/searcher.ts:122](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L122)
|
|
142
152
|
|
|
143
153
|
Creates an instance of the registered search engine.
|
|
144
154
|
|
|
@@ -164,11 +174,31 @@ An instance of the search engine.
|
|
|
164
174
|
|
|
165
175
|
***
|
|
166
176
|
|
|
177
|
+
### currentInstanceIndex?
|
|
178
|
+
|
|
179
|
+
> `static` `optional` **currentInstanceIndex**: `number`
|
|
180
|
+
|
|
181
|
+
Defined in: [web-searcher/src/searcher.ts:52](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L52)
|
|
182
|
+
|
|
183
|
+
Globally shared index for tracking the currently active instance (node) across sessions.
|
|
184
|
+
|
|
185
|
+
***
|
|
186
|
+
|
|
187
|
+
### defaultBaseUrls?
|
|
188
|
+
|
|
189
|
+
> `static` `optional` **defaultBaseUrls**: `string`[]
|
|
190
|
+
|
|
191
|
+
Defined in: [web-searcher/src/searcher.ts:49](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L49)
|
|
192
|
+
|
|
193
|
+
Default base URLs for engines that support multiple instances.
|
|
194
|
+
|
|
195
|
+
***
|
|
196
|
+
|
|
167
197
|
### forEach()
|
|
168
198
|
|
|
169
199
|
> `static` **forEach**: (`cb`) => `void`
|
|
170
200
|
|
|
171
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
201
|
+
Defined in: [web-searcher/src/searcher.ts:129](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L129)
|
|
172
202
|
|
|
173
203
|
Iterates over all registered engines.
|
|
174
204
|
|
|
@@ -190,7 +220,7 @@ Callback function to invoke for each registered engine.
|
|
|
190
220
|
|
|
191
221
|
> `static` **get**: (`name`) => *typeof* `WebSearcher`
|
|
192
222
|
|
|
193
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
223
|
+
Defined in: [web-searcher/src/searcher.ts:113](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L113)
|
|
194
224
|
|
|
195
225
|
Retrieves a registered search engine class by name.
|
|
196
226
|
|
|
@@ -214,7 +244,7 @@ The search engine class constructor.
|
|
|
214
244
|
|
|
215
245
|
> `static` `optional` **name**: `string`
|
|
216
246
|
|
|
217
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
247
|
+
Defined in: [web-searcher/src/searcher.ts:41](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L41)
|
|
218
248
|
|
|
219
249
|
Custom engine name. If not provided, it is derived from the class name.
|
|
220
250
|
For example, `GoogleSearcher` becomes `Google`.
|
|
@@ -225,7 +255,7 @@ For example, `GoogleSearcher` becomes `Google`.
|
|
|
225
255
|
|
|
226
256
|
> `static` **register**: (`ctor`, `options?`) => `boolean`
|
|
227
257
|
|
|
228
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
258
|
+
Defined in: [web-searcher/src/searcher.ts:98](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L98)
|
|
229
259
|
|
|
230
260
|
Registers a search engine class.
|
|
231
261
|
|
|
@@ -255,7 +285,7 @@ Registration options. If a string is provided, it is used as the registered name
|
|
|
255
285
|
|
|
256
286
|
> `static` **setAliases**: (`ctor`, ...`aliases`) => `void`
|
|
257
287
|
|
|
258
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
288
|
+
Defined in: [web-searcher/src/searcher.ts:137](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L137)
|
|
259
289
|
|
|
260
290
|
Sets aliases for a registered engine.
|
|
261
291
|
|
|
@@ -283,7 +313,7 @@ Aliases to add.
|
|
|
283
313
|
|
|
284
314
|
> `static` **unregister**: (`name?`) => `void`
|
|
285
315
|
|
|
286
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
316
|
+
Defined in: [web-searcher/src/searcher.ts:105](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L105)
|
|
287
317
|
|
|
288
318
|
Unregisters a search engine.
|
|
289
319
|
|
|
@@ -307,7 +337,7 @@ The name or class to unregister.
|
|
|
307
337
|
|
|
308
338
|
> **get** **pagination**(): [`PaginationConfig`](../interfaces/PaginationConfig.md) \| `undefined`
|
|
309
339
|
|
|
310
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
340
|
+
Defined in: [web-searcher/src/searcher.ts:236](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L236)
|
|
311
341
|
|
|
312
342
|
Optional pagination configuration.
|
|
313
343
|
Defines how the searcher navigates to subsequent pages.
|
|
@@ -324,15 +354,17 @@ If undefined, the searcher will only fetch the first page.
|
|
|
324
354
|
|
|
325
355
|
#### Get Signature
|
|
326
356
|
|
|
327
|
-
> **get**
|
|
357
|
+
> **get** **template**(): [`FetcherOptions`](../interfaces/FetcherOptions.md)
|
|
328
358
|
|
|
329
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
359
|
+
Defined in: [web-searcher/src/searcher.ts:226](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L226)
|
|
330
360
|
|
|
331
361
|
The declarative template for the fetch options.
|
|
332
362
|
|
|
333
|
-
Subclasses
|
|
363
|
+
Subclasses can implement this getter to provide the engine configuration,
|
|
334
364
|
including the base URL, search parameters pattern, and extraction rules.
|
|
335
365
|
|
|
366
|
+
This getter is **optional** if you override [getTemplate](#gettemplate).
|
|
367
|
+
|
|
336
368
|
Supports variable injection using syntax like `${query}`, `${offset}`, etc.
|
|
337
369
|
|
|
338
370
|
##### Example
|
|
@@ -348,21 +380,128 @@ get template() {
|
|
|
348
380
|
|
|
349
381
|
##### Returns
|
|
350
382
|
|
|
351
|
-
`FetcherOptions`
|
|
383
|
+
[`FetcherOptions`](../interfaces/FetcherOptions.md)
|
|
384
|
+
|
|
385
|
+
***
|
|
386
|
+
|
|
387
|
+
### defaultOptions
|
|
388
|
+
|
|
389
|
+
#### Get Signature
|
|
390
|
+
|
|
391
|
+
> **get** `static` **defaultOptions**(): [`SearchOptions`](../interfaces/SearchOptions.md)
|
|
392
|
+
|
|
393
|
+
Defined in: [web-searcher/src/searcher.ts:61](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L61)
|
|
394
|
+
|
|
395
|
+
Gets or sets the default search parameters for this specific engine class.
|
|
396
|
+
This does not include settings from parent classes.
|
|
397
|
+
|
|
398
|
+
##### Returns
|
|
399
|
+
|
|
400
|
+
[`SearchOptions`](../interfaces/SearchOptions.md)
|
|
401
|
+
|
|
402
|
+
#### Set Signature
|
|
403
|
+
|
|
404
|
+
> **set** `static` **defaultOptions**(`options`): `void`
|
|
405
|
+
|
|
406
|
+
Defined in: [web-searcher/src/searcher.ts:68](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L68)
|
|
407
|
+
|
|
408
|
+
##### Parameters
|
|
409
|
+
|
|
410
|
+
###### options
|
|
411
|
+
|
|
412
|
+
[`SearchOptions`](../interfaces/SearchOptions.md)
|
|
413
|
+
|
|
414
|
+
##### Returns
|
|
415
|
+
|
|
416
|
+
`void`
|
|
352
417
|
|
|
353
418
|
## Methods
|
|
354
419
|
|
|
420
|
+
### \_execute()
|
|
421
|
+
|
|
422
|
+
> `protected` **\_execute**\<`R`\>(`actionOptions`, `context?`): `Promise`\<`FetchActionResult`\<`R`\>\>
|
|
423
|
+
|
|
424
|
+
Defined in: web-fetcher/dist/index.d.ts:1187
|
|
425
|
+
|
|
426
|
+
Executes a single action within the session.
|
|
427
|
+
|
|
428
|
+
#### Type Parameters
|
|
429
|
+
|
|
430
|
+
##### R
|
|
431
|
+
|
|
432
|
+
`R` *extends* `FetchReturnType` = `"response"`
|
|
433
|
+
|
|
434
|
+
The expected return type of the action.
|
|
435
|
+
|
|
436
|
+
#### Parameters
|
|
437
|
+
|
|
438
|
+
##### actionOptions
|
|
439
|
+
|
|
440
|
+
`_RequireAtLeastOne`
|
|
441
|
+
|
|
442
|
+
Configuration for the action to be executed.
|
|
443
|
+
|
|
444
|
+
##### context?
|
|
445
|
+
|
|
446
|
+
`FetchContext`
|
|
447
|
+
|
|
448
|
+
Optional context override for this specific execution. Defaults to the session context.
|
|
449
|
+
|
|
450
|
+
#### Returns
|
|
451
|
+
|
|
452
|
+
`Promise`\<`FetchActionResult`\<`R`\>\>
|
|
453
|
+
|
|
454
|
+
A promise that resolves to the result of the action.
|
|
455
|
+
|
|
456
|
+
#### Example
|
|
457
|
+
|
|
458
|
+
```ts
|
|
459
|
+
await session.execute({ name: 'goto', params: { url: 'https://example.com' } });
|
|
460
|
+
```
|
|
461
|
+
|
|
462
|
+
#### Inherited from
|
|
463
|
+
|
|
464
|
+
`FetchSession._execute`
|
|
465
|
+
|
|
466
|
+
***
|
|
467
|
+
|
|
468
|
+
### \_logDebug()
|
|
469
|
+
|
|
470
|
+
> `protected` **\_logDebug**(`category`, ...`args`): `void`
|
|
471
|
+
|
|
472
|
+
Defined in: web-fetcher/dist/index.d.ts:1173
|
|
473
|
+
|
|
474
|
+
#### Parameters
|
|
475
|
+
|
|
476
|
+
##### category
|
|
477
|
+
|
|
478
|
+
`string`
|
|
479
|
+
|
|
480
|
+
##### args
|
|
481
|
+
|
|
482
|
+
...`any`[]
|
|
483
|
+
|
|
484
|
+
#### Returns
|
|
485
|
+
|
|
486
|
+
`void`
|
|
487
|
+
|
|
488
|
+
#### Inherited from
|
|
489
|
+
|
|
490
|
+
`FetchSession._logDebug`
|
|
491
|
+
|
|
492
|
+
***
|
|
493
|
+
|
|
355
494
|
### createContext()
|
|
356
495
|
|
|
357
496
|
> `protected` **createContext**(`options`): `FetchContext`
|
|
358
497
|
|
|
359
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
498
|
+
Defined in: [web-searcher/src/searcher.ts:254](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L254)
|
|
360
499
|
|
|
361
500
|
#### Parameters
|
|
362
501
|
|
|
363
502
|
##### options
|
|
364
503
|
|
|
365
|
-
`FetcherOptions` = `...`
|
|
504
|
+
[`FetcherOptions`](../interfaces/FetcherOptions.md) = `...`
|
|
366
505
|
|
|
367
506
|
#### Returns
|
|
368
507
|
|
|
@@ -378,7 +517,7 @@ Defined in: [web-searcher/src/searcher.ts:155](https://github.com/isdk/web-searc
|
|
|
378
517
|
|
|
379
518
|
> **dispose**(): `Promise`\<`void`\>
|
|
380
519
|
|
|
381
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
520
|
+
Defined in: web-fetcher/dist/index.d.ts:1233
|
|
382
521
|
|
|
383
522
|
Disposes of the session and its associated engine.
|
|
384
523
|
|
|
@@ -401,9 +540,7 @@ This method should be called when the session is no longer needed to free up res
|
|
|
401
540
|
|
|
402
541
|
> **execute**\<`R`\>(`actionOptions`, `context?`): `Promise`\<`FetchActionResult`\<`R`\>\>
|
|
403
542
|
|
|
404
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
405
|
-
|
|
406
|
-
Executes a single action within the session.
|
|
543
|
+
Defined in: web-fetcher/dist/index.d.ts:1188
|
|
407
544
|
|
|
408
545
|
#### Type Parameters
|
|
409
546
|
|
|
@@ -411,34 +548,20 @@ Executes a single action within the session.
|
|
|
411
548
|
|
|
412
549
|
`R` *extends* `FetchReturnType` = `"response"`
|
|
413
550
|
|
|
414
|
-
The expected return type of the action.
|
|
415
|
-
|
|
416
551
|
#### Parameters
|
|
417
552
|
|
|
418
553
|
##### actionOptions
|
|
419
554
|
|
|
420
555
|
`_RequireAtLeastOne`
|
|
421
556
|
|
|
422
|
-
Configuration for the action to be executed.
|
|
423
|
-
|
|
424
557
|
##### context?
|
|
425
558
|
|
|
426
559
|
`FetchContext`
|
|
427
560
|
|
|
428
|
-
Optional context override for this specific execution. Defaults to the session context.
|
|
429
|
-
|
|
430
561
|
#### Returns
|
|
431
562
|
|
|
432
563
|
`Promise`\<`FetchActionResult`\<`R`\>\>
|
|
433
564
|
|
|
434
|
-
A promise that resolves to the result of the action.
|
|
435
|
-
|
|
436
|
-
#### Example
|
|
437
|
-
|
|
438
|
-
```ts
|
|
439
|
-
await session.execute({ name: 'goto', params: { url: 'https://example.com' } });
|
|
440
|
-
```
|
|
441
|
-
|
|
442
565
|
#### Inherited from
|
|
443
566
|
|
|
444
567
|
`FetchSession.execute`
|
|
@@ -449,7 +572,7 @@ await session.execute({ name: 'goto', params: { url: 'https://example.com' } });
|
|
|
449
572
|
|
|
450
573
|
> **executeAll**(`actions`, `options?`): `Promise`\<\{ `outputs`: `Record`\<`string`, `any`\>; `result`: `FetchResponse` \| `undefined`; \}\>
|
|
451
574
|
|
|
452
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
575
|
+
Defined in: web-fetcher/dist/index.d.ts:1205
|
|
453
576
|
|
|
454
577
|
Executes a sequence of actions.
|
|
455
578
|
|
|
@@ -457,13 +580,13 @@ Executes a sequence of actions.
|
|
|
457
580
|
|
|
458
581
|
##### actions
|
|
459
582
|
|
|
460
|
-
`_RequireAtLeastOne`\<`FetchActionProperties`, `"
|
|
583
|
+
`_RequireAtLeastOne`\<`FetchActionProperties`, `"name"` \| `"id"` \| `"action"`\>[]
|
|
461
584
|
|
|
462
585
|
An array of action options to be executed in order.
|
|
463
586
|
|
|
464
587
|
##### options?
|
|
465
588
|
|
|
466
|
-
`Partial
|
|
589
|
+
`Partial`\<[`FetcherOptions`](../interfaces/FetcherOptions.md)\> & `object`
|
|
467
590
|
|
|
468
591
|
Optional temporary configuration overrides (e.g., timeoutMs, headers) for this batch of actions.
|
|
469
592
|
These overrides do not affect the main session context.
|
|
@@ -493,7 +616,7 @@ const { result, outputs } = await session.executeAll([
|
|
|
493
616
|
|
|
494
617
|
> `protected` **formatOptions**(`options`): `Record`\<`string`, `any`\>
|
|
495
618
|
|
|
496
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
619
|
+
Defined in: [web-searcher/src/searcher.ts:498](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L498)
|
|
497
620
|
|
|
498
621
|
Transforms standard options into engine-specific template variables.
|
|
499
622
|
|
|
@@ -521,7 +644,7 @@ A dictionary of variables to be injected into the template.
|
|
|
521
644
|
|
|
522
645
|
> **getOutputs**(): `Record`\<`string`, `any`\>
|
|
523
646
|
|
|
524
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
647
|
+
Defined in: web-fetcher/dist/index.d.ts:1216
|
|
525
648
|
|
|
526
649
|
Retrieves all outputs accumulated during the session.
|
|
527
650
|
|
|
@@ -541,7 +664,7 @@ A record of stored output data.
|
|
|
541
664
|
|
|
542
665
|
> **getState**(): `Promise`\<\{ `cookies`: `Cookie`[]; `sessionState?`: `any`; \} \| `undefined`\>
|
|
543
666
|
|
|
544
|
-
Defined in: web-fetcher/dist/index.d.ts:
|
|
667
|
+
Defined in: web-fetcher/dist/index.d.ts:1222
|
|
545
668
|
|
|
546
669
|
Gets the current state of the session, including cookies and engine-specific state.
|
|
547
670
|
|
|
@@ -557,16 +680,49 @@ A promise resolving to the session state, or undefined if no engine is initializ
|
|
|
557
680
|
|
|
558
681
|
***
|
|
559
682
|
|
|
683
|
+
### getTemplate()
|
|
684
|
+
|
|
685
|
+
> `protected` **getTemplate**(`variables`, `options`): [`FetcherOptions`](../interfaces/FetcherOptions.md)
|
|
686
|
+
|
|
687
|
+
Defined in: [web-searcher/src/searcher.ts:250](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L250)
|
|
688
|
+
|
|
689
|
+
Dynamically retrieves the fetch template based on current variables and search options.
|
|
690
|
+
|
|
691
|
+
Subclasses can override this method to return different extraction rules (actions)
|
|
692
|
+
or URL patterns based on the search category, region, or other parameters.
|
|
693
|
+
|
|
694
|
+
#### Parameters
|
|
695
|
+
|
|
696
|
+
##### variables
|
|
697
|
+
|
|
698
|
+
`Record`\<`string`, `any`\>
|
|
699
|
+
|
|
700
|
+
The calculated variables (from formatOptions, pagination, etc.).
|
|
701
|
+
|
|
702
|
+
##### options
|
|
703
|
+
|
|
704
|
+
[`SearchOptions`](../interfaces/SearchOptions.md)
|
|
705
|
+
|
|
706
|
+
The original search options provided by the user.
|
|
707
|
+
|
|
708
|
+
#### Returns
|
|
709
|
+
|
|
710
|
+
[`FetcherOptions`](../interfaces/FetcherOptions.md)
|
|
711
|
+
|
|
712
|
+
The fetcher configuration to be used for the current request.
|
|
713
|
+
|
|
714
|
+
***
|
|
715
|
+
|
|
560
716
|
### search()
|
|
561
717
|
|
|
562
718
|
> **search**(`query`, `options`): `Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
|
|
563
719
|
|
|
564
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
720
|
+
Defined in: [web-searcher/src/searcher.ts:284](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L284)
|
|
565
721
|
|
|
566
722
|
Executes a search query.
|
|
567
723
|
|
|
568
|
-
This method handles the pagination loop, variable injection,
|
|
569
|
-
and result transformation.
|
|
724
|
+
This method handles the pagination loop, multi-instance failover, variable injection,
|
|
725
|
+
fetching, and result transformation.
|
|
570
726
|
|
|
571
727
|
#### Parameters
|
|
572
728
|
|
|
@@ -594,7 +750,7 @@ A promise resolving to an array of standardized search results.
|
|
|
594
750
|
|
|
595
751
|
> `protected` **transform**(`outputs`, `context`): `Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
|
|
596
752
|
|
|
597
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
753
|
+
Defined in: [web-searcher/src/searcher.ts:480](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L480)
|
|
598
754
|
|
|
599
755
|
Transform and clean the raw extracted results.
|
|
600
756
|
|
|
@@ -623,24 +779,71 @@ A promise resolving to an array of standardized search results.
|
|
|
623
779
|
|
|
624
780
|
***
|
|
625
781
|
|
|
782
|
+
### validateFetchResult()
|
|
783
|
+
|
|
784
|
+
> `protected` **validateFetchResult**(`results`, `context`): `Promise`\<`boolean`\>
|
|
785
|
+
|
|
786
|
+
Defined in: [web-searcher/src/searcher.ts:462](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L462)
|
|
787
|
+
|
|
788
|
+
Hook for subclasses to validate fetched results before they are accepted.
|
|
789
|
+
If this returns false, the instance manager will consider the fetch a failure
|
|
790
|
+
and automatically switch to the next available baseUrl (if any).
|
|
791
|
+
|
|
792
|
+
#### Parameters
|
|
793
|
+
|
|
794
|
+
##### results
|
|
795
|
+
|
|
796
|
+
[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]
|
|
797
|
+
|
|
798
|
+
The extracted results.
|
|
799
|
+
|
|
800
|
+
##### context
|
|
801
|
+
|
|
802
|
+
[`SearchContext`](../interfaces/SearchContext.md)
|
|
803
|
+
|
|
804
|
+
Context including the current baseUrl and page.
|
|
805
|
+
|
|
806
|
+
#### Returns
|
|
807
|
+
|
|
808
|
+
`Promise`\<`boolean`\>
|
|
809
|
+
|
|
810
|
+
A promise resolving to true if valid, false otherwise.
|
|
811
|
+
|
|
812
|
+
***
|
|
813
|
+
|
|
814
|
+
### getDefaultOptions()
|
|
815
|
+
|
|
816
|
+
> `static` **getDefaultOptions**(): [`SearchOptions`](../interfaces/SearchOptions.md)
|
|
817
|
+
|
|
818
|
+
Defined in: [web-searcher/src/searcher.ts:76](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L76)
|
|
819
|
+
|
|
820
|
+
Retrieves the combined default search options by traversing the prototype chain.
|
|
821
|
+
Priority: Current class > Parent class > WebSearcher base class.
|
|
822
|
+
|
|
823
|
+
#### Returns
|
|
824
|
+
|
|
825
|
+
[`SearchOptions`](../interfaces/SearchOptions.md)
|
|
826
|
+
|
|
827
|
+
***
|
|
828
|
+
|
|
626
829
|
### search()
|
|
627
830
|
|
|
628
|
-
> `static` **search**(`
|
|
831
|
+
> `static` **search**(`engineNames`, `query`, `options`): `Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
|
|
629
832
|
|
|
630
|
-
Defined in: [web-searcher/src/searcher.ts:
|
|
833
|
+
Defined in: [web-searcher/src/searcher.ts:150](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/searcher.ts#L150)
|
|
631
834
|
|
|
632
|
-
Static helper to execute a one-off search.
|
|
835
|
+
Static helper to execute a one-off search or a fallback chain.
|
|
633
836
|
|
|
634
|
-
It creates an instance of the specified engine, executes the search, and
|
|
635
|
-
|
|
837
|
+
It creates an instance of the specified engine(s), executes the search, and automatically
|
|
838
|
+
falls back to the next engine in the list if the current one fails or is exhausted.
|
|
636
839
|
|
|
637
840
|
#### Parameters
|
|
638
841
|
|
|
639
|
-
#####
|
|
842
|
+
##### engineNames
|
|
640
843
|
|
|
641
|
-
|
|
844
|
+
The name(s) of the engine(s) to use (e.g., 'Google' or ['SearXNG', 'Google']).
|
|
642
845
|
|
|
643
|
-
|
|
846
|
+
`string` | `string`[]
|
|
644
847
|
|
|
645
848
|
##### query
|
|
646
849
|
|
|
@@ -650,7 +853,7 @@ The search query string.
|
|
|
650
853
|
|
|
651
854
|
##### options
|
|
652
855
|
|
|
653
|
-
[`SearchOptions`](../interfaces/SearchOptions.md) & `FetcherOptions` = `{}`
|
|
856
|
+
[`SearchOptions`](../interfaces/SearchOptions.md) & [`FetcherOptions`](../interfaces/FetcherOptions.md) = `{}`
|
|
654
857
|
|
|
655
858
|
Combined search options and fetcher options.
|
|
656
859
|
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
[**@isdk/web-searcher**](../README.md)
|
|
2
|
+
|
|
3
|
+
***
|
|
4
|
+
|
|
5
|
+
[@isdk/web-searcher](../globals.md) / extractDate
|
|
6
|
+
|
|
7
|
+
# Function: extractDate()
|
|
8
|
+
|
|
9
|
+
> **extractDate**(`url`, `options`): `Promise`\<`string` \| `null`\>
|
|
10
|
+
|
|
11
|
+
Defined in: [web-searcher/src/utils/extractor/date-extractor.ts:30](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/utils/extractor/date-extractor.ts#L30)
|
|
12
|
+
|
|
13
|
+
High-level convenience function to extract the publication or modification date from a URL.
|
|
14
|
+
It performs a partial fetch of the content and applies multiple extraction rules
|
|
15
|
+
(LD+JSON, Meta tags, Time tags, Headers) to find the most reliable date.
|
|
16
|
+
|
|
17
|
+
## Parameters
|
|
18
|
+
|
|
19
|
+
### url
|
|
20
|
+
|
|
21
|
+
`string`
|
|
22
|
+
|
|
23
|
+
The web page URL to analyze.
|
|
24
|
+
|
|
25
|
+
### options
|
|
26
|
+
|
|
27
|
+
[`ExtractOptions`](../interfaces/ExtractOptions.md) = `{}`
|
|
28
|
+
|
|
29
|
+
Fetch and extraction options.
|
|
30
|
+
|
|
31
|
+
## Returns
|
|
32
|
+
|
|
33
|
+
`Promise`\<`string` \| `null`\>
|
|
34
|
+
|
|
35
|
+
An ISO 8601 date string, or null if no valid date could be found.
|
|
36
|
+
|
|
37
|
+
## Example
|
|
38
|
+
|
|
39
|
+
```ts
|
|
40
|
+
const date = await extractDate('https://example.com/article');
|
|
41
|
+
console.log(date); // "2024-01-20T12:00:00.000Z"
|
|
42
|
+
```
|
|
@@ -0,0 +1,40 @@
|
|
|
1
|
+
[**@isdk/web-searcher**](../README.md)
|
|
2
|
+
|
|
3
|
+
***
|
|
4
|
+
|
|
5
|
+
[@isdk/web-searcher](../globals.md) / extractMetadataFrom
|
|
6
|
+
|
|
7
|
+
# Function: extractMetadataFrom()
|
|
8
|
+
|
|
9
|
+
> **extractMetadataFrom**(`result`, `type`): `string` \| `null`
|
|
10
|
+
|
|
11
|
+
Defined in: [web-searcher/src/utils/extractor/extractor.ts:27](https://github.com/isdk/web-searcher.js/blob/0c4757eb75b3b7c5af0231806f11e7b3c3166736/src/utils/extractor/extractor.ts#L27)
|
|
12
|
+
|
|
13
|
+
Extracts specific metadata from parsed HTML and headers based on a requested type.
|
|
14
|
+
Currently supports 'date' extraction with a prioritized fallback mechanism.
|
|
15
|
+
|
|
16
|
+
## Parameters
|
|
17
|
+
|
|
18
|
+
### result
|
|
19
|
+
|
|
20
|
+
An object containing the raw HTML content and response headers.
|
|
21
|
+
|
|
22
|
+
#### content
|
|
23
|
+
|
|
24
|
+
`string`
|
|
25
|
+
|
|
26
|
+
#### headers
|
|
27
|
+
|
|
28
|
+
`Headers`
|
|
29
|
+
|
|
30
|
+
### type
|
|
31
|
+
|
|
32
|
+
`string`
|
|
33
|
+
|
|
34
|
+
The type of metadata to extract.
|
|
35
|
+
|
|
36
|
+
## Returns
|
|
37
|
+
|
|
38
|
+
`string` \| `null`
|
|
39
|
+
|
|
40
|
+
The extracted and normalized value, or null if not found.
|