@isdk/web-searcher 0.1.4 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. package/README.cn.md +156 -7
  2. package/README.md +156 -7
  3. package/dist/index.d.mts +220 -11
  4. package/dist/index.d.ts +220 -11
  5. package/dist/index.js +1 -1
  6. package/dist/index.mjs +1 -1
  7. package/docs/README.md +156 -7
  8. package/docs/classes/GoogleSearcher.md +171 -44
  9. package/docs/classes/WebSearcher.md +158 -45
  10. package/docs/functions/extractDate.md +42 -0
  11. package/docs/functions/extractMetadataFrom.md +40 -0
  12. package/docs/functions/fetchHeaders.md +34 -0
  13. package/docs/functions/fetchPartial.md +41 -0
  14. package/docs/functions/normalizeDate.md +29 -0
  15. package/docs/functions/parseHeaders.md +28 -0
  16. package/docs/functions/parseHtml.md +31 -0
  17. package/docs/functions/testUrlsByLatency.md +38 -0
  18. package/docs/globals.md +18 -0
  19. package/docs/interfaces/CustomTimeRange.md +3 -3
  20. package/docs/interfaces/ExtractOptions.md +54 -0
  21. package/docs/interfaces/FetchExtractorOptions.md +35 -0
  22. package/docs/interfaces/FetcherOptions.md +424 -0
  23. package/docs/interfaces/HtmlData.md +53 -0
  24. package/docs/interfaces/MetadataResult.md +27 -0
  25. package/docs/interfaces/PaginationConfig.md +9 -9
  26. package/docs/interfaces/SearchContext.md +30 -4
  27. package/docs/interfaces/SearchOptions.md +77 -11
  28. package/docs/interfaces/StandardSearchResult.md +10 -10
  29. package/docs/interfaces/VerifiedUrl.md +25 -0
  30. package/docs/type-aliases/MetadataType.md +13 -0
  31. package/docs/type-aliases/SafeSearchLevel.md +1 -1
  32. package/docs/type-aliases/SearchCategory.md +2 -2
  33. package/docs/type-aliases/SearchTimeRange.md +1 -1
  34. package/docs/type-aliases/SearchTimeRangePreset.md +1 -1
  35. package/docs/type-aliases/SearcherConstructor.md +2 -2
  36. package/package.json +3 -2
@@ -6,7 +6,7 @@
6
6
 
7
7
  # Abstract Class: WebSearcher
8
8
 
9
- Defined in: [web-searcher/src/searcher.ts:31](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L31)
9
+ Defined in: [web-searcher/src/searcher.ts:32](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L32)
10
10
 
11
11
  The abstract base class for all search engines.
12
12
 
@@ -41,7 +41,7 @@ WebSearcher.register(MySearcher);
41
41
 
42
42
  > **new WebSearcher**(`options?`): `WebSearcher`
43
43
 
44
- Defined in: web-fetcher/dist/index.d.ts:2287
44
+ Defined in: web-fetcher/dist/index.d.ts:1171
45
45
 
46
46
  Creates a new FetchSession.
47
47
 
@@ -49,7 +49,7 @@ Creates a new FetchSession.
49
49
 
50
50
  ##### options?
51
51
 
52
- `FetcherOptions`
52
+ [`FetcherOptions`](../interfaces/FetcherOptions.md)
53
53
 
54
54
  Configuration options for the fetcher.
55
55
 
@@ -67,7 +67,7 @@ Configuration options for the fetcher.
67
67
 
68
68
  > `protected` **closed**: `boolean`
69
69
 
70
- Defined in: web-fetcher/dist/index.d.ts:2281
70
+ Defined in: web-fetcher/dist/index.d.ts:1165
71
71
 
72
72
  #### Inherited from
73
73
 
@@ -79,7 +79,7 @@ Defined in: web-fetcher/dist/index.d.ts:2281
79
79
 
80
80
  > `readonly` **context**: `FetchContext`
81
81
 
82
- Defined in: web-fetcher/dist/index.d.ts:2280
82
+ Defined in: web-fetcher/dist/index.d.ts:1164
83
83
 
84
84
  The execution context for this session, containing configurations, event bus, and shared state.
85
85
 
@@ -93,7 +93,7 @@ The execution context for this session, containing configurations, event bus, an
93
93
 
94
94
  > `readonly` **id**: `string`
95
95
 
96
- Defined in: web-fetcher/dist/index.d.ts:2276
96
+ Defined in: web-fetcher/dist/index.d.ts:1160
97
97
 
98
98
  Unique identifier for the session.
99
99
 
@@ -105,9 +105,9 @@ Unique identifier for the session.
105
105
 
106
106
  ### options
107
107
 
108
- > `protected` **options**: `FetcherOptions`
108
+ > `protected` **options**: [`FetcherOptions`](../interfaces/FetcherOptions.md)
109
109
 
110
- Defined in: web-fetcher/dist/index.d.ts:2272
110
+ Defined in: web-fetcher/dist/index.d.ts:1156
111
111
 
112
112
  #### Inherited from
113
113
 
@@ -119,7 +119,7 @@ Defined in: web-fetcher/dist/index.d.ts:2272
119
119
 
120
120
  > `static` **\_isFactory**: `boolean` = `false`
121
121
 
122
- Defined in: [web-searcher/src/searcher.ts:33](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L33)
122
+ Defined in: [web-searcher/src/searcher.ts:34](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L34)
123
123
 
124
124
  ***
125
125
 
@@ -127,7 +127,7 @@ Defined in: [web-searcher/src/searcher.ts:33](https://github.com/isdk/web-search
127
127
 
128
128
  > `static` `optional` **alias**: `string` \| `string`[]
129
129
 
130
- Defined in: [web-searcher/src/searcher.ts:45](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L45)
130
+ Defined in: [web-searcher/src/searcher.ts:46](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L46)
131
131
 
132
132
  Engine alias(es). Can be a single string or an array of strings.
133
133
  Useful for registering shorthand names (e.g., 'g' for 'Google').
@@ -138,7 +138,7 @@ Useful for registering shorthand names (e.g., 'g' for 'Google').
138
138
 
139
139
  > `static` **createObject**: (`name`, ...`args`) => `WebSearcher`
140
140
 
141
- Defined in: [web-searcher/src/searcher.ts:78](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L78)
141
+ Defined in: [web-searcher/src/searcher.ts:85](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L85)
142
142
 
143
143
  Creates an instance of the registered search engine.
144
144
 
@@ -164,11 +164,31 @@ An instance of the search engine.
164
164
 
165
165
  ***
166
166
 
167
+ ### currentInstanceIndex?
168
+
169
+ > `static` `optional` **currentInstanceIndex**: `number`
170
+
171
+ Defined in: [web-searcher/src/searcher.ts:52](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L52)
172
+
173
+ Globally shared index for tracking the currently active instance (node) across sessions.
174
+
175
+ ***
176
+
177
+ ### defaultBaseUrls?
178
+
179
+ > `static` `optional` **defaultBaseUrls**: `string`[]
180
+
181
+ Defined in: [web-searcher/src/searcher.ts:49](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L49)
182
+
183
+ Default base URLs for engines that support multiple instances.
184
+
185
+ ***
186
+
167
187
  ### forEach()
168
188
 
169
189
  > `static` **forEach**: (`cb`) => `void`
170
190
 
171
- Defined in: [web-searcher/src/searcher.ts:85](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L85)
191
+ Defined in: [web-searcher/src/searcher.ts:92](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L92)
172
192
 
173
193
  Iterates over all registered engines.
174
194
 
@@ -190,7 +210,7 @@ Callback function to invoke for each registered engine.
190
210
 
191
211
  > `static` **get**: (`name`) => *typeof* `WebSearcher`
192
212
 
193
- Defined in: [web-searcher/src/searcher.ts:69](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L69)
213
+ Defined in: [web-searcher/src/searcher.ts:76](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L76)
194
214
 
195
215
  Retrieves a registered search engine class by name.
196
216
 
@@ -214,7 +234,7 @@ The search engine class constructor.
214
234
 
215
235
  > `static` `optional` **name**: `string`
216
236
 
217
- Defined in: [web-searcher/src/searcher.ts:40](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L40)
237
+ Defined in: [web-searcher/src/searcher.ts:41](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L41)
218
238
 
219
239
  Custom engine name. If not provided, it is derived from the class name.
220
240
  For example, `GoogleSearcher` becomes `Google`.
@@ -225,7 +245,7 @@ For example, `GoogleSearcher` becomes `Google`.
225
245
 
226
246
  > `static` **register**: (`ctor`, `options?`) => `boolean`
227
247
 
228
- Defined in: [web-searcher/src/searcher.ts:54](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L54)
248
+ Defined in: [web-searcher/src/searcher.ts:61](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L61)
229
249
 
230
250
  Registers a search engine class.
231
251
 
@@ -255,7 +275,7 @@ Registration options. If a string is provided, it is used as the registered name
255
275
 
256
276
  > `static` **setAliases**: (`ctor`, ...`aliases`) => `void`
257
277
 
258
- Defined in: [web-searcher/src/searcher.ts:93](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L93)
278
+ Defined in: [web-searcher/src/searcher.ts:100](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L100)
259
279
 
260
280
  Sets aliases for a registered engine.
261
281
 
@@ -283,7 +303,7 @@ Aliases to add.
283
303
 
284
304
  > `static` **unregister**: (`name?`) => `void`
285
305
 
286
- Defined in: [web-searcher/src/searcher.ts:61](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L61)
306
+ Defined in: [web-searcher/src/searcher.ts:68](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L68)
287
307
 
288
308
  Unregisters a search engine.
289
309
 
@@ -307,7 +327,7 @@ The name or class to unregister.
307
327
 
308
328
  > **get** **pagination**(): [`PaginationConfig`](../interfaces/PaginationConfig.md) \| `undefined`
309
329
 
310
- Defined in: [web-searcher/src/searcher.ts:151](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L151)
330
+ Defined in: [web-searcher/src/searcher.ts:198](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L198)
311
331
 
312
332
  Optional pagination configuration.
313
333
  Defines how the searcher navigates to subsequent pages.
@@ -324,15 +344,17 @@ If undefined, the searcher will only fetch the first page.
324
344
 
325
345
  #### Get Signature
326
346
 
327
- > **get** `abstract` **template**(): `FetcherOptions`
347
+ > **get** **template**(): [`FetcherOptions`](../interfaces/FetcherOptions.md)
328
348
 
329
- Defined in: [web-searcher/src/searcher.ts:143](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L143)
349
+ Defined in: [web-searcher/src/searcher.ts:188](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L188)
330
350
 
331
351
  The declarative template for the fetch options.
332
352
 
333
- Subclasses **must** implement this getter to provide the engine configuration,
353
+ Subclasses can implement this getter to provide the engine configuration,
334
354
  including the base URL, search parameters pattern, and extraction rules.
335
355
 
356
+ This getter is **optional** if you override [getTemplate](#gettemplate).
357
+
336
358
  Supports variable injection using syntax like `${query}`, `${offset}`, etc.
337
359
 
338
360
  ##### Example
@@ -348,21 +370,47 @@ get template() {
348
370
 
349
371
  ##### Returns
350
372
 
351
- `FetcherOptions`
373
+ [`FetcherOptions`](../interfaces/FetcherOptions.md)
352
374
 
353
375
  ## Methods
354
376
 
377
+ ### \_logDebug()
378
+
379
+ > `protected` **\_logDebug**(`category`, ...`args`): `void`
380
+
381
+ Defined in: web-fetcher/dist/index.d.ts:1172
382
+
383
+ #### Parameters
384
+
385
+ ##### category
386
+
387
+ `string`
388
+
389
+ ##### args
390
+
391
+ ...`any`[]
392
+
393
+ #### Returns
394
+
395
+ `void`
396
+
397
+ #### Inherited from
398
+
399
+ `FetchSession._logDebug`
400
+
401
+ ***
402
+
355
403
  ### createContext()
356
404
 
357
405
  > `protected` **createContext**(`options`): `FetchContext`
358
406
 
359
- Defined in: [web-searcher/src/searcher.ts:155](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L155)
407
+ Defined in: [web-searcher/src/searcher.ts:216](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L216)
360
408
 
361
409
  #### Parameters
362
410
 
363
411
  ##### options
364
412
 
365
- `FetcherOptions` = `...`
413
+ [`FetcherOptions`](../interfaces/FetcherOptions.md) = `...`
366
414
 
367
415
  #### Returns
368
416
 
@@ -378,7 +426,7 @@ Defined in: [web-searcher/src/searcher.ts:155](https://github.com/isdk/web-searc
378
426
 
379
427
  > **dispose**(): `Promise`\<`void`\>
380
428
 
381
- Defined in: web-fetcher/dist/index.d.ts:2346
429
+ Defined in: web-fetcher/dist/index.d.ts:1231
382
430
 
383
431
  Disposes of the session and its associated engine.
384
432
 
@@ -401,7 +449,7 @@ This method should be called when the session is no longer needed to free up res
401
449
 
402
450
  > **execute**\<`R`\>(`actionOptions`, `context?`): `Promise`\<`FetchActionResult`\<`R`\>\>
403
451
 
404
- Defined in: web-fetcher/dist/index.d.ts:2301
452
+ Defined in: web-fetcher/dist/index.d.ts:1186
405
453
 
406
454
  Executes a single action within the session.
407
455
 
@@ -449,7 +497,7 @@ await session.execute({ name: 'goto', params: { url: 'https://example.com' } });
449
497
 
450
498
  > **executeAll**(`actions`, `options?`): `Promise`\<\{ `outputs`: `Record`\<`string`, `any`\>; `result`: `FetchResponse` \| `undefined`; \}\>
451
499
 
452
- Defined in: web-fetcher/dist/index.d.ts:2318
500
+ Defined in: web-fetcher/dist/index.d.ts:1203
453
501
 
454
502
  Executes a sequence of actions.
455
503
 
@@ -457,13 +505,13 @@ Executes a sequence of actions.
457
505
 
458
506
  ##### actions
459
507
 
460
- `_RequireAtLeastOne`\<`FetchActionProperties`, `"id"` \| `"name"` \| `"action"`\>[]
508
+ `_RequireAtLeastOne`\<`FetchActionProperties`, `"name"` \| `"id"` \| `"action"`\>[]
461
509
 
462
510
  An array of action options to be executed in order.
463
511
 
464
512
  ##### options?
465
513
 
466
- `Partial`\<`FetcherOptions`\> & `object`
514
+ `Partial`\<[`FetcherOptions`](../interfaces/FetcherOptions.md)\> & `object`
467
515
 
468
516
  Optional temporary configuration overrides (e.g., timeoutMs, headers) for this batch of actions.
469
517
  These overrides do not affect the main session context.
@@ -493,7 +541,7 @@ const { result, outputs } = await session.executeAll([
493
541
 
494
542
  > `protected` **formatOptions**(`options`): `Record`\<`string`, `any`\>
495
543
 
496
- Defined in: [web-searcher/src/searcher.ts:312](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L312)
544
+ Defined in: [web-searcher/src/searcher.ts:457](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L457)
497
545
 
498
546
  Transforms standard options into engine-specific template variables.
499
547
 
@@ -521,7 +569,7 @@ A dictionary of variables to be injected into the template.
521
569
 
522
570
  > **getOutputs**(): `Record`\<`string`, `any`\>
523
571
 
524
- Defined in: web-fetcher/dist/index.d.ts:2329
572
+ Defined in: web-fetcher/dist/index.d.ts:1214
525
573
 
526
574
  Retrieves all outputs accumulated during the session.
527
575
 
@@ -541,7 +589,7 @@ A record of stored output data.
541
589
 
542
590
  > **getState**(): `Promise`\<\{ `cookies`: `Cookie`[]; `sessionState?`: `any`; \} \| `undefined`\>
543
591
 
544
- Defined in: web-fetcher/dist/index.d.ts:2335
592
+ Defined in: web-fetcher/dist/index.d.ts:1220
545
593
 
546
594
  Gets the current state of the session, including cookies and engine-specific state.
547
595
 
@@ -557,16 +605,49 @@ A promise resolving to the session state, or undefined if no engine is initializ
557
605
 
558
606
  ***
559
607
 
608
+ ### getTemplate()
609
+
610
+ > `protected` **getTemplate**(`variables`, `options`): [`FetcherOptions`](../interfaces/FetcherOptions.md)
611
+
612
+ Defined in: [web-searcher/src/searcher.ts:212](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L212)
613
+
614
+ Dynamically retrieves the fetch template based on current variables and search options.
615
+
616
+ Subclasses can override this method to return different extraction rules (actions)
617
+ or URL patterns based on the search category, region, or other parameters.
618
+
619
+ #### Parameters
620
+
621
+ ##### variables
622
+
623
+ `Record`\<`string`, `any`\>
624
+
625
+ The calculated variables (from formatOptions, pagination, etc.).
626
+
627
+ ##### options
628
+
629
+ [`SearchOptions`](../interfaces/SearchOptions.md)
630
+
631
+ The original search options provided by the user.
632
+
633
+ #### Returns
634
+
635
+ [`FetcherOptions`](../interfaces/FetcherOptions.md)
636
+
637
+ The fetcher configuration to be used for the current request.
638
+
639
+ ***
640
+
560
641
  ### search()
561
642
 
562
643
  > **search**(`query`, `options`): `Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
563
644
 
564
- Defined in: [web-searcher/src/searcher.ts:182](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L182)
645
+ Defined in: [web-searcher/src/searcher.ts:246](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L246)
565
646
 
566
647
  Executes a search query.
567
648
 
568
- This method handles the pagination loop, variable injection, fetching,
569
- and result transformation.
649
+ This method handles the pagination loop, multi-instance failover, variable injection,
650
+ fetching, and result transformation.
570
651
 
571
652
  #### Parameters
572
653
 
@@ -594,7 +675,7 @@ A promise resolving to an array of standardized search results.
594
675
 
595
676
  > `protected` **transform**(`outputs`, `context`): `Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
596
677
 
597
- Defined in: [web-searcher/src/searcher.ts:294](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L294)
678
+ Defined in: [web-searcher/src/searcher.ts:439](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L439)
598
679
 
599
680
  Transform and clean the raw extracted results.
600
681
 
@@ -623,24 +704,56 @@ A promise resolving to an array of standardized search results.
623
704
 
624
705
  ***
625
706
 
707
+ ### validateFetchResult()
708
+
709
+ > `protected` **validateFetchResult**(`results`, `context`): `Promise`\<`boolean`\>
710
+
711
+ Defined in: [web-searcher/src/searcher.ts:421](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L421)
712
+
713
+ Hook for subclasses to validate fetched results before they are accepted.
714
+ If this returns false, the instance manager will consider the fetch a failure
715
+ and automatically switch to the next available baseUrl (if any).
716
+
717
+ #### Parameters
718
+
719
+ ##### results
720
+
721
+ [`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]
722
+
723
+ The extracted results.
724
+
725
+ ##### context
726
+
727
+ [`SearchContext`](../interfaces/SearchContext.md)
728
+
729
+ Context including the current baseUrl and page.
730
+
731
+ #### Returns
732
+
733
+ `Promise`\<`boolean`\>
734
+
735
+ A promise resolving to true if valid, false otherwise.
736
+
737
+ ***
738
+
626
739
  ### search()
627
740
 
628
- > `static` **search**(`engineName`, `query`, `options`): `Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
741
+ > `static` **search**(`engineNames`, `query`, `options`): `Promise`\<[`StandardSearchResult`](../interfaces/StandardSearchResult.md)[]\>
629
742
 
630
- Defined in: [web-searcher/src/searcher.ts:106](https://github.com/isdk/web-searcher.js/blob/7bcd8cca4a3a7fc201a5cf3e3b4283f267eadcea/src/searcher.ts#L106)
743
+ Defined in: [web-searcher/src/searcher.ts:113](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/searcher.ts#L113)
631
744
 
632
- Static helper to execute a one-off search.
745
+ Static helper to execute a one-off search or a fallback chain.
633
746
 
634
- It creates an instance of the specified engine, executes the search, and then
635
- automatically disposes of the session.
747
+ It creates an instance of the specified engine(s), executes the search, and automatically
748
+ falls back to the next engine in the list if the current one fails or is exhausted.
636
749
 
637
750
  #### Parameters
638
751
 
639
- ##### engineName
752
+ ##### engineNames
640
753
 
641
- `string`
754
+ The name(s) of the engine(s) to use (e.g., 'Google' or ['SearXNG', 'Google']).
642
755
 
643
- The name of the engine to use (e.g., 'Google').
756
+ `string` | `string`[]
644
757
 
645
758
  ##### query
646
759
 
@@ -650,7 +763,7 @@ The search query string.
650
763
 
651
764
  ##### options
652
765
 
653
- [`SearchOptions`](../interfaces/SearchOptions.md) & `FetcherOptions` = `{}`
766
+ [`SearchOptions`](../interfaces/SearchOptions.md) & [`FetcherOptions`](../interfaces/FetcherOptions.md) = `{}`
654
767
 
655
768
  Combined search options and fetcher options.
656
769
 
@@ -0,0 +1,42 @@
1
+ [**@isdk/web-searcher**](../README.md)
2
+
3
+ ***
4
+
5
+ [@isdk/web-searcher](../globals.md) / extractDate
6
+
7
+ # Function: extractDate()
8
+
9
+ > **extractDate**(`url`, `options`): `Promise`\<`string` \| `null`\>
10
+
11
+ Defined in: [web-searcher/src/utils/extractor/date-extractor.ts:30](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/utils/extractor/date-extractor.ts#L30)
12
+
13
+ High-level convenience function to extract the publication or modification date from a URL.
14
+ It performs a partial fetch of the content and applies multiple extraction rules
15
+ (LD+JSON, Meta tags, Time tags, Headers) to find the most reliable date.
16
+
17
+ ## Parameters
18
+
19
+ ### url
20
+
21
+ `string`
22
+
23
+ The web page URL to analyze.
24
+
25
+ ### options
26
+
27
+ [`ExtractOptions`](../interfaces/ExtractOptions.md) = `{}`
28
+
29
+ Fetch and extraction options.
30
+
31
+ ## Returns
32
+
33
+ `Promise`\<`string` \| `null`\>
34
+
35
+ An ISO 8601 date string, or null if no valid date could be found.
36
+
37
+ ## Example
38
+
39
+ ```ts
40
+ const date = await extractDate('https://example.com/article');
41
+ console.log(date); // "2024-01-20T12:00:00.000Z"
42
+ ```
@@ -0,0 +1,40 @@
1
+ [**@isdk/web-searcher**](../README.md)
2
+
3
+ ***
4
+
5
+ [@isdk/web-searcher](../globals.md) / extractMetadataFrom
6
+
7
+ # Function: extractMetadataFrom()
8
+
9
+ > **extractMetadataFrom**(`result`, `type`): `string` \| `null`
10
+
11
+ Defined in: [web-searcher/src/utils/extractor/extractor.ts:27](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/utils/extractor/extractor.ts#L27)
12
+
13
+ Extracts specific metadata from parsed HTML and headers based on a requested type.
14
+ Currently supports 'date' extraction with a prioritized fallback mechanism.
15
+
16
+ ## Parameters
17
+
18
+ ### result
19
+
20
+ An object containing the raw HTML content and response headers.
21
+
22
+ #### content
23
+
24
+ `string`
25
+
26
+ #### headers
27
+
28
+ `Headers`
29
+
30
+ ### type
31
+
32
+ `string`
33
+
34
+ The type of metadata to extract.
35
+
36
+ ## Returns
37
+
38
+ `string` \| `null`
39
+
40
+ The extracted and normalized value, or null if not found.
@@ -0,0 +1,34 @@
1
+ [**@isdk/web-searcher**](../README.md)
2
+
3
+ ***
4
+
5
+ [@isdk/web-searcher](../globals.md) / fetchHeaders
6
+
7
+ # Function: fetchHeaders()
8
+
9
+ > **fetchHeaders**(`url`, `options`): `Promise`\<`Headers` \| `null`\>
10
+
11
+ Defined in: [web-searcher/src/utils/extractor/fetcher.ts:19](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/utils/extractor/fetcher.ts#L19)
12
+
13
+ Fetches only the HTTP headers for a given URL using a HEAD request.
14
+ Useful for checking 'last-modified' without downloading the body.
15
+
16
+ ## Parameters
17
+
18
+ ### url
19
+
20
+ `string`
21
+
22
+ The URL to check.
23
+
24
+ ### options
25
+
26
+ [`FetchExtractorOptions`](../interfaces/FetchExtractorOptions.md) = `{}`
27
+
28
+ Request options.
29
+
30
+ ## Returns
31
+
32
+ `Promise`\<`Headers` \| `null`\>
33
+
34
+ The Headers object, or null on failure.
@@ -0,0 +1,41 @@
1
+ [**@isdk/web-searcher**](../README.md)
2
+
3
+ ***
4
+
5
+ [@isdk/web-searcher](../globals.md) / fetchPartial
6
+
7
+ # Function: fetchPartial()
8
+
9
+ > **fetchPartial**(`url`, `maxBytes`, `options`): `Promise`\<\{ `content`: `string`; `headers`: `Headers`; \} \| `null`\>
10
+
11
+ Defined in: [web-searcher/src/utils/extractor/fetcher.ts:55](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/utils/extractor/fetcher.ts#L55)
12
+
13
+ Fetches a partial amount of content from a URL.
14
+ Automatically handles character set detection from the Content-Type header.
15
+ Aborts the request once the specified maxBytes is reached.
16
+
17
+ ## Parameters
18
+
19
+ ### url
20
+
21
+ `string`
22
+
23
+ The URL to fetch.
24
+
25
+ ### maxBytes
26
+
27
+ `number` = `32768`
28
+
29
+ The maximum number of bytes to read. Defaults to 32KB.
30
+
31
+ ### options
32
+
33
+ [`FetchExtractorOptions`](../interfaces/FetchExtractorOptions.md) = `{}`
34
+
35
+ Request options.
36
+
37
+ ## Returns
38
+
39
+ `Promise`\<\{ `content`: `string`; `headers`: `Headers`; \} \| `null`\>
40
+
41
+ An object containing the decoded content string and the response headers.
@@ -0,0 +1,29 @@
1
+ [**@isdk/web-searcher**](../README.md)
2
+
3
+ ***
4
+
5
+ [@isdk/web-searcher](../globals.md) / normalizeDate
6
+
7
+ # Function: normalizeDate()
8
+
9
+ > **normalizeDate**(`dateStr`): `string` \| `null`
10
+
11
+ Defined in: [web-searcher/src/utils/extractor/date-normalizer.ts:9](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/utils/extractor/date-normalizer.ts#L9)
12
+
13
+ Normalizes a date string into a standard ISO 8601 format (UTC).
14
+ It handles various formats (YYYY-MM-DD, RFC2822, etc.) and performs
15
+ aggressive cleaning and sanity checks.
16
+
17
+ ## Parameters
18
+
19
+ ### dateStr
20
+
21
+ The raw date string to normalize.
22
+
23
+ `string` | `null`
24
+
25
+ ## Returns
26
+
27
+ `string` \| `null`
28
+
29
+ An ISO 8601 string (e.g., "2024-01-20T00:00:00.000Z") or null if invalid.
@@ -0,0 +1,28 @@
1
+ [**@isdk/web-searcher**](../README.md)
2
+
3
+ ***
4
+
5
+ [@isdk/web-searcher](../globals.md) / parseHeaders
6
+
7
+ # Function: parseHeaders()
8
+
9
+ > **parseHeaders**(`headers`): `Record`\<`string`, `string`\>
10
+
11
+ Defined in: [web-searcher/src/utils/extractor/parser.ts:25](https://github.com/isdk/web-searcher.js/blob/955bc509edda39926bd12c6c2b8c28da7eb13ff5/src/utils/extractor/parser.ts#L25)
12
+
13
+ Converts a Web API Headers object into a plain JavaScript record.
14
+ All header names are converted to lowercase for consistent access.
15
+
16
+ ## Parameters
17
+
18
+ ### headers
19
+
20
+ `Headers`
21
+
22
+ The Headers object to parse.
23
+
24
+ ## Returns
25
+
26
+ `Record`\<`string`, `string`\>
27
+
28
+ A record where keys are lowercase header names.