@isdk/proxy 0.1.1 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (36) hide show
  1. package/README.cn.md +249 -9
  2. package/README.md +249 -7
  3. package/dist/index.d.mts +374 -41
  4. package/dist/index.d.ts +374 -41
  5. package/dist/index.js +1 -1
  6. package/dist/index.mjs +1 -1
  7. package/docs/README.md +249 -7
  8. package/docs/classes/OfflineCacheMissError.md +426 -0
  9. package/docs/classes/SmartCache.md +81 -13
  10. package/docs/functions/createCachedFetch.md +1 -1
  11. package/docs/functions/createFetchWithCache.md +1 -1
  12. package/docs/functions/extractData.md +34 -5
  13. package/docs/functions/fetchWithCache.md +18 -9
  14. package/docs/functions/generateCacheKey.md +34 -4
  15. package/docs/functions/getSiteConfig.md +39 -0
  16. package/docs/functions/isAllowed.md +35 -8
  17. package/docs/functions/isCacheable.md +27 -0
  18. package/docs/functions/isGlob.md +23 -0
  19. package/docs/functions/isMatch.md +44 -0
  20. package/docs/functions/prefetch.md +33 -0
  21. package/docs/globals.md +15 -0
  22. package/docs/interfaces/BodyFilterConfig.md +77 -0
  23. package/docs/interfaces/CacheEntry.md +9 -9
  24. package/docs/interfaces/CacheMetadata.md +8 -8
  25. package/docs/interfaces/CacheRule.md +80 -0
  26. package/docs/interfaces/FetchWithCacheContext.md +44 -16
  27. package/docs/interfaces/FetchWithCacheOptions.md +40 -12
  28. package/docs/interfaces/KeyFilterConfig.md +11 -7
  29. package/docs/interfaces/PrefetchOptions.md +107 -0
  30. package/docs/interfaces/PrefetchRequest.md +31 -0
  31. package/docs/interfaces/PrefetchResult.md +47 -0
  32. package/docs/interfaces/ProxyConfig.md +4 -4
  33. package/docs/interfaces/SiteCacheConfig.md +56 -11
  34. package/docs/interfaces/SmartCacheOptions.md +32 -6
  35. package/docs/variables/OfflineCacheMissErrorCode.md +18 -0
  36. package/package.json +5 -3
package/README.cn.md CHANGED
@@ -13,6 +13,8 @@
13
13
  ## 核心特性
14
14
 
15
15
  - **🚀 混合多级缓存**: L1 (LRU 内存) 提供极速响应,L2 (内容寻址磁盘 `cacache`) 提供持久化存储。
16
+ - **📥 HTTP POST & 多方法支持**: 完整支持 POST、PUT 等非 GET 方法的缓存,内置智能请求体指纹计算机制。
17
+ - **🎯 精细化规则拦截**: 支持通过 `cacheRules` 对特定路径或 Query 参数进行外科手术式的精确缓存控制。
16
18
  - **🌊 原生流式分发**: 内部完全基于 Stream 管道化构建,在代理大文件时天然防 OOM 内存溢出。
17
19
  - **🧠 智能元数据驻留**: 无论文件多大,元数据 (Headers, Status, Policy) 始终驻留在内存中,确保纳秒级的缓存策略判定。
18
20
  - **🔄 过期后异步更新 (SWR)**: 立即返回过期数据,同时在后台静默更新缓存,实现“零等待”响应。
@@ -31,6 +33,8 @@ pnpm add @isdk/proxy
31
33
 
32
34
  使用 `@isdk/proxy` 的主要方式是通过 `fetchWithCache` 函数,它可以包装任何 HTTP 请求逻辑。
33
35
 
36
+ ### 基础用法 (GET 请求)
37
+
34
38
  ```typescript
35
39
  import { SmartCache, createCachedFetch } from '@isdk/proxy';
36
40
 
@@ -40,31 +44,114 @@ const cache = new SmartCache({
40
44
  maxMemorySize: 1024 * 1024 // 内存阈值 1MB
41
45
  });
42
46
 
43
- // 2. 创建一个预配置的缓存 Fetcher (内部会自动防缓存击穿)
47
+ // 2. 创建一个预配置的缓存 Fetcher
44
48
  const myFetch = createCachedFetch({
45
49
  cache,
46
50
  config: {
47
51
  staleIfError: true,
48
- forceCache: false // 设置为 true 可无视 no-store 强制缓存一切,适用于离线应用
49
52
  },
50
53
  backgroundUpdate: true // 开启 SWR (过期后后台静默更新)
51
54
  });
52
55
 
53
- // 3. 在应用的任何地方愉快地使用它!
54
- const request = new Request('https://api.example.com/data');
55
- const response = await myFetch(request, (req) => fetch(req)); // 传入任何返回 Promise<Response> 的获取函数
56
+ // 3. 愉快地使用它!
57
+ const response = await myFetch(new Request('https://api.example.com/data'), (req) => fetch(req));
58
+ console.log(response.headers.get('x-proxy-cache'));
59
+ ```
60
+
61
+ ### 进阶用法:缓存 POST 请求
62
+
63
+ 你可以通过配置 `methods` 开启 POST/PUT 缓存,并使用 `body` 过滤器排除请求体中的动态字段(如时间戳、随机数),从而确保缓存键的稳定性。
64
+
65
+ ```typescript
66
+ const myPostFetch = createCachedFetch({
67
+ cache,
68
+ config: {
69
+ methods: ['GET', 'POST'], // 允许缓存 POST
70
+ body: {
71
+ exclude: ['timestamp', 'nonce'] // 生成缓存键时忽略这些动态字段
72
+ },
73
+ cacheRules: [
74
+ { method: 'POST', path: '/api/v1/query' } // 仅对特定的 POST 接口生效
75
+ ],
76
+ forceCache: true // 对于 POST 请求,后端通常不发 Cache-Control,建议开启强制缓存
77
+ }
78
+ });
79
+ ```
80
+
81
+ ## 配置详解:`SiteCacheConfig`
82
+
83
+ | 配置项 | 类型 | 说明 |
84
+ | :--- | :--- | :--- |
85
+ | `methods` | `string[]` | 允许缓存的 HTTP 方法列表。默认仅为 `['GET', 'HEAD']`。 |
86
+ | `cacheRules` | `CacheRule[]` | 精细化拦截规则。如果配置,请求必须匹配其中至少一条规则才会被缓存。 |
87
+ | `query` | `KeyFilterConfig` | URL 查询参数过滤(`include` 白名单 / `exclude` 黑名单)。 |
88
+ | `headers` | `KeyFilterConfig` | 请求头过滤。 |
89
+ | `cookies` | `KeyFilterConfig` | Cookie 字段过滤。 |
90
+ | `body` | `KeyFilterConfig` | 请求体字段过滤。对于 JSON 类型支持字段级过滤;也支持通过 `extract` 正则提取关键数据。 |
91
+ | `staleIfError`| `boolean` | 网络请求失败时,是否强制返回本地过期的旧缓存。 |
92
+ | `forceCache` | `boolean` | 是否无视源站指令强制执行缓存,常用于离线应用。 |
93
+ | `offline` | `boolean` | 离线模式。开启后只读缓存,若无缓存则抛出 `OfflineCacheMissError`。 |
94
+
95
+ ### `CacheRule` 规则对象
96
+
97
+ - `method`: 匹配的 HTTP 方法。
98
+ - `path`: 路径匹配(支持**正则表达式**、**Glob 通配符**、**数组格式**或**前缀匹配**)。
99
+ - `query`: 键值对匹配。值可以是 `string`(全等/Glob匹配)、`true`(参数必须存在)、`false`(参数必须不存在)、或 `RegExp`(正则匹配)。
100
+ - `bodyType`: 匹配 Body 类型,支持 `'json'`, `'text'`, `'binary'`。
101
+ - `body`: Body 内容匹配(支持**正则表达式**、**Glob 通配符**或**数组格式**)。
102
+
103
+ ---
104
+
105
+ ### `fetchWithCache` 高级选项
106
+
107
+ 除了 `SiteCacheConfig` 外,`fetchWithCache` 还支持以下控制选项:
108
+
109
+ | 选项 | 类型 | 说明 |
110
+ | :--- | :--- | :--- |
111
+ | `backgroundUpdate` | `boolean` | 是否启用后台异步更新 (SWR)。默认为 `true`。 |
112
+ | `onBackgroundUpdate`| `function` | 当触发后台更新时,接收该更新 Promise 的回调。可用作任务追踪。 |
113
+ | `generateKey` | `function` | 自定义缓存键生成函数。 |
114
+
115
+ ### 模式匹配说明
116
+
117
+ `@isdk/proxy` 为所有可配置字段提供强大的模式匹配能力:
118
+
119
+ | 模式类型 | 示例 | 说明 |
120
+ | :--- | :--- | :--- |
121
+ | **正则表达式** | `/api/v[12]/.*/i` | JavaScript 正则(JSON 中用字符串表示,如 `"/api/v[12]/.*/i"`) |
122
+ | **Glob 通配符** | `/**/*.json` | 文件路径风格通配符匹配 |
123
+ | **否定模式** | `['!/api/private/**', '/api/**']` | 排除匹配(以 `!` 开头) |
124
+ | **数组格式** | `['/api/v1/*', '/api/v2/*']` | 多模式组合(OR 逻辑,负向优先) |
125
+ | **布尔值** | `true` / `false` | 用于 query 参数:必须存在/不存在 |
56
126
 
57
- console.log(response.headers.get('x-proxy-cache')); // 输出: "MISS", "HIT", "STALE" 或 "STALE_IF_ERROR"
58
- const data = await response.json();
127
+ **高级模式匹配示例:**
128
+
129
+ ```typescript
130
+ const myFetch = createCachedFetch({
131
+ cache,
132
+ config: {
133
+ cacheRules: [
134
+ {
135
+ path: ['/api/v1/items/*', '!/api/v1/items/private/*'], // v1 items,排除 private
136
+ query: {
137
+ format: '/^(json|xml)$/', // 正则匹配 format 参数
138
+ 'page*': true // Glob:任何以 page 开头的参数必须存在
139
+ },
140
+ body: /\"action\"\s*:\s*\"query\"/ // 正则匹配 Body 内容
141
+ }
142
+ ]
143
+ }
144
+ });
59
145
  ```
60
146
 
61
147
  ## 适配器 (Adapters)
62
148
 
63
149
  `@isdk/proxy` 旨在成为环境无关的纯净核心。虽然核心库保持纯粹,但你可以轻松集成或找到针对特定环境的适配器:
64
150
 
65
- - **MSW 适配器**: 参见 `@isdk/proxy-msw` (独立包),将此缓存引擎作为 MSW 拦截器使用。
151
+ - **HTTP 代理服务器 (Node.js)**: 参见 [@isdk/proxy-server](https://www.npmjs.com/package/@isdk/proxy-server)(独立包),用于启动独立的 HTTP 缓存代理服务器。
152
+ - **Crawlee 适配器**: 参见 [@isdk/proxy-crawlee](https://www.npmjs.com/package/@isdk/proxy-crawlee)(独立包),用于集成到 Crawlee 网页爬虫生命周期中。
153
+ - **MSW 适配器**: 参见 `@isdk/proxy-msw`(独立包),将此缓存引擎作为 MSW 拦截器使用。
66
154
  - **Axios 适配器**: 可以通过将 Axios 配置转换为 Web 标准 `Request` 轻松实现。
67
- - **Crawlee 适配器**: 能够集成到爬虫生命周期中,减少重复抓取。
68
155
 
69
156
  ## 架构设计详解
70
157
 
@@ -124,6 +211,159 @@ const data = await response.json();
124
211
  - **`options.maxMemorySize`**: 响应体进入内存 (L1) 的大小阈值(字节),超过此大小的文件将直接进入磁盘流传输(默认 `1048576` 即 1MB)。
125
212
  - **`options.storagePath`**: 磁盘 L2 缓存(cacache)的物理存储路径(默认为操作系统的临时目录)。
126
213
 
214
+ ### 工具函数
215
+
216
+ 导出以下工具函数供高级用法:
217
+
218
+ #### `isMatch(pattern, value, usePrefix?)`
219
+
220
+ 通用模式匹配函数。支持正则表达式、Glob、数组模式(含否定)和字符串前缀/精确匹配。
221
+
222
+ - **`pattern`**: `string | RegExp | (string | RegExp)[]`
223
+ - **`value`**: 要测试的字符串
224
+ - **`usePrefix`**: 对于普通字符串,是否使用前缀匹配而非精确匹配(默认:`false`)
225
+ - **返回值**: `boolean`
226
+
227
+ ```typescript
228
+ import { isMatch } from '@isdk/proxy';
229
+
230
+ isMatch('/api/v[12]/.*', '/api/v1/users'); // 正则表达式
231
+ isMatch('/api/**/*.json', '/api/v1/data.json'); // Glob 通配符
232
+ isMatch(['!/private/**', '/api/**'], '/api/data'); // 否定模式
233
+ ```
234
+
235
+ #### `isGlob(pattern)`
236
+
237
+ 判断字符串是否为 Glob 语法。
238
+
239
+ - **`pattern`**: `string`
240
+ - **返回值**: `boolean`
241
+
242
+ #### `getSiteConfig(urlString, proxyConfig)`
243
+
244
+ 根据 URL 获取对应的站点级缓存配置。
245
+
246
+ - **`urlString`**: 完整的请求 URL
247
+ - **`proxyConfig`**: 包含 `sites` 和 `default` 配置的 `ProxyConfig` 对象
248
+ - **返回值**: `SiteCacheConfig`
249
+
250
+ ```typescript
251
+ import { getSiteConfig } from '@isdk/proxy';
252
+
253
+ const config = getSiteConfig('https://api.example.com/data', {
254
+ default: { methods: ['GET'] },
255
+ sites: {
256
+ 'api.example.com': { methods: ['GET', 'POST'], forceCache: true },
257
+ '/internal/': { staleIfError: true } // 前缀匹配
258
+ }
259
+ });
260
+ ```
261
+
262
+ #### `isAllowed(key, config, defaultAllowed?)`
263
+
264
+ 判断指定的键是否允许参与缓存指纹计算。
265
+
266
+ - **`key`**: 要检查的键名
267
+ - **`config`**: `KeyFilterConfig` 对象,支持 `include`(白名单)或 `exclude`(黑名单)
268
+ - **`defaultAllowed`**: 可选参数。当没有配置或配置未命中时使用的默认值
269
+ - **返回值**: `boolean | undefined`
270
+
271
+ **优先级逻辑**:
272
+
273
+ 1. `exclude` 命中 → 直接返回 `false`(优先级最高)
274
+ 2. `include` 存在且命中 → 返回 `true`
275
+ 3. `include` 存在但不命中 → 返回 `false`
276
+ 4. 都没有配置 → 使用 `defaultAllowed`(未传则返回 `undefined`)
277
+
278
+ ```typescript
279
+ import { isAllowed } from '@isdk/proxy';
280
+
281
+ // 无配置
282
+ isAllowed('key'); // undefined (falsy)
283
+
284
+ // 白名单
285
+ isAllowed('id', { include: ['id', 'name'] }); // true
286
+ isAllowed('email', { include: ['id', 'name'] }); // false
287
+
288
+ // 黑名单
289
+ isAllowed('password', { exclude: ['password'] }); // false
290
+ isAllowed('name', { exclude: ['password'] }); // undefined (falsy)
291
+
292
+ // 需要 defaultAllowed 来设置默认值
293
+ isAllowed('name', { exclude: ['password'] }, true); // true
294
+ ```
295
+
296
+ #### `extractData(source, config, defaultAllowed?)`
297
+
298
+ 从源对象中根据过滤配置提取数据并标准化。用于生成缓存指纹。
299
+
300
+ - **`source`**: 原始数据对象(Query、Headers、Cookies 等)
301
+ - **`config`**: `KeyFilterConfig` 对象
302
+ - **`defaultAllowed`**: 可选参数。当没有配置或配置未命中时,是否允许提取(默认 `false`)
303
+ - **返回值**: `Record<string, string[]>` 标准化后的数据,键为小写,值为排序后的数组
304
+
305
+ ```typescript
306
+ import { extractData } from '@isdk/proxy';
307
+
308
+ const headers = { 'Content-Type': 'application/json', 'X-Request-Id': '123' };
309
+
310
+ // 默认不提取任何键
311
+ extractData(headers); // {}
312
+
313
+ // 提取所有键
314
+ extractData(headers, undefined, true); // { 'content-type': ['application/json'], 'x-request-id': ['123'] }
315
+
316
+ // 白名单
317
+ extractData(headers, { include: ['content-type'] }); // { 'content-type': ['application/json'] }
318
+
319
+ // 黑名单
320
+ extractData(headers, { include: ['*'], exclude: ['x-request-id'] }, true); // { 'content-type': ['application/json'] }
321
+ ```
322
+
323
+ ### `prefetch(options)`
324
+
325
+ 预缓存函数,提前将指定的 URL 列表内容存入缓存。
326
+
327
+ - **`urls`**: `PrefetchRequest[]`。每个对象包含 `url` 和可选的 `request` 配置。
328
+ - **`config`**: `ProxyConfig` 完整配置。
329
+ - **`cache`**: `SmartCache` 实例。
330
+ - **`concurrency`**: 并发数(默认 `3`)。
331
+ - **`onProgress`**: 进度回调 `(completed, total, url) => void`。
332
+
333
+ ```typescript
334
+ import { prefetch } from '@isdk/proxy';
335
+
336
+ const result = await prefetch({
337
+ urls: [
338
+ { url: 'https://api.example.com/page1' },
339
+ { url: 'https://api.example.com/api2', request: { method: 'POST', body: '...' } }
340
+ ],
341
+ config,
342
+ cache,
343
+ onProgress: (c, t, url) => console.log(`Progress: ${c}/${t} - ${url}`)
344
+ });
345
+ console.log(`Succeeded: ${result.succeeded}, Failed: ${result.failed}`);
346
+ ```
347
+
348
+ ### 错误处理:`OfflineCacheMissError`
349
+
350
+ 在开启 `offline: true` 模式时,如果请求未命中缓存,将抛出此错误。
351
+
352
+ - **`name`**: `OfflineCacheMissError`
353
+ - **`code`**: `ERR_OFFLINE_CACHE_MISS` (可通过导入 `OfflineCacheMissErrorCode` 获得)
354
+
355
+ ```typescript
356
+ import { OfflineCacheMissError } from '@isdk/proxy';
357
+
358
+ try {
359
+ await myFetch(request);
360
+ } catch (e) {
361
+ if (e instanceof OfflineCacheMissError) {
362
+ // 处理离线未命中
363
+ }
364
+ }
365
+ ```
366
+
127
367
  ### 缓存状态标头 (Cache Status Headers)
128
368
 
129
369
  由 `@isdk/proxy` 处理并返回的所有 `Response`,其 Headers 中都会注入 `x-proxy-cache` 字段以便观测生命周期,可能的值有:
package/README.md CHANGED
@@ -15,6 +15,8 @@ In high-concurrency environments—like **API Proxies**, **Web Scrapers**, or **
15
15
  ## Key Features
16
16
 
17
17
  - **🚀 Hybrid Multi-tier Cache**: Extreme speed with L1 (LRU Memory) and persistence with L2 (Content Addressable Disk via `cacache`).
18
+ - **📥 HTTP POST & Method Support**: Full support for caching POST, PUT, and other methods with intelligent request body fingerprinting.
19
+ - **🎯 Precision Filtering**: Fine-grained `cacheRules` to intercept specific paths or query parameters.
18
20
  - **🌊 Streaming Native**: Fully stream-based internal pipeline natively prevents Out-Of-Memory (OOM) issues when proxying large files.
19
21
  - **🧠 Intelligent Meta-Residency**: Metadata (Headers, Status, Policy) stays in memory regardless of body size, ensuring nanosecond cache policy evaluations.
20
22
  - **🔄 Stale-While-Revalidate (SWR)**: Serve stale content instantly while updating the cache silently in the background.
@@ -33,6 +35,8 @@ pnpm add @isdk/proxy
33
35
 
34
36
  The primary way to use `@isdk/proxy` is via the `fetchWithCache` function, which can wrap any HTTP request logic.
35
37
 
38
+ ### Basic Usage (GET)
39
+
36
40
  ```typescript
37
41
  import { SmartCache, createCachedFetch } from '@isdk/proxy';
38
42
 
@@ -42,28 +46,112 @@ const cache = new SmartCache({
42
46
  maxMemorySize: 1024 * 1024 // 1MB threshold
43
47
  });
44
48
 
45
- // 2. Create a pre-configured cached fetcher (automatically tracks concurrent requests)
49
+ // 2. Create a pre-configured cached fetcher
46
50
  const myFetch = createCachedFetch({
47
51
  cache,
48
52
  config: {
49
53
  staleIfError: true,
50
- forceCache: false // Set to true to cache everything (ignore no-store) for offline-first apps
51
54
  },
52
55
  backgroundUpdate: true // Enable SWR
53
56
  });
54
57
 
55
- // 3. Use it anywhere in your app!
56
- const request = new Request('https://api.example.com/data');
57
- const response = await myFetch(request, (req) => fetch(req));
58
+ // 3. Use it!
59
+ const response = await myFetch(new Request('https://api.example.com/data'), (req) => fetch(req));
60
+ console.log(response.headers.get('x-proxy-cache'));
61
+ ```
62
+
63
+ ### Advanced Usage: Caching POST Requests
64
+
65
+ You can cache POST/PUT requests by enabling methods and defining body filters to ignore dynamic fields (like timestamps) in the request body.
66
+
67
+ ```typescript
68
+ const myPostFetch = createCachedFetch({
69
+ cache,
70
+ config: {
71
+ methods: ['GET', 'POST'], // Enable POST caching
72
+ body: {
73
+ exclude: ['timestamp', 'nonce'] // Ignore these fields when generating cache keys
74
+ },
75
+ cacheRules: [
76
+ { method: 'POST', path: '/api/v1/query' } // Only cache specific POST endpoints
77
+ ],
78
+ forceCache: true // Often needed for POST if backend doesn't send Cache-Control
79
+ }
80
+ });
81
+ ```
82
+
83
+ ## Configuration: `SiteCacheConfig`
84
+
85
+ | Field | Type | Description |
86
+ | :--- | :--- | :--- |
87
+ | `methods` | `string[]` | List of allowed HTTP methods. Default: `['GET', 'HEAD']`. |
88
+ | `cacheRules` | `CacheRule[]` | Fine-grained rules. If set, a request must match at least one rule to be cached. |
89
+ | `query` | `KeyFilterConfig` | Filters for URL search parameters (`include`/`exclude`). |
90
+ | `headers` | `KeyFilterConfig` | Filters for request headers. |
91
+ | `cookies` | `KeyFilterConfig` | Filters for cookies. |
92
+ | `body` | `KeyFilterConfig` | Filters for request body fields. For JSON, supports field-level filtering; also supports extracting key data via `extract` regex. |
93
+ | `staleIfError`| `boolean` | Serve stale cache on network failure. |
94
+ | `forceCache` | `boolean` | Ignore `no-store` and force caching (useful for offline support). |
95
+ | `offline` | `boolean` | Offline mode. Only reads from cache; throws `OfflineCacheMissError` if no cache exists. |
96
+
97
+ ### `CacheRule` Object
98
+
99
+ - `method`: HTTP method to match.
100
+ - `path`: URL pathname matching (supports **RegExp**, **Glob**, **Array**, or **prefix match**).
101
+ - `query`: Key-value pairs. Values can be `string` (exact/Glob match), `true` (must exist), `false` (must not exist), or `RegExp`.
102
+ - `bodyType`: Match body type. Supports `'json'`, `'text'`, `'binary'`.
103
+ - `body`: Body content matching (supports **RegExp**, **Glob**, or **Array**).
104
+
105
+ ---
106
+
107
+ ### `fetchWithCache` Advanced Options
108
+
109
+ In addition to `SiteCacheConfig`, `fetchWithCache` supports the following control options:
110
+
111
+ | Option | Type | Description |
112
+ | :--- | :--- | :--- |
113
+ | `backgroundUpdate` | `boolean` | Whether to enable background async update (SWR). Default is `true`. |
114
+ | `onBackgroundUpdate`| `function` | Callback that receives the update Promise when a background update is triggered. Useful for task tracking. |
115
+ | `generateKey` | `function` | Custom cache key generation function. |
116
+
117
+ ### Pattern Matching
118
+
119
+ `@isdk/proxy` provides powerful pattern matching for all configurable fields:
120
+
121
+ | Pattern Type | Example | Description |
122
+ | :--- | :--- | :--- |
123
+ | **RegExp** | `/api/v[12]/.*/i` | JavaScript RegExp (in JSON, use string like `"/api/v[12]/.*/i"`) |
124
+ | **Glob** | `/**/*.json` | File path style wildcard matching |
125
+ | **Negation** | `['!/api/private/**', '/api/**']` | Exclude patterns (prefixed with `!`, checked first) |
126
+ | **Array** | `['/api/v1/*', '/api/v2/*']` | Multiple patterns (OR logic, negative takes precedence) |
127
+ | **Boolean** | `true` / `false` | For query params: must/must not exist |
58
128
 
59
- console.log(response.headers.get('x-proxy-cache')); // "MISS", "HIT", "STALE", or "STALE_IF_ERROR"
60
- const data = await response.json();
129
+ **Example with advanced pattern matching:**
130
+
131
+ ```typescript
132
+ const myFetch = createCachedFetch({
133
+ cache,
134
+ config: {
135
+ cacheRules: [
136
+ {
137
+ path: ['/api/v1/items/*', '!/api/v1/items/private/*'], // v1 items, exclude private
138
+ query: {
139
+ format: '/^(json|xml)$/', // Regex for format param
140
+ 'page*': true // Glob: any param starting with 'page' must exist
141
+ },
142
+ body: /\"action\"\s*:\s*\"query\"/ // Regex body match
143
+ }
144
+ ]
145
+ }
146
+ });
61
147
  ```
62
148
 
63
149
  ## Adapters
64
150
 
65
151
  `@isdk/proxy` is designed to be framework-agnostic. While the core library is pure, you can find (or build) adapters for specific environments:
66
152
 
153
+ - **HTTP Caching Proxy Server (Node.js)**: See [@isdk/proxy-server](https://www.npmjs.com/package/@isdk/proxy-server) (separate package) for running a standalone HTTP forward proxy.
154
+ - **Crawlee Adapter**: See [@isdk/proxy-crawlee](https://www.npmjs.com/package/@isdk/proxy-crawlee) (separate package) for integrating with Crawlee web scraping lifecycle.
67
155
  - **MSW Adapter**: See `@isdk/proxy-msw` (separate package) to use this caching engine as an MSW interceptor.
68
156
  - **Axios Adapter**: Easily implemented by converting Axios config to Web `Request`.
69
157
 
@@ -116,9 +204,163 @@ The hybrid multi-tier storage engine.
116
204
  - **`options.maxMemorySize`**: Threshold (in bytes) for offloading bodies to disk (default `1048576`, i.e., 1MB).
117
205
  - **`options.storagePath`**: Disk storage path for the `cacache` engine (defaults to a system temp folder).
118
206
 
207
+ ### Utility Functions
208
+
209
+ Exported from `@isdk/proxy` for advanced usage:
210
+
211
+ #### `isMatch(pattern, value, usePrefix?)`
212
+
213
+ Universal pattern matching function. Supports RegExp, Glob, array patterns (with negation), and string prefix/exact matching.
214
+
215
+ - **`pattern`**: `string | RegExp | (string | RegExp)[]`
216
+ - **`value`**: The string to test against
217
+ - **`usePrefix`**: For plain strings, use prefix match instead of exact match (default: `false`)
218
+ - **Returns**: `boolean`
219
+
220
+ ```typescript
221
+ import { isMatch } from '@isdk/proxy';
222
+
223
+ isMatch('/api/v[12]/.*', '/api/v1/users'); // RegExp
224
+ isMatch('/api/**/*.json', '/api/v1/data.json'); // Glob
225
+ isMatch(['!/private/**', '/api/**'], '/api/data'); // Negation
226
+ ```
227
+
228
+ #### `isGlob(pattern)`
229
+
230
+ Check if a pattern is Glob syntax.
231
+
232
+ - **`pattern`**: `string`
233
+ - **Returns**: `boolean`
234
+
235
+ #### `getSiteConfig(urlString, proxyConfig)`
236
+
237
+ Get the site-specific cache configuration for a given URL.
238
+
239
+ - **`urlString`**: Full URL to match
240
+ - **`proxyConfig`**: `ProxyConfig` object with `sites` and `default` config
241
+ - **Returns**: `SiteCacheConfig`
242
+
243
+ ```typescript
244
+ import { getSiteConfig } from '@isdk/proxy';
245
+
246
+ const config = getSiteConfig('https://api.example.com/data', {
247
+ default: { methods: ['GET'] },
248
+ sites: {
249
+ 'api.example.com': { methods: ['GET', 'POST'], forceCache: true },
250
+ '/internal/': { staleIfError: true } // prefix match
251
+ }
252
+ });
253
+ ```
254
+
255
+ #### `isAllowed(key, config, defaultAllowed?)`
256
+
257
+ Check if a key is allowed to participate in cache key fingerprinting.
258
+
259
+ - **`key`**: The key name to check
260
+ - **`config`**: `KeyFilterConfig` with `include` (whitelist) or `exclude` (blacklist)
261
+ - **`defaultAllowed`**: Optional. Default value when no config or no match
262
+ - **Returns**: `boolean | undefined`
263
+
264
+ **Priority Logic**:
265
+
266
+ 1. `exclude` hit → returns `false` (highest priority)
267
+ 2. `include` exists and hits → returns `true`
268
+ 3. `include` exists but no hit → returns `false`
269
+ 4. No config → uses `defaultAllowed` (returns `undefined` if not provided)
270
+
271
+ ```typescript
272
+ import { isAllowed } from '@isdk/proxy';
273
+
274
+ // No config
275
+ isAllowed('key'); // undefined (falsy)
276
+
277
+ // Whitelist
278
+ isAllowed('id', { include: ['id', 'name'] }); // true
279
+ isAllowed('email', { include: ['id', 'name'] }); // false
280
+
281
+ // Blacklist
282
+ isAllowed('password', { exclude: ['password'] }); // false
283
+ isAllowed('name', { exclude: ['password'] }); // undefined (falsy)
284
+
285
+ // Need defaultAllowed to set default
286
+ isAllowed('name', { exclude: ['password'] }, true); // true
287
+ ```
288
+
289
+ #### `extractData(source, config, defaultAllowed?)`
290
+
291
+ Extract and normalize data from a source object based on filter config. Used for generating cache fingerprints.
292
+
293
+ - **`source`**: Original data object (Query, Headers, Cookies, etc.)
294
+ - **`config`**: `KeyFilterConfig` object
295
+ - **`defaultAllowed`**: Optional. Whether to allow extraction when no config or no match (default `false`)
296
+ - **Returns**: `Record<string, string[]>` normalized data with lowercase keys and sorted array values
297
+
298
+ ```typescript
299
+ import { extractData } from '@isdk/proxy';
300
+
301
+ const headers = { 'Content-Type': 'application/json', 'X-Request-Id': '123' };
302
+
303
+ // No extraction by default
304
+ extractData(headers); // {}
305
+
306
+ // Extract all keys
307
+ extractData(headers, undefined, true); // { 'content-type': ['application/json'], 'x-request-id': ['123'] }
308
+
309
+ // Whitelist
310
+ extractData(headers, { include: ['content-type'] }); // { 'content-type': ['application/json'] }
311
+
312
+ // Blacklist
313
+ extractData(headers, { include: ['*'], exclude: ['x-request-id'] }, true); // { 'content-type': ['application/json'] }
314
+ ```
315
+
316
+ ### `prefetch(options)`
317
+
318
+ Pre-cache function that fetches and stores a list of URLs into cache ahead of time.
319
+
320
+ - **`urls`**: `PrefetchRequest[]`. Each object contains `url` and optional `request` config.
321
+ - **`config`**: `ProxyConfig` full configuration.
322
+ - **`cache`**: `SmartCache` instance.
323
+ - **`concurrency`**: Concurrency limit (default `3`).
324
+ - **`onProgress`**: Progress callback `(completed, total, url) => void`.
325
+
326
+ ```typescript
327
+ import { prefetch } from '@isdk/proxy';
328
+
329
+ const result = await prefetch({
330
+ urls: [
331
+ { url: 'https://api.example.com/page1' },
332
+ { url: 'https://api.example.com/api2', request: { method: 'POST', body: '...' } }
333
+ ],
334
+ config,
335
+ cache,
336
+ onProgress: (c, t, url) => console.log(`Progress: ${c}/${t} - ${url}`)
337
+ });
338
+ console.log(`Succeeded: ${result.succeeded}, Failed: ${result.failed}`);
339
+ ```
340
+
341
+ ### Error Handling: `OfflineCacheMissError`
342
+
343
+ When `offline: true` mode is enabled and a request does not hit the cache, this error is thrown.
344
+
345
+ - **`name`**: `OfflineCacheMissError`
346
+ - **`code`**: `ERR_OFFLINE_CACHE_MISS` (can be imported as `OfflineCacheMissErrorCode`)
347
+
348
+ ```typescript
349
+ import { OfflineCacheMissError } from '@isdk/proxy';
350
+
351
+ try {
352
+ await myFetch(request);
353
+ } catch (e) {
354
+ if (e instanceof OfflineCacheMissError) {
355
+ // Handle offline cache miss
356
+ }
357
+ }
358
+ ```
359
+
119
360
  ### Cache Status Headers
120
361
 
121
362
  Every response processed by `@isdk/proxy` will include an `x-proxy-cache` header indicating its lifecycle:
363
+
122
364
  - `HIT`: Served entirely from L1 or L2 cache.
123
365
  - `MISS`: Bypassed cache and fetched from the origin server.
124
366
  - `STALE`: Served from stale cache while a background update was initiated (SWR).