@mostafa.hanafy/prompt-cache 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md ADDED
@@ -0,0 +1,203 @@
1
+ # @mostafa.hanafy/prompt-cache
2
+
3
+ ![npm](https://img.shields.io/npm/v/@mostafa.hanafy/prompt-cache)
4
+ ![npm downloads](https://img.shields.io/npm/dm/@mostafa.hanafy/prompt-cache)
5
+ ![license](https://img.shields.io/npm/l/@mostafa.hanafy/prompt-cache)
6
+
7
+ Reliable caching layer for LLM API calls to reduce cost, avoid duplicate requests, and improve latency.
8
+
9
+ ## Why
10
+
11
+ LLM applications often repeat the same requests:
12
+
13
+ - retries from clients
14
+ - repeated user actions
15
+ - identical prompts within short time windows
16
+ - shared context across workflows
17
+
18
+ Without caching, you pay for the same request multiple times.
19
+
20
+ `@mostafa.hanafy/prompt-cache` helps you:
21
+
22
+ - avoid duplicate API calls
23
+ - reduce token usage cost
24
+ - improve response latency
25
+ - keep caching logic simple and deterministic
26
+
27
+ ## Install
28
+
29
+ ```bash
30
+ npm install @mostafa.hanafy/prompt-cache
31
+ # or
32
+ pnpm add @mostafa.hanafy/prompt-cache
33
+ # or
34
+ bun add @mostafa.hanafy/prompt-cache
35
+ ```
36
+
37
+ ## Key capabilities
38
+
39
+ - deterministic cache keys (`key` / `keyParts`)
40
+ - TTL-based expiration
41
+ - in-memory cache adapter
42
+ - **in-flight request deduplication (prevents duplicate concurrent calls)**
43
+ - pluggable cache adapters
44
+ - lifecycle hooks
45
+
46
+ ## Quick start
47
+
48
+ ```ts
49
+ import { withPromptCache } from "@mostafa.hanafy/prompt-cache";
50
+
51
+ const response = await withPromptCache({
52
+ keyParts: ["openai", "gpt-4o-mini", prompt, context],
53
+ ttlSeconds: 60,
54
+ call: () =>
55
+ openai.responses.create({
56
+ model: "gpt-4o-mini",
57
+ input: prompt,
58
+ }),
59
+ });
60
+ ```
61
+
62
+ ## Example use case
63
+
64
+ Avoid duplicate AI calls in high-traffic APIs:
65
+
66
+ ```ts
67
+ await withPromptCache({
68
+ keyParts: ["chat", userId, prompt],
69
+ ttlSeconds: 30,
70
+ call: () => generateResponse(prompt),
71
+ });
72
+ ```
73
+
74
+ If 10 users trigger the same request at the same time:
75
+
76
+ - without cache → 10 API calls ❌
77
+ - with @mostafa.hanafy/prompt-cache → 1 API call ✅
78
+
79
+ ## How it works
80
+
81
+ ```text
82
+ request
83
+
84
+ generate key
85
+
86
+ cache lookup
87
+
88
+ hit → return cached result
89
+ miss → execute call
90
+
91
+ store result
92
+ ```
93
+
94
+ ## API
95
+
96
+ ### `withPromptCache(options)`
97
+
98
+ Wraps an async LLM call with cache lookup, in-flight deduplication, and optional hooks.
99
+
100
+ ```ts
101
+ await withPromptCache({
102
+ key?: string,
103
+ keyParts?: unknown[],
104
+ ttlSeconds?: number,
105
+ cache?: CacheAdapter,
106
+ shouldCache?: (value) => boolean,
107
+ onHit?: (meta) => void,
108
+ onMiss?: (meta) => void,
109
+ onSet?: (meta) => void,
110
+ onError?: (error, meta) => void,
111
+ call: () => Promise<T>,
112
+ });
113
+ ```
114
+
115
+ Options:
116
+
117
+ - `key`: explicit cache key (takes precedence over `keyParts`)
118
+ - `keyParts`: parts that are deterministically hashed into a key
119
+ - `ttlSeconds`: cache TTL (default `60`)
120
+ - `cache`: custom adapter (`memoryCache` is default)
121
+ - `shouldCache`: return `false` to skip writing a result
122
+ - `onHit` / `onMiss` / `onSet` / `onError`: lifecycle hooks
123
+ - `call`: async operation to execute on cache miss
124
+
125
+ ### `createCacheKey(parts)`
126
+
127
+ Creates a deterministic key by stable-stringifying input and hashing it.
128
+
129
+ Use this when you want to precompute or inspect keys directly.
130
+
131
+ ### `createMemoryCache()` and `memoryCache`
132
+
133
+ - `createMemoryCache()`: creates a new isolated in-memory cache adapter
134
+ - `memoryCache`: shared default singleton adapter used by `withPromptCache`
135
+
136
+ ## Lifecycle hooks
137
+
138
+ ```ts
139
+ await withPromptCache({
140
+ keyParts: [prompt],
141
+ onHit: (meta) => console.log("cache hit", meta),
142
+ onMiss: (meta) => console.log("cache miss", meta),
143
+ onSet: (meta) => console.log("stored in cache", meta),
144
+ onError: (error, meta) => console.error("cache error", error, meta),
145
+ call: async () => aiCall(),
146
+ });
147
+ ```
148
+
149
+ ## Key and TTL guidance
150
+
151
+ - Include all request inputs in `keyParts` (model, prompt, context, options).
152
+ - In-memory cache is suitable for single-instance apps.
153
+ - Use a custom adapter (for example Redis) in distributed deployments.
154
+ - Set `ttlSeconds` to match freshness requirements.
155
+
156
+ ## Testing and verification
157
+
158
+ ```bash
159
+ bun test
160
+ bun run typecheck
161
+ bun run lint
162
+ bun run build
163
+ ```
164
+
165
+ ## Performance verification (PRD §10)
166
+
167
+ Run benchmark harness:
168
+
169
+ ```bash
170
+ # preferred
171
+ bun bench/overhead.ts
172
+
173
+ # fallback
174
+ tsx bench/overhead.ts
175
+ ```
176
+
177
+ It reports `avg/p50/p95/max` latency for:
178
+
179
+ - cache-hit path
180
+ - miss path
181
+ - concurrent dedup path
182
+
183
+ Pass criterion: cache-hit path `p95 < 1ms`.
184
+ The benchmark exits with a non-zero code if this criterion fails.
185
+
186
+ ## Limitations
187
+
188
+ - In-memory cache does not work across multiple instances
189
+ - TTL-based invalidation only (no advanced invalidation yet)
190
+ - Cache key must include all relevant inputs
191
+
192
+ ## Related packages
193
+
194
+ Part of a small AI developer toolkit:
195
+
196
+ - token-budget-guard — enforce token budgets for LLM calls
197
+ - llm-retry-guard — safe retry wrapper for LLM APIs
198
+ - ai-request-logger — structured logging for AI requests
199
+ - @mostafa.hanafy/prompt-cache — avoid duplicate LLM calls
200
+
201
+ ## License
202
+
203
+ MIT
package/dist/index.cjs ADDED
@@ -0,0 +1,134 @@
1
+ "use strict";
2
+ var __defProp = Object.defineProperty;
3
+ var __getOwnPropDesc = Object.getOwnPropertyDescriptor;
4
+ var __getOwnPropNames = Object.getOwnPropertyNames;
5
+ var __hasOwnProp = Object.prototype.hasOwnProperty;
6
+ var __export = (target, all) => {
7
+ for (var name in all)
8
+ __defProp(target, name, { get: all[name], enumerable: true });
9
+ };
10
+ var __copyProps = (to, from, except, desc) => {
11
+ if (from && typeof from === "object" || typeof from === "function") {
12
+ for (let key of __getOwnPropNames(from))
13
+ if (!__hasOwnProp.call(to, key) && key !== except)
14
+ __defProp(to, key, { get: () => from[key], enumerable: !(desc = __getOwnPropDesc(from, key)) || desc.enumerable });
15
+ }
16
+ return to;
17
+ };
18
+ var __toCommonJS = (mod) => __copyProps(__defProp({}, "__esModule", { value: true }), mod);
19
+
20
+ // src/index.ts
21
+ var index_exports = {};
22
+ __export(index_exports, {
23
+ createCacheKey: () => createCacheKey,
24
+ createMemoryCache: () => createMemoryCache,
25
+ memoryCache: () => memoryCache,
26
+ withPromptCache: () => withPromptCache
27
+ });
28
+ module.exports = __toCommonJS(index_exports);
29
+
30
+ // src/key.ts
31
+ var stableStringify = (value) => {
32
+ if (value === null || typeof value !== "object") {
33
+ return JSON.stringify(value);
34
+ }
35
+ if (Array.isArray(value)) {
36
+ return `[${value.map(stableStringify).join(",")}]`;
37
+ }
38
+ const entries = Object.entries(value).sort(([a], [b]) => a.localeCompare(b)).map(([k, v]) => `${JSON.stringify(k)}:${stableStringify(v)}`);
39
+ return `{${entries.join(",")}}`;
40
+ };
41
+ var hashString = (input) => {
42
+ let h = 2166136261;
43
+ for (let i = 0; i < input.length; i++) {
44
+ h ^= input.charCodeAt(i);
45
+ h = Math.imul(h, 16777619);
46
+ }
47
+ return (h >>> 0).toString(16);
48
+ };
49
+ var createCacheKey = (parts) => {
50
+ return hashString(stableStringify(parts));
51
+ };
52
+
53
+ // src/memory.ts
54
+ var createMemoryCache = () => {
55
+ const store = /* @__PURE__ */ new Map();
56
+ return {
57
+ async get(key) {
58
+ const item = store.get(key);
59
+ if (!item) return void 0;
60
+ if (item.expiresAt && Date.now() > item.expiresAt) {
61
+ store.delete(key);
62
+ return void 0;
63
+ }
64
+ return item.value;
65
+ },
66
+ async set(key, value, ttlSeconds) {
67
+ const expiresAt = typeof ttlSeconds === "number" ? Date.now() + ttlSeconds * 1e3 : void 0;
68
+ store.set(key, { value, expiresAt });
69
+ },
70
+ async delete(key) {
71
+ store.delete(key);
72
+ }
73
+ };
74
+ };
75
+ var memoryCache = createMemoryCache();
76
+
77
+ // src/cache.ts
78
+ var inFlight = /* @__PURE__ */ new Map();
79
+ var withPromptCache = async (options) => {
80
+ const {
81
+ key,
82
+ keyParts,
83
+ ttlSeconds = 60,
84
+ cache = memoryCache,
85
+ shouldCache,
86
+ onHit,
87
+ onMiss,
88
+ onSet,
89
+ onError,
90
+ call
91
+ } = options;
92
+ const cacheKey = key ?? createCacheKey(keyParts ?? []);
93
+ if (!cacheKey) {
94
+ throw new Error("Provide `key` or `keyParts`");
95
+ }
96
+ const meta = { key: cacheKey, ttlSeconds };
97
+ try {
98
+ const cached = await cache.get(cacheKey);
99
+ if (cached !== void 0) {
100
+ onHit?.(meta);
101
+ return cached;
102
+ }
103
+ onMiss?.(meta);
104
+ const existing = inFlight.get(cacheKey);
105
+ if (existing) {
106
+ return existing;
107
+ }
108
+ const request = (async () => {
109
+ const result = await call();
110
+ const canCache = shouldCache ? shouldCache(result) : true;
111
+ if (canCache) {
112
+ await cache.set(cacheKey, result, ttlSeconds);
113
+ onSet?.(meta);
114
+ }
115
+ return result;
116
+ })();
117
+ inFlight.set(cacheKey, request);
118
+ try {
119
+ return await request;
120
+ } finally {
121
+ inFlight.delete(cacheKey);
122
+ }
123
+ } catch (error) {
124
+ onError?.(error, meta);
125
+ throw error;
126
+ }
127
+ };
128
+ // Annotate the CommonJS export names for ESM import in node:
129
+ 0 && (module.exports = {
130
+ createCacheKey,
131
+ createMemoryCache,
132
+ memoryCache,
133
+ withPromptCache
134
+ });
@@ -0,0 +1,30 @@
1
+ type CacheAdapter = {
2
+ get: (key: string) => Promise<unknown | undefined>;
3
+ set: (key: string, value: unknown, ttlSeconds?: number) => Promise<void>;
4
+ delete?: (key: string) => Promise<void>;
5
+ };
6
+ type CacheEventMeta = {
7
+ key: string;
8
+ ttlSeconds?: number;
9
+ };
10
+ type PromptCacheOptions<T> = {
11
+ key?: string;
12
+ keyParts?: unknown[];
13
+ ttlSeconds?: number;
14
+ cache?: CacheAdapter;
15
+ shouldCache?: (value: T) => boolean;
16
+ onHit?: (meta: CacheEventMeta) => void;
17
+ onMiss?: (meta: CacheEventMeta) => void;
18
+ onSet?: (meta: CacheEventMeta) => void;
19
+ onError?: (error: unknown, meta: CacheEventMeta) => void;
20
+ call: () => Promise<T>;
21
+ };
22
+
23
+ declare const withPromptCache: <T>(options: PromptCacheOptions<T>) => Promise<T>;
24
+
25
+ declare const createCacheKey: (parts: unknown[]) => string;
26
+
27
+ declare const createMemoryCache: () => CacheAdapter;
28
+ declare const memoryCache: CacheAdapter;
29
+
30
+ export { type CacheAdapter, type PromptCacheOptions, createCacheKey, createMemoryCache, memoryCache, withPromptCache };
@@ -0,0 +1,30 @@
1
+ type CacheAdapter = {
2
+ get: (key: string) => Promise<unknown | undefined>;
3
+ set: (key: string, value: unknown, ttlSeconds?: number) => Promise<void>;
4
+ delete?: (key: string) => Promise<void>;
5
+ };
6
+ type CacheEventMeta = {
7
+ key: string;
8
+ ttlSeconds?: number;
9
+ };
10
+ type PromptCacheOptions<T> = {
11
+ key?: string;
12
+ keyParts?: unknown[];
13
+ ttlSeconds?: number;
14
+ cache?: CacheAdapter;
15
+ shouldCache?: (value: T) => boolean;
16
+ onHit?: (meta: CacheEventMeta) => void;
17
+ onMiss?: (meta: CacheEventMeta) => void;
18
+ onSet?: (meta: CacheEventMeta) => void;
19
+ onError?: (error: unknown, meta: CacheEventMeta) => void;
20
+ call: () => Promise<T>;
21
+ };
22
+
23
+ declare const withPromptCache: <T>(options: PromptCacheOptions<T>) => Promise<T>;
24
+
25
+ declare const createCacheKey: (parts: unknown[]) => string;
26
+
27
+ declare const createMemoryCache: () => CacheAdapter;
28
+ declare const memoryCache: CacheAdapter;
29
+
30
+ export { type CacheAdapter, type PromptCacheOptions, createCacheKey, createMemoryCache, memoryCache, withPromptCache };
package/dist/index.js ADDED
@@ -0,0 +1,104 @@
1
+ // src/key.ts
2
+ var stableStringify = (value) => {
3
+ if (value === null || typeof value !== "object") {
4
+ return JSON.stringify(value);
5
+ }
6
+ if (Array.isArray(value)) {
7
+ return `[${value.map(stableStringify).join(",")}]`;
8
+ }
9
+ const entries = Object.entries(value).sort(([a], [b]) => a.localeCompare(b)).map(([k, v]) => `${JSON.stringify(k)}:${stableStringify(v)}`);
10
+ return `{${entries.join(",")}}`;
11
+ };
12
+ var hashString = (input) => {
13
+ let h = 2166136261;
14
+ for (let i = 0; i < input.length; i++) {
15
+ h ^= input.charCodeAt(i);
16
+ h = Math.imul(h, 16777619);
17
+ }
18
+ return (h >>> 0).toString(16);
19
+ };
20
+ var createCacheKey = (parts) => {
21
+ return hashString(stableStringify(parts));
22
+ };
23
+
24
+ // src/memory.ts
25
+ var createMemoryCache = () => {
26
+ const store = /* @__PURE__ */ new Map();
27
+ return {
28
+ async get(key) {
29
+ const item = store.get(key);
30
+ if (!item) return void 0;
31
+ if (item.expiresAt && Date.now() > item.expiresAt) {
32
+ store.delete(key);
33
+ return void 0;
34
+ }
35
+ return item.value;
36
+ },
37
+ async set(key, value, ttlSeconds) {
38
+ const expiresAt = typeof ttlSeconds === "number" ? Date.now() + ttlSeconds * 1e3 : void 0;
39
+ store.set(key, { value, expiresAt });
40
+ },
41
+ async delete(key) {
42
+ store.delete(key);
43
+ }
44
+ };
45
+ };
46
+ var memoryCache = createMemoryCache();
47
+
48
+ // src/cache.ts
49
+ var inFlight = /* @__PURE__ */ new Map();
50
+ var withPromptCache = async (options) => {
51
+ const {
52
+ key,
53
+ keyParts,
54
+ ttlSeconds = 60,
55
+ cache = memoryCache,
56
+ shouldCache,
57
+ onHit,
58
+ onMiss,
59
+ onSet,
60
+ onError,
61
+ call
62
+ } = options;
63
+ const cacheKey = key ?? createCacheKey(keyParts ?? []);
64
+ if (!cacheKey) {
65
+ throw new Error("Provide `key` or `keyParts`");
66
+ }
67
+ const meta = { key: cacheKey, ttlSeconds };
68
+ try {
69
+ const cached = await cache.get(cacheKey);
70
+ if (cached !== void 0) {
71
+ onHit?.(meta);
72
+ return cached;
73
+ }
74
+ onMiss?.(meta);
75
+ const existing = inFlight.get(cacheKey);
76
+ if (existing) {
77
+ return existing;
78
+ }
79
+ const request = (async () => {
80
+ const result = await call();
81
+ const canCache = shouldCache ? shouldCache(result) : true;
82
+ if (canCache) {
83
+ await cache.set(cacheKey, result, ttlSeconds);
84
+ onSet?.(meta);
85
+ }
86
+ return result;
87
+ })();
88
+ inFlight.set(cacheKey, request);
89
+ try {
90
+ return await request;
91
+ } finally {
92
+ inFlight.delete(cacheKey);
93
+ }
94
+ } catch (error) {
95
+ onError?.(error, meta);
96
+ throw error;
97
+ }
98
+ };
99
+ export {
100
+ createCacheKey,
101
+ createMemoryCache,
102
+ memoryCache,
103
+ withPromptCache
104
+ };
package/package.json ADDED
@@ -0,0 +1,62 @@
1
+ {
2
+ "name": "@mostafa.hanafy/prompt-cache",
3
+ "version": "0.1.0",
4
+ "description": "Lightweight caching layer for LLM API calls with deterministic keys and in-flight deduplication.",
5
+ "license": "MIT",
6
+ "type": "module",
7
+ "main": "./dist/index.cjs",
8
+ "module": "./dist/index.js",
9
+ "types": "./dist/index.d.ts",
10
+ "exports": {
11
+ ".": {
12
+ "types": "./dist/index.d.ts",
13
+ "import": "./dist/index.js",
14
+ "require": "./dist/index.cjs"
15
+ }
16
+ },
17
+ "files": [
18
+ "dist",
19
+ "README.md"
20
+ ],
21
+ "sideEffects": false,
22
+ "engines": {
23
+ "node": ">=18"
24
+ },
25
+ "scripts": {
26
+ "build": "tsup src/index.ts --format esm,cjs --dts --clean",
27
+ "dev": "tsup src/index.ts --format esm,cjs --dts --watch",
28
+ "test": "bun test",
29
+ "test:watch": "bun test --watch",
30
+ "typecheck": "tsc --noEmit -p tsconfig.json",
31
+ "lint": "biome lint src tests bench example.ts",
32
+ "biome": "biome check src tests bench example.ts README.md package.json tsconfig.json",
33
+ "format": "biome format --write .",
34
+ "bench": "bun bench/overhead.ts",
35
+ "prepublishOnly": "bun run test && bun run typecheck && bun run lint && bun run build"
36
+ },
37
+ "funding": {
38
+ "type": "individual",
39
+ "url": "https://buymeacoffee.com/mostafahanafy"
40
+ },
41
+ "publishConfig": {
42
+ "access": "public"
43
+ },
44
+ "keywords": [
45
+ "ai",
46
+ "llm",
47
+ "cache",
48
+ "prompt-cache",
49
+ "openai",
50
+ "anthropic",
51
+ "ai-performance",
52
+ "ai-cost",
53
+ "typescript",
54
+ "nodejs"
55
+ ],
56
+ "devDependencies": {
57
+ "@biomejs/biome": "^1.9.4",
58
+ "@types/bun": "^1.3.11",
59
+ "tsup": "^8.3.6",
60
+ "typescript": "^5.7.3"
61
+ }
62
+ }