phantomllm 0.2.3 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -5,7 +5,7 @@
5
5
  <h1 align="center">phantomllm</h1>
6
6
 
7
7
  <p align="center">
8
- Dockerized mock server for OpenAI-compatible APIs.<br/>
8
+ Mock server for OpenAI-compatible APIs.<br/>
9
9
  Test your LLM integrations against a real HTTP server instead of patching <code>fetch</code>.
10
10
  </p>
11
11
 
@@ -23,8 +23,7 @@ await mock.stop();
23
23
  ## Table of Contents
24
24
 
25
25
  - [Why phantomllm?](#why-phantomllm)
26
- - [Prerequisites](#prerequisites)
27
- - [Setup](#setup)
26
+ - [Installation](#installation)
28
27
  - [Getting the Server URL](#getting-the-server-url)
29
28
  - [API Reference](#api-reference)
30
29
  - [MockLLM](#mockllm)
@@ -32,6 +31,7 @@ await mock.stop();
32
31
  - [Streaming](#streaming)
33
32
  - [Embeddings](#embeddings)
34
33
  - [Error Simulation](#error-simulation)
34
+ - [API Key Validation](#api-key-validation)
35
35
  - [Stub Matching](#stub-matching)
36
36
  - [Integration Examples](#integration-examples)
37
37
  - [OpenAI Node.js SDK](#openai-nodejs-sdk)
@@ -46,72 +46,44 @@ await mock.stop();
46
46
  - [Jest](#jest)
47
47
  - [Shared Fixture for Multi-File Suites](#shared-fixture-for-multi-file-suites)
48
48
  - [Performance](#performance)
49
- - [Configuration](#configuration)
50
49
  - [Troubleshooting](#troubleshooting)
51
50
  - [License](#license)
52
51
 
53
52
  ## Why phantomllm?
54
53
 
55
- - **Real HTTP server** — no monkey-patching `fetch` or `http`. Your SDK makes actual network calls through a real TCP connection.
56
- - **Works with any client** — OpenAI SDK, Vercel AI SDK, LangChain, opencode, Python, curl anything that speaks the OpenAI API protocol.
54
+ - **Real HTTP server** — no monkey-patching `fetch` or `http`. Your SDK makes actual network calls.
55
+ - **Zero config** — `npm install phantomllm` and go. No Docker, no external services, no setup steps.
56
+ - **Works with any client** — OpenAI SDK, Vercel AI SDK, LangChain, opencode, Python, curl.
57
57
  - **Streaming support** — SSE chunked responses work exactly like the real OpenAI API.
58
- - **Fast** — ~1s container cold start, sub-millisecond response latency, 4,000+ req/s throughput.
59
- - **Simple API** — fluent `given/when` pattern: `mock.given.chatCompletion.forModel('gpt-4').willReturn('Hello')`.
58
+ - **Fast** — in-process Fastify server, sub-millisecond response latency.
59
+ - **Simple API** — fluent `given`/`expect` pattern: `mock.given.chatCompletion.forModel('gpt-4').willReturn('Hello')`.
60
60
 
61
- ## Prerequisites
62
-
63
- | Requirement | Version |
64
- |-------------|---------|
65
- | Node.js | 18+ |
66
- | Docker | 20.10+ |
67
-
68
- Docker must be running before you call `mock.start()`. Verify with:
61
+ ## Installation
69
62
 
70
63
  ```bash
71
- docker info
64
+ npm install phantomllm --save-dev
72
65
  ```
73
66
 
74
- ## Setup
75
-
76
- ### 1. Install the package
77
-
78
67
  ```bash
79
- npm install phantomllm --save-dev
68
+ pnpm add -D phantomllm
80
69
  ```
81
70
 
82
- ### 2. Build the Docker image
83
-
84
- The mock server runs inside Docker. You need to build the image once before running tests:
85
-
86
71
  ```bash
87
- # Clone the repo (if using from source)
88
- npm run docker:build
72
+ yarn add -D phantomllm
89
73
  ```
90
74
 
91
- This builds a ~170MB Alpine-based image tagged `phantomllm-server:latest`. The image contains only the compiled Fastify server — no dev dependencies.
92
-
93
- ### 3. Verify it works
94
-
95
- ```typescript
96
- import { MockLLM } from 'phantomllm';
97
-
98
- const mock = new MockLLM();
99
- await mock.start();
100
- console.log('Mock server running at:', mock.apiBaseUrl);
101
- // => "Mock server running at: http://localhost:55123/v1"
102
- await mock.stop();
103
- ```
75
+ That's it. No Docker, no image builds, no extra setup.
104
76
 
105
77
  ## Getting the Server URL
106
78
 
107
- `MockLLM` provides two URL getters. Use whichever fits your client:
79
+ `MockLLM` provides two URL getters:
108
80
 
109
81
  ```typescript
110
82
  const mock = new MockLLM();
111
83
  await mock.start();
112
84
 
113
- mock.baseUrl // "http://localhost:55123" — raw host:port
114
- mock.apiBaseUrl // "http://localhost:55123/v1" — includes /v1 prefix
85
+ mock.baseUrl // "http://127.0.0.1:55123" — raw host:port
86
+ mock.apiBaseUrl // "http://127.0.0.1:55123/v1" — includes /v1 prefix
115
87
  ```
116
88
 
117
89
  **Which one to use:**
@@ -121,10 +93,7 @@ mock.apiBaseUrl // "http://localhost:55123/v1" — includes /v1 prefix
121
93
  | OpenAI SDK (`baseURL`) | `mock.apiBaseUrl` | `new OpenAI({ baseURL: mock.apiBaseUrl })` |
122
94
  | Vercel AI SDK (`baseURL`) | `mock.apiBaseUrl` | `createOpenAI({ baseURL: mock.apiBaseUrl })` |
123
95
  | LangChain (`configuration.baseURL`) | `mock.apiBaseUrl` | `new ChatOpenAI({ configuration: { baseURL: mock.apiBaseUrl } })` |
124
- | opencode config | `mock.apiBaseUrl` | `"baseURL": "http://localhost:55123/v1"` |
125
- | Python openai (`base_url`) | `mock.apiBaseUrl` | `OpenAI(base_url=mock.apiBaseUrl)` |
126
- | Plain fetch | `mock.baseUrl` | `fetch(\`${mock.baseUrl}/v1/chat/completions\`)` |
127
- | Admin API | `mock.baseUrl` | `fetch(\`${mock.baseUrl}/_admin/health\`)` |
96
+ | Plain fetch | `mock.baseUrl` | `` fetch(`${mock.baseUrl}/v1/chat/completions`) `` |
128
97
 
129
98
  Most SDK clients expect the URL to end with `/v1`. Use `mock.apiBaseUrl` and you won't need to think about it.
130
99
 
@@ -132,27 +101,23 @@ Most SDK clients expect the URL to end with `/v1`. Use `mock.apiBaseUrl` and you
132
101
 
133
102
  ### MockLLM
134
103
 
135
- The main class. Creates and manages a Docker container running the mock OpenAI server.
104
+ The main class. Starts an in-process HTTP server that implements the OpenAI API.
136
105
 
137
106
  ```typescript
138
107
  import { MockLLM } from 'phantomllm';
139
108
 
140
- const mock = new MockLLM({
141
- image: 'phantomllm-server:latest', // Docker image name (default)
142
- containerPort: 8080, // Port inside the container (default)
143
- reuse: true, // Reuse container across runs (default)
144
- startupTimeout: 30_000, // Max ms to wait for startup (default)
145
- });
109
+ const mock = new MockLLM();
146
110
  ```
147
111
 
148
112
  | Method / Property | Returns | Description |
149
113
  |---|---|---|
150
- | `await mock.start()` | `void` | Start the Docker container. Idempotent — safe to call twice. |
151
- | `await mock.stop()` | `void` | Stop and remove the container. Idempotent. |
152
- | `mock.baseUrl` | `string` | Server URL without `/v1`, e.g. `http://localhost:55123`. |
153
- | `mock.apiBaseUrl` | `string` | Server URL with `/v1`, e.g. `http://localhost:55123/v1`. Pass this to SDK clients. |
154
- | `mock.given` | `GivenStubs` | Entry point for the fluent stubbing API. |
155
- | `await mock.clear()` | `void` | Remove all registered stubs. Call between tests. |
114
+ | `await mock.start()` | `void` | Start the mock server. Idempotent — safe to call twice. |
115
+ | `await mock.stop()` | `void` | Stop the server. Idempotent. |
116
+ | `mock.baseUrl` | `string` | Server URL without `/v1`, e.g. `http://127.0.0.1:55123`. |
117
+ | `mock.apiBaseUrl` | `string` | Server URL with `/v1`, e.g. `http://127.0.0.1:55123/v1`. Pass this to SDK clients. |
118
+ | `mock.given` | `GivenStubs` | Entry point for stubbing responses. |
119
+ | `mock.expect` | `ExpectConditions` | Entry point for configuring server behavior (API key validation, etc.). |
120
+ | `await mock.clear()` | `void` | Remove all stubs and reset server config. Call between tests. |
156
121
 
157
122
  `MockLLM` implements `Symbol.asyncDispose` for automatic cleanup:
158
123
 
@@ -262,6 +227,52 @@ Error responses follow the OpenAI error format:
262
227
  }
263
228
  ```
264
229
 
230
+ ### API Key Validation
231
+
232
+ Test that your code sends the correct API key.
233
+
234
+ ```typescript
235
+ mock.expect.apiKey('sk-test-key-123');
236
+
237
+ // Requests without a key or with the wrong key get 401
238
+ // { error: { message: "...", type: "authentication_error", code: "invalid_api_key" } }
239
+
240
+ // Only requests with the correct key succeed
241
+ const openai = new OpenAI({
242
+ baseURL: mock.apiBaseUrl,
243
+ apiKey: 'sk-test-key-123', // must match exactly
244
+ });
245
+ ```
246
+
247
+ `mock.expect` configures server constraints at runtime. API key validation applies to all `/v1/*` endpoints. Admin endpoints (`/_admin/*`) are always accessible.
248
+
249
+ Calling `mock.clear()` resets the API key requirement along with all stubs.
250
+
251
+ | Method | Description |
252
+ |---|---|
253
+ | `mock.expect.apiKey(key)` | Require `Authorization: Bearer <key>` on all `/v1/*` requests. |
254
+
255
+ **Testing auth error handling:**
256
+
257
+ ```typescript
258
+ it('handles invalid API key', async () => {
259
+ mock.expect.apiKey('correct-key');
260
+ mock.given.chatCompletion.willReturn('Hello');
261
+
262
+ const badClient = new OpenAI({
263
+ baseURL: mock.apiBaseUrl,
264
+ apiKey: 'wrong-key',
265
+ });
266
+
267
+ await expect(
268
+ badClient.chat.completions.create({
269
+ model: 'gpt-4o',
270
+ messages: [{ role: 'user', content: 'Hi' }],
271
+ }),
272
+ ).rejects.toThrow();
273
+ });
274
+ ```
275
+
265
276
  ### Stub Matching
266
277
 
267
278
  When multiple stubs are registered, the most specific match wins:
@@ -300,7 +311,7 @@ await mock.start();
300
311
 
301
312
  const openai = new OpenAI({
302
313
  baseURL: mock.apiBaseUrl,
303
- apiKey: 'test-key', // any string — the mock doesn't validate keys
314
+ apiKey: 'test-key',
304
315
  });
305
316
 
306
317
  // Non-streaming
@@ -379,7 +390,7 @@ for await (const chunk of result.textStream) {
379
390
  await mock.stop();
380
391
  ```
381
392
 
382
- > **Note:** Use `provider.chat('model')` instead of `provider('model')` to ensure requests go through `/v1/chat/completions`. The default `provider('model')` in `@ai-sdk/openai` v3+ uses the Responses API.
393
+ > **Note:** Use `provider.chat('model')` instead of `provider('model')` to ensure requests go through `/v1/chat/completions`.
383
394
 
384
395
  ### opencode
385
396
 
@@ -390,7 +401,7 @@ Add a provider entry to your `opencode.json` pointing at the mock:
390
401
  "provider": {
391
402
  "mock": {
392
403
  "api": "openai",
393
- "baseURL": "http://localhost:PORT/v1",
404
+ "baseURL": "http://127.0.0.1:PORT/v1",
394
405
  "apiKey": "test-key",
395
406
  "models": {
396
407
  "gpt-4o": { "id": "gpt-4o" }
@@ -400,13 +411,12 @@ Add a provider entry to your `opencode.json` pointing at the mock:
400
411
  }
401
412
  ```
402
413
 
403
- Start the mock and print the URL to use:
414
+ Start the mock and print the URL:
404
415
 
405
416
  ```typescript
406
417
  const mock = new MockLLM();
407
418
  await mock.start();
408
419
  console.log(`Set baseURL to: ${mock.apiBaseUrl}`);
409
- // Update the port in opencode.json to match
410
420
  ```
411
421
 
412
422
  ### LangChain
@@ -436,13 +446,13 @@ await mock.stop();
436
446
 
437
447
  ### Python openai package
438
448
 
439
- The mock server is a real HTTP server — any language can use it. Start the mock from Node.js, then connect from Python:
449
+ The mock server is a real HTTP server — any language can connect to it. Start the mock from Node.js, then use it from Python:
440
450
 
441
451
  ```python
442
452
  import openai
443
453
 
444
454
  client = openai.OpenAI(
445
- base_url="http://localhost:55123/v1", # use mock.apiBaseUrl
455
+ base_url="http://127.0.0.1:55123/v1", # use mock.apiBaseUrl
446
456
  api_key="test-key",
447
457
  )
448
458
 
@@ -472,7 +482,7 @@ console.log(data.choices[0].message.content);
472
482
  ### curl
473
483
 
474
484
  ```bash
475
- curl http://localhost:55123/v1/chat/completions \
485
+ curl http://127.0.0.1:55123/v1/chat/completions \
476
486
  -H "Content-Type: application/json" \
477
487
  -d '{
478
488
  "model": "gpt-4o",
@@ -496,14 +506,14 @@ describe('my LLM feature', () => {
496
506
  beforeAll(async () => {
497
507
  await mock.start();
498
508
  openai = new OpenAI({ baseURL: mock.apiBaseUrl, apiKey: 'test' });
499
- }, 30_000);
509
+ });
500
510
 
501
511
  afterAll(async () => {
502
512
  await mock.stop();
503
513
  });
504
514
 
505
515
  beforeEach(async () => {
506
- await mock.clear(); // reset stubs between tests
516
+ await mock.clear();
507
517
  });
508
518
 
509
519
  it('should summarize text', async () => {
@@ -562,7 +572,7 @@ describe('my LLM feature', () => {
562
572
  beforeAll(async () => {
563
573
  await mock.start();
564
574
  openai = new OpenAI({ baseURL: mock.apiBaseUrl, apiKey: 'test' });
565
- }, 30_000); // container startup timeout
575
+ });
566
576
 
567
577
  afterAll(() => mock.stop());
568
578
  beforeEach(() => mock.clear());
@@ -582,7 +592,7 @@ describe('my LLM feature', () => {
582
592
 
583
593
  ### Shared Fixture for Multi-File Suites
584
594
 
585
- Start one container for your entire test suite. Each test file imports the shared mock and clears stubs between tests.
595
+ Start one server for your entire test suite:
586
596
 
587
597
  **`tests/support/mock.ts`**
588
598
 
@@ -593,7 +603,6 @@ export const mock = new MockLLM();
593
603
 
594
604
  export async function setup() {
595
605
  await mock.start();
596
- // Make the URL available to other processes if needed
597
606
  process.env.PHANTOMLLM_URL = mock.apiBaseUrl;
598
607
  }
599
608
 
@@ -629,95 +638,27 @@ it('works', async () => {
629
638
 
630
639
  ## Performance
631
640
 
632
- Benchmarks on Apple Silicon (Docker via OrbStack):
633
-
634
- ### Container Lifecycle
635
-
636
- | Metric | Time |
637
- |--------|------|
638
- | Cold start (`mock.start()`) | ~1.1s |
639
- | Stop (`mock.stop()`) | ~130ms |
640
- | Full lifecycle (start, stub, request, stop) | ~1.2s |
641
-
642
- ### Request Latency (through Docker network)
643
-
644
- | Metric | Median | p95 |
645
- |--------|--------|-----|
646
- | Chat completion (non-streaming) | 0.6ms | 1.8ms |
647
- | Streaming TTFB | 0.7ms | 0.9ms |
648
- | Streaming total (8 chunks) | 0.7ms | 1.0ms |
649
- | Embedding (1536-dim) | 0.7ms | 1.6ms |
650
- | Embedding batch (10x1536) | 1.9ms | 2.7ms |
651
- | Stub registration | 0.5ms | 0.8ms |
652
- | Clear stubs | 0.2ms | 0.4ms |
653
-
654
- ### Throughput
655
-
656
- | Metric | Requests/sec |
657
- |--------|-------------|
658
- | Sequential chat completions | ~4,300 |
659
- | Concurrent (10 workers) | ~6,400 |
660
- | Health endpoint | ~5,900 |
661
-
662
- ### Tips
663
-
664
- - **Don't restart between tests.** Call `mock.clear()` (sub-millisecond) instead of `stop()`/`start()` (~1.2s).
665
- - **Use global setup.** Start one container for your entire suite. See [Shared Fixture](#shared-fixture-for-multi-file-suites).
666
- - **Cache the Docker image in CI.** Build it once and cache the layer:
667
-
668
- ```yaml
669
- # GitHub Actions
670
- - name: Build mock server image
671
- run: npm run docker:build
672
-
673
- - name: Cache Docker layers
674
- uses: actions/cache@v4
675
- with:
676
- path: /tmp/.buildx-cache
677
- key: ${{ runner.os }}-docker-phantomllm-${{ hashFiles('Dockerfile') }}
678
- ```
679
-
680
- ## Configuration
681
-
682
- ### Constructor Options
683
-
684
- | Option | Type | Default | Description |
685
- |--------|------|---------|-------------|
686
- | `image` | `string` | `'phantomllm-server:latest'` | Docker image to run. |
687
- | `containerPort` | `number` | `8080` | Port the server listens on inside the container. |
688
- | `reuse` | `boolean` | `true` | Reuse a running container across test runs. |
689
- | `startupTimeout` | `number` | `30000` | Max milliseconds to wait for the container to become healthy. |
690
-
691
- ### Environment Variables
692
-
693
- | Variable | Description |
694
- |----------|-------------|
695
- | `PHANTOMLLM_IMAGE` | Override the Docker image name without changing code. Takes precedence over the default but not over the constructor `image` option. |
696
-
697
- ### OpenAI-Compatible Endpoints
641
+ The mock server runs in-process using Fastify — no Docker overhead:
698
642
 
699
- The mock server implements:
643
+ | Metric | Value |
644
+ |--------|-------|
645
+ | Server startup | < 5ms |
646
+ | Chat completion response | ~0.2ms median |
647
+ | Streaming (8 chunks) | ~0.2ms total |
648
+ | Embedding (1536-dim) | ~0.3ms median |
649
+ | Throughput | ~11,000 req/s |
700
650
 
701
- | Endpoint | Method | Description |
702
- |----------|--------|-------------|
703
- | `/v1/chat/completions` | POST | Chat completions (streaming and non-streaming) |
704
- | `/v1/embeddings` | POST | Text embeddings |
705
- | `/v1/models` | GET | List available models |
706
- | `/_admin/stubs` | POST | Register a stub |
707
- | `/_admin/stubs` | DELETE | Clear all stubs |
708
- | `/_admin/health` | GET | Health check |
651
+ **Tips:**
652
+ - Start the server once in `beforeAll`, call `mock.clear()` between tests.
653
+ - Use a [shared fixture](#shared-fixture-for-multi-file-suites) for multi-file test suites.
709
654
 
710
655
  ## Troubleshooting
711
656
 
712
657
  | Problem | Cause | Solution |
713
658
  |---|---|---|
714
659
  | `ContainerNotStartedError` | Using `baseUrl`, `given`, or `clear()` before `start()`. | Call `await mock.start()` first. |
715
- | Container startup timeout | Docker not running or image not built. | Run `docker info` to verify Docker. Run `npm run docker:build` to build the image. |
716
- | Connection refused | Wrong URL or container not ready. | Use `mock.apiBaseUrl` for SDK clients. Ensure `start()` has resolved. |
717
660
  | Stubs leaking between tests | Stubs persist until cleared. | Call `await mock.clear()` in `beforeEach`. |
718
661
  | 418 response | No stub matches the request. | Register a stub matching the model/content, or add a catch-all: `mock.given.chatCompletion.willReturn('...')`. |
719
- | `PHANTOMLLM_IMAGE` env var | Need a custom image. | Set `PHANTOMLLM_IMAGE=my-registry/image:tag` in your environment. |
720
- | Slow CI | Image rebuilt every run. | Cache Docker layers and enable container reuse. |
721
662
  | AI SDK uses wrong endpoint | `provider('model')` defaults to Responses API in v3+. | Use `provider.chat('model')` to target `/v1/chat/completions`. |
722
663
 
723
664
  ## License