phantomllm 0.3.0 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -5,7 +5,7 @@
5
5
  <h1 align="center">phantomllm</h1>
6
6
 
7
7
  <p align="center">
8
- Dockerized mock server for OpenAI-compatible APIs.<br/>
8
+ Mock server for OpenAI-compatible APIs.<br/>
9
9
  Test your LLM integrations against a real HTTP server instead of patching <code>fetch</code>.
10
10
  </p>
11
11
 
@@ -23,8 +23,7 @@ await mock.stop();
23
23
  ## Table of Contents
24
24
 
25
25
  - [Why phantomllm?](#why-phantomllm)
26
- - [Prerequisites](#prerequisites)
27
- - [Setup](#setup)
26
+ - [Installation](#installation)
28
27
  - [Getting the Server URL](#getting-the-server-url)
29
28
  - [API Reference](#api-reference)
30
29
  - [MockLLM](#mockllm)
@@ -47,72 +46,44 @@ await mock.stop();
47
46
  - [Jest](#jest)
48
47
  - [Shared Fixture for Multi-File Suites](#shared-fixture-for-multi-file-suites)
49
48
  - [Performance](#performance)
50
- - [Configuration](#configuration)
51
49
  - [Troubleshooting](#troubleshooting)
52
50
  - [License](#license)
53
51
 
54
52
  ## Why phantomllm?
55
53
 
56
- - **Real HTTP server** — no monkey-patching `fetch` or `http`. Your SDK makes actual network calls through a real TCP connection.
57
- - **Works with any client** — OpenAI SDK, Vercel AI SDK, LangChain, opencode, Python, curl anything that speaks the OpenAI API protocol.
54
+ - **Real HTTP server** — no monkey-patching `fetch` or `http`. Your SDK makes actual network calls.
55
+ - **Zero config** — `npm install phantomllm` and go. No Docker, no external services, no setup steps.
56
+ - **Works with any client** — OpenAI SDK, Vercel AI SDK, LangChain, opencode, Python, curl.
58
57
  - **Streaming support** — SSE chunked responses work exactly like the real OpenAI API.
59
- - **Fast** — ~1s container cold start, sub-millisecond response latency, 4,000+ req/s throughput.
60
- - **Simple API** — fluent `given`/`require` pattern: `mock.given.chatCompletion.forModel('gpt-4').willReturn('Hello')`.
58
+ - **Fast** — in-process Fastify server, sub-millisecond response latency.
59
+ - **Simple API** — fluent `given`/`expect` pattern: `mock.given.chatCompletion.forModel('gpt-4').willReturn('Hello')`.
61
60
 
62
- ## Prerequisites
63
-
64
- | Requirement | Version |
65
- |-------------|---------|
66
- | Node.js | 18+ |
67
- | Docker | 20.10+ |
68
-
69
- Docker must be running before you call `mock.start()`. Verify with:
61
+ ## Installation
70
62
 
71
63
  ```bash
72
- docker info
64
+ npm install phantomllm --save-dev
73
65
  ```
74
66
 
75
- ## Setup
76
-
77
- ### 1. Install the package
78
-
79
67
  ```bash
80
- npm install phantomllm --save-dev
68
+ pnpm add -D phantomllm
81
69
  ```
82
70
 
83
- ### 2. Build the Docker image
84
-
85
- The mock server runs inside Docker. You need to build the image once before running tests:
86
-
87
71
  ```bash
88
- # Clone the repo (if using from source)
89
- npm run docker:build
72
+ yarn add -D phantomllm
90
73
  ```
91
74
 
92
- This builds a ~170MB Alpine-based image tagged `phantomllm-server:latest`. The image contains only the compiled Fastify server — no dev dependencies.
93
-
94
- ### 3. Verify it works
95
-
96
- ```typescript
97
- import { MockLLM } from 'phantomllm';
98
-
99
- const mock = new MockLLM();
100
- await mock.start();
101
- console.log('Mock server running at:', mock.apiBaseUrl);
102
- // => "Mock server running at: http://localhost:55123/v1"
103
- await mock.stop();
104
- ```
75
+ That's it. No Docker, no image builds, no extra setup.
105
76
 
106
77
  ## Getting the Server URL
107
78
 
108
- `MockLLM` provides two URL getters. Use whichever fits your client:
79
+ `MockLLM` provides two URL getters:
109
80
 
110
81
  ```typescript
111
82
  const mock = new MockLLM();
112
83
  await mock.start();
113
84
 
114
- mock.baseUrl // "http://localhost:55123" — raw host:port
115
- mock.apiBaseUrl // "http://localhost:55123/v1" — includes /v1 prefix
85
+ mock.baseUrl // "http://127.0.0.1:55123" — raw host:port
86
+ mock.apiBaseUrl // "http://127.0.0.1:55123/v1" — includes /v1 prefix
116
87
  ```
117
88
 
118
89
  **Which one to use:**
@@ -122,10 +93,7 @@ mock.apiBaseUrl // "http://localhost:55123/v1" — includes /v1 prefix
122
93
  | OpenAI SDK (`baseURL`) | `mock.apiBaseUrl` | `new OpenAI({ baseURL: mock.apiBaseUrl })` |
123
94
  | Vercel AI SDK (`baseURL`) | `mock.apiBaseUrl` | `createOpenAI({ baseURL: mock.apiBaseUrl })` |
124
95
  | LangChain (`configuration.baseURL`) | `mock.apiBaseUrl` | `new ChatOpenAI({ configuration: { baseURL: mock.apiBaseUrl } })` |
125
- | opencode config | `mock.apiBaseUrl` | `"baseURL": "http://localhost:55123/v1"` |
126
- | Python openai (`base_url`) | `mock.apiBaseUrl` | `OpenAI(base_url=mock.apiBaseUrl)` |
127
- | Plain fetch | `mock.baseUrl` | `fetch(\`${mock.baseUrl}/v1/chat/completions\`)` |
128
- | Admin API | `mock.baseUrl` | `fetch(\`${mock.baseUrl}/_admin/health\`)` |
96
+ | Plain fetch | `mock.baseUrl` | `` fetch(`${mock.baseUrl}/v1/chat/completions`) `` |
129
97
 
130
98
  Most SDK clients expect the URL to end with `/v1`. Use `mock.apiBaseUrl` and you won't need to think about it.
131
99
 
@@ -133,28 +101,23 @@ Most SDK clients expect the URL to end with `/v1`. Use `mock.apiBaseUrl` and you
133
101
 
134
102
  ### MockLLM
135
103
 
136
- The main class. Creates and manages a Docker container running the mock OpenAI server.
104
+ The main class. Starts an in-process HTTP server that implements the OpenAI API.
137
105
 
138
106
  ```typescript
139
107
  import { MockLLM } from 'phantomllm';
140
108
 
141
- const mock = new MockLLM({
142
- image: 'phantomllm-server:latest', // Docker image name (default)
143
- containerPort: 8080, // Port inside the container (default)
144
- reuse: true, // Reuse container across runs (default)
145
- startupTimeout: 30_000, // Max ms to wait for startup (default)
146
- });
109
+ const mock = new MockLLM();
147
110
  ```
148
111
 
149
112
  | Method / Property | Returns | Description |
150
113
  |---|---|---|
151
- | `await mock.start()` | `void` | Start the Docker container. Idempotent — safe to call twice. |
152
- | `await mock.stop()` | `void` | Stop and remove the container. Idempotent. |
153
- | `mock.baseUrl` | `string` | Server URL without `/v1`, e.g. `http://localhost:55123`. |
154
- | `mock.apiBaseUrl` | `string` | Server URL with `/v1`, e.g. `http://localhost:55123/v1`. Pass this to SDK clients. |
114
+ | `await mock.start()` | `void` | Start the mock server. Idempotent — safe to call twice. |
115
+ | `await mock.stop()` | `void` | Stop the server. Idempotent. |
116
+ | `mock.baseUrl` | `string` | Server URL without `/v1`, e.g. `http://127.0.0.1:55123`. |
117
+ | `mock.apiBaseUrl` | `string` | Server URL with `/v1`, e.g. `http://127.0.0.1:55123/v1`. Pass this to SDK clients. |
155
118
  | `mock.given` | `GivenStubs` | Entry point for stubbing responses. |
156
119
  | `mock.expect` | `ExpectConditions` | Entry point for configuring server behavior (API key validation, etc.). |
157
- | `await mock.clear()` | `void` | Remove all stubs and reset server config (including API key). Call between tests. |
120
+ | `await mock.clear()` | `void` | Remove all stubs and reset server config. Call between tests. |
158
121
 
159
122
  `MockLLM` implements `Symbol.asyncDispose` for automatic cleanup:
160
123
 
@@ -266,7 +229,7 @@ Error responses follow the OpenAI error format:
266
229
 
267
230
  ### API Key Validation
268
231
 
269
- Test that your code sends the correct API key by requiring authentication on the mock server.
232
+ Test that your code sends the correct API key.
270
233
 
271
234
  ```typescript
272
235
  mock.expect.apiKey('sk-test-key-123');
@@ -281,13 +244,13 @@ const openai = new OpenAI({
281
244
  });
282
245
  ```
283
246
 
284
- `mock.expect` configures server constraints at runtime — no container restart needed. API key validation applies to all `/v1/*` endpoints (chat completions, embeddings, models). Admin endpoints (`/_admin/*`) are always accessible.
247
+ `mock.expect` configures server constraints at runtime. API key validation applies to all `/v1/*` endpoints. Admin endpoints (`/_admin/*`) are always accessible.
285
248
 
286
249
  Calling `mock.clear()` resets the API key requirement along with all stubs.
287
250
 
288
251
  | Method | Description |
289
252
  |---|---|
290
- | `mock.expect.apiKey(key)` | Require `Authorization: Bearer <key>` on all `/v1/*` requests. Single call, no chaining needed. |
253
+ | `mock.expect.apiKey(key)` | Require `Authorization: Bearer <key>` on all `/v1/*` requests. |
291
254
 
292
255
  **Testing auth error handling:**
293
256
 
@@ -296,7 +259,6 @@ it('handles invalid API key', async () => {
296
259
  mock.expect.apiKey('correct-key');
297
260
  mock.given.chatCompletion.willReturn('Hello');
298
261
 
299
- // Client configured with wrong key
300
262
  const badClient = new OpenAI({
301
263
  baseURL: mock.apiBaseUrl,
302
264
  apiKey: 'wrong-key',
@@ -349,7 +311,7 @@ await mock.start();
349
311
 
350
312
  const openai = new OpenAI({
351
313
  baseURL: mock.apiBaseUrl,
352
- apiKey: 'test-key', // any string — the mock doesn't validate keys
314
+ apiKey: 'test-key',
353
315
  });
354
316
 
355
317
  // Non-streaming
@@ -428,7 +390,7 @@ for await (const chunk of result.textStream) {
428
390
  await mock.stop();
429
391
  ```
430
392
 
431
- > **Note:** Use `provider.chat('model')` instead of `provider('model')` to ensure requests go through `/v1/chat/completions`. The default `provider('model')` in `@ai-sdk/openai` v3+ uses the Responses API.
393
+ > **Note:** Use `provider.chat('model')` instead of `provider('model')` to ensure requests go through `/v1/chat/completions`.
432
394
 
433
395
  ### opencode
434
396
 
@@ -439,7 +401,7 @@ Add a provider entry to your `opencode.json` pointing at the mock:
439
401
  "provider": {
440
402
  "mock": {
441
403
  "api": "openai",
442
- "baseURL": "http://localhost:PORT/v1",
404
+ "baseURL": "http://127.0.0.1:PORT/v1",
443
405
  "apiKey": "test-key",
444
406
  "models": {
445
407
  "gpt-4o": { "id": "gpt-4o" }
@@ -449,13 +411,12 @@ Add a provider entry to your `opencode.json` pointing at the mock:
449
411
  }
450
412
  ```
451
413
 
452
- Start the mock and print the URL to use:
414
+ Start the mock and print the URL:
453
415
 
454
416
  ```typescript
455
417
  const mock = new MockLLM();
456
418
  await mock.start();
457
419
  console.log(`Set baseURL to: ${mock.apiBaseUrl}`);
458
- // Update the port in opencode.json to match
459
420
  ```
460
421
 
461
422
  ### LangChain
@@ -485,13 +446,13 @@ await mock.stop();
485
446
 
486
447
  ### Python openai package
487
448
 
488
- The mock server is a real HTTP server — any language can use it. Start the mock from Node.js, then connect from Python:
449
+ The mock server is a real HTTP server — any language can connect to it. Start the mock from Node.js, then use it from Python:
489
450
 
490
451
  ```python
491
452
  import openai
492
453
 
493
454
  client = openai.OpenAI(
494
- base_url="http://localhost:55123/v1", # use mock.apiBaseUrl
455
+ base_url="http://127.0.0.1:55123/v1", # use mock.apiBaseUrl
495
456
  api_key="test-key",
496
457
  )
497
458
 
@@ -521,7 +482,7 @@ console.log(data.choices[0].message.content);
521
482
  ### curl
522
483
 
523
484
  ```bash
524
- curl http://localhost:55123/v1/chat/completions \
485
+ curl http://127.0.0.1:55123/v1/chat/completions \
525
486
  -H "Content-Type: application/json" \
526
487
  -d '{
527
488
  "model": "gpt-4o",
@@ -545,14 +506,14 @@ describe('my LLM feature', () => {
545
506
  beforeAll(async () => {
546
507
  await mock.start();
547
508
  openai = new OpenAI({ baseURL: mock.apiBaseUrl, apiKey: 'test' });
548
- }, 30_000);
509
+ });
549
510
 
550
511
  afterAll(async () => {
551
512
  await mock.stop();
552
513
  });
553
514
 
554
515
  beforeEach(async () => {
555
- await mock.clear(); // reset stubs between tests
516
+ await mock.clear();
556
517
  });
557
518
 
558
519
  it('should summarize text', async () => {
@@ -611,7 +572,7 @@ describe('my LLM feature', () => {
611
572
  beforeAll(async () => {
612
573
  await mock.start();
613
574
  openai = new OpenAI({ baseURL: mock.apiBaseUrl, apiKey: 'test' });
614
- }, 30_000); // container startup timeout
575
+ });
615
576
 
616
577
  afterAll(() => mock.stop());
617
578
  beforeEach(() => mock.clear());
@@ -631,7 +592,7 @@ describe('my LLM feature', () => {
631
592
 
632
593
  ### Shared Fixture for Multi-File Suites
633
594
 
634
- Start one container for your entire test suite. Each test file imports the shared mock and clears stubs between tests.
595
+ Start one server for your entire test suite:
635
596
 
636
597
  **`tests/support/mock.ts`**
637
598
 
@@ -642,7 +603,6 @@ export const mock = new MockLLM();
642
603
 
643
604
  export async function setup() {
644
605
  await mock.start();
645
- // Make the URL available to other processes if needed
646
606
  process.env.PHANTOMLLM_URL = mock.apiBaseUrl;
647
607
  }
648
608
 
@@ -678,95 +638,27 @@ it('works', async () => {
678
638
 
679
639
  ## Performance
680
640
 
681
- Benchmarks on Apple Silicon (Docker via OrbStack):
682
-
683
- ### Container Lifecycle
684
-
685
- | Metric | Time |
686
- |--------|------|
687
- | Cold start (`mock.start()`) | ~1.1s |
688
- | Stop (`mock.stop()`) | ~130ms |
689
- | Full lifecycle (start, stub, request, stop) | ~1.2s |
690
-
691
- ### Request Latency (through Docker network)
692
-
693
- | Metric | Median | p95 |
694
- |--------|--------|-----|
695
- | Chat completion (non-streaming) | 0.6ms | 1.8ms |
696
- | Streaming TTFB | 0.7ms | 0.9ms |
697
- | Streaming total (8 chunks) | 0.7ms | 1.0ms |
698
- | Embedding (1536-dim) | 0.7ms | 1.6ms |
699
- | Embedding batch (10x1536) | 1.9ms | 2.7ms |
700
- | Stub registration | 0.5ms | 0.8ms |
701
- | Clear stubs | 0.2ms | 0.4ms |
702
-
703
- ### Throughput
704
-
705
- | Metric | Requests/sec |
706
- |--------|-------------|
707
- | Sequential chat completions | ~4,300 |
708
- | Concurrent (10 workers) | ~6,400 |
709
- | Health endpoint | ~5,900 |
710
-
711
- ### Tips
712
-
713
- - **Don't restart between tests.** Call `mock.clear()` (sub-millisecond) instead of `stop()`/`start()` (~1.2s).
714
- - **Use global setup.** Start one container for your entire suite. See [Shared Fixture](#shared-fixture-for-multi-file-suites).
715
- - **Cache the Docker image in CI.** Build it once and cache the layer:
716
-
717
- ```yaml
718
- # GitHub Actions
719
- - name: Build mock server image
720
- run: npm run docker:build
721
-
722
- - name: Cache Docker layers
723
- uses: actions/cache@v4
724
- with:
725
- path: /tmp/.buildx-cache
726
- key: ${{ runner.os }}-docker-phantomllm-${{ hashFiles('Dockerfile') }}
727
- ```
728
-
729
- ## Configuration
730
-
731
- ### Constructor Options
732
-
733
- | Option | Type | Default | Description |
734
- |--------|------|---------|-------------|
735
- | `image` | `string` | `'phantomllm-server:latest'` | Docker image to run. |
736
- | `containerPort` | `number` | `8080` | Port the server listens on inside the container. |
737
- | `reuse` | `boolean` | `true` | Reuse a running container across test runs. |
738
- | `startupTimeout` | `number` | `30000` | Max milliseconds to wait for the container to become healthy. |
739
-
740
- ### Environment Variables
741
-
742
- | Variable | Description |
743
- |----------|-------------|
744
- | `PHANTOMLLM_IMAGE` | Override the Docker image name without changing code. Takes precedence over the default but not over the constructor `image` option. |
745
-
746
- ### OpenAI-Compatible Endpoints
641
+ The mock server runs in-process using Fastify — no Docker overhead:
747
642
 
748
- The mock server implements:
643
+ | Metric | Value |
644
+ |--------|-------|
645
+ | Server startup | < 5ms |
646
+ | Chat completion response | ~0.2ms median |
647
+ | Streaming (8 chunks) | ~0.2ms total |
648
+ | Embedding (1536-dim) | ~0.3ms median |
649
+ | Throughput | ~11,000 req/s |
749
650
 
750
- | Endpoint | Method | Description |
751
- |----------|--------|-------------|
752
- | `/v1/chat/completions` | POST | Chat completions (streaming and non-streaming) |
753
- | `/v1/embeddings` | POST | Text embeddings |
754
- | `/v1/models` | GET | List available models |
755
- | `/_admin/stubs` | POST | Register a stub |
756
- | `/_admin/stubs` | DELETE | Clear all stubs |
757
- | `/_admin/health` | GET | Health check |
651
+ **Tips:**
652
+ - Start the server once in `beforeAll`, call `mock.clear()` between tests.
653
+ - Use a [shared fixture](#shared-fixture-for-multi-file-suites) for multi-file test suites.
758
654
 
759
655
  ## Troubleshooting
760
656
 
761
657
  | Problem | Cause | Solution |
762
658
  |---|---|---|
763
659
  | `ContainerNotStartedError` | Using `baseUrl`, `given`, or `clear()` before `start()`. | Call `await mock.start()` first. |
764
- | Container startup timeout | Docker not running or image not built. | Run `docker info` to verify Docker. Run `npm run docker:build` to build the image. |
765
- | Connection refused | Wrong URL or container not ready. | Use `mock.apiBaseUrl` for SDK clients. Ensure `start()` has resolved. |
766
660
  | Stubs leaking between tests | Stubs persist until cleared. | Call `await mock.clear()` in `beforeEach`. |
767
661
  | 418 response | No stub matches the request. | Register a stub matching the model/content, or add a catch-all: `mock.given.chatCompletion.willReturn('...')`. |
768
- | `PHANTOMLLM_IMAGE` env var | Need a custom image. | Set `PHANTOMLLM_IMAGE=my-registry/image:tag` in your environment. |
769
- | Slow CI | Image rebuilt every run. | Cache Docker layers and enable container reuse. |
770
662
  | AI SDK uses wrong endpoint | `provider('model')` defaults to Responses API in v3+. | Use `provider.chat('model')` to target `/v1/chat/completions`. |
771
663
 
772
664
  ## License