phantomllm 0.3.0 → 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +48 -156
- package/dist/index.cjs +485 -42
- package/dist/index.cjs.map +1 -1
- package/dist/index.d.cts +2 -10
- package/dist/index.d.ts +2 -10
- package/dist/index.js +481 -43
- package/dist/index.js.map +1 -1
- package/package.json +3 -8
package/README.md
CHANGED
|
@@ -5,7 +5,7 @@
|
|
|
5
5
|
<h1 align="center">phantomllm</h1>
|
|
6
6
|
|
|
7
7
|
<p align="center">
|
|
8
|
-
|
|
8
|
+
Mock server for OpenAI-compatible APIs.<br/>
|
|
9
9
|
Test your LLM integrations against a real HTTP server instead of patching <code>fetch</code>.
|
|
10
10
|
</p>
|
|
11
11
|
|
|
@@ -23,8 +23,7 @@ await mock.stop();
|
|
|
23
23
|
## Table of Contents
|
|
24
24
|
|
|
25
25
|
- [Why phantomllm?](#why-phantomllm)
|
|
26
|
-
- [
|
|
27
|
-
- [Setup](#setup)
|
|
26
|
+
- [Installation](#installation)
|
|
28
27
|
- [Getting the Server URL](#getting-the-server-url)
|
|
29
28
|
- [API Reference](#api-reference)
|
|
30
29
|
- [MockLLM](#mockllm)
|
|
@@ -47,72 +46,44 @@ await mock.stop();
|
|
|
47
46
|
- [Jest](#jest)
|
|
48
47
|
- [Shared Fixture for Multi-File Suites](#shared-fixture-for-multi-file-suites)
|
|
49
48
|
- [Performance](#performance)
|
|
50
|
-
- [Configuration](#configuration)
|
|
51
49
|
- [Troubleshooting](#troubleshooting)
|
|
52
50
|
- [License](#license)
|
|
53
51
|
|
|
54
52
|
## Why phantomllm?
|
|
55
53
|
|
|
56
|
-
- **Real HTTP server** — no monkey-patching `fetch` or `http`. Your SDK makes actual network calls
|
|
57
|
-
- **
|
|
54
|
+
- **Real HTTP server** — no monkey-patching `fetch` or `http`. Your SDK makes actual network calls.
|
|
55
|
+
- **Zero config** — `npm install phantomllm` and go. No Docker, no external services, no setup steps.
|
|
56
|
+
- **Works with any client** — OpenAI SDK, Vercel AI SDK, LangChain, opencode, Python, curl.
|
|
58
57
|
- **Streaming support** — SSE chunked responses work exactly like the real OpenAI API.
|
|
59
|
-
- **Fast** —
|
|
60
|
-
- **Simple API** — fluent `given`/`
|
|
58
|
+
- **Fast** — in-process Fastify server, sub-millisecond response latency.
|
|
59
|
+
- **Simple API** — fluent `given`/`expect` pattern: `mock.given.chatCompletion.forModel('gpt-4').willReturn('Hello')`.
|
|
61
60
|
|
|
62
|
-
##
|
|
63
|
-
|
|
64
|
-
| Requirement | Version |
|
|
65
|
-
|-------------|---------|
|
|
66
|
-
| Node.js | 18+ |
|
|
67
|
-
| Docker | 20.10+ |
|
|
68
|
-
|
|
69
|
-
Docker must be running before you call `mock.start()`. Verify with:
|
|
61
|
+
## Installation
|
|
70
62
|
|
|
71
63
|
```bash
|
|
72
|
-
|
|
64
|
+
npm install phantomllm --save-dev
|
|
73
65
|
```
|
|
74
66
|
|
|
75
|
-
## Setup
|
|
76
|
-
|
|
77
|
-
### 1. Install the package
|
|
78
|
-
|
|
79
67
|
```bash
|
|
80
|
-
|
|
68
|
+
pnpm add -D phantomllm
|
|
81
69
|
```
|
|
82
70
|
|
|
83
|
-
### 2. Build the Docker image
|
|
84
|
-
|
|
85
|
-
The mock server runs inside Docker. You need to build the image once before running tests:
|
|
86
|
-
|
|
87
71
|
```bash
|
|
88
|
-
|
|
89
|
-
npm run docker:build
|
|
72
|
+
yarn add -D phantomllm
|
|
90
73
|
```
|
|
91
74
|
|
|
92
|
-
|
|
93
|
-
|
|
94
|
-
### 3. Verify it works
|
|
95
|
-
|
|
96
|
-
```typescript
|
|
97
|
-
import { MockLLM } from 'phantomllm';
|
|
98
|
-
|
|
99
|
-
const mock = new MockLLM();
|
|
100
|
-
await mock.start();
|
|
101
|
-
console.log('Mock server running at:', mock.apiBaseUrl);
|
|
102
|
-
// => "Mock server running at: http://localhost:55123/v1"
|
|
103
|
-
await mock.stop();
|
|
104
|
-
```
|
|
75
|
+
That's it. No Docker, no image builds, no extra setup.
|
|
105
76
|
|
|
106
77
|
## Getting the Server URL
|
|
107
78
|
|
|
108
|
-
`MockLLM` provides two URL getters
|
|
79
|
+
`MockLLM` provides two URL getters:
|
|
109
80
|
|
|
110
81
|
```typescript
|
|
111
82
|
const mock = new MockLLM();
|
|
112
83
|
await mock.start();
|
|
113
84
|
|
|
114
|
-
mock.baseUrl // "http://
|
|
115
|
-
mock.apiBaseUrl // "http://
|
|
85
|
+
mock.baseUrl // "http://127.0.0.1:55123" — raw host:port
|
|
86
|
+
mock.apiBaseUrl // "http://127.0.0.1:55123/v1" — includes /v1 prefix
|
|
116
87
|
```
|
|
117
88
|
|
|
118
89
|
**Which one to use:**
|
|
@@ -122,10 +93,7 @@ mock.apiBaseUrl // "http://localhost:55123/v1" — includes /v1 prefix
|
|
|
122
93
|
| OpenAI SDK (`baseURL`) | `mock.apiBaseUrl` | `new OpenAI({ baseURL: mock.apiBaseUrl })` |
|
|
123
94
|
| Vercel AI SDK (`baseURL`) | `mock.apiBaseUrl` | `createOpenAI({ baseURL: mock.apiBaseUrl })` |
|
|
124
95
|
| LangChain (`configuration.baseURL`) | `mock.apiBaseUrl` | `new ChatOpenAI({ configuration: { baseURL: mock.apiBaseUrl } })` |
|
|
125
|
-
|
|
|
126
|
-
| Python openai (`base_url`) | `mock.apiBaseUrl` | `OpenAI(base_url=mock.apiBaseUrl)` |
|
|
127
|
-
| Plain fetch | `mock.baseUrl` | `fetch(\`${mock.baseUrl}/v1/chat/completions\`)` |
|
|
128
|
-
| Admin API | `mock.baseUrl` | `fetch(\`${mock.baseUrl}/_admin/health\`)` |
|
|
96
|
+
| Plain fetch | `mock.baseUrl` | `` fetch(`${mock.baseUrl}/v1/chat/completions`) `` |
|
|
129
97
|
|
|
130
98
|
Most SDK clients expect the URL to end with `/v1`. Use `mock.apiBaseUrl` and you won't need to think about it.
|
|
131
99
|
|
|
@@ -133,28 +101,23 @@ Most SDK clients expect the URL to end with `/v1`. Use `mock.apiBaseUrl` and you
|
|
|
133
101
|
|
|
134
102
|
### MockLLM
|
|
135
103
|
|
|
136
|
-
The main class.
|
|
104
|
+
The main class. Starts an in-process HTTP server that implements the OpenAI API.
|
|
137
105
|
|
|
138
106
|
```typescript
|
|
139
107
|
import { MockLLM } from 'phantomllm';
|
|
140
108
|
|
|
141
|
-
const mock = new MockLLM(
|
|
142
|
-
image: 'phantomllm-server:latest', // Docker image name (default)
|
|
143
|
-
containerPort: 8080, // Port inside the container (default)
|
|
144
|
-
reuse: true, // Reuse container across runs (default)
|
|
145
|
-
startupTimeout: 30_000, // Max ms to wait for startup (default)
|
|
146
|
-
});
|
|
109
|
+
const mock = new MockLLM();
|
|
147
110
|
```
|
|
148
111
|
|
|
149
112
|
| Method / Property | Returns | Description |
|
|
150
113
|
|---|---|---|
|
|
151
|
-
| `await mock.start()` | `void` | Start the
|
|
152
|
-
| `await mock.stop()` | `void` | Stop
|
|
153
|
-
| `mock.baseUrl` | `string` | Server URL without `/v1`, e.g. `http://
|
|
154
|
-
| `mock.apiBaseUrl` | `string` | Server URL with `/v1`, e.g. `http://
|
|
114
|
+
| `await mock.start()` | `void` | Start the mock server. Idempotent — safe to call twice. |
|
|
115
|
+
| `await mock.stop()` | `void` | Stop the server. Idempotent. |
|
|
116
|
+
| `mock.baseUrl` | `string` | Server URL without `/v1`, e.g. `http://127.0.0.1:55123`. |
|
|
117
|
+
| `mock.apiBaseUrl` | `string` | Server URL with `/v1`, e.g. `http://127.0.0.1:55123/v1`. Pass this to SDK clients. |
|
|
155
118
|
| `mock.given` | `GivenStubs` | Entry point for stubbing responses. |
|
|
156
119
|
| `mock.expect` | `ExpectConditions` | Entry point for configuring server behavior (API key validation, etc.). |
|
|
157
|
-
| `await mock.clear()` | `void` | Remove all stubs and reset server config
|
|
120
|
+
| `await mock.clear()` | `void` | Remove all stubs and reset server config. Call between tests. |
|
|
158
121
|
|
|
159
122
|
`MockLLM` implements `Symbol.asyncDispose` for automatic cleanup:
|
|
160
123
|
|
|
@@ -266,7 +229,7 @@ Error responses follow the OpenAI error format:
|
|
|
266
229
|
|
|
267
230
|
### API Key Validation
|
|
268
231
|
|
|
269
|
-
Test that your code sends the correct API key
|
|
232
|
+
Test that your code sends the correct API key.
|
|
270
233
|
|
|
271
234
|
```typescript
|
|
272
235
|
mock.expect.apiKey('sk-test-key-123');
|
|
@@ -281,13 +244,13 @@ const openai = new OpenAI({
|
|
|
281
244
|
});
|
|
282
245
|
```
|
|
283
246
|
|
|
284
|
-
`mock.expect` configures server constraints at runtime
|
|
247
|
+
`mock.expect` configures server constraints at runtime. API key validation applies to all `/v1/*` endpoints. Admin endpoints (`/_admin/*`) are always accessible.
|
|
285
248
|
|
|
286
249
|
Calling `mock.clear()` resets the API key requirement along with all stubs.
|
|
287
250
|
|
|
288
251
|
| Method | Description |
|
|
289
252
|
|---|---|
|
|
290
|
-
| `mock.expect.apiKey(key)` | Require `Authorization: Bearer <key>` on all `/v1/*` requests.
|
|
253
|
+
| `mock.expect.apiKey(key)` | Require `Authorization: Bearer <key>` on all `/v1/*` requests. |
|
|
291
254
|
|
|
292
255
|
**Testing auth error handling:**
|
|
293
256
|
|
|
@@ -296,7 +259,6 @@ it('handles invalid API key', async () => {
|
|
|
296
259
|
mock.expect.apiKey('correct-key');
|
|
297
260
|
mock.given.chatCompletion.willReturn('Hello');
|
|
298
261
|
|
|
299
|
-
// Client configured with wrong key
|
|
300
262
|
const badClient = new OpenAI({
|
|
301
263
|
baseURL: mock.apiBaseUrl,
|
|
302
264
|
apiKey: 'wrong-key',
|
|
@@ -349,7 +311,7 @@ await mock.start();
|
|
|
349
311
|
|
|
350
312
|
const openai = new OpenAI({
|
|
351
313
|
baseURL: mock.apiBaseUrl,
|
|
352
|
-
apiKey: 'test-key',
|
|
314
|
+
apiKey: 'test-key',
|
|
353
315
|
});
|
|
354
316
|
|
|
355
317
|
// Non-streaming
|
|
@@ -428,7 +390,7 @@ for await (const chunk of result.textStream) {
|
|
|
428
390
|
await mock.stop();
|
|
429
391
|
```
|
|
430
392
|
|
|
431
|
-
> **Note:** Use `provider.chat('model')` instead of `provider('model')` to ensure requests go through `/v1/chat/completions`.
|
|
393
|
+
> **Note:** Use `provider.chat('model')` instead of `provider('model')` to ensure requests go through `/v1/chat/completions`.
|
|
432
394
|
|
|
433
395
|
### opencode
|
|
434
396
|
|
|
@@ -439,7 +401,7 @@ Add a provider entry to your `opencode.json` pointing at the mock:
|
|
|
439
401
|
"provider": {
|
|
440
402
|
"mock": {
|
|
441
403
|
"api": "openai",
|
|
442
|
-
"baseURL": "http://
|
|
404
|
+
"baseURL": "http://127.0.0.1:PORT/v1",
|
|
443
405
|
"apiKey": "test-key",
|
|
444
406
|
"models": {
|
|
445
407
|
"gpt-4o": { "id": "gpt-4o" }
|
|
@@ -449,13 +411,12 @@ Add a provider entry to your `opencode.json` pointing at the mock:
|
|
|
449
411
|
}
|
|
450
412
|
```
|
|
451
413
|
|
|
452
|
-
Start the mock and print the URL
|
|
414
|
+
Start the mock and print the URL:
|
|
453
415
|
|
|
454
416
|
```typescript
|
|
455
417
|
const mock = new MockLLM();
|
|
456
418
|
await mock.start();
|
|
457
419
|
console.log(`Set baseURL to: ${mock.apiBaseUrl}`);
|
|
458
|
-
// Update the port in opencode.json to match
|
|
459
420
|
```
|
|
460
421
|
|
|
461
422
|
### LangChain
|
|
@@ -485,13 +446,13 @@ await mock.stop();
|
|
|
485
446
|
|
|
486
447
|
### Python openai package
|
|
487
448
|
|
|
488
|
-
The mock server is a real HTTP server — any language can
|
|
449
|
+
The mock server is a real HTTP server — any language can connect to it. Start the mock from Node.js, then use it from Python:
|
|
489
450
|
|
|
490
451
|
```python
|
|
491
452
|
import openai
|
|
492
453
|
|
|
493
454
|
client = openai.OpenAI(
|
|
494
|
-
base_url="http://
|
|
455
|
+
base_url="http://127.0.0.1:55123/v1", # use mock.apiBaseUrl
|
|
495
456
|
api_key="test-key",
|
|
496
457
|
)
|
|
497
458
|
|
|
@@ -521,7 +482,7 @@ console.log(data.choices[0].message.content);
|
|
|
521
482
|
### curl
|
|
522
483
|
|
|
523
484
|
```bash
|
|
524
|
-
curl http://
|
|
485
|
+
curl http://127.0.0.1:55123/v1/chat/completions \
|
|
525
486
|
-H "Content-Type: application/json" \
|
|
526
487
|
-d '{
|
|
527
488
|
"model": "gpt-4o",
|
|
@@ -545,14 +506,14 @@ describe('my LLM feature', () => {
|
|
|
545
506
|
beforeAll(async () => {
|
|
546
507
|
await mock.start();
|
|
547
508
|
openai = new OpenAI({ baseURL: mock.apiBaseUrl, apiKey: 'test' });
|
|
548
|
-
}
|
|
509
|
+
});
|
|
549
510
|
|
|
550
511
|
afterAll(async () => {
|
|
551
512
|
await mock.stop();
|
|
552
513
|
});
|
|
553
514
|
|
|
554
515
|
beforeEach(async () => {
|
|
555
|
-
await mock.clear();
|
|
516
|
+
await mock.clear();
|
|
556
517
|
});
|
|
557
518
|
|
|
558
519
|
it('should summarize text', async () => {
|
|
@@ -611,7 +572,7 @@ describe('my LLM feature', () => {
|
|
|
611
572
|
beforeAll(async () => {
|
|
612
573
|
await mock.start();
|
|
613
574
|
openai = new OpenAI({ baseURL: mock.apiBaseUrl, apiKey: 'test' });
|
|
614
|
-
}
|
|
575
|
+
});
|
|
615
576
|
|
|
616
577
|
afterAll(() => mock.stop());
|
|
617
578
|
beforeEach(() => mock.clear());
|
|
@@ -631,7 +592,7 @@ describe('my LLM feature', () => {
|
|
|
631
592
|
|
|
632
593
|
### Shared Fixture for Multi-File Suites
|
|
633
594
|
|
|
634
|
-
Start one
|
|
595
|
+
Start one server for your entire test suite:
|
|
635
596
|
|
|
636
597
|
**`tests/support/mock.ts`**
|
|
637
598
|
|
|
@@ -642,7 +603,6 @@ export const mock = new MockLLM();
|
|
|
642
603
|
|
|
643
604
|
export async function setup() {
|
|
644
605
|
await mock.start();
|
|
645
|
-
// Make the URL available to other processes if needed
|
|
646
606
|
process.env.PHANTOMLLM_URL = mock.apiBaseUrl;
|
|
647
607
|
}
|
|
648
608
|
|
|
@@ -678,95 +638,27 @@ it('works', async () => {
|
|
|
678
638
|
|
|
679
639
|
## Performance
|
|
680
640
|
|
|
681
|
-
|
|
682
|
-
|
|
683
|
-
### Container Lifecycle
|
|
684
|
-
|
|
685
|
-
| Metric | Time |
|
|
686
|
-
|--------|------|
|
|
687
|
-
| Cold start (`mock.start()`) | ~1.1s |
|
|
688
|
-
| Stop (`mock.stop()`) | ~130ms |
|
|
689
|
-
| Full lifecycle (start, stub, request, stop) | ~1.2s |
|
|
690
|
-
|
|
691
|
-
### Request Latency (through Docker network)
|
|
692
|
-
|
|
693
|
-
| Metric | Median | p95 |
|
|
694
|
-
|--------|--------|-----|
|
|
695
|
-
| Chat completion (non-streaming) | 0.6ms | 1.8ms |
|
|
696
|
-
| Streaming TTFB | 0.7ms | 0.9ms |
|
|
697
|
-
| Streaming total (8 chunks) | 0.7ms | 1.0ms |
|
|
698
|
-
| Embedding (1536-dim) | 0.7ms | 1.6ms |
|
|
699
|
-
| Embedding batch (10x1536) | 1.9ms | 2.7ms |
|
|
700
|
-
| Stub registration | 0.5ms | 0.8ms |
|
|
701
|
-
| Clear stubs | 0.2ms | 0.4ms |
|
|
702
|
-
|
|
703
|
-
### Throughput
|
|
704
|
-
|
|
705
|
-
| Metric | Requests/sec |
|
|
706
|
-
|--------|-------------|
|
|
707
|
-
| Sequential chat completions | ~4,300 |
|
|
708
|
-
| Concurrent (10 workers) | ~6,400 |
|
|
709
|
-
| Health endpoint | ~5,900 |
|
|
710
|
-
|
|
711
|
-
### Tips
|
|
712
|
-
|
|
713
|
-
- **Don't restart between tests.** Call `mock.clear()` (sub-millisecond) instead of `stop()`/`start()` (~1.2s).
|
|
714
|
-
- **Use global setup.** Start one container for your entire suite. See [Shared Fixture](#shared-fixture-for-multi-file-suites).
|
|
715
|
-
- **Cache the Docker image in CI.** Build it once and cache the layer:
|
|
716
|
-
|
|
717
|
-
```yaml
|
|
718
|
-
# GitHub Actions
|
|
719
|
-
- name: Build mock server image
|
|
720
|
-
run: npm run docker:build
|
|
721
|
-
|
|
722
|
-
- name: Cache Docker layers
|
|
723
|
-
uses: actions/cache@v4
|
|
724
|
-
with:
|
|
725
|
-
path: /tmp/.buildx-cache
|
|
726
|
-
key: ${{ runner.os }}-docker-phantomllm-${{ hashFiles('Dockerfile') }}
|
|
727
|
-
```
|
|
728
|
-
|
|
729
|
-
## Configuration
|
|
730
|
-
|
|
731
|
-
### Constructor Options
|
|
732
|
-
|
|
733
|
-
| Option | Type | Default | Description |
|
|
734
|
-
|--------|------|---------|-------------|
|
|
735
|
-
| `image` | `string` | `'phantomllm-server:latest'` | Docker image to run. |
|
|
736
|
-
| `containerPort` | `number` | `8080` | Port the server listens on inside the container. |
|
|
737
|
-
| `reuse` | `boolean` | `true` | Reuse a running container across test runs. |
|
|
738
|
-
| `startupTimeout` | `number` | `30000` | Max milliseconds to wait for the container to become healthy. |
|
|
739
|
-
|
|
740
|
-
### Environment Variables
|
|
741
|
-
|
|
742
|
-
| Variable | Description |
|
|
743
|
-
|----------|-------------|
|
|
744
|
-
| `PHANTOMLLM_IMAGE` | Override the Docker image name without changing code. Takes precedence over the default but not over the constructor `image` option. |
|
|
745
|
-
|
|
746
|
-
### OpenAI-Compatible Endpoints
|
|
641
|
+
The mock server runs in-process using Fastify — no Docker overhead:
|
|
747
642
|
|
|
748
|
-
|
|
643
|
+
| Metric | Value |
|
|
644
|
+
|--------|-------|
|
|
645
|
+
| Server startup | < 5ms |
|
|
646
|
+
| Chat completion response | ~0.2ms median |
|
|
647
|
+
| Streaming (8 chunks) | ~0.2ms total |
|
|
648
|
+
| Embedding (1536-dim) | ~0.3ms median |
|
|
649
|
+
| Throughput | ~11,000 req/s |
|
|
749
650
|
|
|
750
|
-
|
|
751
|
-
|
|
752
|
-
|
|
753
|
-
| `/v1/embeddings` | POST | Text embeddings |
|
|
754
|
-
| `/v1/models` | GET | List available models |
|
|
755
|
-
| `/_admin/stubs` | POST | Register a stub |
|
|
756
|
-
| `/_admin/stubs` | DELETE | Clear all stubs |
|
|
757
|
-
| `/_admin/health` | GET | Health check |
|
|
651
|
+
**Tips:**
|
|
652
|
+
- Start the server once in `beforeAll`, call `mock.clear()` between tests.
|
|
653
|
+
- Use a [shared fixture](#shared-fixture-for-multi-file-suites) for multi-file test suites.
|
|
758
654
|
|
|
759
655
|
## Troubleshooting
|
|
760
656
|
|
|
761
657
|
| Problem | Cause | Solution |
|
|
762
658
|
|---|---|---|
|
|
763
659
|
| `ContainerNotStartedError` | Using `baseUrl`, `given`, or `clear()` before `start()`. | Call `await mock.start()` first. |
|
|
764
|
-
| Container startup timeout | Docker not running or image not built. | Run `docker info` to verify Docker. Run `npm run docker:build` to build the image. |
|
|
765
|
-
| Connection refused | Wrong URL or container not ready. | Use `mock.apiBaseUrl` for SDK clients. Ensure `start()` has resolved. |
|
|
766
660
|
| Stubs leaking between tests | Stubs persist until cleared. | Call `await mock.clear()` in `beforeEach`. |
|
|
767
661
|
| 418 response | No stub matches the request. | Register a stub matching the model/content, or add a catch-all: `mock.given.chatCompletion.willReturn('...')`. |
|
|
768
|
-
| `PHANTOMLLM_IMAGE` env var | Need a custom image. | Set `PHANTOMLLM_IMAGE=my-registry/image:tag` in your environment. |
|
|
769
|
-
| Slow CI | Image rebuilt every run. | Cache Docker layers and enable container reuse. |
|
|
770
662
|
| AI SDK uses wrong endpoint | `provider('model')` defaults to Responses API in v3+. | Use `provider.chat('model')` to target `/v1/chat/completions`. |
|
|
771
663
|
|
|
772
664
|
## License
|