ai-proxy-server 3.0.0.dev1__tar.gz → 3.0.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/PKG-INFO +170 -55
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/README.md +151 -50
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/__main__.py +1 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/_app.py +5 -8
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/api_key_check/__init__.py +1 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/api_key_check/allow_all.py +2 -4
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/api_key_check/in_config.py +1 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/api_key_check/with_request.py +3 -4
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/app.py +6 -6
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/base_types.py +34 -2
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/bootstrap.py +11 -7
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/config.py +2 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/config_loaders/__init__.py +1 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/config_loaders/json.py +1 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/config_loaders/python.py +1 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/config_loaders/toml.py +1 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/core.py +58 -42
- ai_proxy_server-3.0.2/lm_proxy/errors.py +44 -0
- ai_proxy_server-3.0.2/lm_proxy/handlers/__init__.py +4 -0
- ai_proxy_server-3.0.2/lm_proxy/handlers/forward_http_headers.py +71 -0
- ai_proxy_server-3.0.2/lm_proxy/handlers/rate_limiter.py +87 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/loggers.py +6 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/models_endpoint.py +8 -10
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/utils.py +2 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/pyproject.toml +42 -16
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/LICENSE +0 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/__init__.py +0 -0
- {ai_proxy_server-3.0.0.dev1 → ai_proxy_server-3.0.2}/lm_proxy/config_loaders/yaml.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.3
|
|
2
2
|
Name: ai-proxy-server
|
|
3
|
-
Version: 3.0.
|
|
3
|
+
Version: 3.0.2
|
|
4
4
|
Summary: AI Proxy Server is an OpenAI-compatible http proxy server for inferencing various LLMs capable of working with Google, Anthropic, OpenAI APIs, local PyTorch inference, etc.
|
|
5
5
|
License: MIT License
|
|
6
6
|
|
|
@@ -23,7 +23,7 @@ License: MIT License
|
|
|
23
23
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
24
24
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
25
25
|
SOFTWARE.
|
|
26
|
-
Keywords: llm,large language models,ai,gpt,openai,proxy,http,proxy-server
|
|
26
|
+
Keywords: llm,large language models,ai,gpt,openai,proxy,http,proxy-server,llm gateway,openai,anthropic,google genai
|
|
27
27
|
Author: Vitalii Stepanenko
|
|
28
28
|
Author-email: mail@vitaliy.in
|
|
29
29
|
Maintainer: Vitalii Stepanenko
|
|
@@ -36,16 +36,30 @@ Classifier: Programming Language :: Python :: 3.11
|
|
|
36
36
|
Classifier: Programming Language :: Python :: 3.12
|
|
37
37
|
Classifier: Programming Language :: Python :: 3.13
|
|
38
38
|
Classifier: License :: OSI Approved :: MIT License
|
|
39
|
+
Classifier: Operating System :: OS Independent
|
|
40
|
+
Classifier: Intended Audience :: Developers
|
|
41
|
+
Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
|
|
42
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
43
|
+
Classifier: Development Status :: 5 - Production/Stable
|
|
44
|
+
Provides-Extra: all
|
|
45
|
+
Provides-Extra: anthropic
|
|
46
|
+
Provides-Extra: google
|
|
39
47
|
Provides-Extra: test
|
|
40
|
-
Requires-Dist: ai-microcore (>=5.
|
|
48
|
+
Requires-Dist: ai-microcore (>=5.1.2,<6)
|
|
49
|
+
Requires-Dist: anthropic (>=0.77,<1) ; extra == "all"
|
|
50
|
+
Requires-Dist: anthropic (>=0.77,<1) ; extra == "anthropic"
|
|
41
51
|
Requires-Dist: fastapi (>=0.121.3,<1)
|
|
52
|
+
Requires-Dist: google-genai (>=1.62.0,<2) ; extra == "all"
|
|
53
|
+
Requires-Dist: google-genai (>=1.62.0,<2) ; extra == "google"
|
|
42
54
|
Requires-Dist: pydantic (>=2.12.5,<2.13.0)
|
|
43
55
|
Requires-Dist: pytest (>=8.4.2,<8.5.0) ; extra == "test"
|
|
44
56
|
Requires-Dist: pytest-asyncio (>=1.2.0,<1.3.0) ; extra == "test"
|
|
45
57
|
Requires-Dist: pytest-cov (>=7.0.0,<7.1.0) ; extra == "test"
|
|
46
58
|
Requires-Dist: requests (>=2.32.5,<2.33.0)
|
|
47
|
-
Requires-Dist: typer (>=0.
|
|
48
|
-
Requires-Dist: uvicorn (>=0.
|
|
59
|
+
Requires-Dist: typer (>=0.24.0)
|
|
60
|
+
Requires-Dist: uvicorn (>=0.41.0)
|
|
61
|
+
Requires-Dist: websockets (>=14.0,<15)
|
|
62
|
+
Project-URL: Bug Tracker, https://github.com/Nayjest/lm-proxy/issues
|
|
49
63
|
Project-URL: Source Code, https://github.com/Nayjest/lm-proxy
|
|
50
64
|
Description-Content-Type: text/markdown
|
|
51
65
|
|
|
@@ -92,13 +106,19 @@ It works as a drop-in replacement for OpenAI's API, allowing you to switch betwe
|
|
|
92
106
|
- [Load Balancing Example](#load-balancing-example)
|
|
93
107
|
- [Google Vertex AI Example](#google-vertex-ai-configuration-example)
|
|
94
108
|
- [Using Tokens from OIDC Provider as Virtual/Client API Keys](#using-tokens-from-oidc-provider-as-virtualclient-api-keys)
|
|
95
|
-
- [Add-on Components](
|
|
96
|
-
- [Database Connector](#database-connector)
|
|
109
|
+
- [Add-on Components](#-add-on-components)
|
|
110
|
+
- [Database Connector](#database-connector)
|
|
111
|
+
- [Request Handlers (Middleware)](#-request-handlers--middleware)
|
|
112
|
+
- [Guides & Reference](#-guides--reference)
|
|
113
|
+
- [Known Limitations](#-known-limitations)
|
|
97
114
|
- [Debugging](#-debugging)
|
|
98
115
|
- [Contributing](#-contributing)
|
|
99
116
|
- [License](#-license)
|
|
100
117
|
|
|
101
|
-
|
|
118
|
+
<a href="#" align="center"><img alt="AI Proxy Server / Gateway" src="https://raw.githubusercontent.com/Nayjest/lm-proxy/main/press-kit/assets/lm-proxy_1_hacker_1600x672.png"></a>
|
|
119
|
+
|
|
120
|
+
|
|
121
|
+
## ✨ Features<a id="-features"></a>
|
|
102
122
|
|
|
103
123
|
- **Provider Agnostic**: Connect to OpenAI, Anthropic, Google AI, local models, and more using a single API
|
|
104
124
|
- **Unified Interface**: Access all models through the standard OpenAI API format
|
|
@@ -108,21 +128,28 @@ It works as a drop-in replacement for OpenAI's API, allowing you to switch betwe
|
|
|
108
128
|
- **Easy Configuration**: Simple TOML/YAML/JSON/Python configuration files for setup
|
|
109
129
|
- **Extensible by Design**: Minimal core with clearly defined extension points, enabling seamless customization and expansion without modifying the core system.
|
|
110
130
|
|
|
111
|
-
|
|
131
|
+
|
|
132
|
+
## 🚀 Getting Started<a id="-getting-started"></a>
|
|
112
133
|
|
|
113
134
|
### Requirements
|
|
114
135
|
Python 3.11 | 3.12 | 3.13
|
|
115
136
|
|
|
116
|
-
### Installation
|
|
117
|
-
|
|
137
|
+
### Installation<a id="installation"></a>
|
|
118
138
|
```bash
|
|
119
139
|
pip install ai-proxy-server
|
|
120
140
|
```
|
|
141
|
+
For proxying to Anthropic API or Google Gemini via Vertex AI or Google AI Studio, install optional dependencies:
|
|
142
|
+
```
|
|
143
|
+
pip install ai-proxy-server[anthropic,google]
|
|
144
|
+
```
|
|
145
|
+
or
|
|
146
|
+
```
|
|
147
|
+
pip install ai-proxy-server[all]
|
|
148
|
+
```
|
|
121
149
|
|
|
122
|
-
### Quick Start
|
|
150
|
+
### Quick Start<a id="quick-start"></a>
|
|
123
151
|
|
|
124
152
|
#### 1. Create a `config.toml` file:
|
|
125
|
-
|
|
126
153
|
```toml
|
|
127
154
|
host = "0.0.0.0"
|
|
128
155
|
port = 8000
|
|
@@ -149,7 +176,6 @@ api_keys = ["YOUR_API_KEY_HERE"]
|
|
|
149
176
|
> To enhance security, consider storing upstream API keys in operating system environment variables rather than embedding them directly in the configuration file. You can reference these variables in the configuration using the env:<VAR_NAME> syntax.
|
|
150
177
|
|
|
151
178
|
#### 2. Start the server:
|
|
152
|
-
|
|
153
179
|
```bash
|
|
154
180
|
ai-proxy-server
|
|
155
181
|
```
|
|
@@ -159,7 +185,6 @@ python -m lm_proxy
|
|
|
159
185
|
```
|
|
160
186
|
|
|
161
187
|
#### 3. Use it with any OpenAI-compatible client:
|
|
162
|
-
|
|
163
188
|
```python
|
|
164
189
|
from openai import OpenAI
|
|
165
190
|
|
|
@@ -176,7 +201,6 @@ print(completion.choices[0].message.content)
|
|
|
176
201
|
```
|
|
177
202
|
|
|
178
203
|
Or use the same endpoint with Claude models:
|
|
179
|
-
|
|
180
204
|
```python
|
|
181
205
|
completion = client.chat.completions.create(
|
|
182
206
|
model="claude-opus-4-1-20250805", # This will be routed to Anthropic based on config
|
|
@@ -184,12 +208,12 @@ completion = client.chat.completions.create(
|
|
|
184
208
|
)
|
|
185
209
|
```
|
|
186
210
|
|
|
187
|
-
## 📝 Configuration
|
|
188
211
|
|
|
189
|
-
|
|
212
|
+
## 📝 Configuration<a id="-configuration"></a>
|
|
190
213
|
|
|
191
|
-
|
|
214
|
+
AI Proxy Server is configured through a TOML/YAML/JSON/Python file that specifies connections, routing rules, and access control.
|
|
192
215
|
|
|
216
|
+
### Basic Structure<a id="basic-structure"></a>
|
|
193
217
|
```toml
|
|
194
218
|
host = "0.0.0.0" # Interface to bind to
|
|
195
219
|
port = 8000 # Port to listen on
|
|
@@ -248,19 +272,18 @@ created_at = "created_at"
|
|
|
248
272
|
duration = "duration"
|
|
249
273
|
```
|
|
250
274
|
|
|
251
|
-
### Environment Variables
|
|
275
|
+
### Environment Variables<a id="environment-variables"></a>
|
|
252
276
|
|
|
253
277
|
You can reference environment variables in your configuration file by prefixing values with `env:`.
|
|
254
278
|
|
|
255
279
|
For example:
|
|
256
|
-
|
|
257
280
|
```toml
|
|
258
281
|
[connections.openai]
|
|
259
282
|
api_key = "env:OPENAI_API_KEY"
|
|
260
283
|
```
|
|
261
284
|
|
|
262
285
|
At runtime, AI Proxy Server automatically retrieves the value of the target variable
|
|
263
|
-
(OPENAI_API_KEY) from your operating system
|
|
286
|
+
(OPENAI_API_KEY) from your operating system's environment or from a .env file, if present.
|
|
264
287
|
|
|
265
288
|
### .env Files
|
|
266
289
|
|
|
@@ -280,7 +303,6 @@ LM_PROXY_DEBUG=no
|
|
|
280
303
|
```
|
|
281
304
|
|
|
282
305
|
You can also control `.env` file usage with the `--env` command-line option:
|
|
283
|
-
|
|
284
306
|
```bash
|
|
285
307
|
# Use a custom .env file path
|
|
286
308
|
ai-proxy-server --env="path/to/your/.env"
|
|
@@ -288,7 +310,8 @@ ai-proxy-server --env="path/to/your/.env"
|
|
|
288
310
|
ai-proxy-server --env=""
|
|
289
311
|
```
|
|
290
312
|
|
|
291
|
-
|
|
313
|
+
|
|
314
|
+
## 🔑 Proxy API Keys vs. Provider API Keys<a id="-proxy-api-keys-vs-provider-api-keys"></a>
|
|
292
315
|
|
|
293
316
|
AI Proxy Server utilizes two distinct types of API keys to facilitate secure and efficient request handling.
|
|
294
317
|
|
|
@@ -309,18 +332,17 @@ This distinction ensures a clear separation of concerns:
|
|
|
309
332
|
Virtual API Keys manage user authentication and access within the proxy,
|
|
310
333
|
while Upstream API Keys handle secure communication with external providers.
|
|
311
334
|
|
|
312
|
-
## 🔌 API Usage
|
|
313
335
|
|
|
314
|
-
|
|
336
|
+
## 🔌 API Usage<a id="-api-usage"></a>
|
|
315
337
|
|
|
316
|
-
|
|
338
|
+
AI Proxy Server implements the OpenAI chat completions API endpoint. You can use any OpenAI-compatible client to interact with it.
|
|
317
339
|
|
|
340
|
+
### Chat Completions Endpoint<a id="chat-completions-endpoint"></a>
|
|
318
341
|
```http
|
|
319
342
|
POST /v1/chat/completions
|
|
320
343
|
```
|
|
321
344
|
|
|
322
345
|
#### Request Format
|
|
323
|
-
|
|
324
346
|
```json
|
|
325
347
|
{
|
|
326
348
|
"model": "gpt-3.5-turbo",
|
|
@@ -334,7 +356,6 @@ POST /v1/chat/completions
|
|
|
334
356
|
```
|
|
335
357
|
|
|
336
358
|
#### Response Format
|
|
337
|
-
|
|
338
359
|
```json
|
|
339
360
|
{
|
|
340
361
|
"choices": [
|
|
@@ -351,12 +372,10 @@ POST /v1/chat/completions
|
|
|
351
372
|
```
|
|
352
373
|
|
|
353
374
|
|
|
354
|
-
### Models List Endpoint
|
|
375
|
+
### Models List Endpoint<a id="models-list-endpoint"></a>
|
|
355
376
|
|
|
356
377
|
|
|
357
378
|
List and describe all models available through the API.
|
|
358
|
-
|
|
359
|
-
|
|
360
379
|
```http
|
|
361
380
|
GET /v1/models
|
|
362
381
|
```
|
|
@@ -366,7 +385,6 @@ Routing keys can reference both **exact model names** and **model name patterns*
|
|
|
366
385
|
|
|
367
386
|
By default, wildcard patterns are displayed as-is in the models list (e.g., `"gpt*"`, `"claude*"`).
|
|
368
387
|
This behavior can be customized via the `model_listing_mode` configuration option:
|
|
369
|
-
|
|
370
388
|
```
|
|
371
389
|
model_listing_mode = "as_is" | "ignore_wildcards" | "expand_wildcards"
|
|
372
390
|
```
|
|
@@ -399,7 +417,6 @@ api_key = "env:ANTHROPIC_API_KEY"
|
|
|
399
417
|
|
|
400
418
|
|
|
401
419
|
#### Response Format
|
|
402
|
-
|
|
403
420
|
```json
|
|
404
421
|
{
|
|
405
422
|
"object": "list",
|
|
@@ -420,23 +437,22 @@ api_key = "env:ANTHROPIC_API_KEY"
|
|
|
420
437
|
}
|
|
421
438
|
```
|
|
422
439
|
|
|
423
|
-
|
|
440
|
+
|
|
441
|
+
## 🔒 User Groups Configuration<a id="-user-groups-configuration"></a>
|
|
424
442
|
|
|
425
443
|
The `[groups]` section in the configuration defines access control rules for different user groups.
|
|
426
444
|
Each group can have its own set of virtual API keys and permitted connections.
|
|
427
445
|
|
|
428
|
-
### Basic Group Definition
|
|
429
|
-
|
|
446
|
+
### Basic Group Definition<a id="basic-group-definition"></a>
|
|
430
447
|
```toml
|
|
431
448
|
[groups.default]
|
|
432
449
|
api_keys = ["KEY1", "KEY2"]
|
|
433
450
|
allowed_connections = "*" # Allow access to all connections
|
|
434
451
|
```
|
|
435
452
|
|
|
436
|
-
### Group-based Access Control
|
|
453
|
+
### Group-based Access Control<a id="group-based-access-control"></a>
|
|
437
454
|
|
|
438
455
|
You can create multiple groups to segment your users and control their access:
|
|
439
|
-
|
|
440
456
|
```toml
|
|
441
457
|
# Admin group with full access
|
|
442
458
|
[groups.admin]
|
|
@@ -454,7 +470,7 @@ api_keys = ["FREE_KEY_1", "FREE_KEY_2"]
|
|
|
454
470
|
allowed_connections = "openai" # Only allowed to use OpenAI connection
|
|
455
471
|
```
|
|
456
472
|
|
|
457
|
-
### Connection Restrictions
|
|
473
|
+
### Connection Restrictions<a id="connection-restrictions"></a>
|
|
458
474
|
|
|
459
475
|
The `allowed_connections` parameter controls which upstream providers a group can access:
|
|
460
476
|
|
|
@@ -468,7 +484,8 @@ This allows fine-grained control over which users can access which AI providers,
|
|
|
468
484
|
- Implementing usage quotas per group
|
|
469
485
|
- Billing and cost allocation by user group
|
|
470
486
|
|
|
471
|
-
|
|
487
|
+
|
|
488
|
+
### Virtual API Key Validation<a id="virtual-api-key-validation"></a>
|
|
472
489
|
|
|
473
490
|
#### Overview
|
|
474
491
|
|
|
@@ -485,7 +502,6 @@ In the .py config representation, the validator function can be passed directly
|
|
|
485
502
|
#### Example configuration for external API key validation using HTTP request to Keycloak / OpenID Connect
|
|
486
503
|
|
|
487
504
|
This example shows how to validate API keys against an external service (e.g., Keycloak):
|
|
488
|
-
|
|
489
505
|
```toml
|
|
490
506
|
[api_key_check]
|
|
491
507
|
class = "lm_proxy.api_key_check.CheckAPIKeyWithRequest"
|
|
@@ -502,7 +518,6 @@ Authorization = "Bearer {api_key}"
|
|
|
502
518
|
|
|
503
519
|
For more advanced authentication needs,
|
|
504
520
|
you can implement a custom validator function:
|
|
505
|
-
|
|
506
521
|
```python
|
|
507
522
|
# my_validators.py
|
|
508
523
|
def validate_api_key(api_key: str) -> str | None:
|
|
@@ -523,7 +538,6 @@ def validate_api_key(api_key: str) -> str | None:
|
|
|
523
538
|
```
|
|
524
539
|
|
|
525
540
|
Then reference it in your config:
|
|
526
|
-
|
|
527
541
|
```toml
|
|
528
542
|
api_key_check = "my_validators.validate_api_key"
|
|
529
543
|
```
|
|
@@ -531,11 +545,11 @@ api_key_check = "my_validators.validate_api_key"
|
|
|
531
545
|
> In this case, the `api_keys` lists in groups are ignored, and the custom function is responsible for all validation logic.
|
|
532
546
|
|
|
533
547
|
|
|
534
|
-
## 🛠️ Advanced Usage
|
|
535
|
-
### Dynamic Model Routing
|
|
548
|
+
## 🛠️ Advanced Usage<a id="-advanced-usage"></a>
|
|
536
549
|
|
|
537
|
-
|
|
550
|
+
### Dynamic Model Routing<a id="dynamic-model-routing"></a>
|
|
538
551
|
|
|
552
|
+
The routing section allows flexible pattern matching with wildcards:
|
|
539
553
|
```toml
|
|
540
554
|
[routing]
|
|
541
555
|
"gpt-4*" = "openai.gpt-4" # Route gpt-4 requests to OpenAI GPT-4
|
|
@@ -548,18 +562,19 @@ The routing section allows flexible pattern matching with wildcards:
|
|
|
548
562
|
Keys are model name patterns (with `*` wildcard support), and values are connection/model mappings.
|
|
549
563
|
Connection names reference those defined in the `[connections]` section.
|
|
550
564
|
|
|
551
|
-
### Load Balancing Example
|
|
565
|
+
### Load Balancing Example<a id="load-balancing-example"></a>
|
|
552
566
|
|
|
553
567
|
- [Simple load-balancer configuration](https://github.com/Nayjest/lm-proxy/blob/main/examples/load_balancer_config.py)
|
|
554
568
|
This example demonstrates how to set up a load balancer that randomly
|
|
555
569
|
distributes requests across multiple language model servers using the lm_proxy.
|
|
556
570
|
|
|
557
|
-
### Google Vertex AI Configuration Example
|
|
571
|
+
### Google Vertex AI Configuration Example<a id="google-vertex-ai-configuration-example"></a>
|
|
558
572
|
|
|
559
573
|
- [vertex-ai.toml](https://github.com/Nayjest/lm-proxy/blob/main/examples/vertex-ai.toml)
|
|
560
574
|
This example demonstrates how to connect AI Proxy Server to Google Gemini model via Vertex AI API
|
|
561
575
|
|
|
562
|
-
|
|
576
|
+
|
|
577
|
+
### Using Tokens from OIDC Provider as Virtual/Client API Keys<a id="using-tokens-from-oidc-provider-as-virtualclient-api-keys"></a>
|
|
563
578
|
|
|
564
579
|
You can configure AI Proxy Server to validate tokens from OpenID Connect (OIDC) providers like Keycloak, Auth0, or Okta as API keys.
|
|
565
580
|
|
|
@@ -594,9 +609,95 @@ Authorization = "Bearer {api_key}"
|
|
|
594
609
|
|
|
595
610
|
Clients pass their OIDC access token as the API key when making requests to AI Proxy Server.
|
|
596
611
|
|
|
597
|
-
## 🧩 Add-on Components
|
|
598
612
|
|
|
599
|
-
|
|
613
|
+
## 🪝 Request Handlers (Middleware)<a id="-request-handlers--middleware"></a>
|
|
614
|
+
|
|
615
|
+
Handlers intercept and modify requests *before* they reach the upstream LLM provider. They enable cross-cutting concerns such as rate limiting, logging, auditing, and header manipulation.
|
|
616
|
+
|
|
617
|
+
Handlers are defined in the `before` list within the configuration file and execute sequentially in the order specified.
|
|
618
|
+
|
|
619
|
+
### Built-in Handlers
|
|
620
|
+
|
|
621
|
+
AI Proxy Server includes several built-in handlers for common operational needs.
|
|
622
|
+
|
|
623
|
+
#### Rate Limiter
|
|
624
|
+
|
|
625
|
+
The `RateLimiter` protects upstream credentials and manages traffic load using a sliding window algorithm.
|
|
626
|
+
|
|
627
|
+
**Parameters:**
|
|
628
|
+
|
|
629
|
+
| Parameter | Type | Description |
|
|
630
|
+
|-----------|------|-------------|
|
|
631
|
+
| `max_requests` | int | Maximum number of requests allowed per window |
|
|
632
|
+
| `window_seconds` | int | Duration of the sliding window in seconds |
|
|
633
|
+
| `per` | string | Scope of the limit: `api_key`, `ip`, `connection`, `group`, or `global` |
|
|
634
|
+
|
|
635
|
+
**Configuration:**
|
|
636
|
+
```toml
|
|
637
|
+
[[before]]
|
|
638
|
+
class = "lm_proxy.handlers.RateLimiter"
|
|
639
|
+
max_requests = 10
|
|
640
|
+
window_seconds = 60
|
|
641
|
+
per = "api_key"
|
|
642
|
+
|
|
643
|
+
[[before]]
|
|
644
|
+
class = "lm_proxy.handlers.RateLimiter"
|
|
645
|
+
max_requests = 1000
|
|
646
|
+
window_seconds = 300
|
|
647
|
+
per = "global"
|
|
648
|
+
```
|
|
649
|
+
|
|
650
|
+
#### HTTP Headers Forwarder
|
|
651
|
+
|
|
652
|
+
The `HTTPHeadersForwarder` passes specific headers from incoming client requests to the upstream provider—useful for distributed tracing or tenant context propagation.
|
|
653
|
+
|
|
654
|
+
Sensitive headers (`Authorization`, `Host`, `Content-Length`) are stripped by default to prevent protocol corruption and credential leaks.
|
|
655
|
+
```toml
|
|
656
|
+
[[before]]
|
|
657
|
+
class = "lm_proxy.handlers.HTTPHeadersForwarder"
|
|
658
|
+
white_list_headers = ["x-trace-id", "x-correlation-id", "x-tenant-id"]
|
|
659
|
+
```
|
|
660
|
+
See also [HTTP Header Management](https://github.com/Nayjest/lm-proxy/blob/main/doc/http_headers.md).
|
|
661
|
+
|
|
662
|
+
### Custom Handlers
|
|
663
|
+
|
|
664
|
+
Extend functionality by implementing custom handlers in Python. A handler is any callable (function or class instance) that accepts a `RequestContext`.
|
|
665
|
+
|
|
666
|
+
#### Interface
|
|
667
|
+
```python
|
|
668
|
+
from lm_proxy.base_types import RequestContext
|
|
669
|
+
|
|
670
|
+
async def my_custom_handler(ctx: RequestContext) -> None:
|
|
671
|
+
# Implementation here
|
|
672
|
+
pass
|
|
673
|
+
```
|
|
674
|
+
|
|
675
|
+
#### Example: Audit Logger
|
|
676
|
+
```python
|
|
677
|
+
# my_extensions.py
|
|
678
|
+
import logging
|
|
679
|
+
from lm_proxy.base_types import RequestContext
|
|
680
|
+
|
|
681
|
+
class AuditLogger:
|
|
682
|
+
def __init__(self, prefix: str = "AUDIT"):
|
|
683
|
+
self.prefix = prefix
|
|
684
|
+
|
|
685
|
+
async def __call__(self, ctx: RequestContext) -> None:
|
|
686
|
+
user = ctx.user_info.get("name", "anonymous")
|
|
687
|
+
logging.info(f"[{self.prefix}] User '{user}' requested model '{ctx.model}'")
|
|
688
|
+
```
|
|
689
|
+
|
|
690
|
+
**Registration:**
|
|
691
|
+
```toml
|
|
692
|
+
[[before]]
|
|
693
|
+
class = "my_extensions.AuditLogger"
|
|
694
|
+
prefix = "SECURITY_AUDIT"
|
|
695
|
+
```
|
|
696
|
+
|
|
697
|
+
|
|
698
|
+
## 🧩 Add-on Components<a id="-add-on-components"></a>
|
|
699
|
+
|
|
700
|
+
### Database Connector<a id="database-connector"></a>
|
|
600
701
|
|
|
601
702
|
[ai-proxy-server-db-connector](https://github.com/nayjest/lm-proxy-db-connector) is a lightweight SQLAlchemy-based connector that enables AI Proxy Server to work with relational databases including PostgreSQL, MySQL/MariaDB, SQLite, Oracle, Microsoft SQL Server, and many others.
|
|
602
703
|
|
|
@@ -605,7 +706,21 @@ Clients pass their OIDC access token as the API key when making requests to AI P
|
|
|
605
706
|
- Share database connections across components, extensions, and custom functions
|
|
606
707
|
- Built-in database logger for structured logging of AI request data
|
|
607
708
|
|
|
608
|
-
|
|
709
|
+
|
|
710
|
+
## 📚 Guides & Reference<a id="-guides--reference"></a>
|
|
711
|
+
|
|
712
|
+
For more detailed information, check out these articles:
|
|
713
|
+
- [HTTP Header Management](https://github.com/Nayjest/lm-proxy/blob/main/doc/http_headers.md)
|
|
714
|
+
|
|
715
|
+
|
|
716
|
+
## 🚧 Known Limitations<a id="-known-limitations"></a>
|
|
717
|
+
|
|
718
|
+
- **Multiple generations (n > 1):** When proxying requests to Google or Anthropic APIs, only the first generation is returned. Multi-generation support is tracked in [#35](https://github.com/Nayjest/lm-proxy/issues/35).
|
|
719
|
+
|
|
720
|
+
- **Model listing with wildcards / forwarding actual model metadata:** The `/v1/models` endpoint does not query upstream providers to expand wildcard patterns (e.g., `gpt*`) or fetch model metadata. Only explicitly defined model names are listed [#36](https://github.com/Nayjest/lm-proxy/issues/36).
|
|
721
|
+
|
|
722
|
+
|
|
723
|
+
## 🔍 Debugging<a id="-debugging"></a>
|
|
609
724
|
|
|
610
725
|
### Overview
|
|
611
726
|
When **debugging mode** is enabled,
|
|
@@ -629,7 +744,7 @@ Alternatively, you can enable or disable debugging via the command-line argument
|
|
|
629
744
|
> CLI arguments override environment variable settings.
|
|
630
745
|
|
|
631
746
|
|
|
632
|
-
## 🤝 Contributing
|
|
747
|
+
## 🤝 Contributing<a id="-contributing"></a>
|
|
633
748
|
|
|
634
749
|
Contributions are welcome! Please feel free to submit a Pull Request.
|
|
635
750
|
|
|
@@ -640,7 +755,7 @@ Contributions are welcome! Please feel free to submit a Pull Request.
|
|
|
640
755
|
5. Open a Pull Request
|
|
641
756
|
|
|
642
757
|
|
|
643
|
-
## 📄 License
|
|
758
|
+
## 📄 License<a id="-license"></a>
|
|
644
759
|
|
|
645
760
|
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
646
761
|
© 2025–2026 [Vitalii Stepanenko](mailto:mail@vitaliy.in)
|