inference-proxy 3.0.0.dev1__tar.gz → 3.0.1__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (28) hide show
  1. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/PKG-INFO +167 -53
  2. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/README.md +151 -50
  3. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/app.py +3 -0
  4. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/base_types.py +34 -2
  5. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/bootstrap.py +14 -3
  6. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/config.py +1 -0
  7. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/core.py +55 -33
  8. inference_proxy-3.0.1/lm_proxy/errors.py +43 -0
  9. inference_proxy-3.0.1/lm_proxy/handlers/__init__.py +7 -0
  10. inference_proxy-3.0.1/lm_proxy/handlers/forward_http_headers.py +70 -0
  11. inference_proxy-3.0.1/lm_proxy/handlers/rate_limiter.py +88 -0
  12. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/models_endpoint.py +2 -1
  13. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/utils.py +2 -0
  14. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/pyproject.toml +39 -16
  15. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/LICENSE +0 -0
  16. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/__init__.py +0 -0
  17. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/__main__.py +0 -0
  18. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/_app.py +0 -0
  19. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/api_key_check/__init__.py +0 -0
  20. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/api_key_check/allow_all.py +0 -0
  21. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/api_key_check/in_config.py +0 -0
  22. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/api_key_check/with_request.py +0 -0
  23. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/config_loaders/__init__.py +0 -0
  24. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/config_loaders/json.py +0 -0
  25. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/config_loaders/python.py +0 -0
  26. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/config_loaders/toml.py +0 -0
  27. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/config_loaders/yaml.py +0 -0
  28. {inference_proxy-3.0.0.dev1 → inference_proxy-3.0.1}/lm_proxy/loggers.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: inference-proxy
3
- Version: 3.0.0.dev1
3
+ Version: 3.0.1
4
4
  Summary: Inference Proxy is an OpenAI-compatible http proxy server for inferencing various LLMs capable of working with Google, Anthropic, OpenAI APIs, local PyTorch inference, etc.
5
5
  License: MIT License
6
6
 
@@ -23,7 +23,7 @@ License: MIT License
23
23
  LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
24
24
  OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
25
25
  SOFTWARE.
26
- Keywords: llm,large language models,ai,gpt,openai,proxy,http,proxy-server
26
+ Keywords: llm,large language models,ai,gpt,openai,proxy,http,proxy-server,llm gateway,openai,anthropic,google genai
27
27
  Author: Vitalii Stepanenko
28
28
  Author-email: mail@vitaliy.in
29
29
  Maintainer: Vitalii Stepanenko
@@ -36,9 +36,21 @@ Classifier: Programming Language :: Python :: 3.11
36
36
  Classifier: Programming Language :: Python :: 3.12
37
37
  Classifier: Programming Language :: Python :: 3.13
38
38
  Classifier: License :: OSI Approved :: MIT License
39
+ Classifier: Operating System :: OS Independent
40
+ Classifier: Intended Audience :: Developers
41
+ Classifier: Topic :: Software Development :: Libraries :: Application Frameworks
42
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
43
+ Classifier: Development Status :: 5 - Production/Stable
44
+ Provides-Extra: all
45
+ Provides-Extra: anthropic
46
+ Provides-Extra: google
39
47
  Provides-Extra: test
40
- Requires-Dist: ai-microcore (>=5.0.0.dev7,<6)
48
+ Requires-Dist: ai-microcore (>=5.1.2,<6)
49
+ Requires-Dist: anthropic (>=0.77,<1) ; extra == "all"
50
+ Requires-Dist: anthropic (>=0.77,<1) ; extra == "anthropic"
41
51
  Requires-Dist: fastapi (>=0.121.3,<1)
52
+ Requires-Dist: google-genai (>=1.62.0,<2) ; extra == "all"
53
+ Requires-Dist: google-genai (>=1.62.0,<2) ; extra == "google"
42
54
  Requires-Dist: pydantic (>=2.12.5,<2.13.0)
43
55
  Requires-Dist: pytest (>=8.4.2,<8.5.0) ; extra == "test"
44
56
  Requires-Dist: pytest-asyncio (>=1.2.0,<1.3.0) ; extra == "test"
@@ -46,6 +58,7 @@ Requires-Dist: pytest-cov (>=7.0.0,<7.1.0) ; extra == "test"
46
58
  Requires-Dist: requests (>=2.32.5,<2.33.0)
47
59
  Requires-Dist: typer (>=0.16.1)
48
60
  Requires-Dist: uvicorn (>=0.22.0)
61
+ Project-URL: Bug Tracker, https://github.com/Nayjest/lm-proxy/issues
49
62
  Project-URL: Source Code, https://github.com/Nayjest/lm-proxy
50
63
  Description-Content-Type: text/markdown
51
64
 
@@ -92,13 +105,19 @@ It works as a drop-in replacement for OpenAI's API, allowing you to switch betwe
92
105
  - [Load Balancing Example](#load-balancing-example)
93
106
  - [Google Vertex AI Example](#google-vertex-ai-configuration-example)
94
107
  - [Using Tokens from OIDC Provider as Virtual/Client API Keys](#using-tokens-from-oidc-provider-as-virtualclient-api-keys)
95
- - [Add-on Components](#add-on-components)
96
- - [Database Connector](#database-connector)
108
+ - [Add-on Components](#-add-on-components)
109
+ - [Database Connector](#database-connector)
110
+ - [Request Handlers (Middleware)](#-request-handlers--middleware)
111
+ - [Guides & Reference](#-guides--reference)
112
+ - [Known Limitations](#-known-limitations)
97
113
  - [Debugging](#-debugging)
98
114
  - [Contributing](#-contributing)
99
115
  - [License](#-license)
100
116
 
101
- ## Features
117
+ <a href="#" align="center"><img alt="Inference Proxy / Gateway" src="https://raw.githubusercontent.com/Nayjest/lm-proxy/main/press-kit/assets/lm-proxy_1_hacker_1600x672.png"></a>
118
+
119
+
120
+ ## ✨ Features<a id="-features"></a>
102
121
 
103
122
  - **Provider Agnostic**: Connect to OpenAI, Anthropic, Google AI, local models, and more using a single API
104
123
  - **Unified Interface**: Access all models through the standard OpenAI API format
@@ -108,21 +127,28 @@ It works as a drop-in replacement for OpenAI's API, allowing you to switch betwe
108
127
  - **Easy Configuration**: Simple TOML/YAML/JSON/Python configuration files for setup
109
128
  - **Extensible by Design**: Minimal core with clearly defined extension points, enabling seamless customization and expansion without modifying the core system.
110
129
 
111
- ## 🚀 Getting Started
130
+
131
+ ## 🚀 Getting Started<a id="-getting-started"></a>
112
132
 
113
133
  ### Requirements
114
134
  Python 3.11 | 3.12 | 3.13
115
135
 
116
- ### Installation
117
-
136
+ ### Installation<a id="installation"></a>
118
137
  ```bash
119
138
  pip install inference-proxy
120
139
  ```
140
+ For proxying to Anthropic API or Google Gemini via Vertex AI or Google AI Studio, install optional dependencies:
141
+ ```
142
+ pip install inference-proxy[anthropic,google]
143
+ ```
144
+ or
145
+ ```
146
+ pip install inference-proxy[all]
147
+ ```
121
148
 
122
- ### Quick Start
149
+ ### Quick Start<a id="quick-start"></a>
123
150
 
124
151
  #### 1. Create a `config.toml` file:
125
-
126
152
  ```toml
127
153
  host = "0.0.0.0"
128
154
  port = 8000
@@ -149,7 +175,6 @@ api_keys = ["YOUR_API_KEY_HERE"]
149
175
  > To enhance security, consider storing upstream API keys in operating system environment variables rather than embedding them directly in the configuration file. You can reference these variables in the configuration using the env:<VAR_NAME> syntax.
150
176
 
151
177
  #### 2. Start the server:
152
-
153
178
  ```bash
154
179
  inference-proxy
155
180
  ```
@@ -159,7 +184,6 @@ python -m lm_proxy
159
184
  ```
160
185
 
161
186
  #### 3. Use it with any OpenAI-compatible client:
162
-
163
187
  ```python
164
188
  from openai import OpenAI
165
189
 
@@ -176,7 +200,6 @@ print(completion.choices[0].message.content)
176
200
  ```
177
201
 
178
202
  Or use the same endpoint with Claude models:
179
-
180
203
  ```python
181
204
  completion = client.chat.completions.create(
182
205
  model="claude-opus-4-1-20250805", # This will be routed to Anthropic based on config
@@ -184,12 +207,12 @@ completion = client.chat.completions.create(
184
207
  )
185
208
  ```
186
209
 
187
- ## 📝 Configuration
188
210
 
189
- Inference Proxy is configured through a TOML/YAML/JSON/Python file that specifies connections, routing rules, and access control.
211
+ ## 📝 Configuration<a id="-configuration"></a>
190
212
 
191
- ### Basic Structure
213
+ Inference Proxy is configured through a TOML/YAML/JSON/Python file that specifies connections, routing rules, and access control.
192
214
 
215
+ ### Basic Structure<a id="basic-structure"></a>
193
216
  ```toml
194
217
  host = "0.0.0.0" # Interface to bind to
195
218
  port = 8000 # Port to listen on
@@ -248,19 +271,18 @@ created_at = "created_at"
248
271
  duration = "duration"
249
272
  ```
250
273
 
251
- ### Environment Variables
274
+ ### Environment Variables<a id="environment-variables"></a>
252
275
 
253
276
  You can reference environment variables in your configuration file by prefixing values with `env:`.
254
277
 
255
278
  For example:
256
-
257
279
  ```toml
258
280
  [connections.openai]
259
281
  api_key = "env:OPENAI_API_KEY"
260
282
  ```
261
283
 
262
284
  At runtime, Inference Proxy automatically retrieves the value of the target variable
263
- (OPENAI_API_KEY) from your operating systems environment or from a .env file, if present.
285
+ (OPENAI_API_KEY) from your operating system's environment or from a .env file, if present.
264
286
 
265
287
  ### .env Files
266
288
 
@@ -280,7 +302,6 @@ LM_PROXY_DEBUG=no
280
302
  ```
281
303
 
282
304
  You can also control `.env` file usage with the `--env` command-line option:
283
-
284
305
  ```bash
285
306
  # Use a custom .env file path
286
307
  inference-proxy --env="path/to/your/.env"
@@ -288,7 +309,8 @@ inference-proxy --env="path/to/your/.env"
288
309
  inference-proxy --env=""
289
310
  ```
290
311
 
291
- ## 🔑 Proxy API Keys vs. Provider API Keys
312
+
313
+ ## 🔑 Proxy API Keys vs. Provider API Keys<a id="-proxy-api-keys-vs-provider-api-keys"></a>
292
314
 
293
315
  Inference Proxy utilizes two distinct types of API keys to facilitate secure and efficient request handling.
294
316
 
@@ -309,18 +331,17 @@ This distinction ensures a clear separation of concerns:
309
331
  Virtual API Keys manage user authentication and access within the proxy,
310
332
  while Upstream API Keys handle secure communication with external providers.
311
333
 
312
- ## 🔌 API Usage
313
334
 
314
- Inference Proxy implements the OpenAI chat completions API endpoint. You can use any OpenAI-compatible client to interact with it.
335
+ ## 🔌 API Usage<a id="-api-usage"></a>
315
336
 
316
- ### Chat Completions Endpoint
337
+ Inference Proxy implements the OpenAI chat completions API endpoint. You can use any OpenAI-compatible client to interact with it.
317
338
 
339
+ ### Chat Completions Endpoint<a id="chat-completions-endpoint"></a>
318
340
  ```http
319
341
  POST /v1/chat/completions
320
342
  ```
321
343
 
322
344
  #### Request Format
323
-
324
345
  ```json
325
346
  {
326
347
  "model": "gpt-3.5-turbo",
@@ -334,7 +355,6 @@ POST /v1/chat/completions
334
355
  ```
335
356
 
336
357
  #### Response Format
337
-
338
358
  ```json
339
359
  {
340
360
  "choices": [
@@ -351,12 +371,10 @@ POST /v1/chat/completions
351
371
  ```
352
372
 
353
373
 
354
- ### Models List Endpoint
374
+ ### Models List Endpoint<a id="models-list-endpoint"></a>
355
375
 
356
376
 
357
377
  List and describe all models available through the API.
358
-
359
-
360
378
  ```http
361
379
  GET /v1/models
362
380
  ```
@@ -366,7 +384,6 @@ Routing keys can reference both **exact model names** and **model name patterns*
366
384
 
367
385
  By default, wildcard patterns are displayed as-is in the models list (e.g., `"gpt*"`, `"claude*"`).
368
386
  This behavior can be customized via the `model_listing_mode` configuration option:
369
-
370
387
  ```
371
388
  model_listing_mode = "as_is" | "ignore_wildcards" | "expand_wildcards"
372
389
  ```
@@ -399,7 +416,6 @@ api_key = "env:ANTHROPIC_API_KEY"
399
416
 
400
417
 
401
418
  #### Response Format
402
-
403
419
  ```json
404
420
  {
405
421
  "object": "list",
@@ -420,23 +436,22 @@ api_key = "env:ANTHROPIC_API_KEY"
420
436
  }
421
437
  ```
422
438
 
423
- ## 🔒 User Groups Configuration
439
+
440
+ ## 🔒 User Groups Configuration<a id="-user-groups-configuration"></a>
424
441
 
425
442
  The `[groups]` section in the configuration defines access control rules for different user groups.
426
443
  Each group can have its own set of virtual API keys and permitted connections.
427
444
 
428
- ### Basic Group Definition
429
-
445
+ ### Basic Group Definition<a id="basic-group-definition"></a>
430
446
  ```toml
431
447
  [groups.default]
432
448
  api_keys = ["KEY1", "KEY2"]
433
449
  allowed_connections = "*" # Allow access to all connections
434
450
  ```
435
451
 
436
- ### Group-based Access Control
452
+ ### Group-based Access Control<a id="group-based-access-control"></a>
437
453
 
438
454
  You can create multiple groups to segment your users and control their access:
439
-
440
455
  ```toml
441
456
  # Admin group with full access
442
457
  [groups.admin]
@@ -454,7 +469,7 @@ api_keys = ["FREE_KEY_1", "FREE_KEY_2"]
454
469
  allowed_connections = "openai" # Only allowed to use OpenAI connection
455
470
  ```
456
471
 
457
- ### Connection Restrictions
472
+ ### Connection Restrictions<a id="connection-restrictions"></a>
458
473
 
459
474
  The `allowed_connections` parameter controls which upstream providers a group can access:
460
475
 
@@ -468,7 +483,8 @@ This allows fine-grained control over which users can access which AI providers,
468
483
  - Implementing usage quotas per group
469
484
  - Billing and cost allocation by user group
470
485
 
471
- ### Virtual API Key Validation
486
+
487
+ ### Virtual API Key Validation<a id="virtual-api-key-validation"></a>
472
488
 
473
489
  #### Overview
474
490
 
@@ -485,7 +501,6 @@ In the .py config representation, the validator function can be passed directly
485
501
  #### Example configuration for external API key validation using HTTP request to Keycloak / OpenID Connect
486
502
 
487
503
  This example shows how to validate API keys against an external service (e.g., Keycloak):
488
-
489
504
  ```toml
490
505
  [api_key_check]
491
506
  class = "lm_proxy.api_key_check.CheckAPIKeyWithRequest"
@@ -502,7 +517,6 @@ Authorization = "Bearer {api_key}"
502
517
 
503
518
  For more advanced authentication needs,
504
519
  you can implement a custom validator function:
505
-
506
520
  ```python
507
521
  # my_validators.py
508
522
  def validate_api_key(api_key: str) -> str | None:
@@ -523,7 +537,6 @@ def validate_api_key(api_key: str) -> str | None:
523
537
  ```
524
538
 
525
539
  Then reference it in your config:
526
-
527
540
  ```toml
528
541
  api_key_check = "my_validators.validate_api_key"
529
542
  ```
@@ -531,11 +544,11 @@ api_key_check = "my_validators.validate_api_key"
531
544
  > In this case, the `api_keys` lists in groups are ignored, and the custom function is responsible for all validation logic.
532
545
 
533
546
 
534
- ## 🛠️ Advanced Usage
535
- ### Dynamic Model Routing
547
+ ## 🛠️ Advanced Usage<a id="-advanced-usage"></a>
536
548
 
537
- The routing section allows flexible pattern matching with wildcards:
549
+ ### Dynamic Model Routing<a id="dynamic-model-routing"></a>
538
550
 
551
+ The routing section allows flexible pattern matching with wildcards:
539
552
  ```toml
540
553
  [routing]
541
554
  "gpt-4*" = "openai.gpt-4" # Route gpt-4 requests to OpenAI GPT-4
@@ -548,18 +561,19 @@ The routing section allows flexible pattern matching with wildcards:
548
561
  Keys are model name patterns (with `*` wildcard support), and values are connection/model mappings.
549
562
  Connection names reference those defined in the `[connections]` section.
550
563
 
551
- ### Load Balancing Example
564
+ ### Load Balancing Example<a id="load-balancing-example"></a>
552
565
 
553
566
  - [Simple load-balancer configuration](https://github.com/Nayjest/lm-proxy/blob/main/examples/load_balancer_config.py)
554
567
  This example demonstrates how to set up a load balancer that randomly
555
568
  distributes requests across multiple language model servers using the lm_proxy.
556
569
 
557
- ### Google Vertex AI Configuration Example
570
+ ### Google Vertex AI Configuration Example<a id="google-vertex-ai-configuration-example"></a>
558
571
 
559
572
  - [vertex-ai.toml](https://github.com/Nayjest/lm-proxy/blob/main/examples/vertex-ai.toml)
560
573
  This example demonstrates how to connect Inference Proxy to Google Gemini model via Vertex AI API
561
574
 
562
- ### Using Tokens from OIDC Provider as Virtual/Client API Keys
575
+
576
+ ### Using Tokens from OIDC Provider as Virtual/Client API Keys<a id="using-tokens-from-oidc-provider-as-virtualclient-api-keys"></a>
563
577
 
564
578
  You can configure Inference Proxy to validate tokens from OpenID Connect (OIDC) providers like Keycloak, Auth0, or Okta as API keys.
565
579
 
@@ -594,9 +608,95 @@ Authorization = "Bearer {api_key}"
594
608
 
595
609
  Clients pass their OIDC access token as the API key when making requests to Inference Proxy.
596
610
 
597
- ## 🧩 Add-on Components
598
611
 
599
- ### Database Connector
612
+ ## 🪝 Request Handlers (Middleware)<a id="-request-handlers--middleware"></a>
613
+
614
+ Handlers intercept and modify requests *before* they reach the upstream LLM provider. They enable cross-cutting concerns such as rate limiting, logging, auditing, and header manipulation.
615
+
616
+ Handlers are defined in the `before` list within the configuration file and execute sequentially in the order specified.
617
+
618
+ ### Built-in Handlers
619
+
620
+ Inference Proxy includes several built-in handlers for common operational needs.
621
+
622
+ #### Rate Limiter
623
+
624
+ The `RateLimiter` protects upstream credentials and manages traffic load using a sliding window algorithm.
625
+
626
+ **Parameters:**
627
+
628
+ | Parameter | Type | Description |
629
+ |-----------|------|-------------|
630
+ | `max_requests` | int | Maximum number of requests allowed per window |
631
+ | `window_seconds` | int | Duration of the sliding window in seconds |
632
+ | `per` | string | Scope of the limit: `api_key`, `ip`, `connection`, `group`, or `global` |
633
+
634
+ **Configuration:**
635
+ ```toml
636
+ [[before]]
637
+ class = "lm_proxy.handlers.RateLimiter"
638
+ max_requests = 10
639
+ window_seconds = 60
640
+ per = "api_key"
641
+
642
+ [[before]]
643
+ class = "lm_proxy.handlers.RateLimiter"
644
+ max_requests = 1000
645
+ window_seconds = 300
646
+ per = "global"
647
+ ```
648
+
649
+ #### HTTP Headers Forwarder
650
+
651
+ The `HTTPHeadersForwarder` passes specific headers from incoming client requests to the upstream provider—useful for distributed tracing or tenant context propagation.
652
+
653
+ Sensitive headers (`Authorization`, `Host`, `Content-Length`) are stripped by default to prevent protocol corruption and credential leaks.
654
+ ```toml
655
+ [[before]]
656
+ class = "lm_proxy.handlers.HTTPHeadersForwarder"
657
+ white_list_headers = ["x-trace-id", "x-correlation-id", "x-tenant-id"]
658
+ ```
659
+ See also [HTTP Header Management](https://github.com/Nayjest/lm-proxy/blob/main/doc/http_headers.md).
660
+
661
+ ### Custom Handlers
662
+
663
+ Extend functionality by implementing custom handlers in Python. A handler is any callable (function or class instance) that accepts a `RequestContext`.
664
+
665
+ #### Interface
666
+ ```python
667
+ from lm_proxy.base_types import RequestContext
668
+
669
+ async def my_custom_handler(ctx: RequestContext) -> None:
670
+ # Implementation here
671
+ pass
672
+ ```
673
+
674
+ #### Example: Audit Logger
675
+ ```python
676
+ # my_extensions.py
677
+ import logging
678
+ from lm_proxy.base_types import RequestContext
679
+
680
+ class AuditLogger:
681
+ def __init__(self, prefix: str = "AUDIT"):
682
+ self.prefix = prefix
683
+
684
+ async def __call__(self, ctx: RequestContext) -> None:
685
+ user = ctx.user_info.get("name", "anonymous")
686
+ logging.info(f"[{self.prefix}] User '{user}' requested model '{ctx.model}'")
687
+ ```
688
+
689
+ **Registration:**
690
+ ```toml
691
+ [[before]]
692
+ class = "my_extensions.AuditLogger"
693
+ prefix = "SECURITY_AUDIT"
694
+ ```
695
+
696
+
697
+ ## 🧩 Add-on Components<a id="-add-on-components"></a>
698
+
699
+ ### Database Connector<a id="database-connector"></a>
600
700
 
601
701
  [inference-proxy-db-connector](https://github.com/nayjest/lm-proxy-db-connector) is a lightweight SQLAlchemy-based connector that enables Inference Proxy to work with relational databases including PostgreSQL, MySQL/MariaDB, SQLite, Oracle, Microsoft SQL Server, and many others.
602
702
 
@@ -605,7 +705,21 @@ Clients pass their OIDC access token as the API key when making requests to Infe
605
705
  - Share database connections across components, extensions, and custom functions
606
706
  - Built-in database logger for structured logging of AI request data
607
707
 
608
- ## 🔍 Debugging
708
+
709
+ ## 📚 Guides & Reference<a id="-guides--reference"></a>
710
+
711
+ For more detailed information, check out these articles:
712
+ - [HTTP Header Management](https://github.com/Nayjest/lm-proxy/blob/main/doc/http_headers.md)
713
+
714
+
715
+ ## 🚧 Known Limitations<a id="-known-limitations"></a>
716
+
717
+ - **Multiple generations (n > 1):** When proxying requests to Google or Anthropic APIs, only the first generation is returned. Multi-generation support is tracked in [#35](https://github.com/Nayjest/lm-proxy/issues/35).
718
+
719
+ - **Model listing with wildcards / forwarding actual model metadata:** The `/v1/models` endpoint does not query upstream providers to expand wildcard patterns (e.g., `gpt*`) or fetch model metadata. Only explicitly defined model names are listed [#36](https://github.com/Nayjest/lm-proxy/issues/36).
720
+
721
+
722
+ ## 🔍 Debugging<a id="-debugging"></a>
609
723
 
610
724
  ### Overview
611
725
  When **debugging mode** is enabled,
@@ -629,7 +743,7 @@ Alternatively, you can enable or disable debugging via the command-line argument
629
743
  > CLI arguments override environment variable settings.
630
744
 
631
745
 
632
- ## 🤝 Contributing
746
+ ## 🤝 Contributing<a id="-contributing"></a>
633
747
 
634
748
  Contributions are welcome! Please feel free to submit a Pull Request.
635
749
 
@@ -640,7 +754,7 @@ Contributions are welcome! Please feel free to submit a Pull Request.
640
754
  5. Open a Pull Request
641
755
 
642
756
 
643
- ## 📄 License
757
+ ## 📄 License<a id="-license"></a>
644
758
 
645
759
  This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
646
760
  © 2025–2026 [Vitalii Stepanenko](mailto:mail@vitaliy.in)