PyPI - inference-proxy - Versions diffs - 0.2.2__tar.gz → 0.3.0__tar.gz - Mend

inference-proxy 0.2.2tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (15) hide show

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.3
 Name: inference-proxy
-Version: 0.2.2
+Version: 0.3.0
 Summary: "Inference Proxy" is OpenAI-compatible http proxy server for inferencing various LLMs capable of working with Google, Anthropic, OpenAI APIs, local PyTorch inference, etc.
 License: MIT License
@@ -158,6 +158,10 @@ api_key = "env:OPENAI_API_KEY"
 api_type = "google_ai_studio"
 api_key = "env:GOOGLE_API_KEY"
+[connections.anthropic]
+api_type = "anthropic"
+api_key  = "env:ANTHROPIC_API_KEY"
 # Routing rules (model_pattern = "connection.model")
 [routing]
 "gpt*" = "openai.*"     # Route all GPT models to OpenAI
@@ -171,6 +175,25 @@ api_keys = [
     "KEY1",
     "KEY2"
 ]
+# optional
+[[loggers]]
+class = 'lm_proxy.loggers.BaseLogger'
+[loggers.log_writer]
+class = 'lm_proxy.loggers.log_writers.JsonLogWriter'
+file_name = 'storage/json.log'
+[loggers.entry_transformer]
+class = 'lm_proxy.loggers.LogEntryTransformer'
+completion_tokens = "response.usage.completion_tokens"
+prompt_tokens = "response.usage.prompt_tokens"
+prompt = "request.messages"
+response = "response"
+group = "group"
+connection = "connection"
+api_key_id = "api_key_id"
+remote_addr = "remote_addr"
+created_at = "created_at"
+duration = "duration"
 ```
 ### Environment Variables
@@ -184,6 +207,28 @@ api_key = "env:OPENAI_API_KEY"
 Load these from a `.env` file or set them in your environment before starting the server.
+## 🔑 Proxy API Keys vs. Provider API Keys
+Inference Proxy utilizes two distinct types of API keys to facilitate secure and efficient request handling.
+- **Proxy API Key (Virtual API Key, Client API Key):**
+A unique key generated and managed within the Inference Proxy.
+Clients use these keys to authenticate their requests to the proxy's API endpoints.
+Each Client API Key is associated with a specific group, which defines the scope of access and permissions for the client's requests.
+These keys allow users to securely interact with the proxy without direct access to external service credentials.
+- **Provider API Key (Upstream API Key):**
+A key provided by external LLM inference providers (e.g., OpenAI, Anthropic, Mistral, etc.) and configured within the Inference Proxy.
+The proxy uses these keys to authenticate and forward validated client requests to the respective external services.
+Provider API Keys remain hidden from end users, ensuring secure and transparent communication with provider APIs.
+This distinction ensures a clear separation of concerns:
+Virtual API Keys manage user authentication and access within the proxy,
+while Upstream API Keys handle secure communication with external providers.
 ## 🔌 API Usage
 Inference Proxy implements the OpenAI chat completions API endpoint. You can use any OpenAI-compatible client to interact with it.

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/README.md RENAMED Viewed

@@ -112,6 +112,10 @@ api_key = "env:OPENAI_API_KEY"
 api_type = "google_ai_studio"
 api_key = "env:GOOGLE_API_KEY"
+[connections.anthropic]
+api_type = "anthropic"
+api_key  = "env:ANTHROPIC_API_KEY"
 # Routing rules (model_pattern = "connection.model")
 [routing]
 "gpt*" = "openai.*"     # Route all GPT models to OpenAI
@@ -125,6 +129,25 @@ api_keys = [
     "KEY1",
     "KEY2"
 ]
+# optional
+[[loggers]]
+class = 'lm_proxy.loggers.BaseLogger'
+[loggers.log_writer]
+class = 'lm_proxy.loggers.log_writers.JsonLogWriter'
+file_name = 'storage/json.log'
+[loggers.entry_transformer]
+class = 'lm_proxy.loggers.LogEntryTransformer'
+completion_tokens = "response.usage.completion_tokens"
+prompt_tokens = "response.usage.prompt_tokens"
+prompt = "request.messages"
+response = "response"
+group = "group"
+connection = "connection"
+api_key_id = "api_key_id"
+remote_addr = "remote_addr"
+created_at = "created_at"
+duration = "duration"
 ```
 ### Environment Variables
@@ -138,6 +161,28 @@ api_key = "env:OPENAI_API_KEY"
 Load these from a `.env` file or set them in your environment before starting the server.
+## 🔑 Proxy API Keys vs. Provider API Keys
+Inference Proxy utilizes two distinct types of API keys to facilitate secure and efficient request handling.
+- **Proxy API Key (Virtual API Key, Client API Key):**
+A unique key generated and managed within the Inference Proxy.
+Clients use these keys to authenticate their requests to the proxy's API endpoints.
+Each Client API Key is associated with a specific group, which defines the scope of access and permissions for the client's requests.
+These keys allow users to securely interact with the proxy without direct access to external service credentials.
+- **Provider API Key (Upstream API Key):**
+A key provided by external LLM inference providers (e.g., OpenAI, Anthropic, Mistral, etc.) and configured within the Inference Proxy.
+The proxy uses these keys to authenticate and forward validated client requests to the respective external services.
+Provider API Keys remain hidden from end users, ensuring secure and transparent communication with provider APIs.
+This distinction ensures a clear separation of concerns:
+Virtual API Keys manage user authentication and access within the proxy,
+while Upstream API Keys handle secure communication with external providers.
 ## 🔌 API Usage
 Inference Proxy implements the OpenAI chat completions API endpoint. You can use any OpenAI-compatible client to interact with it.

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/lm_proxy/app.py RENAMED Viewed

@@ -14,7 +14,7 @@ def run_server(
     config: str = typer.Option(None, help="Path to the configuration file"),
     debug: bool = typer.Option(False, help="Enable debug mode (more verbose logging)"),
 ):
-    bootstrap(config or 'config.toml')
+    bootstrap(config or "config.toml")
     uvicorn.run(
         "lm_proxy.app:web_app",
         host=env.config.host,
@@ -25,7 +25,9 @@ def run_server(
 def web_app():
-    app = FastAPI(title="LM-Proxy", description="OpenAI-compatible proxy server for LLM inference")
+    app = FastAPI(
+        title="LM-Proxy", description="OpenAI-compatible proxy server for LLM inference"
+    )
     app.add_api_route(
         path="/v1/chat/completions",
         endpoint=chat_completions,

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/lm_proxy/bootstrap.py RENAMED Viewed

@@ -55,12 +55,13 @@ class Env:
                     env.connections[conn_name] = conn_config
                 else:
                     mc.configure(
-                        **conn_config,
-                        EMBEDDING_DB_TYPE=mc.EmbeddingDbType.NONE
+                        **conn_config, EMBEDDING_DB_TYPE=mc.EmbeddingDbType.NONE
                     )
                     env.connections[conn_name] = mc.env().llm_async_function
             except mc.LLMConfigError as e:
-                raise ValueError(f"Error in configuration for connection '{conn_name}': {e}")
+                raise ValueError(
+                    f"Error in configuration for connection '{conn_name}': {e}"
+                )
         logging.info(f"Done initializing {len(env.connections)} connections.")
@@ -68,9 +69,9 @@ class Env:
 env = Env()
-def bootstrap(config: str | Config = 'config.toml'):
-    load_dotenv('.env', override=True)
-    debug = '--debug' in sys.argv or get_bool_from_env('LM_PROXY_DEBUG', False)
+def bootstrap(config: str | Config = "config.toml"):
+    load_dotenv(".env", override=True)
+    debug = "--debug" in sys.argv or get_bool_from_env("LM_PROXY_DEBUG", False)
     setup_logging(logging.DEBUG if debug else logging.INFO)
     mc.logging.LoggingConfig.OUTPUT_METHOD = logging.info
     logging.info(

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/lm_proxy/config.py RENAMED Viewed

@@ -2,6 +2,7 @@
 Configuration models for LM-Proxy settings.
 This module defines Pydantic models that match the structure of config.toml.
 """
 import os
 from typing import Union, Callable
 import tomllib
@@ -10,6 +11,8 @@ import importlib.util
 from pydantic import BaseModel, Field, ConfigDict
 from microcore.utils import resolve_callable
+from .utils import resolve_instance_or_callable
 class Group(BaseModel):
     api_keys: list[str] = Field(default_factory=list)
@@ -24,20 +27,30 @@ class Group(BaseModel):
 class Config(BaseModel):
     """Main configuration model matching config.toml structure."""
     model_config = ConfigDict(extra="forbid")
     enabled: bool = True
     host: str = "0.0.0.0"
     port: int = 8000
     dev_autoreload: bool = False
-    connections: dict[str, Union[dict, Callable]]
+    connections: dict[str, Union[dict, Callable]] = Field(
+        ...,  # Required field (no default)
+        description="Dictionary of connection configurations",
+        examples=[{"openai": {"api_key": "sk-..."}}],
+    )
     routing: dict[str, str] = Field(default_factory=dict)
     """ model_name_pattern* => connection_name.< model | * >, example: {"gpt-*": "oai.*"} """
     groups: dict[str, Group] = Field(default_factory=dict)
     check_api_key: Union[str, Callable] = Field(default="lm_proxy.core.check_api_key")
+    loggers: list[Union[str, Callable, dict]] = Field(default_factory=list)
+    encryption_key: str = Field(
+        default="Eclipse", description="Key for encrypting sensitive data"
+    )
     def __init__(self, **data):
         super().__init__(**data)
         self.check_api_key = resolve_callable(self.check_api_key)
+        self.loggers = [resolve_instance_or_callable(logger) for logger in self.loggers]
         if not self.groups:
             # Default group with no restrictions
             self.groups = {"default": Group()}

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/lm_proxy/core.py RENAMED Viewed

@@ -4,16 +4,20 @@ import json
 import logging
 import secrets
 import time
+import hashlib
 from typing import List, Optional
 import microcore as mc
 from fastapi import HTTPException
+from lm_proxy.loggers import LogEntry
 from pydantic import BaseModel
 from starlette.requests import Request
 from starlette.responses import JSONResponse, Response, StreamingResponse
 from .bootstrap import env
 from .config import Config, Group
+from .loggers import log_non_blocking
+from .utils import get_client_ip
 class ChatCompletionRequest(BaseModel):
@@ -30,7 +34,9 @@ class ChatCompletionRequest(BaseModel):
     user: Optional[str] = None
-def resolve_connection_and_model(config: Config, external_model: str) -> tuple[str, str]:
+def resolve_connection_and_model(
+    config: Config, external_model: str
+) -> tuple[str, str]:
     for model_match, rule in config.routing.items():
         if fnmatch.fnmatchcase(external_model, model_match):
             connection_name, model_part = rule.split(".", 1)
@@ -45,11 +51,14 @@ def resolve_connection_and_model(config: Config, external_model: str) -> tuple[s
     raise ValueError(
         f"No routing rule matched model '{external_model}'. "
-        "Add a catch-all rule like \"*\" = \"openai.gpt-3.5-turbo\" if desired."
+        'Add a catch-all rule like "*" = "openai.gpt-3.5-turbo" if desired.'
     )
-async def process_stream(async_llm_func, prompt, llm_params):
+async def process_stream(
+    async_llm_func, request: ChatCompletionRequest, llm_params, log_entry: LogEntry
+):
+    prompt = request.messages
     queue = asyncio.Queue()
     stream_id = f"chatcmpl-{secrets.token_hex(12)}"
     created = int(time.time())
@@ -67,20 +76,18 @@ async def process_stream(async_llm_func, prompt, llm_params):
             "choices": [{"index": 0, "delta": delta}],
         }
         if error is not None:
-            obj['error'] = {'message': str(error), 'type': type(error).__name__}
+            obj["error"] = {"message": str(error), "type": type(error).__name__}
             if finish_reason is None:
-                finish_reason = 'error'
+                finish_reason = "error"
         if finish_reason is not None:
-            obj['choices'][0]['finish_reason'] = finish_reason
+            obj["choices"][0]["finish_reason"] = finish_reason
         return "data: " + json.dumps(obj) + "\n\n"
-    task = asyncio.create_task(
-        async_llm_func(prompt, **llm_params, callback=callback)
-    )
+    task = asyncio.create_task(async_llm_func(prompt, **llm_params, callback=callback))
     try:
         # Initial chunk: role
-        yield make_chunk(delta={'role': 'assistant'})
+        yield make_chunk(delta={"role": "assistant"})
         while not task.done():
             try:
@@ -96,13 +103,16 @@ async def process_stream(async_llm_func, prompt, llm_params):
     finally:
         try:
-            await task
+            result = await task
+            log_entry.response = result
         except Exception as e:
-            yield make_chunk(error={'message': str(e), 'type': type(e).__name__})
+            log_entry.error = e
+            yield make_chunk(error={"message": str(e), "type": type(e).__name__})
     # Final chunk: finish_reason
-    yield make_chunk(finish_reason='stop')
+    yield make_chunk(finish_reason="stop")
     yield "data: [DONE]\n\n"
+    await log_non_blocking(log_entry)
 def read_api_key(request: Request) -> str:
@@ -116,13 +126,33 @@ def read_api_key(request: Request) -> str:
     return ""
-def check_api_key(api_key: Optional[str]) -> Group:
+def check_api_key(api_key: Optional[str]) -> Optional[Group]:
+    """
+    Validates an Client API key against configured groups and returns the matching group.
+    Args:
+        api_key (Optional[str]): The Virtual / Client API key to validate.
+    Returns:
+        Optional[Group]: The Group object if the API key is valid and found in a group,
+        None otherwise.
+    """
     for group_name, group in env.config.groups.items():
         if api_key in group.api_keys:
             return group_name
+    return None
+def api_key_id(api_key: Optional[str]) -> str | None:
+    if not api_key:
+        return None
+    return hashlib.md5(
+        (api_key + env.config.encryption_key).encode("utf-8")
+    ).hexdigest()
-async def chat_completions(request: ChatCompletionRequest, raw_request: Request) -> Response:
+async def chat_completions(
+    request: ChatCompletionRequest, raw_request: Request
+) -> Response:
     """
     Endpoint for chat completions that mimics OpenAI's API structure.
     Streams the response from the LLM using microcore.
@@ -141,13 +171,19 @@ async def chat_completions(request: ChatCompletionRequest, raw_request: Request)
         )
     api_key = read_api_key(raw_request)
     group: str | bool | None = (env.config.check_api_key)(api_key)
+    log_entry = LogEntry(
+        request=request,
+        api_key_id=api_key_id(api_key),
+        group=group if isinstance(group, str) else None,
+        remote_addr=get_client_ip(raw_request),
+    )
     if not group:
         raise HTTPException(
             status_code=403,
             detail={
                 "error": {
                     "message": "Incorrect API key provided: "
-                               "your API key is invalid, expired, or revoked.",
+                    "your API key is invalid, expired, or revoked.",
                     "type": "invalid_request_error",
                     "param": None,
                     "code": "invalid_api_key",
@@ -155,17 +191,17 @@ async def chat_completions(request: ChatCompletionRequest, raw_request: Request)
             },
         )
-    llm_params = request.model_dump(exclude={'messages'}, exclude_none=True)
+    llm_params = request.model_dump(exclude={"messages"}, exclude_none=True)
     connection, llm_params["model"] = resolve_connection_and_model(
-        env.config,
-        llm_params.get("model", "default_model")
+        env.config, llm_params.get("model", "default_model")
     )
+    log_entry.connection = connection
     logging.debug(
         "Resolved routing for [%s] --> connection: %s, model: %s",
         request.model,
         connection,
-        llm_params["model"]
+        llm_params["model"],
     )
     if not env.config.groups[group].allows_connecting_to(connection):
@@ -186,18 +222,27 @@ async def chat_completions(request: ChatCompletionRequest, raw_request: Request)
     logging.info("Querying LLM... params: %s", llm_params)
     if request.stream:
         return StreamingResponse(
-            process_stream(async_llm_func, request.messages, llm_params),
-            media_type="text/event-stream"
+            process_stream(async_llm_func, request, llm_params, log_entry),
+            media_type="text/event-stream",
         )
-    out = await async_llm_func(request.messages, **llm_params)
-    logging.info("LLM response: %s", out)
+    try:
+        out = await async_llm_func(request.messages, **llm_params)
+        log_entry.response = out
+        logging.info("LLM response: %s", out)
+    except Exception as e:
+        log_entry.error = e
+        await log_non_blocking(log_entry)
+        raise
+    await log_non_blocking(log_entry)
     return JSONResponse(
         {
             "choices": [
                 {
                     "index": 0,
                     "message": {"role": "assistant", "content": str(out)},
-                    "finish_reason": "stop"
+                    "finish_reason": "stop",
                 }
             ]
         }

inference_proxy-0.3.0/lm_proxy/loggers/__init__.py ADDED Viewed

@@ -0,0 +1,11 @@
+from .base_logger import BaseLogger, LogEntryTransformer
+from .log_writers import JsonLogWriter
+from .core import LogEntry, log_non_blocking
+__all__ = [
+    "BaseLogger",
+    "LogEntryTransformer",
+    "JsonLogWriter",
+    "LogEntry",
+    "log_non_blocking",
+]

inference_proxy-0.3.0/lm_proxy/loggers/base_logger.py ADDED Viewed

@@ -0,0 +1,56 @@
+import abc
+from dataclasses import dataclass, field
+from lm_proxy.utils import resolve_instance_or_callable
+from ..utils import resolve_obj_path
+from .core import LogEntry
+class AbstractLogEntryTransformer(abc.ABC):
+    @abc.abstractmethod
+    def __call__(self, log_entry: LogEntry) -> dict:
+        raise NotImplementedError()
+class LogEntryTransformer(AbstractLogEntryTransformer):
+    def __init__(self, **kwargs):
+        self.mapping = kwargs
+    def __call__(self, log_entry: LogEntry) -> dict:
+        result = {}
+        for key, path in self.mapping.items():
+            result[key] = resolve_obj_path(log_entry, path)
+        return result
+class AbstractLogWriter(abc.ABC):
+    @abc.abstractmethod
+    def __call__(self, logged_data: dict) -> dict:
+        raise NotImplementedError()
+@dataclass
+class BaseLogger:
+    log_writer: AbstractLogWriter | str | dict
+    entry_transformer: AbstractLogEntryTransformer | str | dict = field(default=None)
+    def __post_init__(self):
+        self.entry_transformer = resolve_instance_or_callable(
+            self.entry_transformer,
+            debug_name="logging.<logger>.entry_transformer",
+        )
+        self.log_writer = resolve_instance_or_callable(
+            self.log_writer,
+            debug_name="logging.<logger>.log_writer",
+        )
+    def _transform(self, log_entry: LogEntry) -> dict:
+        return (
+            self.entry_transformer(log_entry)
+            if self.entry_transformer
+            else log_entry.to_dict()
+        )
+    def __call__(self, log_entry: LogEntry):
+        self.log_writer(self._transform(log_entry))

inference_proxy-0.3.0/lm_proxy/loggers/core.py ADDED Viewed

@@ -0,0 +1,53 @@
+import asyncio
+import logging
+from typing import Optional, TYPE_CHECKING
+from dataclasses import dataclass, field
+from datetime import datetime
+import microcore as mc
+from ..bootstrap import env
+if TYPE_CHECKING:
+    from lm_proxy.core import ChatCompletionRequest, Group
+@dataclass
+class LogEntry:
+    request: "ChatCompletionRequest" = field()
+    response: Optional[mc.LLMResponse] = field(default=None)
+    error: Optional[Exception] = field(default=None)
+    group: "Group" = field(default=None)
+    connection: str = field(default=None)
+    api_key_id: Optional[str] = field(default=None)
+    remote_addr: Optional[str] = field(default=None)
+    created_at: Optional[datetime] = field(default_factory=datetime.now)
+    duration: Optional[float] = field(default=None)
+    def to_dict(self) -> dict:
+        data = self.__dict__.copy()
+        if self.request:
+            data["request"] = self.request.model_dump(mode="json")
+        return data
+async def log(log_entry: LogEntry):
+    if log_entry.duration is None and log_entry.created_at:
+        log_entry.duration = (datetime.now() - log_entry.created_at).total_seconds()
+    for handler in env.config.loggers:
+        # check if it is async, then run both sync and async loggers in non-blocking way (sync too)
+        if asyncio.iscoroutinefunction(handler):
+            asyncio.create_task(handler(log_entry))
+        else:
+            try:
+                handler(log_entry)
+            except Exception as e:
+                logging.error("Error in logger handler: %s", e)
+                raise e
+async def log_non_blocking(
+    log_entry: LogEntry,
+) -> Optional[asyncio.Task]:
+    if env.config.loggers:
+        task = asyncio.create_task(log(log_entry))
+        return task

inference_proxy-0.3.0/lm_proxy/loggers/log_writers.py ADDED Viewed

@@ -0,0 +1,24 @@
+import os
+import json
+from dataclasses import dataclass
+from .base_logger import AbstractLogWriter
+from ..utils import CustomJsonEncoder
+@dataclass
+class JsonLogWriter(AbstractLogWriter):
+    file_name: str
+    def __post_init__(self):
+        dir_path = os.path.dirname(self.file_name)
+        if dir_path:
+            os.makedirs(dir_path, exist_ok=True)
+        # Create the file if it doesn't exist
+        with open(self.file_name, "a", encoding="utf-8"):
+            pass
+    def __call__(self, logged_data: dict):
+        with open(self.file_name, "a", encoding="utf-8") as f:
+            f.write(json.dumps(logged_data, cls=CustomJsonEncoder) + "\n")

inference_proxy-0.3.0/lm_proxy/utils.py ADDED Viewed

@@ -0,0 +1,73 @@
+import json
+import inspect
+from typing import Union, Callable
+from datetime import datetime, date, time
+from microcore.utils import resolve_callable
+from starlette.requests import Request
+def resolve_obj_path(obj, path: str, default=None):
+    """Resolves dotted path supporting both attributes and dict keys."""
+    for part in path.split("."):
+        try:
+            if isinstance(obj, dict):
+                obj = obj[part]
+            else:
+                obj = getattr(obj, part)
+        except (AttributeError, KeyError, TypeError):
+            return default
+    return obj
+def resolve_instance_or_callable(
+    item: Union[str, Callable, dict], class_key: str = "class", debug_name: str = None
+) -> Callable:
+    if not item:
+        return None
+    if isinstance(item, dict):
+        if class_key not in item:
+            raise ValueError(
+                f"'{class_key}' key is missing in {debug_name or 'item'} config: {item}"
+            )
+        class_name = item.pop(class_key)
+        constructor = resolve_callable(class_name)
+        return constructor(**item)
+    if isinstance(item, str):
+        fn = resolve_callable(item)
+        return fn() if inspect.isclass(fn) else fn
+    if callable(item):
+        return item() if inspect.isclass(item) else item
+    else:
+        raise ValueError(f"Invalid {debug_name or 'item'} config: {item}")
+class CustomJsonEncoder(json.JSONEncoder):
+    def default(self, obj):
+        if isinstance(obj, datetime):
+            return obj.isoformat()
+        elif isinstance(obj, date):
+            return obj.isoformat()
+        elif isinstance(obj, time):
+            return obj.isoformat()
+        elif hasattr(obj, "__dict__"):
+            return obj.__dict__
+        elif hasattr(obj, "model_dump"):
+            return obj.model_dump()
+        elif hasattr(obj, "dict"):
+            return obj.dict()
+        return super().default(obj)
+def get_client_ip(request: Request) -> str:
+    # Try different headers in order of preference
+    if forwarded_for := request.headers.get("X-Forwarded-For"):
+        return forwarded_for.split(",")[0].strip()
+    if real_ip := request.headers.get("X-Real-IP"):
+        return real_ip
+    if forwarded := request.headers.get("Forwarded"):
+        # Parse Forwarded header (RFC 7239)
+        return forwarded.split("for=")[1].split(";")[0].strip()
+    # Fallback to direct client
+    return request.client.host if request.client else "unknown"

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "inference-proxy"
-version = "0.2.2"
+version = "0.3.0"
 description = "\"Inference Proxy\" is OpenAI-compatible http proxy server for inferencing various LLMs capable of working with Google, Anthropic, OpenAI APIs, local PyTorch inference, etc."
 readme = "README.md"
 keywords = ["llm", "large language models", "ai", "gpt", "openai", "proxy", "http", "proxy-server"]
@@ -43,7 +43,11 @@ package-mode = true
 packages = [{ include = "lm_proxy"}]
 [tool.poetry.group.test.dependencies]
-pytest = "^7.4.3"
+pytest = "~=8.4.2"
+pytest-asyncio = "~=1.2.0"
 [tool.poetry.scripts]
 inference-proxy = "lm_proxy.app:cli_app"
+[tool.pytest.ini_options]
+asyncio_mode = "auto"

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/LICENSE RENAMED Viewed

File without changes

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/lm_proxy/__init__.py RENAMED Viewed

File without changes

{inference_proxy-0.2.2 → inference_proxy-0.3.0}/lm_proxy/__main__.py RENAMED Viewed

File without changes

inference-proxy 0.2.2__tar.gz → 0.3.0__tar.gz

inference-proxy 0.2.2tar.gz → 0.3.0tar.gz