polyrouter 1.0.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ prune examples
2
+ prune testenv
3
+ exclude .ignore
4
+ exclude README.md
5
+ exclude requirements.txt
6
+ exclude requirements-dev.txt
7
+ exclude temp.py
@@ -0,0 +1,304 @@
1
+ Metadata-Version: 2.4
2
+ Name: polyrouter
3
+ Version: 1.0.0
4
+ Summary: A routing/orchestration library (adjust as needed)
5
+ Author-email: Pratham Tomar <prathamtomar1733@gmail.com>
6
+ License: MIT
7
+ Classifier: Programming Language :: Python :: 3
8
+ Classifier: Programming Language :: Python :: 3.11
9
+ Classifier: License :: OSI Approved :: MIT License
10
+ Classifier: Operating System :: OS Independent
11
+ Requires-Python: >=3.11
12
+ Description-Content-Type: text/markdown
13
+ Requires-Dist: cerebras_cloud_sdk>=1.67.0
14
+ Requires-Dist: google-genai>=2.4.0
15
+ Requires-Dist: groq>=1.2.0
16
+
17
+ # PolyRouter
18
+
19
+ ![Python](https://img.shields.io/badge/Python-3-blue)
20
+ ![Architecture](https://img.shields.io/badge/Architecture-Config--Driven%20LLM%20Router-0A84FF)
21
+ ![License](https://img.shields.io/badge/License-TBD-lightgrey)
22
+
23
+ PolyRouter is a lightweight Python library that routes requests across multiple LLM providers. It helps applications achieve deterministic failover by rotating API keys, client providers, and model candidates when requests fail.
24
+
25
+ ## Overview
26
+
27
+ This repository provides an orchestration layer that can be embedded in your application to manage provider rotation, API-key pools, and model fallbacks. Keys are loaded from the environment (see `.env`), and the orchestrator tries configured provider/model/key combinations until a request succeeds or all combinations are exhausted.
28
+
29
+ Example provider adapters included in this snapshot:
30
+
31
+ - Groq
32
+ - Google Gemini
33
+ - Cerebras
34
+
35
+ ## Badges
36
+
37
+ | Badge | Meaning |
38
+ | ----------------------------------------------------------------------------------------------- | -------------------------------------- |
39
+ | ![Python](https://img.shields.io/badge/Python-3-blue) | Python implementation |
40
+ | ![Architecture](https://img.shields.io/badge/Architecture-Config--Driven%20LLM%20Router-0A84FF) | Multi-provider failover design |
41
+ | ![License](https://img.shields.io/badge/License-TBD-lightgrey) | Update once a formal license is chosen |
42
+
43
+ ## Key Features
44
+
45
+ | Capability | Technical Detail |
46
+ | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------- |
47
+ | Provider rotation | Requests can move across Groq, Gemini, and Cerebras client pools without changing application code. |
48
+ | Key pool management | Each provider can be backed by multiple API keys, allowing the runtime to continue when a single key expires or is rate-limited. |
49
+ | Model pool fallback | Ordered model lists in `config.py` act as a preference chain, so the router can try alternate models before surfacing a failure. |
50
+ | Debug visibility | `DEBUG` and `IN_DEPTH_DEBUG` control log verbosity so you can switch between concise operational logs and deep request tracing. |
51
+ | Centralized configuration | Provider counts, model lists, and debug mode live in one place instead of being duplicated across call sites. |
52
+ | Failure isolation | Provider-specific errors do not have to terminate the entire workflow if another valid key/model combination is still available. |
53
+
54
+ ## How It Works
55
+
56
+ ```mermaid
57
+ flowchart TD
58
+ A[Application request] --> B[Load config.py]
59
+ B --> C[Load .env API keys]
60
+ C --> D[Try primary provider]
61
+ D --> E{Request succeeds?}
62
+ E -->|Yes| F[Return response]
63
+ E -->|No| G[Rotate key / model / client]
64
+ G --> H{Any combinations left?}
65
+ H -->|Yes| D
66
+ H -->|No| I[Raise exhaustion error]
67
+ ```
68
+
69
+ The intended behavior is simple:
70
+
71
+ 1. Read provider preferences, model lists, and key counts from `config.py`.
72
+ 2. Load API credentials from the environment.
73
+ 3. Attempt a request with the active provider/model combination.
74
+ 4. On provider failure, rotate through the next key or model.
75
+ 5. When a provider pool is exhausted, move to the next client family.
76
+ 6. Stop only when every configured combination has been tried.
77
+
78
+ > The repo is built for operational resilience, not for single-provider purity.
79
+
80
+ ## Project Structure
81
+
82
+ ```text
83
+ PolyRouter/
84
+ ├── examples/ # Example usage scenarios
85
+ │ └── basic_usage.py
86
+ ├── polyrouter/ # Library source
87
+ │ ├── Exceptions.py
88
+ │ ├── LLMClients.py
89
+ │ ├── LLMOrchestrator.py
90
+ │ └── __init__.py
91
+ ├── .env # (example present in repo snapshot)
92
+ ├── requirements-dev.txt # Development / runtime deps
93
+ └── README.md
94
+ ```
95
+
96
+ ## Installation
97
+
98
+ Local setup (recommended):
99
+
100
+ ```bash
101
+ git clone <repository-url>
102
+ cd PolyRouter
103
+ python3 -m venv .venv
104
+ source .venv/bin/activate
105
+ pip install --upgrade pip
106
+ pip install -r requirements-dev.txt
107
+ ```
108
+
109
+ ### Environment variables
110
+
111
+ This repo includes a `.env` file in the snapshot. In normal usage copy and populate a local `.env` (do not commit secrets):
112
+
113
+ ```bash
114
+ cp .env .env.local
115
+ # edit .env.local and export provider API keys, e.g.:
116
+ # GROQ_API_KEY0=...
117
+ # GEMINI_API_KEY0=...
118
+ ```
119
+
120
+ The examples use environment variables named like `GROQ_API_KEY0`, `GEMINI_API_KEY0`, etc.
121
+
122
+ ### Dependencies
123
+
124
+ Dependencies used by examples and adapters may include provider SDKs and `python-dotenv`. Install via `requirements-dev.txt`.
125
+
126
+ <details>
127
+ <summary>Build / verification steps</summary>
128
+
129
+ This repository is a Python library-style project rather than a packaged service, so the essential validation step is import verification plus a smoke test in your host application.
130
+
131
+ ```bash
132
+ python -m compileall .
133
+ python - <<'PY'
134
+ from config import DEBUG, IN_DEPTH_DEBUG, GROQ_MODEL
135
+ print("config loaded:", DEBUG, IN_DEPTH_DEBUG, GROQ_MODEL)
136
+ PY
137
+ ```
138
+
139
+ </details>
140
+
141
+ ## Configuration
142
+
143
+ `config.py` is the primary customization point.
144
+
145
+ | Setting | Role |
146
+ | ---------------- | ------------------------------------------------------------ |
147
+ | `DEBUG` | Enables the main debug statement stream. |
148
+ | `IN_DEPTH_DEBUG` | Enables detailed trace output for low-level troubleshooting. |
149
+ | `GROQ_MODEL` | Ordered Groq model preference list. |
150
+ | `GEMINI_MODEL` | Ordered Gemini model preference list. |
151
+ | `CEREBRAS_MODEL` | Ordered Cerebras model preference list. |
152
+ | `GROQ_KEY` | Number of Groq API keys to scan. |
153
+ | `GEMINI_KEY` | Number of Gemini API keys to scan. |
154
+ | `CEREBRAS_KEY` | Number of Cerebras API keys to scan. |
155
+
156
+ Recommended operating model:
157
+
158
+ ```python
159
+ DEBUG = 1
160
+ IN_DEPTH_DEBUG = 0
161
+
162
+ GROQ_MODEL = ["openai/gpt-oss-120b", "llama-3.3-70b-versatile"]
163
+ GEMINI_MODEL = ["gemini-2.5-flash"]
164
+ CEREBRAS_MODEL = ["gpt-oss-120b"]
165
+
166
+ GROQ_KEY = 2
167
+ GEMINI_KEY = 1
168
+ CEREBRAS_KEY = 1
169
+ ```
170
+
171
+ ## Usage
172
+
173
+ See `examples/basic_usage.py` for a minimal example. The orchestrator can be constructed directly in your application; you do not need a `config.py` file if you prefer to pass provider settings programmatically.
174
+
175
+ Minimal usage:
176
+
177
+ ```python
178
+ from dotenv import load_dotenv
179
+ from polyrouter.LLMOrchestrator import LLMOrchestrator
180
+ import os
181
+
182
+ load_dotenv()
183
+
184
+ llm = LLMOrchestrator(
185
+ groq={
186
+ "groq_models": ["openai/gpt-oss-120b"],
187
+ "groq_keys": [os.getenv("GROQ_API_KEY0")],
188
+ },
189
+ debug=True,
190
+ verbose=True,
191
+ prompt="You are a helpful assistant",
192
+ )
193
+
194
+ response = llm.request()
195
+ print(response)
196
+ ```
197
+
198
+ When `debug`/`verbose` are enabled the orchestrator prints provider, model and key selection and rotation decisions.
199
+
200
+ ## API / CLI
201
+
202
+ No standalone CLI or HTTP API is exposed in this repository snapshot.
203
+
204
+ The public surface is intentionally library-oriented:
205
+
206
+ - `config.py` controls behavior
207
+ - `LLMClients.py` defines the client abstraction
208
+ - `LLMOrchestrator.py` is the orchestration boundary
209
+
210
+ If you add a CLI later, document it here with exact command syntax and exit codes.
211
+
212
+ ## Deployment
213
+
214
+ Because this project is a routing library, deployment usually means shipping it as part of a larger Python service or worker.
215
+
216
+ ### Recommended deployment checklist
217
+
218
+ 1. Pin dependencies with `requirements.txt`.
219
+ 2. Inject secrets through the runtime environment, not source control.
220
+ 3. Set `DEBUG = 0` and `IN_DEPTH_DEBUG = 0` for production unless you are actively diagnosing issues.
221
+ 4. Validate all required API keys are present before starting the process.
222
+ 5. Run the host service behind your preferred process manager, container runtime, or platform scheduler.
223
+
224
+ ### Containerized deployment
225
+
226
+ If you package the project into a container, copy only the source files, install requirements, and mount secrets through environment variables or secret storage.
227
+
228
+ ```dockerfile
229
+ FROM python:3.11-slim
230
+
231
+ WORKDIR /app
232
+ COPY requirements.txt .
233
+ RUN pip install --no-cache-dir -r requirements.txt
234
+
235
+ COPY . .
236
+ CMD ["python", "-c", "import config; print('LLM-Gateway-Service ready')"]
237
+ ```
238
+
239
+ ## Screenshots
240
+
241
+ > Screenshot placeholder: add an architecture diagram or runtime trace capture here once the project has a visual demo surface.
242
+
243
+ Suggested assets for a production repository:
244
+
245
+ - request-routing diagram
246
+ - provider rotation log snippet
247
+ - environment setup screenshot
248
+
249
+ ## Troubleshooting
250
+
251
+ | Symptom | Likely Cause | Resolution |
252
+ | -------------------------------------- | ------------------------------------------------------------ | -------------------------------------------------------------------------- |
253
+ | Import error for a provider SDK | Dependencies are missing from the active virtual environment | Re-run `pip install -r requirements.txt` inside the activated environment. |
254
+ | Requests stop after one provider fails | No fallback keys or models are configured | Add more keys to `.env` and expand the model pool in `config.py`. |
255
+ | All requests fail immediately | Environment variables are missing or misnamed | Verify `.env` matches `.env.template` exactly. |
256
+ | Debug logs are too noisy | Verbosity flags are enabled | Set `DEBUG = 0` and `IN_DEPTH_DEBUG = 0` for normal operation. |
257
+ | A specific model keeps failing | The model is unsupported, rate-limited, or exhausted | Remove it from the preference list or move it later in the rotation order. |
258
+
259
+ ## Contributing
260
+
261
+ Contributions are welcome if they improve correctness, observability, or provider coverage.
262
+
263
+ Please keep pull requests focused and include:
264
+
265
+ - a concise description of the routing behavior being changed
266
+ - reproduction steps for any failure-handling update
267
+ - updates to `.env.template` and `config.py` when configuration contracts change
268
+ - tests or a clear validation checklist when the orchestration flow changes
269
+
270
+ Guidelines:
271
+
272
+ 1. Do not hard-code secrets.
273
+ 2. Preserve backward-compatible configuration names whenever possible.
274
+ 3. Keep provider rotation behavior deterministic and well logged.
275
+ 4. Prefer small, isolated changes to client adapters and error handling.
276
+
277
+ ## Roadmap
278
+
279
+ Planned improvements that would strengthen the project further:
280
+
281
+ - formal public orchestration API with documented inputs and return types
282
+ - structured logging with request IDs and provider attempt history
283
+ - health checks for provider pools and exhausted key detection
284
+ - test coverage for failover, invalid-key handling, and model rotation
285
+ - optional CLI for smoke testing provider credentials
286
+ - metrics hooks for success rate, fallback rate, and exhaustion rate
287
+
288
+ ## Acknowledgements
289
+
290
+ LLM-Gateway-Service builds on the ecosystem provided by:
291
+
292
+ - Groq
293
+ - Google Gemini / Google Gen AI SDK
294
+ - Cerebras Cloud SDK
295
+ - python-dotenv
296
+ - tenacity
297
+
298
+ It also follows a common open-source reliability pattern: fail over without forcing callers to understand vendor-specific error recovery.
299
+
300
+ ## License
301
+
302
+ License: TBD.
303
+
304
+ Add the repository's chosen license here once it is finalized, and keep the license file in sync with this section.
@@ -0,0 +1,288 @@
1
+ # PolyRouter
2
+
3
+ ![Python](https://img.shields.io/badge/Python-3-blue)
4
+ ![Architecture](https://img.shields.io/badge/Architecture-Config--Driven%20LLM%20Router-0A84FF)
5
+ ![License](https://img.shields.io/badge/License-TBD-lightgrey)
6
+
7
+ PolyRouter is a lightweight Python library that routes requests across multiple LLM providers. It helps applications achieve deterministic failover by rotating API keys, client providers, and model candidates when requests fail.
8
+
9
+ ## Overview
10
+
11
+ This repository provides an orchestration layer that can be embedded in your application to manage provider rotation, API-key pools, and model fallbacks. Keys are loaded from the environment (see `.env`), and the orchestrator tries configured provider/model/key combinations until a request succeeds or all combinations are exhausted.
12
+
13
+ Example provider adapters included in this snapshot:
14
+
15
+ - Groq
16
+ - Google Gemini
17
+ - Cerebras
18
+
19
+ ## Badges
20
+
21
+ | Badge | Meaning |
22
+ | ----------------------------------------------------------------------------------------------- | -------------------------------------- |
23
+ | ![Python](https://img.shields.io/badge/Python-3-blue) | Python implementation |
24
+ | ![Architecture](https://img.shields.io/badge/Architecture-Config--Driven%20LLM%20Router-0A84FF) | Multi-provider failover design |
25
+ | ![License](https://img.shields.io/badge/License-TBD-lightgrey) | Update once a formal license is chosen |
26
+
27
+ ## Key Features
28
+
29
+ | Capability | Technical Detail |
30
+ | ------------------------- | -------------------------------------------------------------------------------------------------------------------------------- |
31
+ | Provider rotation | Requests can move across Groq, Gemini, and Cerebras client pools without changing application code. |
32
+ | Key pool management | Each provider can be backed by multiple API keys, allowing the runtime to continue when a single key expires or is rate-limited. |
33
+ | Model pool fallback | Ordered model lists in `config.py` act as a preference chain, so the router can try alternate models before surfacing a failure. |
34
+ | Debug visibility | `DEBUG` and `IN_DEPTH_DEBUG` control log verbosity so you can switch between concise operational logs and deep request tracing. |
35
+ | Centralized configuration | Provider counts, model lists, and debug mode live in one place instead of being duplicated across call sites. |
36
+ | Failure isolation | Provider-specific errors do not have to terminate the entire workflow if another valid key/model combination is still available. |
37
+
38
+ ## How It Works
39
+
40
+ ```mermaid
41
+ flowchart TD
42
+ A[Application request] --> B[Load config.py]
43
+ B --> C[Load .env API keys]
44
+ C --> D[Try primary provider]
45
+ D --> E{Request succeeds?}
46
+ E -->|Yes| F[Return response]
47
+ E -->|No| G[Rotate key / model / client]
48
+ G --> H{Any combinations left?}
49
+ H -->|Yes| D
50
+ H -->|No| I[Raise exhaustion error]
51
+ ```
52
+
53
+ The intended behavior is simple:
54
+
55
+ 1. Read provider preferences, model lists, and key counts from `config.py`.
56
+ 2. Load API credentials from the environment.
57
+ 3. Attempt a request with the active provider/model combination.
58
+ 4. On provider failure, rotate through the next key or model.
59
+ 5. When a provider pool is exhausted, move to the next client family.
60
+ 6. Stop only when every configured combination has been tried.
61
+
62
+ > The repo is built for operational resilience, not for single-provider purity.
63
+
64
+ ## Project Structure
65
+
66
+ ```text
67
+ PolyRouter/
68
+ ├── examples/ # Example usage scenarios
69
+ │ └── basic_usage.py
70
+ ├── polyrouter/ # Library source
71
+ │ ├── Exceptions.py
72
+ │ ├── LLMClients.py
73
+ │ ├── LLMOrchestrator.py
74
+ │ └── __init__.py
75
+ ├── .env # (example present in repo snapshot)
76
+ ├── requirements-dev.txt # Development / runtime deps
77
+ └── README.md
78
+ ```
79
+
80
+ ## Installation
81
+
82
+ Local setup (recommended):
83
+
84
+ ```bash
85
+ git clone <repository-url>
86
+ cd PolyRouter
87
+ python3 -m venv .venv
88
+ source .venv/bin/activate
89
+ pip install --upgrade pip
90
+ pip install -r requirements-dev.txt
91
+ ```
92
+
93
+ ### Environment variables
94
+
95
+ This repo includes a `.env` file in the snapshot. In normal usage copy and populate a local `.env` (do not commit secrets):
96
+
97
+ ```bash
98
+ cp .env .env.local
99
+ # edit .env.local and export provider API keys, e.g.:
100
+ # GROQ_API_KEY0=...
101
+ # GEMINI_API_KEY0=...
102
+ ```
103
+
104
+ The examples use environment variables named like `GROQ_API_KEY0`, `GEMINI_API_KEY0`, etc.
105
+
106
+ ### Dependencies
107
+
108
+ Dependencies used by examples and adapters may include provider SDKs and `python-dotenv`. Install via `requirements-dev.txt`.
109
+
110
+ <details>
111
+ <summary>Build / verification steps</summary>
112
+
113
+ This repository is a Python library-style project rather than a packaged service, so the essential validation step is import verification plus a smoke test in your host application.
114
+
115
+ ```bash
116
+ python -m compileall .
117
+ python - <<'PY'
118
+ from config import DEBUG, IN_DEPTH_DEBUG, GROQ_MODEL
119
+ print("config loaded:", DEBUG, IN_DEPTH_DEBUG, GROQ_MODEL)
120
+ PY
121
+ ```
122
+
123
+ </details>
124
+
125
+ ## Configuration
126
+
127
+ `config.py` is the primary customization point.
128
+
129
+ | Setting | Role |
130
+ | ---------------- | ------------------------------------------------------------ |
131
+ | `DEBUG` | Enables the main debug statement stream. |
132
+ | `IN_DEPTH_DEBUG` | Enables detailed trace output for low-level troubleshooting. |
133
+ | `GROQ_MODEL` | Ordered Groq model preference list. |
134
+ | `GEMINI_MODEL` | Ordered Gemini model preference list. |
135
+ | `CEREBRAS_MODEL` | Ordered Cerebras model preference list. |
136
+ | `GROQ_KEY` | Number of Groq API keys to scan. |
137
+ | `GEMINI_KEY` | Number of Gemini API keys to scan. |
138
+ | `CEREBRAS_KEY` | Number of Cerebras API keys to scan. |
139
+
140
+ Recommended operating model:
141
+
142
+ ```python
143
+ DEBUG = 1
144
+ IN_DEPTH_DEBUG = 0
145
+
146
+ GROQ_MODEL = ["openai/gpt-oss-120b", "llama-3.3-70b-versatile"]
147
+ GEMINI_MODEL = ["gemini-2.5-flash"]
148
+ CEREBRAS_MODEL = ["gpt-oss-120b"]
149
+
150
+ GROQ_KEY = 2
151
+ GEMINI_KEY = 1
152
+ CEREBRAS_KEY = 1
153
+ ```
154
+
155
+ ## Usage
156
+
157
+ See `examples/basic_usage.py` for a minimal example. The orchestrator can be constructed directly in your application; you do not need a `config.py` file if you prefer to pass provider settings programmatically.
158
+
159
+ Minimal usage:
160
+
161
+ ```python
162
+ from dotenv import load_dotenv
163
+ from polyrouter.LLMOrchestrator import LLMOrchestrator
164
+ import os
165
+
166
+ load_dotenv()
167
+
168
+ llm = LLMOrchestrator(
169
+ groq={
170
+ "groq_models": ["openai/gpt-oss-120b"],
171
+ "groq_keys": [os.getenv("GROQ_API_KEY0")],
172
+ },
173
+ debug=True,
174
+ verbose=True,
175
+ prompt="You are a helpful assistant",
176
+ )
177
+
178
+ response = llm.request()
179
+ print(response)
180
+ ```
181
+
182
+ When `debug`/`verbose` are enabled the orchestrator prints provider, model and key selection and rotation decisions.
183
+
184
+ ## API / CLI
185
+
186
+ No standalone CLI or HTTP API is exposed in this repository snapshot.
187
+
188
+ The public surface is intentionally library-oriented:
189
+
190
+ - `config.py` controls behavior
191
+ - `LLMClients.py` defines the client abstraction
192
+ - `LLMOrchestrator.py` is the orchestration boundary
193
+
194
+ If you add a CLI later, document it here with exact command syntax and exit codes.
195
+
196
+ ## Deployment
197
+
198
+ Because this project is a routing library, deployment usually means shipping it as part of a larger Python service or worker.
199
+
200
+ ### Recommended deployment checklist
201
+
202
+ 1. Pin dependencies with `requirements.txt`.
203
+ 2. Inject secrets through the runtime environment, not source control.
204
+ 3. Set `DEBUG = 0` and `IN_DEPTH_DEBUG = 0` for production unless you are actively diagnosing issues.
205
+ 4. Validate all required API keys are present before starting the process.
206
+ 5. Run the host service behind your preferred process manager, container runtime, or platform scheduler.
207
+
208
+ ### Containerized deployment
209
+
210
+ If you package the project into a container, copy only the source files, install requirements, and mount secrets through environment variables or secret storage.
211
+
212
+ ```dockerfile
213
+ FROM python:3.11-slim
214
+
215
+ WORKDIR /app
216
+ COPY requirements.txt .
217
+ RUN pip install --no-cache-dir -r requirements.txt
218
+
219
+ COPY . .
220
+ CMD ["python", "-c", "import config; print('LLM-Gateway-Service ready')"]
221
+ ```
222
+
223
+ ## Screenshots
224
+
225
+ > Screenshot placeholder: add an architecture diagram or runtime trace capture here once the project has a visual demo surface.
226
+
227
+ Suggested assets for a production repository:
228
+
229
+ - request-routing diagram
230
+ - provider rotation log snippet
231
+ - environment setup screenshot
232
+
233
+ ## Troubleshooting
234
+
235
+ | Symptom | Likely Cause | Resolution |
236
+ | -------------------------------------- | ------------------------------------------------------------ | -------------------------------------------------------------------------- |
237
+ | Import error for a provider SDK | Dependencies are missing from the active virtual environment | Re-run `pip install -r requirements.txt` inside the activated environment. |
238
+ | Requests stop after one provider fails | No fallback keys or models are configured | Add more keys to `.env` and expand the model pool in `config.py`. |
239
+ | All requests fail immediately | Environment variables are missing or misnamed | Verify `.env` matches `.env.template` exactly. |
240
+ | Debug logs are too noisy | Verbosity flags are enabled | Set `DEBUG = 0` and `IN_DEPTH_DEBUG = 0` for normal operation. |
241
+ | A specific model keeps failing | The model is unsupported, rate-limited, or exhausted | Remove it from the preference list or move it later in the rotation order. |
242
+
243
+ ## Contributing
244
+
245
+ Contributions are welcome if they improve correctness, observability, or provider coverage.
246
+
247
+ Please keep pull requests focused and include:
248
+
249
+ - a concise description of the routing behavior being changed
250
+ - reproduction steps for any failure-handling update
251
+ - updates to `.env.template` and `config.py` when configuration contracts change
252
+ - tests or a clear validation checklist when the orchestration flow changes
253
+
254
+ Guidelines:
255
+
256
+ 1. Do not hard-code secrets.
257
+ 2. Preserve backward-compatible configuration names whenever possible.
258
+ 3. Keep provider rotation behavior deterministic and well logged.
259
+ 4. Prefer small, isolated changes to client adapters and error handling.
260
+
261
+ ## Roadmap
262
+
263
+ Planned improvements that would strengthen the project further:
264
+
265
+ - formal public orchestration API with documented inputs and return types
266
+ - structured logging with request IDs and provider attempt history
267
+ - health checks for provider pools and exhausted key detection
268
+ - test coverage for failover, invalid-key handling, and model rotation
269
+ - optional CLI for smoke testing provider credentials
270
+ - metrics hooks for success rate, fallback rate, and exhaustion rate
271
+
272
+ ## Acknowledgements
273
+
274
+ LLM-Gateway-Service builds on the ecosystem provided by:
275
+
276
+ - Groq
277
+ - Google Gemini / Google Gen AI SDK
278
+ - Cerebras Cloud SDK
279
+ - python-dotenv
280
+ - tenacity
281
+
282
+ It also follows a common open-source reliability pattern: fail over without forcing callers to understand vendor-specific error recovery.
283
+
284
+ ## License
285
+
286
+ License: TBD.
287
+
288
+ Add the repository's chosen license here once it is finalized, and keep the license file in sync with this section.
@@ -0,0 +1,49 @@
1
+ # this file will be responsible for defination of all the user defined Exceptions
2
+ import logging
3
+
4
+ logger = logging.getLogger(__name__)
5
+
6
+ class LLMError(Exception):
7
+ def __init__(self, message):
8
+ logger.error(message)
9
+ super().__init__(message)
10
+
11
+
12
+ class AllModelsFailedError(LLMError):
13
+ def __init__(self, message):
14
+ super().__init__(message)
15
+
16
+
17
+ class ModelRateLimit(LLMError):
18
+ def __init__(self, message):
19
+ super().__init__(message)
20
+
21
+
22
+ class AllClientsExhaustedError(LLMError):
23
+ def __init__(self, message):
24
+ super().__init__(message)
25
+
26
+
27
+ class InvalidAPIKey(LLMError):
28
+ def __init__(self, message):
29
+ super().__init__(message)
30
+
31
+
32
+ class InvalidJSONResponseError(LLMError):
33
+ def __init__(self, message):
34
+ super().__init__(message)
35
+
36
+
37
+ class NoAPIKeysError(LLMError):
38
+ def __init__(self,message):
39
+ super().__init__(message)
40
+
41
+
42
+ class NoModelMentioned(LLMError):
43
+ def __init__(self,message):
44
+ super().__init__(message)
45
+
46
+
47
+ class UnknownError(LLMError):
48
+ def __init__(self,message):
49
+ super().__init__(message)