aetherroute 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- aetherroute-0.1.0/CHANGELOG.md +29 -0
- aetherroute-0.1.0/LICENSE +21 -0
- aetherroute-0.1.0/MANIFEST.in +8 -0
- aetherroute-0.1.0/PKG-INFO +324 -0
- aetherroute-0.1.0/README.md +255 -0
- aetherroute-0.1.0/aetherroute/__init__.py +8 -0
- aetherroute-0.1.0/aetherroute/adapters/__init__.py +8 -0
- aetherroute-0.1.0/aetherroute/adapters/prompt.py +129 -0
- aetherroute-0.1.0/aetherroute/adapters/token_counter.py +44 -0
- aetherroute-0.1.0/aetherroute/cache/__init__.py +3 -0
- aetherroute-0.1.0/aetherroute/cache/semantic.py +153 -0
- aetherroute-0.1.0/aetherroute/config.py +65 -0
- aetherroute-0.1.0/aetherroute/context/__init__.py +3 -0
- aetherroute-0.1.0/aetherroute/context/curator.py +164 -0
- aetherroute-0.1.0/aetherroute/cost/__init__.py +3 -0
- aetherroute-0.1.0/aetherroute/cost/governor.py +119 -0
- aetherroute-0.1.0/aetherroute/observability/__init__.py +3 -0
- aetherroute-0.1.0/aetherroute/observability/dashboard.py +18 -0
- aetherroute-0.1.0/aetherroute/observability/logger.py +79 -0
- aetherroute-0.1.0/aetherroute/observability/report.py +237 -0
- aetherroute-0.1.0/aetherroute/orchestrator.py +350 -0
- aetherroute-0.1.0/aetherroute/providers/__init__.py +15 -0
- aetherroute-0.1.0/aetherroute/providers/anthropic.py +131 -0
- aetherroute-0.1.0/aetherroute/providers/base.py +128 -0
- aetherroute-0.1.0/aetherroute/providers/mistral.py +142 -0
- aetherroute-0.1.0/aetherroute/providers/ollama.py +108 -0
- aetherroute-0.1.0/aetherroute/providers/openai.py +122 -0
- aetherroute-0.1.0/aetherroute/providers/registry.py +120 -0
- aetherroute-0.1.0/aetherroute/py.typed +1 -0
- aetherroute-0.1.0/aetherroute/router/__init__.py +4 -0
- aetherroute-0.1.0/aetherroute/router/classifier.py +64 -0
- aetherroute-0.1.0/aetherroute/router/engine.py +250 -0
- aetherroute-0.1.0/aetherroute/security/__init__.py +9 -0
- aetherroute-0.1.0/aetherroute/security/permission.py +50 -0
- aetherroute-0.1.0/aetherroute/security/sanitizer.py +64 -0
- aetherroute-0.1.0/aetherroute/validation/__init__.py +9 -0
- aetherroute-0.1.0/aetherroute/validation/consistency.py +109 -0
- aetherroute-0.1.0/aetherroute/validation/validator.py +112 -0
- aetherroute-0.1.0/aetherroute.egg-info/PKG-INFO +324 -0
- aetherroute-0.1.0/aetherroute.egg-info/SOURCES.txt +47 -0
- aetherroute-0.1.0/aetherroute.egg-info/dependency_links.txt +1 -0
- aetherroute-0.1.0/aetherroute.egg-info/entry_points.txt +2 -0
- aetherroute-0.1.0/aetherroute.egg-info/requires.txt +26 -0
- aetherroute-0.1.0/aetherroute.egg-info/top_level.txt +1 -0
- aetherroute-0.1.0/config.yaml +90 -0
- aetherroute-0.1.0/pyproject.toml +81 -0
- aetherroute-0.1.0/requirements.txt +21 -0
- aetherroute-0.1.0/setup.cfg +4 -0
- aetherroute-0.1.0/tests/test_aetherroute.py +196 -0
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to **AetherRoute** are documented here.
|
|
4
|
+
This project follows [Semantic Versioning](https://semver.org/).
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
|
|
8
|
+
## [0.1.0] — 2025-06-14
|
|
9
|
+
|
|
10
|
+
### Added
|
|
11
|
+
- **Multi-provider orchestration** — OpenAI, Anthropic Claude, Mistral, and Ollama support with a unified async interface.
|
|
12
|
+
- **Intelligent task routing** — TF-IDF + scoring engine routes queries to the best-fit provider based on task category, latency, cost, and health history.
|
|
13
|
+
- **Hot failover** — Automatic provider failover on error or health registry `down` status.
|
|
14
|
+
- **Output validation & repair loops** — Pydantic schema validation with automatic re-prompt on failure (up to 3 retries).
|
|
15
|
+
- **Self-consistency checker** — High-stakes queries run against two providers; divergent responses are flagged.
|
|
16
|
+
- **Cost governor** — Per-request and per-session hard cost ceilings enforced from config.
|
|
17
|
+
- **Security sanitizer** — Prompt injection detection + RBAC permission guard.
|
|
18
|
+
- **Fuzzy semantic cache** — Redis-backed (in-memory fallback) cache with cosine similarity matching.
|
|
19
|
+
- **Context curation** — TF-IDF relevance ranking + async summarization to prevent context window overflow.
|
|
20
|
+
- **CLI observability report** — `aetherroute-report` command (powered by `rich`) renders routing analytics, cost tables, and request traces directly in the terminal.
|
|
21
|
+
- **Mock mode** — Full offline demo with schema-aware mock responses for all providers.
|
|
22
|
+
- **PyPI packaging** — Installable via `pip install aetherroute` with provider SDKs as optional extras.
|
|
23
|
+
|
|
24
|
+
### Changed
|
|
25
|
+
- Replaced Streamlit web dashboard with a zero-browser CLI report (`aetherroute-report`).
|
|
26
|
+
- Made provider SDK dependencies (`openai`, `anthropic`, `mistralai`) optional extras.
|
|
27
|
+
|
|
28
|
+
### Fixed
|
|
29
|
+
- Mock providers now synthesize schema-aware JSON responses instead of returning a hardcoded generic object, preventing false validation failures in offline mode.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Mithun Barath M R
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,324 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: aetherroute
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Production-grade multi-provider LLM orchestration, routing, and validation framework
|
|
5
|
+
Author-email: Mithun Barath M R <barathmithun1548@gmail.com>
|
|
6
|
+
License: MIT License
|
|
7
|
+
|
|
8
|
+
Copyright (c) 2025 Mithun Barath M R
|
|
9
|
+
|
|
10
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
11
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
12
|
+
in the Software without restriction, including without limitation the rights
|
|
13
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
14
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
15
|
+
furnished to do so, subject to the following conditions:
|
|
16
|
+
|
|
17
|
+
The above copyright notice and this permission notice shall be included in all
|
|
18
|
+
copies or substantial portions of the Software.
|
|
19
|
+
|
|
20
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
21
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
22
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
23
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
24
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
25
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
26
|
+
SOFTWARE.
|
|
27
|
+
|
|
28
|
+
Project-URL: Homepage, https://github.com/mithunbarath/aetherroute
|
|
29
|
+
Project-URL: Source, https://github.com/mithunbarath/aetherroute
|
|
30
|
+
Project-URL: Bug Tracker, https://github.com/mithunbarath/aetherroute/issues
|
|
31
|
+
Project-URL: Changelog, https://github.com/mithunbarath/aetherroute/blob/main/CHANGELOG.md
|
|
32
|
+
Keywords: llm,ai,openai,anthropic,mistral,routing,orchestration,validation,pydantic,async,failover
|
|
33
|
+
Classifier: Development Status :: 3 - Alpha
|
|
34
|
+
Classifier: Intended Audience :: Developers
|
|
35
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
36
|
+
Classifier: Programming Language :: Python :: 3
|
|
37
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
38
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
39
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
40
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
41
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
42
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
43
|
+
Classifier: Framework :: AsyncIO
|
|
44
|
+
Requires-Python: >=3.9
|
|
45
|
+
Description-Content-Type: text/markdown
|
|
46
|
+
License-File: LICENSE
|
|
47
|
+
Requires-Dist: pydantic>=2.0.0
|
|
48
|
+
Requires-Dist: pyyaml>=6.0.0
|
|
49
|
+
Requires-Dist: httpx>=0.24.0
|
|
50
|
+
Requires-Dist: redis>=5.0.0
|
|
51
|
+
Requires-Dist: tiktoken>=0.5.0
|
|
52
|
+
Requires-Dist: rich>=13.0.0
|
|
53
|
+
Provides-Extra: openai
|
|
54
|
+
Requires-Dist: openai>=1.0.0; extra == "openai"
|
|
55
|
+
Provides-Extra: anthropic
|
|
56
|
+
Requires-Dist: anthropic>=0.18.0; extra == "anthropic"
|
|
57
|
+
Provides-Extra: mistral
|
|
58
|
+
Requires-Dist: mistralai>=0.1.0; extra == "mistral"
|
|
59
|
+
Provides-Extra: all
|
|
60
|
+
Requires-Dist: openai>=1.0.0; extra == "all"
|
|
61
|
+
Requires-Dist: anthropic>=0.18.0; extra == "all"
|
|
62
|
+
Requires-Dist: mistralai>=0.1.0; extra == "all"
|
|
63
|
+
Provides-Extra: dev
|
|
64
|
+
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
65
|
+
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
|
|
66
|
+
Requires-Dist: build>=1.0.0; extra == "dev"
|
|
67
|
+
Requires-Dist: twine>=5.0.0; extra == "dev"
|
|
68
|
+
Dynamic: license-file
|
|
69
|
+
|
|
70
|
+
# AetherRoute ⚡
|
|
71
|
+
|
|
72
|
+
[](https://pypi.org/project/aetherroute/)
|
|
73
|
+
[](https://www.python.org/downloads/)
|
|
74
|
+
[](LICENSE)
|
|
75
|
+
[](https://github.com/mithunbarath/aetherroute)
|
|
76
|
+
|
|
77
|
+
**AetherRoute** is a production-grade multi-provider LLM orchestration, validation, and routing framework. It is designed to mitigate critical LLM failure modes — provider outages, cost spikes, prompt injection attacks, structured formatting errors, and token window overflows — in professional environments.
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
Request ──► Input Sanitizer ──► Cache Lookup (Exact/Semantic)
|
|
81
|
+
│
|
|
82
|
+
┌────────────────────────────────────┘ (Cache Miss)
|
|
83
|
+
▼
|
|
84
|
+
Context Curation (Sliding Window & Summarization)
|
|
85
|
+
│
|
|
86
|
+
▼
|
|
87
|
+
Scoring & Decision Router (Task Fit, Cost, Latency, SQLite History)
|
|
88
|
+
│
|
|
89
|
+
▼
|
|
90
|
+
Provider Pool (OpenAI, Anthropic Claude, Mistral, Ollama)
|
|
91
|
+
│
|
|
92
|
+
├──► Success ──► Pydantic Validator ──► DB Logging ──► Response
|
|
93
|
+
│ │ (Validation Error)
|
|
94
|
+
│ ▼ (Up to 3 Retries)
|
|
95
|
+
│ Repair Re-prompt Loop
|
|
96
|
+
│
|
|
97
|
+
└──► Failure ──► Hot Failover (Try next-best provider)
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## 🚀 Key Features & Solved Production Failure Modes
|
|
103
|
+
|
|
104
|
+
### 1. Resilient Failover & Registry
|
|
105
|
+
- **Problem**: OpenAI/Anthropic downtime or rate limiting crashes your application.
|
|
106
|
+
- **Solution**: A live health registry (`healthy`, `degraded`, `down`) updates via periodic pings. On failure, AetherRoute performs a **hot-failover** to the next-best provider seamlessly.
|
|
107
|
+
|
|
108
|
+
### 2. Prompt Normalization
|
|
109
|
+
- **Problem**: Providers expect different message structures (Anthropic alternating roles, OpenAI permissive).
|
|
110
|
+
- **Solution**: Unified prompt format dynamically mapped to each provider's API requirements.
|
|
111
|
+
|
|
112
|
+
### 3. Real-Time Cost Governor
|
|
113
|
+
- **Problem**: Recursive loops and heavy context cause runaway API bills.
|
|
114
|
+
- **Solution**: Hard per-request and per-session cost ceilings enforced against live token counts. Blocked calls raise `CostLimitExceeded`.
|
|
115
|
+
|
|
116
|
+
### 4. Sliding Context Curation
|
|
117
|
+
- **Problem**: Full chat history causes token overflows and irrelevant retrieval.
|
|
118
|
+
- **Solution**: TF-IDF cosine relevance ranking prunes old messages, and cheap models asynchronously summarize older context.
|
|
119
|
+
|
|
120
|
+
### 5. Output Validation & Repair Loops
|
|
121
|
+
- **Problem**: LLMs return invalid JSON or miss required fields.
|
|
122
|
+
- **Solution**: Pydantic schema validation with automatic re-prompting on failure (up to 3 retries with error detail injected into the repair prompt).
|
|
123
|
+
|
|
124
|
+
### 6. Security Input Sanitization
|
|
125
|
+
- **Problem**: Prompt injection scripts bypass rules and leak system prompts.
|
|
126
|
+
- **Solution**: Regex-based injection detection + RBAC permission guards raise `SecurityBlockError`/`PermissionDeniedError`.
|
|
127
|
+
|
|
128
|
+
### 7. Fuzzy Semantic Cache
|
|
129
|
+
- **Problem**: Semantically identical repeated queries waste money and increase latency.
|
|
130
|
+
- **Solution**: Async Redis cache (in-memory fallback) returns hits for both exact and semantically similar queries using cosine similarity.
|
|
131
|
+
|
|
132
|
+
### 8. CLI Observability Report
|
|
133
|
+
- **Problem**: Black-box execution blocks debugging.
|
|
134
|
+
- **Solution**: Every routing decision, latency, cost, and validation retry is persisted in SQLite and rendered as a beautiful terminal report via `aetherroute-report`.
|
|
135
|
+
|
|
136
|
+
---
|
|
137
|
+
|
|
138
|
+
## 📦 Installation
|
|
139
|
+
|
|
140
|
+
### Core (no provider SDK)
|
|
141
|
+
```bash
|
|
142
|
+
pip install aetherroute
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
### With specific providers
|
|
146
|
+
```bash
|
|
147
|
+
pip install aetherroute[openai] # OpenAI only
|
|
148
|
+
pip install aetherroute[anthropic] # Anthropic Claude only
|
|
149
|
+
pip install aetherroute[mistral] # Mistral AI only
|
|
150
|
+
pip install aetherroute[all] # All providers
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
### From source
|
|
154
|
+
```bash
|
|
155
|
+
git clone https://github.com/mithunbarath/aetherroute.git
|
|
156
|
+
cd aetherroute
|
|
157
|
+
pip install -e ".[all,dev]"
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
---
|
|
161
|
+
|
|
162
|
+
## ⚙️ Configuration
|
|
163
|
+
|
|
164
|
+
Set API keys as environment variables (all optional — AetherRoute runs in mock mode without them):
|
|
165
|
+
|
|
166
|
+
```bash
|
|
167
|
+
# Windows CMD
|
|
168
|
+
set OPENAI_API_KEY=your-openai-key
|
|
169
|
+
set ANTHROPIC_API_KEY=your-anthropic-key
|
|
170
|
+
set MISTRAL_API_KEY=your-mistral-key
|
|
171
|
+
|
|
172
|
+
# Linux / macOS
|
|
173
|
+
export OPENAI_API_KEY=your-openai-key
|
|
174
|
+
export ANTHROPIC_API_KEY=your-anthropic-key
|
|
175
|
+
export MISTRAL_API_KEY=your-mistral-key
|
|
176
|
+
```
|
|
177
|
+
|
|
178
|
+
Or edit `config.yaml` directly in your project directory.
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
182
|
+
## ⚡ Quick Start
|
|
183
|
+
|
|
184
|
+
```python
|
|
185
|
+
import asyncio
|
|
186
|
+
from aetherroute.orchestrator import AetherRouteOrchestrator
|
|
187
|
+
|
|
188
|
+
async def main():
|
|
189
|
+
orchestrator = AetherRouteOrchestrator()
|
|
190
|
+
await orchestrator.start()
|
|
191
|
+
|
|
192
|
+
response = await orchestrator.query(
|
|
193
|
+
messages=[{"role": "user", "content": "Summarise the history of the Roman Empire."}],
|
|
194
|
+
query="Summarise the history of the Roman Empire.",
|
|
195
|
+
session_id="my-session"
|
|
196
|
+
)
|
|
197
|
+
print(response["text"])
|
|
198
|
+
await orchestrator.close()
|
|
199
|
+
|
|
200
|
+
asyncio.run(main())
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
### Structured Output (with Pydantic)
|
|
204
|
+
|
|
205
|
+
```python
|
|
206
|
+
from pydantic import BaseModel, Field
|
|
207
|
+
|
|
208
|
+
class Ticket(BaseModel):
|
|
209
|
+
ticket_id: int
|
|
210
|
+
sentiment: str = Field(description="positive, neutral, or negative")
|
|
211
|
+
urgency: str = Field(description="high, medium, or low")
|
|
212
|
+
summary: str
|
|
213
|
+
|
|
214
|
+
ticket, raw = await orchestrator.query_structured(
|
|
215
|
+
messages=[{"role": "user", "content": "Ticket #42: customer is very angry, system is down."}],
|
|
216
|
+
query="Extract ticket info.",
|
|
217
|
+
response_model=Ticket,
|
|
218
|
+
session_id="my-session"
|
|
219
|
+
)
|
|
220
|
+
print(ticket.urgency) # "high"
|
|
221
|
+
```
|
|
222
|
+
|
|
223
|
+
---
|
|
224
|
+
|
|
225
|
+
## 🎬 Running the Demo
|
|
226
|
+
|
|
227
|
+
The included `demo.py` showcases **all 6 scenarios** using mock providers (runs fully offline):
|
|
228
|
+
|
|
229
|
+
```bash
|
|
230
|
+
python demo.py
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
Demonstrates:
|
|
234
|
+
- Routing decisions based on query classification
|
|
235
|
+
- Provider outage hot-failover
|
|
236
|
+
- Output validator repair loops
|
|
237
|
+
- Security injection blocking
|
|
238
|
+
- Session cost governor enforcement
|
|
239
|
+
- Fuzzy semantic cache hits
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
243
|
+
## 📊 CLI Observability Report
|
|
244
|
+
|
|
245
|
+
View routing analytics, cost breakdowns, and request traces directly in your terminal:
|
|
246
|
+
|
|
247
|
+
```bash
|
|
248
|
+
python -m aetherroute.observability.report
|
|
249
|
+
|
|
250
|
+
# Or if installed via pip:
|
|
251
|
+
aetherroute-report
|
|
252
|
+
|
|
253
|
+
# Options:
|
|
254
|
+
# --db PATH SQLite database path (default: aetherroute.db)
|
|
255
|
+
# --traces N Number of recent traces to display (default: 15)
|
|
256
|
+
aetherroute-report --db aetherroute.db --traces 25
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
The report shows:
|
|
260
|
+
- **Summary panel** — total requests, cost, average latency, success rate
|
|
261
|
+
- **Provider routing table** — request count, total cost, avg latency per provider
|
|
262
|
+
- **Task category breakdown** — coding, summarization, reasoning, creative, extraction
|
|
263
|
+
- **Session cost totals** — cost aggregated per session ID
|
|
264
|
+
- **Request trace table** — last N requests with colour-coded success/fail indicators
|
|
265
|
+
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
## 🧪 Running Tests
|
|
269
|
+
|
|
270
|
+
```bash
|
|
271
|
+
pip install aetherroute[dev]
|
|
272
|
+
pytest tests/ -v
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
---
|
|
276
|
+
|
|
277
|
+
## 📁 Project Structure
|
|
278
|
+
|
|
279
|
+
```
|
|
280
|
+
aetherroute/
|
|
281
|
+
├── adapters/ # Prompt normalization for each provider
|
|
282
|
+
├── cache/ # Semantic + exact caching (Redis/in-memory)
|
|
283
|
+
├── config.py # YAML config loader (AetherRouteConfig)
|
|
284
|
+
├── context/ # Context curation (TF-IDF, summarization)
|
|
285
|
+
├── cost/ # Cost governor & SQLite transaction logging
|
|
286
|
+
├── observability/
|
|
287
|
+
│ ├── logger.py # SQLite request logger
|
|
288
|
+
│ └── report.py # Rich CLI observability report
|
|
289
|
+
├── orchestrator.py # Main entry-point: AetherRouteOrchestrator
|
|
290
|
+
├── providers/ # OpenAI, Anthropic, Mistral, Ollama wrappers
|
|
291
|
+
├── router/ # Task classifier & routing engine
|
|
292
|
+
├── security/ # Input sanitizer & RBAC permission guard
|
|
293
|
+
└── validation/ # Pydantic validator & self-consistency checker
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
---
|
|
297
|
+
|
|
298
|
+
## 🤝 Contributing
|
|
299
|
+
|
|
300
|
+
Contributions, issues, and feature requests are welcome!
|
|
301
|
+
|
|
302
|
+
1. Fork the repository
|
|
303
|
+
2. Create a feature branch: `git checkout -b feature/my-feature`
|
|
304
|
+
3. Commit your changes: `git commit -m "Add my feature"`
|
|
305
|
+
4. Push to the branch: `git push origin feature/my-feature`
|
|
306
|
+
5. Open a Pull Request
|
|
307
|
+
|
|
308
|
+
Please ensure all tests pass (`pytest tests/`) and update the `CHANGELOG.md`.
|
|
309
|
+
|
|
310
|
+
---
|
|
311
|
+
|
|
312
|
+
## 👤 Author
|
|
313
|
+
|
|
314
|
+
**Mithun Barath M R**
|
|
315
|
+
- 📧 [barathmithun1548@gmail.com](mailto:barathmithun1548@gmail.com)
|
|
316
|
+
- 💼 [LinkedIn](https://www.linkedin.com/in/mithunbarathmr13/)
|
|
317
|
+
- 🐙 [GitHub](https://github.com/mithunbarath)
|
|
318
|
+
- 🏥 Co-founder, MedClara — Healthcare AI Automation
|
|
319
|
+
|
|
320
|
+
---
|
|
321
|
+
|
|
322
|
+
## 📄 License
|
|
323
|
+
|
|
324
|
+
This project is licensed under the **MIT License** — see [LICENSE](LICENSE) for details.
|
|
@@ -0,0 +1,255 @@
|
|
|
1
|
+
# AetherRoute ⚡
|
|
2
|
+
|
|
3
|
+
[](https://pypi.org/project/aetherroute/)
|
|
4
|
+
[](https://www.python.org/downloads/)
|
|
5
|
+
[](LICENSE)
|
|
6
|
+
[](https://github.com/mithunbarath/aetherroute)
|
|
7
|
+
|
|
8
|
+
**AetherRoute** is a production-grade multi-provider LLM orchestration, validation, and routing framework. It is designed to mitigate critical LLM failure modes — provider outages, cost spikes, prompt injection attacks, structured formatting errors, and token window overflows — in professional environments.
|
|
9
|
+
|
|
10
|
+
```
|
|
11
|
+
Request ──► Input Sanitizer ──► Cache Lookup (Exact/Semantic)
|
|
12
|
+
│
|
|
13
|
+
┌────────────────────────────────────┘ (Cache Miss)
|
|
14
|
+
▼
|
|
15
|
+
Context Curation (Sliding Window & Summarization)
|
|
16
|
+
│
|
|
17
|
+
▼
|
|
18
|
+
Scoring & Decision Router (Task Fit, Cost, Latency, SQLite History)
|
|
19
|
+
│
|
|
20
|
+
▼
|
|
21
|
+
Provider Pool (OpenAI, Anthropic Claude, Mistral, Ollama)
|
|
22
|
+
│
|
|
23
|
+
├──► Success ──► Pydantic Validator ──► DB Logging ──► Response
|
|
24
|
+
│ │ (Validation Error)
|
|
25
|
+
│ ▼ (Up to 3 Retries)
|
|
26
|
+
│ Repair Re-prompt Loop
|
|
27
|
+
│
|
|
28
|
+
└──► Failure ──► Hot Failover (Try next-best provider)
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## 🚀 Key Features & Solved Production Failure Modes
|
|
34
|
+
|
|
35
|
+
### 1. Resilient Failover & Registry
|
|
36
|
+
- **Problem**: OpenAI/Anthropic downtime or rate limiting crashes your application.
|
|
37
|
+
- **Solution**: A live health registry (`healthy`, `degraded`, `down`) updates via periodic pings. On failure, AetherRoute performs a **hot-failover** to the next-best provider seamlessly.
|
|
38
|
+
|
|
39
|
+
### 2. Prompt Normalization
|
|
40
|
+
- **Problem**: Providers expect different message structures (Anthropic alternating roles, OpenAI permissive).
|
|
41
|
+
- **Solution**: Unified prompt format dynamically mapped to each provider's API requirements.
|
|
42
|
+
|
|
43
|
+
### 3. Real-Time Cost Governor
|
|
44
|
+
- **Problem**: Recursive loops and heavy context cause runaway API bills.
|
|
45
|
+
- **Solution**: Hard per-request and per-session cost ceilings enforced against live token counts. Blocked calls raise `CostLimitExceeded`.
|
|
46
|
+
|
|
47
|
+
### 4. Sliding Context Curation
|
|
48
|
+
- **Problem**: Full chat history causes token overflows and irrelevant retrieval.
|
|
49
|
+
- **Solution**: TF-IDF cosine relevance ranking prunes old messages, and cheap models asynchronously summarize older context.
|
|
50
|
+
|
|
51
|
+
### 5. Output Validation & Repair Loops
|
|
52
|
+
- **Problem**: LLMs return invalid JSON or miss required fields.
|
|
53
|
+
- **Solution**: Pydantic schema validation with automatic re-prompting on failure (up to 3 retries with error detail injected into the repair prompt).
|
|
54
|
+
|
|
55
|
+
### 6. Security Input Sanitization
|
|
56
|
+
- **Problem**: Prompt injection scripts bypass rules and leak system prompts.
|
|
57
|
+
- **Solution**: Regex-based injection detection + RBAC permission guards raise `SecurityBlockError`/`PermissionDeniedError`.
|
|
58
|
+
|
|
59
|
+
### 7. Fuzzy Semantic Cache
|
|
60
|
+
- **Problem**: Semantically identical repeated queries waste money and increase latency.
|
|
61
|
+
- **Solution**: Async Redis cache (in-memory fallback) returns hits for both exact and semantically similar queries using cosine similarity.
|
|
62
|
+
|
|
63
|
+
### 8. CLI Observability Report
|
|
64
|
+
- **Problem**: Black-box execution blocks debugging.
|
|
65
|
+
- **Solution**: Every routing decision, latency, cost, and validation retry is persisted in SQLite and rendered as a beautiful terminal report via `aetherroute-report`.
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## 📦 Installation
|
|
70
|
+
|
|
71
|
+
### Core (no provider SDK)
|
|
72
|
+
```bash
|
|
73
|
+
pip install aetherroute
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
### With specific providers
|
|
77
|
+
```bash
|
|
78
|
+
pip install aetherroute[openai] # OpenAI only
|
|
79
|
+
pip install aetherroute[anthropic] # Anthropic Claude only
|
|
80
|
+
pip install aetherroute[mistral] # Mistral AI only
|
|
81
|
+
pip install aetherroute[all] # All providers
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### From source
|
|
85
|
+
```bash
|
|
86
|
+
git clone https://github.com/mithunbarath/aetherroute.git
|
|
87
|
+
cd aetherroute
|
|
88
|
+
pip install -e ".[all,dev]"
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
93
|
+
## ⚙️ Configuration
|
|
94
|
+
|
|
95
|
+
Set API keys as environment variables (all optional — AetherRoute runs in mock mode without them):
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
# Windows CMD
|
|
99
|
+
set OPENAI_API_KEY=your-openai-key
|
|
100
|
+
set ANTHROPIC_API_KEY=your-anthropic-key
|
|
101
|
+
set MISTRAL_API_KEY=your-mistral-key
|
|
102
|
+
|
|
103
|
+
# Linux / macOS
|
|
104
|
+
export OPENAI_API_KEY=your-openai-key
|
|
105
|
+
export ANTHROPIC_API_KEY=your-anthropic-key
|
|
106
|
+
export MISTRAL_API_KEY=your-mistral-key
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
Or edit `config.yaml` directly in your project directory.
|
|
110
|
+
|
|
111
|
+
---
|
|
112
|
+
|
|
113
|
+
## ⚡ Quick Start
|
|
114
|
+
|
|
115
|
+
```python
|
|
116
|
+
import asyncio
|
|
117
|
+
from aetherroute.orchestrator import AetherRouteOrchestrator
|
|
118
|
+
|
|
119
|
+
async def main():
|
|
120
|
+
orchestrator = AetherRouteOrchestrator()
|
|
121
|
+
await orchestrator.start()
|
|
122
|
+
|
|
123
|
+
response = await orchestrator.query(
|
|
124
|
+
messages=[{"role": "user", "content": "Summarise the history of the Roman Empire."}],
|
|
125
|
+
query="Summarise the history of the Roman Empire.",
|
|
126
|
+
session_id="my-session"
|
|
127
|
+
)
|
|
128
|
+
print(response["text"])
|
|
129
|
+
await orchestrator.close()
|
|
130
|
+
|
|
131
|
+
asyncio.run(main())
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
### Structured Output (with Pydantic)
|
|
135
|
+
|
|
136
|
+
```python
|
|
137
|
+
from pydantic import BaseModel, Field
|
|
138
|
+
|
|
139
|
+
class Ticket(BaseModel):
|
|
140
|
+
ticket_id: int
|
|
141
|
+
sentiment: str = Field(description="positive, neutral, or negative")
|
|
142
|
+
urgency: str = Field(description="high, medium, or low")
|
|
143
|
+
summary: str
|
|
144
|
+
|
|
145
|
+
ticket, raw = await orchestrator.query_structured(
|
|
146
|
+
messages=[{"role": "user", "content": "Ticket #42: customer is very angry, system is down."}],
|
|
147
|
+
query="Extract ticket info.",
|
|
148
|
+
response_model=Ticket,
|
|
149
|
+
session_id="my-session"
|
|
150
|
+
)
|
|
151
|
+
print(ticket.urgency) # "high"
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
---
|
|
155
|
+
|
|
156
|
+
## 🎬 Running the Demo
|
|
157
|
+
|
|
158
|
+
The included `demo.py` showcases **all 6 scenarios** using mock providers (runs fully offline):
|
|
159
|
+
|
|
160
|
+
```bash
|
|
161
|
+
python demo.py
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
Demonstrates:
|
|
165
|
+
- Routing decisions based on query classification
|
|
166
|
+
- Provider outage hot-failover
|
|
167
|
+
- Output validator repair loops
|
|
168
|
+
- Security injection blocking
|
|
169
|
+
- Session cost governor enforcement
|
|
170
|
+
- Fuzzy semantic cache hits
|
|
171
|
+
|
|
172
|
+
---
|
|
173
|
+
|
|
174
|
+
## 📊 CLI Observability Report
|
|
175
|
+
|
|
176
|
+
View routing analytics, cost breakdowns, and request traces directly in your terminal:
|
|
177
|
+
|
|
178
|
+
```bash
|
|
179
|
+
python -m aetherroute.observability.report
|
|
180
|
+
|
|
181
|
+
# Or if installed via pip:
|
|
182
|
+
aetherroute-report
|
|
183
|
+
|
|
184
|
+
# Options:
|
|
185
|
+
# --db PATH SQLite database path (default: aetherroute.db)
|
|
186
|
+
# --traces N Number of recent traces to display (default: 15)
|
|
187
|
+
aetherroute-report --db aetherroute.db --traces 25
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
The report shows:
|
|
191
|
+
- **Summary panel** — total requests, cost, average latency, success rate
|
|
192
|
+
- **Provider routing table** — request count, total cost, avg latency per provider
|
|
193
|
+
- **Task category breakdown** — coding, summarization, reasoning, creative, extraction
|
|
194
|
+
- **Session cost totals** — cost aggregated per session ID
|
|
195
|
+
- **Request trace table** — last N requests with colour-coded success/fail indicators
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
## 🧪 Running Tests
|
|
200
|
+
|
|
201
|
+
```bash
|
|
202
|
+
pip install aetherroute[dev]
|
|
203
|
+
pytest tests/ -v
|
|
204
|
+
```
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
## 📁 Project Structure
|
|
209
|
+
|
|
210
|
+
```
|
|
211
|
+
aetherroute/
|
|
212
|
+
├── adapters/ # Prompt normalization for each provider
|
|
213
|
+
├── cache/ # Semantic + exact caching (Redis/in-memory)
|
|
214
|
+
├── config.py # YAML config loader (AetherRouteConfig)
|
|
215
|
+
├── context/ # Context curation (TF-IDF, summarization)
|
|
216
|
+
├── cost/ # Cost governor & SQLite transaction logging
|
|
217
|
+
├── observability/
|
|
218
|
+
│ ├── logger.py # SQLite request logger
|
|
219
|
+
│ └── report.py # Rich CLI observability report
|
|
220
|
+
├── orchestrator.py # Main entry-point: AetherRouteOrchestrator
|
|
221
|
+
├── providers/ # OpenAI, Anthropic, Mistral, Ollama wrappers
|
|
222
|
+
├── router/ # Task classifier & routing engine
|
|
223
|
+
├── security/ # Input sanitizer & RBAC permission guard
|
|
224
|
+
└── validation/ # Pydantic validator & self-consistency checker
|
|
225
|
+
```
|
|
226
|
+
|
|
227
|
+
---
|
|
228
|
+
|
|
229
|
+
## 🤝 Contributing
|
|
230
|
+
|
|
231
|
+
Contributions, issues, and feature requests are welcome!
|
|
232
|
+
|
|
233
|
+
1. Fork the repository
|
|
234
|
+
2. Create a feature branch: `git checkout -b feature/my-feature`
|
|
235
|
+
3. Commit your changes: `git commit -m "Add my feature"`
|
|
236
|
+
4. Push to the branch: `git push origin feature/my-feature`
|
|
237
|
+
5. Open a Pull Request
|
|
238
|
+
|
|
239
|
+
Please ensure all tests pass (`pytest tests/`) and update the `CHANGELOG.md`.
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
243
|
+
## 👤 Author
|
|
244
|
+
|
|
245
|
+
**Mithun Barath M R**
|
|
246
|
+
- 📧 [barathmithun1548@gmail.com](mailto:barathmithun1548@gmail.com)
|
|
247
|
+
- 💼 [LinkedIn](https://www.linkedin.com/in/mithunbarathmr13/)
|
|
248
|
+
- 🐙 [GitHub](https://github.com/mithunbarath)
|
|
249
|
+
- 🏥 Co-founder, MedClara — Healthcare AI Automation
|
|
250
|
+
|
|
251
|
+
---
|
|
252
|
+
|
|
253
|
+
## 📄 License
|
|
254
|
+
|
|
255
|
+
This project is licensed under the **MIT License** — see [LICENSE](LICENSE) for details.
|