ai-lib-python 0.5.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (84) hide show
  1. ai_lib_python/__init__.py +43 -0
  2. ai_lib_python/batch/__init__.py +15 -0
  3. ai_lib_python/batch/collector.py +244 -0
  4. ai_lib_python/batch/executor.py +224 -0
  5. ai_lib_python/cache/__init__.py +26 -0
  6. ai_lib_python/cache/backends.py +380 -0
  7. ai_lib_python/cache/key.py +237 -0
  8. ai_lib_python/cache/manager.py +332 -0
  9. ai_lib_python/client/__init__.py +37 -0
  10. ai_lib_python/client/builder.py +528 -0
  11. ai_lib_python/client/cancel.py +368 -0
  12. ai_lib_python/client/core.py +433 -0
  13. ai_lib_python/client/response.py +134 -0
  14. ai_lib_python/embeddings/__init__.py +36 -0
  15. ai_lib_python/embeddings/client.py +339 -0
  16. ai_lib_python/embeddings/types.py +234 -0
  17. ai_lib_python/embeddings/vectors.py +246 -0
  18. ai_lib_python/errors/__init__.py +41 -0
  19. ai_lib_python/errors/base.py +316 -0
  20. ai_lib_python/errors/classification.py +210 -0
  21. ai_lib_python/guardrails/__init__.py +35 -0
  22. ai_lib_python/guardrails/base.py +336 -0
  23. ai_lib_python/guardrails/filters.py +583 -0
  24. ai_lib_python/guardrails/validators.py +475 -0
  25. ai_lib_python/pipeline/__init__.py +55 -0
  26. ai_lib_python/pipeline/accumulate.py +248 -0
  27. ai_lib_python/pipeline/base.py +240 -0
  28. ai_lib_python/pipeline/decode.py +281 -0
  29. ai_lib_python/pipeline/event_map.py +506 -0
  30. ai_lib_python/pipeline/fan_out.py +284 -0
  31. ai_lib_python/pipeline/select.py +297 -0
  32. ai_lib_python/plugins/__init__.py +32 -0
  33. ai_lib_python/plugins/base.py +294 -0
  34. ai_lib_python/plugins/hooks.py +296 -0
  35. ai_lib_python/plugins/middleware.py +285 -0
  36. ai_lib_python/plugins/registry.py +294 -0
  37. ai_lib_python/protocol/__init__.py +71 -0
  38. ai_lib_python/protocol/loader.py +317 -0
  39. ai_lib_python/protocol/manifest.py +385 -0
  40. ai_lib_python/protocol/validator.py +460 -0
  41. ai_lib_python/py.typed +1 -0
  42. ai_lib_python/resilience/__init__.py +102 -0
  43. ai_lib_python/resilience/backpressure.py +225 -0
  44. ai_lib_python/resilience/circuit_breaker.py +318 -0
  45. ai_lib_python/resilience/executor.py +343 -0
  46. ai_lib_python/resilience/fallback.py +341 -0
  47. ai_lib_python/resilience/preflight.py +413 -0
  48. ai_lib_python/resilience/rate_limiter.py +291 -0
  49. ai_lib_python/resilience/retry.py +299 -0
  50. ai_lib_python/resilience/signals.py +283 -0
  51. ai_lib_python/routing/__init__.py +118 -0
  52. ai_lib_python/routing/manager.py +593 -0
  53. ai_lib_python/routing/strategy.py +345 -0
  54. ai_lib_python/routing/types.py +397 -0
  55. ai_lib_python/structured/__init__.py +33 -0
  56. ai_lib_python/structured/json_mode.py +281 -0
  57. ai_lib_python/structured/schema.py +316 -0
  58. ai_lib_python/structured/validator.py +334 -0
  59. ai_lib_python/telemetry/__init__.py +127 -0
  60. ai_lib_python/telemetry/exporters/__init__.py +9 -0
  61. ai_lib_python/telemetry/exporters/prometheus.py +111 -0
  62. ai_lib_python/telemetry/feedback.py +446 -0
  63. ai_lib_python/telemetry/health.py +409 -0
  64. ai_lib_python/telemetry/logger.py +389 -0
  65. ai_lib_python/telemetry/metrics.py +496 -0
  66. ai_lib_python/telemetry/tracer.py +473 -0
  67. ai_lib_python/tokens/__init__.py +25 -0
  68. ai_lib_python/tokens/counter.py +282 -0
  69. ai_lib_python/tokens/estimator.py +286 -0
  70. ai_lib_python/transport/__init__.py +34 -0
  71. ai_lib_python/transport/auth.py +141 -0
  72. ai_lib_python/transport/http.py +364 -0
  73. ai_lib_python/transport/pool.py +425 -0
  74. ai_lib_python/types/__init__.py +41 -0
  75. ai_lib_python/types/events.py +343 -0
  76. ai_lib_python/types/message.py +332 -0
  77. ai_lib_python/types/tool.py +191 -0
  78. ai_lib_python/utils/__init__.py +21 -0
  79. ai_lib_python/utils/tool_call_assembler.py +317 -0
  80. ai_lib_python-0.5.0.dist-info/METADATA +837 -0
  81. ai_lib_python-0.5.0.dist-info/RECORD +84 -0
  82. ai_lib_python-0.5.0.dist-info/WHEEL +4 -0
  83. ai_lib_python-0.5.0.dist-info/licenses/LICENSE-APACHE +201 -0
  84. ai_lib_python-0.5.0.dist-info/licenses/LICENSE-MIT +21 -0
@@ -0,0 +1,837 @@
1
+ Metadata-Version: 2.4
2
+ Name: ai-lib-python
3
+ Version: 0.5.0
4
+ Summary: Official Python Runtime for AI-Protocol - The canonical Pythonic implementation for unified AI model interaction
5
+ Project-URL: Homepage, https://github.com/hiddenpath/ai-lib-python
6
+ Project-URL: Documentation, https://github.com/hiddenpath/ai-lib-python#readme
7
+ Project-URL: Repository, https://github.com/hiddenpath/ai-lib-python
8
+ Project-URL: Issues, https://github.com/hiddenpath/ai-lib-python/issues
9
+ Author: AI-Protocol Team
10
+ License-Expression: MIT OR Apache-2.0
11
+ License-File: LICENSE-APACHE
12
+ License-File: LICENSE-MIT
13
+ Keywords: ai,anthropic,claude,gpt,llm,openai,protocol,streaming
14
+ Classifier: Development Status :: 4 - Beta
15
+ Classifier: Intended Audience :: Developers
16
+ Classifier: License :: OSI Approved :: Apache Software License
17
+ Classifier: License :: OSI Approved :: MIT License
18
+ Classifier: Programming Language :: Python :: 3
19
+ Classifier: Programming Language :: Python :: 3.10
20
+ Classifier: Programming Language :: Python :: 3.11
21
+ Classifier: Programming Language :: Python :: 3.12
22
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
23
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
24
+ Classifier: Typing :: Typed
25
+ Requires-Python: >=3.10
26
+ Requires-Dist: fastjsonschema>=2.19
27
+ Requires-Dist: httpx>=0.25.0
28
+ Requires-Dist: jsonpath-ng>=1.6
29
+ Requires-Dist: pydantic-settings>=2.0
30
+ Requires-Dist: pydantic>=2.0
31
+ Requires-Dist: pyyaml>=6.0
32
+ Provides-Extra: dev
33
+ Requires-Dist: mypy>=1.8; extra == 'dev'
34
+ Requires-Dist: pytest-asyncio>=0.23; extra == 'dev'
35
+ Requires-Dist: pytest-cov>=4.0; extra == 'dev'
36
+ Requires-Dist: pytest-httpx>=0.30; extra == 'dev'
37
+ Requires-Dist: pytest>=8.0; extra == 'dev'
38
+ Requires-Dist: ruff>=0.2; extra == 'dev'
39
+ Requires-Dist: types-pyyaml>=6.0; extra == 'dev'
40
+ Provides-Extra: docs
41
+ Requires-Dist: mkdocs-material>=9.0; extra == 'docs'
42
+ Requires-Dist: mkdocs>=1.5; extra == 'docs'
43
+ Requires-Dist: mkdocstrings[python]>=0.24; extra == 'docs'
44
+ Provides-Extra: full
45
+ Requires-Dist: keyring>=24.0; extra == 'full'
46
+ Requires-Dist: opentelemetry-api>=1.20; extra == 'full'
47
+ Requires-Dist: opentelemetry-exporter-otlp>=1.20; extra == 'full'
48
+ Requires-Dist: opentelemetry-sdk>=1.20; extra == 'full'
49
+ Requires-Dist: tiktoken>=0.5; extra == 'full'
50
+ Requires-Dist: watchdog>=3.0; extra == 'full'
51
+ Provides-Extra: jupyter
52
+ Requires-Dist: ipywidgets>=8.0; extra == 'jupyter'
53
+ Provides-Extra: telemetry
54
+ Requires-Dist: opentelemetry-api>=1.20; extra == 'telemetry'
55
+ Requires-Dist: opentelemetry-exporter-otlp>=1.20; extra == 'telemetry'
56
+ Requires-Dist: opentelemetry-sdk>=1.20; extra == 'telemetry'
57
+ Provides-Extra: tokenizer
58
+ Requires-Dist: tiktoken>=0.5; extra == 'tokenizer'
59
+ Description-Content-Type: text/markdown
60
+
61
+ # ai-lib-python
62
+
63
+ **Official Python Runtime for AI-Protocol** - The canonical Pythonic implementation for unified AI model interaction.
64
+
65
+ [![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
66
+ [![License](https://img.shields.io/badge/license-MIT%20OR%20Apache--2.0-green.svg)](LICENSE)
67
+ [![Tests](https://github.com/hiddenpath/ai-lib-python/actions/workflows/ci.yml/badge.svg)](https://github.com/hiddenpath/ai-lib-python/actions)
68
+ [![PyPI](https://img.shields.io/pypi/v/ai-lib-python.svg)](https://pypi.org/project/ai-lib-python/)
69
+
70
+ ## Overview
71
+
72
+ `ai-lib-python` is the **official Python runtime** for the [AI-Protocol](https://github.com/hiddenpath/ai-protocol) specification. As the canonical Python implementation maintained by the AI-Protocol team, it embodies the core design principle:
73
+
74
+ > **All logic is operators, all configuration is protocol.**
75
+
76
+ Unlike traditional adapter libraries that hardcode provider-specific logic, `ai-lib-python` is a **protocol-driven runtime** that executes AI-Protocol specifications.
77
+
78
+ ## Features
79
+
80
+ - **Protocol-Driven**: All behavior is driven by YAML/JSON protocol files
81
+ - **Unified Interface**: Single API for all AI providers (OpenAI, Anthropic, Gemini, DeepSeek, etc.)
82
+ - **Streaming First**: Native async streaming with Python's `async for`
83
+ - **Type Safe**: Full type hints with Pydantic v2 models
84
+ - **Production Ready**: Built-in retry, rate limiting, circuit breaker, and fallback
85
+ - **Extensible**: Easy to add new providers via protocol configuration
86
+ - **Multimodal**: Support for text, images (base64/URL), and audio
87
+ - **Telemetry**: Structured logging, metrics, distributed tracing, and user feedback collection
88
+ - **Token Counting**: tiktoken integration and cost estimation
89
+ - **Connection Pooling**: Efficient HTTP connection management
90
+ - **Request Batching**: Parallel execution with concurrency control
91
+ - **Model Routing**: Smart model selection with load balancing strategies
92
+ - **Embeddings**: Embedding generation with vector operations
93
+ - **Structured Output**: JSON mode with schema validation
94
+ - **Response Caching**: Multi-backend caching with TTL support
95
+ - **Plugin System**: Extensible hooks and middleware architecture
96
+ - **Stream Cancellation**: Cooperative cancellation for streaming operations
97
+
98
+ ## Installation
99
+
100
+ ```bash
101
+ pip install ai-lib-python
102
+ ```
103
+
104
+ With optional features:
105
+
106
+ ```bash
107
+ # Full installation with all features
108
+ pip install ai-lib-python[full]
109
+
110
+ # For telemetry (OpenTelemetry integration)
111
+ pip install ai-lib-python[telemetry]
112
+
113
+ # For token counting (tiktoken)
114
+ pip install ai-lib-python[tokenizer]
115
+
116
+ # For Jupyter notebook integration
117
+ pip install ai-lib-python[jupyter]
118
+
119
+ # For development
120
+ pip install ai-lib-python[dev]
121
+ ```
122
+
123
+ ## Quick Start
124
+
125
+ ### Basic Usage
126
+
127
+ ```python
128
+ import asyncio
129
+ from ai_lib_python import AiClient, Message
130
+
131
+ async def main():
132
+ # Create client with model
133
+ client = await AiClient.create("openai/gpt-4o")
134
+
135
+ # Simple chat completion
136
+ response = await (
137
+ client.chat()
138
+ .user("Hello! What's 2+2?")
139
+ .execute()
140
+ )
141
+ print(response.content)
142
+ # Output: 2+2 equals 4.
143
+
144
+ await client.close()
145
+
146
+ asyncio.run(main())
147
+ ```
148
+
149
+ ### Streaming
150
+
151
+ ```python
152
+ async def stream_example():
153
+ client = await AiClient.create("anthropic/claude-3-5-sonnet")
154
+
155
+ async for event in (
156
+ client.chat()
157
+ .system("You are a helpful assistant.")
158
+ .user("Tell me a short story.")
159
+ .stream()
160
+ ):
161
+ if event.is_content_delta:
162
+ print(event.as_content_delta.content, end="", flush=True)
163
+
164
+ print() # Newline at end
165
+ await client.close()
166
+ ```
167
+
168
+ ### With Messages List
169
+
170
+ ```python
171
+ from ai_lib_python import Message
172
+
173
+ messages = [
174
+ Message.system("You are a Python expert."),
175
+ Message.user("How do I read a file in Python?"),
176
+ ]
177
+
178
+ response = await (
179
+ client.chat()
180
+ .messages(messages)
181
+ .temperature(0.7)
182
+ .max_tokens(1024)
183
+ .execute()
184
+ )
185
+ ```
186
+
187
+ ### Tool Calling (Function Calling)
188
+
189
+ ```python
190
+ from ai_lib_python import ToolDefinition
191
+
192
+ # Define a tool
193
+ weather_tool = ToolDefinition.from_function(
194
+ name="get_weather",
195
+ description="Get current weather for a location",
196
+ parameters={
197
+ "type": "object",
198
+ "properties": {
199
+ "location": {"type": "string", "description": "City name"},
200
+ "unit": {"type": "string", "enum": ["celsius", "fahrenheit"]}
201
+ },
202
+ "required": ["location"]
203
+ }
204
+ )
205
+
206
+ # Use tool in request
207
+ response = await (
208
+ client.chat()
209
+ .user("What's the weather in Tokyo?")
210
+ .tools([weather_tool])
211
+ .execute()
212
+ )
213
+
214
+ # Check for tool calls
215
+ if response.tool_calls:
216
+ for tool_call in response.tool_calls:
217
+ print(f"Call {tool_call.function_name}: {tool_call.arguments}")
218
+ ```
219
+
220
+ ### Multimodal (Images)
221
+
222
+ ```python
223
+ from ai_lib_python import Message, ContentBlock
224
+
225
+ # Image from URL
226
+ message = Message.user_with_image(
227
+ "What's in this image?",
228
+ image_url="https://example.com/image.jpg"
229
+ )
230
+
231
+ # Image from base64
232
+ with open("photo.jpg", "rb") as f:
233
+ image_data = base64.b64encode(f.read()).decode()
234
+
235
+ message = Message(
236
+ role=MessageRole.USER,
237
+ content=[
238
+ ContentBlock.text("Describe this image:"),
239
+ ContentBlock.image_base64(image_data, "image/jpeg"),
240
+ ]
241
+ )
242
+
243
+ response = await client.chat().messages([message]).execute()
244
+ ```
245
+
246
+ ### Production-Ready Configuration
247
+
248
+ ```python
249
+ from ai_lib_python import AiClient
250
+
251
+ # Enable all resilience patterns
252
+ client = await (
253
+ AiClient.builder()
254
+ .model("openai/gpt-4o")
255
+ .production_ready() # Enables retry, rate limit, circuit breaker
256
+ .with_fallbacks(["anthropic/claude-3-5-sonnet"])
257
+ .build()
258
+ )
259
+
260
+ # Check resilience status
261
+ print(f"Circuit state: {client.circuit_state}")
262
+ print(f"In-flight requests: {client.current_inflight}")
263
+ print(client.get_resilience_stats())
264
+ ```
265
+
266
+ ### Custom Resilience Configuration
267
+
268
+ ```python
269
+ from ai_lib_python import AiClient
270
+ from ai_lib_python.resilience import (
271
+ RetryConfig,
272
+ RateLimiterConfig,
273
+ CircuitBreakerConfig,
274
+ )
275
+
276
+ client = await (
277
+ AiClient.builder()
278
+ .model("openai/gpt-4o")
279
+ .with_retry(RetryConfig(
280
+ max_retries=5,
281
+ min_delay_ms=1000,
282
+ max_delay_ms=30000,
283
+ ))
284
+ .with_rate_limit(RateLimiterConfig.from_rps(10))
285
+ .with_circuit_breaker(CircuitBreakerConfig(
286
+ failure_threshold=5,
287
+ cooldown_seconds=30,
288
+ ))
289
+ .max_inflight(20)
290
+ .build()
291
+ )
292
+ ```
293
+
294
+ ### Context Manager
295
+
296
+ ```python
297
+ async with await AiClient.create("openai/gpt-4o") as client:
298
+ response = await client.chat().user("Hello!").execute()
299
+ print(response.content)
300
+ # Client automatically closed
301
+ ```
302
+
303
+ ### Token Counting and Cost Estimation
304
+
305
+ ```python
306
+ from ai_lib_python.tokens import TokenCounter, estimate_cost, get_model_pricing
307
+
308
+ # Count tokens
309
+ counter = TokenCounter.for_model("gpt-4o")
310
+ token_count = counter.count("Hello, how are you?")
311
+ print(f"Token count: {token_count}")
312
+
313
+ # Count message tokens
314
+ messages = [Message.user("Hello!"), Message.assistant("Hi there!")]
315
+ total_tokens = counter.count_messages(messages)
316
+
317
+ # Estimate cost
318
+ cost = estimate_cost(input_tokens=1000, output_tokens=500, model="gpt-4o")
319
+ print(f"Estimated cost: ${cost.total_cost:.4f}")
320
+
321
+ # Get model pricing info
322
+ pricing = get_model_pricing("gpt-4o")
323
+ print(f"Input: ${pricing.input_price_per_1k}/1K tokens")
324
+ print(f"Context window: {pricing.context_window} tokens")
325
+ ```
326
+
327
+ ### Metrics and Telemetry
328
+
329
+ ```python
330
+ from ai_lib_python.telemetry import (
331
+ get_logger,
332
+ MetricsCollector,
333
+ MetricLabels,
334
+ Tracer,
335
+ )
336
+
337
+ # Structured logging
338
+ logger = get_logger("my_app")
339
+ logger.info("Request started", model="gpt-4o", tokens=100)
340
+
341
+ # Metrics collection
342
+ collector = MetricsCollector()
343
+ labels = MetricLabels(provider="openai", model="gpt-4o")
344
+ collector.record_request(labels, latency=0.5, status="success", tokens_in=100, tokens_out=50)
345
+
346
+ # Get metrics snapshot
347
+ snapshot = collector.get_snapshot()
348
+ print(f"Total requests: {snapshot.total_requests}")
349
+ print(f"P99 latency: {snapshot.latency_p99_ms:.2f}ms")
350
+
351
+ # Export to Prometheus format
352
+ prometheus_metrics = collector.to_prometheus()
353
+
354
+ # Distributed tracing
355
+ tracer = Tracer("my_service")
356
+ with tracer.span("api_call") as span:
357
+ span.set_attribute("model", "gpt-4o")
358
+ # ... do work
359
+ ```
360
+
361
+ ### Batch Processing
362
+
363
+ ```python
364
+ from ai_lib_python.batch import BatchExecutor, BatchConfig
365
+
366
+ # Execute multiple requests concurrently
367
+ async def process_question(question: str) -> str:
368
+ client = await AiClient.create("openai/gpt-4o")
369
+ response = await client.chat().user(question).execute()
370
+ await client.close()
371
+ return response.content
372
+
373
+ questions = ["What is AI?", "What is Python?", "What is async?"]
374
+
375
+ executor = BatchExecutor(process_question, max_concurrent=5)
376
+ result = await executor.execute(questions)
377
+
378
+ print(f"Successful: {result.successful_count}")
379
+ print(f"Failed: {result.failed_count}")
380
+ for answer in result.get_successful_results():
381
+ print(answer)
382
+ ```
383
+
384
+ ### Connection Pooling
385
+
386
+ ```python
387
+ from ai_lib_python.transport import ConnectionPool, PoolConfig
388
+
389
+ # Create connection pool with custom config
390
+ pool = ConnectionPool(PoolConfig.high_throughput())
391
+
392
+ # Use pooled connections
393
+ async with pool:
394
+ client = await pool.get_client("openai", "https://api.openai.com")
395
+ response = await client.post("/v1/chat/completions", json=payload)
396
+
397
+ # Get pool statistics
398
+ stats = pool.get_stats("openai")
399
+ print(f"Active connections: {stats['openai']['active_connections']}")
400
+ ```
401
+
402
+ ### Model Routing & Selection
403
+
404
+ ```python
405
+ from ai_lib_python.routing import (
406
+ ModelManager, ModelInfo, create_openai_models, create_anthropic_models,
407
+ CostBasedSelector, QualityBasedSelector,
408
+ )
409
+
410
+ # Create a model manager with pre-configured models
411
+ manager = create_openai_models()
412
+ manager.merge(create_anthropic_models())
413
+
414
+ # Select model by capability
415
+ code_models = manager.filter_by_capability("code_generation")
416
+ print(f"Code models: {[m.name for m in code_models]}")
417
+
418
+ # Select cheapest model
419
+ selector = CostBasedSelector()
420
+ cheapest = selector.select(manager.list_models())
421
+ print(f"Cheapest: {cheapest.name} @ ${cheapest.pricing.input_cost_per_1k}/1K")
422
+
423
+ # Select highest quality model
424
+ quality_selector = QualityBasedSelector()
425
+ best = quality_selector.select(manager.list_models())
426
+ print(f"Best quality: {best.name}")
427
+
428
+ # Recommend model for use case
429
+ recommended = manager.recommend_for("chat")
430
+ ```
431
+
432
+ ### Stream Cancellation
433
+
434
+ ```python
435
+ from ai_lib_python.client import create_cancel_pair, CancellableStream, CancelReason
436
+
437
+ async def cancellable_stream():
438
+ client = await AiClient.create("openai/gpt-4o")
439
+
440
+ # Create cancel token and handle
441
+ token, handle = create_cancel_pair()
442
+
443
+ # Start streaming with cancellation support
444
+ stream = client.chat().user("Write a long story...").stream()
445
+ cancellable = CancellableStream(stream, token)
446
+
447
+ # In another task, you can cancel:
448
+ # handle.cancel(CancelReason.USER_REQUEST)
449
+
450
+ async for event in cancellable:
451
+ if event.is_content_delta:
452
+ print(event.as_content_delta.content, end="")
453
+
454
+ # Check if cancelled
455
+ if token.is_cancelled:
456
+ print("\n[Cancelled]")
457
+ break
458
+ ```
459
+
460
+ ### User Feedback Collection
461
+
462
+ ```python
463
+ from ai_lib_python.telemetry import (
464
+ RatingFeedback, ThumbsFeedback, ChoiceSelectionFeedback,
465
+ InMemoryFeedbackSink, set_feedback_sink, report_feedback,
466
+ )
467
+
468
+ # Set up feedback collection
469
+ sink = InMemoryFeedbackSink(max_events=1000)
470
+ set_feedback_sink(sink)
471
+
472
+ # Report user feedback
473
+ await report_feedback(RatingFeedback(
474
+ request_id="req-123",
475
+ rating=5,
476
+ category="helpfulness",
477
+ comment="Great response!"
478
+ ))
479
+
480
+ await report_feedback(ThumbsFeedback(
481
+ request_id="req-456",
482
+ is_positive=True
483
+ ))
484
+
485
+ # Report multi-candidate selection (for A/B testing)
486
+ await report_feedback(ChoiceSelectionFeedback(
487
+ request_id="req-789",
488
+ chosen_index=0,
489
+ rejected_indices=[1, 2],
490
+ latency_to_select_ms=1500.0
491
+ ))
492
+
493
+ # Retrieve feedback
494
+ all_feedback = sink.get_events()
495
+ request_feedback = sink.get_events_by_request("req-123")
496
+ ```
497
+
498
+ ### Embeddings
499
+
500
+ ```python
501
+ from ai_lib_python.embeddings import (
502
+ EmbeddingClient, cosine_similarity, find_most_similar
503
+ )
504
+
505
+ # Create embedding client
506
+ client = await EmbeddingClient.create("openai/text-embedding-3-small")
507
+
508
+ # Generate embeddings
509
+ response = await client.embed("Hello, world!")
510
+ embedding = response.first.vector
511
+ print(f"Dimensions: {len(embedding)}")
512
+
513
+ # Batch embeddings
514
+ texts = ["Hello", "World", "Python", "AI"]
515
+ response = await client.embed_batch(texts)
516
+
517
+ # Find most similar
518
+ query = response.embeddings[0].vector
519
+ candidates = [e.vector for e in response.embeddings[1:]]
520
+ results = find_most_similar(query, candidates, top_k=2)
521
+ for idx, score in results:
522
+ print(f"Text '{texts[idx+1]}' similarity: {score:.4f}")
523
+
524
+ await client.close()
525
+ ```
526
+
527
+ ### Response Caching
528
+
529
+ ```python
530
+ from ai_lib_python.cache import CacheManager, CacheConfig, MemoryCache
531
+
532
+ # Create cache manager
533
+ cache = CacheManager(
534
+ config=CacheConfig(default_ttl_seconds=3600),
535
+ backend=MemoryCache(max_size=1000)
536
+ )
537
+
538
+ # Cache responses
539
+ key = cache.generate_key(model="gpt-4o", messages=messages)
540
+
541
+ # Check cache first
542
+ cached = await cache.get(key)
543
+ if cached:
544
+ print("Cache hit!")
545
+ response = cached
546
+ else:
547
+ response = await client.chat().messages(messages).execute()
548
+ await cache.set(key, response)
549
+
550
+ # Get cache statistics
551
+ stats = cache.stats()
552
+ print(f"Hit ratio: {stats.hit_ratio:.2%}")
553
+ ```
554
+
555
+ ### Plugin System
556
+
557
+ ```python
558
+ from ai_lib_python.plugins import (
559
+ Plugin, PluginContext, PluginRegistry, HookType, HookManager
560
+ )
561
+
562
+ # Create a custom plugin
563
+ class LoggingPlugin(Plugin):
564
+ def name(self) -> str:
565
+ return "logging"
566
+
567
+ async def on_before_request(self, ctx: PluginContext) -> None:
568
+ print(f"Request to {ctx.model}: {ctx.request}")
569
+
570
+ async def on_after_response(self, ctx: PluginContext) -> None:
571
+ print(f"Response received: {ctx.response}")
572
+
573
+ # Register plugin
574
+ registry = PluginRegistry()
575
+ await registry.register(LoggingPlugin())
576
+
577
+ # Use hooks for fine-grained control
578
+ hooks = HookManager()
579
+ hooks.register(HookType.BEFORE_REQUEST, "log", lambda ctx: print(f"Starting {ctx.model}"))
580
+
581
+ # Trigger hooks
582
+ ctx = PluginContext(model="gpt-4o", request={"messages": [...]})
583
+ await registry.trigger_before_request(ctx)
584
+ ```
585
+
586
+ ## Supported Providers
587
+
588
+ | Provider | Models | Streaming | Tools | Vision |
589
+ |----------|--------|-----------|-------|--------|
590
+ | OpenAI | GPT-4o, GPT-4, GPT-3.5 | βœ… | βœ… | βœ… |
591
+ | Anthropic | Claude 3.5, Claude 3 | βœ… | βœ… | βœ… |
592
+ | Google | Gemini Pro, Gemini Flash | βœ… | βœ… | βœ… |
593
+ | DeepSeek | DeepSeek Chat, Coder | βœ… | βœ… | ❌ |
594
+ | Qwen | Qwen2.5, Qwen-Max | βœ… | βœ… | βœ… |
595
+ | Groq | Llama, Mixtral | βœ… | βœ… | ❌ |
596
+ | Mistral | Mistral Large, Medium | βœ… | βœ… | ❌ |
597
+
598
+ ## API Reference
599
+
600
+ ### Core Classes
601
+
602
+ - **`AiClient`**: Main entry point for AI model interaction
603
+ - **`Message`**: Represents a chat message with role and content
604
+ - **`ContentBlock`**: Content blocks for multimodal messages
605
+ - **`ToolDefinition`**: Tool/function definition for function calling
606
+ - **`StreamingEvent`**: Events from streaming responses
607
+
608
+ ### Resilience Classes
609
+
610
+ - **`RetryPolicy`**: Exponential backoff with jitter
611
+ - **`RateLimiter`**: Token bucket rate limiting
612
+ - **`CircuitBreaker`**: Circuit breaker pattern
613
+ - **`Backpressure`**: Concurrency limiting
614
+ - **`FallbackChain`**: Multi-target failover
615
+ - **`PreflightChecker`**: Unified request gating
616
+ - **`SignalsSnapshot`**: Runtime state aggregation
617
+
618
+ ### Routing Classes
619
+
620
+ - **`ModelManager`**: Centralized model management
621
+ - **`ModelInfo`**: Model information with capabilities
622
+ - **`ModelArray`**: Load balancing across endpoints
623
+ - **`ModelSelectionStrategy`**: Selection strategies (Cost, Quality, Performance, etc.)
624
+
625
+ ### Telemetry Classes
626
+
627
+ - **`AiLibLogger`**: Structured logging with masking
628
+ - **`MetricsCollector`**: Request metrics collection
629
+ - **`Tracer`**: Distributed tracing
630
+ - **`HealthChecker`**: Health monitoring
631
+ - **`FeedbackSink`**: User feedback collection
632
+
633
+ ### Embedding Classes
634
+
635
+ - **`EmbeddingClient`**: Embedding generation client
636
+ - **`Embedding`**: Single embedding result
637
+ - **`EmbeddingResponse`**: Response with usage stats
638
+
639
+ ### Token Classes
640
+
641
+ - **`TokenCounter`**: Token counting interface
642
+ - **`CostEstimate`**: Cost estimation result
643
+ - **`ModelPricing`**: Model pricing information
644
+
645
+ ### Cache Classes
646
+
647
+ - **`CacheManager`**: High-level cache management
648
+ - **`CacheBackend`**: Cache backend interface (Memory, Disk, Null)
649
+ - **`CacheKeyGenerator`**: Deterministic key generation
650
+
651
+ ### Batch Classes
652
+
653
+ - **`BatchCollector`**: Request grouping
654
+ - **`BatchExecutor`**: Parallel execution
655
+
656
+ ### Plugin Classes
657
+
658
+ - **`Plugin`**: Base plugin class
659
+ - **`PluginRegistry`**: Plugin management
660
+ - **`HookManager`**: Event-driven hooks
661
+ - **`Middleware`**: Request/response chain
662
+
663
+ ### Transport Classes
664
+
665
+ - **`ConnectionPool`**: HTTP connection pooling
666
+ - **`PoolConfig`**: Pool configuration
667
+
668
+ ### Cancellation Classes
669
+
670
+ - **`CancelToken`**: Cooperative cancellation token
671
+ - **`CancelHandle`**: Public cancel interface
672
+ - **`CancellableStream`**: Cancellable async iterator
673
+
674
+ ### Error Classes
675
+
676
+ - **`AiLibError`**: Base error class
677
+ - **`ProtocolError`**: Protocol loading/validation errors
678
+ - **`TransportError`**: HTTP transport errors
679
+ - **`RemoteError`**: API errors from providers
680
+
681
+ ## Architecture
682
+
683
+ ```
684
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
685
+ β”‚ AiClient β”‚
686
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
687
+ β”‚ β”‚ ChatRequest β”‚ β”‚ Resilience β”‚ β”‚ Protocol β”‚ β”‚
688
+ β”‚ β”‚ Builder β”‚ β”‚ Executor β”‚ β”‚ Loader β”‚ β”‚
689
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
690
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
691
+ β”‚ β”‚ β”‚
692
+ β–Ό β–Ό β–Ό
693
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
694
+ β”‚ HttpTransport β”‚ β”‚ Pipeline β”‚ β”‚ ProtocolManifest β”‚
695
+ │ (httpx) │ │ (decode→ │ │ (YAML/JSON) │
696
+ │ │ │ select→ │ │ │
697
+ β”‚ β”‚ β”‚ map) β”‚ β”‚ β”‚
698
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
699
+ ```
700
+
701
+ ## Development
702
+
703
+ ```bash
704
+ # Clone the repository
705
+ git clone https://github.com/hiddenpath/ai-lib-python.git
706
+ cd ai-lib-python
707
+
708
+ # Install development dependencies
709
+ pip install -e ".[dev]"
710
+
711
+ # Run tests
712
+ pytest
713
+
714
+ # Run tests with coverage
715
+ pytest --cov=src/ai_lib_python
716
+
717
+ # Type checking
718
+ mypy src
719
+
720
+ # Linting
721
+ ruff check src tests
722
+
723
+ # Format code
724
+ ruff format src tests
725
+ ```
726
+
727
+ ## Project Structure
728
+
729
+ ```
730
+ ai-lib-python/
731
+ β”œβ”€β”€ src/ai_lib_python/
732
+ β”‚ β”œβ”€β”€ __init__.py # Package exports
733
+ β”‚ β”œβ”€β”€ types/ # Type definitions
734
+ β”‚ β”‚ β”œβ”€β”€ message.py # Message, ContentBlock
735
+ β”‚ β”‚ β”œβ”€β”€ tool.py # ToolDefinition, ToolCall
736
+ β”‚ β”‚ └── events.py # StreamingEvent types
737
+ β”‚ β”œβ”€β”€ protocol/ # Protocol layer
738
+ β”‚ β”‚ β”œβ”€β”€ manifest.py # ProtocolManifest models
739
+ β”‚ β”‚ β”œβ”€β”€ loader.py # Protocol loading
740
+ β”‚ β”‚ └── validator.py # Schema validation (+ version/streaming checks)
741
+ β”‚ β”œβ”€β”€ transport/ # HTTP transport
742
+ β”‚ β”‚ β”œβ”€β”€ http.py # HttpTransport
743
+ β”‚ β”‚ β”œβ”€β”€ auth.py # API key resolution
744
+ β”‚ β”‚ └── pool.py # ConnectionPool
745
+ β”‚ β”œβ”€β”€ pipeline/ # Stream processing
746
+ β”‚ β”‚ β”œβ”€β”€ decode.py # SSE/NDJSON decoders
747
+ β”‚ β”‚ β”œβ”€β”€ select.py # JSONPath selectors
748
+ β”‚ β”‚ β”œβ”€β”€ accumulate.py # Tool call accumulator
749
+ β”‚ β”‚ β”œβ”€β”€ event_map.py # Event mappers
750
+ β”‚ β”‚ └── fan_out.py # FanOut, Replicate, Split transforms
751
+ β”‚ β”œβ”€β”€ resilience/ # Resilience patterns
752
+ β”‚ β”‚ β”œβ”€β”€ retry.py # RetryPolicy
753
+ β”‚ β”‚ β”œβ”€β”€ rate_limiter.py # RateLimiter
754
+ β”‚ β”‚ β”œβ”€β”€ circuit_breaker.py
755
+ β”‚ β”‚ β”œβ”€β”€ backpressure.py
756
+ β”‚ β”‚ β”œβ”€β”€ fallback.py # FallbackChain
757
+ β”‚ β”‚ β”œβ”€β”€ executor.py # ResilientExecutor
758
+ β”‚ β”‚ β”œβ”€β”€ signals.py # SignalsSnapshot
759
+ β”‚ β”‚ └── preflight.py # PreflightChecker
760
+ β”‚ β”œβ”€β”€ routing/ # Model routing & load balancing
761
+ β”‚ β”‚ β”œβ”€β”€ models.py # ModelInfo, ModelCapabilities
762
+ β”‚ β”‚ β”œβ”€β”€ strategies.py # Selection strategies
763
+ β”‚ β”‚ β”œβ”€β”€ manager.py # ModelManager
764
+ β”‚ β”‚ └── array.py # ModelArray (load balancing)
765
+ β”‚ β”œβ”€β”€ client/ # User API
766
+ β”‚ β”‚ β”œβ”€β”€ core.py # AiClient
767
+ β”‚ β”‚ β”œβ”€β”€ builder.py # Builders
768
+ β”‚ β”‚ β”œβ”€β”€ response.py # ChatResponse
769
+ β”‚ β”‚ └── cancel.py # CancelToken, CancellableStream
770
+ β”‚ β”œβ”€β”€ embeddings/ # Embedding support
771
+ β”‚ β”‚ β”œβ”€β”€ client.py # EmbeddingClient
772
+ β”‚ β”‚ β”œβ”€β”€ types.py # Embedding, EmbeddingRequest
773
+ β”‚ β”‚ └── vectors.py # Vector operations
774
+ β”‚ β”œβ”€β”€ cache/ # Response caching
775
+ β”‚ β”‚ β”œβ”€β”€ manager.py # CacheManager
776
+ β”‚ β”‚ β”œβ”€β”€ backend.py # MemoryCache, DiskCache
777
+ β”‚ β”‚ └── key.py # CacheKeyGenerator
778
+ β”‚ β”œβ”€β”€ tokens/ # Token counting
779
+ β”‚ β”‚ β”œβ”€β”€ counter.py # TokenCounter, TiktokenCounter
780
+ β”‚ β”‚ └── pricing.py # ModelPricing, CostEstimate
781
+ β”‚ β”œβ”€β”€ telemetry/ # Observability
782
+ β”‚ β”‚ β”œβ”€β”€ logging.py # AiLibLogger
783
+ β”‚ β”‚ β”œβ”€β”€ metrics.py # MetricsCollector
784
+ β”‚ β”‚ β”œβ”€β”€ tracing.py # Tracer
785
+ β”‚ β”‚ β”œβ”€β”€ health.py # HealthChecker
786
+ β”‚ β”‚ └── feedback.py # Feedback types and sinks
787
+ β”‚ β”œβ”€β”€ batch/ # Request batching
788
+ β”‚ β”‚ β”œβ”€β”€ collector.py # BatchCollector
789
+ β”‚ β”‚ └── executor.py # BatchExecutor
790
+ β”‚ β”œβ”€β”€ plugins/ # Plugin system
791
+ β”‚ β”‚ β”œβ”€β”€ base.py # Plugin base class
792
+ β”‚ β”‚ β”œβ”€β”€ registry.py # PluginRegistry
793
+ β”‚ β”‚ β”œβ”€β”€ hooks.py # HookManager
794
+ β”‚ β”‚ └── middleware.py # Middleware chain
795
+ β”‚ β”œβ”€β”€ structured/ # Structured output
796
+ β”‚ β”‚ β”œβ”€β”€ json_mode.py # JsonModeConfig
797
+ β”‚ β”‚ β”œβ”€β”€ schema.py # SchemaGenerator
798
+ β”‚ β”‚ └── validator.py # OutputValidator
799
+ β”‚ β”œβ”€β”€ utils/ # Utilities
800
+ β”‚ β”‚ └── tool_call_assembler.py # ToolCallAssembler
801
+ β”‚ └── errors/ # Error hierarchy
802
+ β”œβ”€β”€ tests/
803
+ β”‚ β”œβ”€β”€ unit/ # Unit tests
804
+ β”‚ └── integration/ # Integration tests
805
+ β”œβ”€β”€ docs/ # Documentation
806
+ β”œβ”€β”€ examples/ # Example scripts
807
+ └── pyproject.toml
808
+ ```
809
+
810
+ ## Environment Variables
811
+
812
+ | Variable | Description | Default |
813
+ |----------|-------------|---------|
814
+ | `OPENAI_API_KEY` | OpenAI API key | - |
815
+ | `ANTHROPIC_API_KEY` | Anthropic API key | - |
816
+ | `GOOGLE_API_KEY` | Google AI API key | - |
817
+ | `AI_PROTOCOL_PATH` | Custom protocol directory | - |
818
+ | `AI_HTTP_TIMEOUT_SECS` | HTTP timeout | 60 |
819
+ | `AI_LIB_MAX_INFLIGHT` | Max concurrent requests | 10 |
820
+
821
+ ## Related Projects
822
+
823
+ - [AI-Protocol](https://github.com/hiddenpath/ai-protocol) - Protocol specification
824
+ - [ai-lib-rust](https://github.com/hiddenpath/ai-lib-rust) - Rust runtime implementation
825
+
826
+ ## Contributing
827
+
828
+ Contributions are welcome! Please read our [Contributing Guide](CONTRIBUTING.md) for details.
829
+
830
+ ## License
831
+
832
+ This project is licensed under either of:
833
+
834
+ - Apache License, Version 2.0 ([LICENSE-APACHE](LICENSE-APACHE) or http://www.apache.org/licenses/LICENSE-2.0)
835
+ - MIT License ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT)
836
+
837
+ at your option.