genai-otel-instrument 0.1.12.dev0__py3-none-any.whl → 0.1.16__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of genai-otel-instrument might be problematic. Click here for more details.

@@ -1,952 +1,959 @@
1
- Metadata-Version: 2.4
2
- Name: genai-otel-instrument
3
- Version: 0.1.12.dev0
4
- Summary: Comprehensive OpenTelemetry auto-instrumentation for LLM/GenAI applications
5
- Author-email: Kshitij Thakkar <kshitijthakkar@rocketmail.com>
6
- License: Apache-2.0
7
- Project-URL: Homepage, https://github.com/Mandark-droid/genai_otel_instrument
8
- Project-URL: Repository, https://github.com/Mandark-droid/genai_otel_instrument
9
- Project-URL: Documentation, https://github.com/Mandark-droid/genai_otel_instrument#readme
10
- Project-URL: Issues, https://github.com/Mandark-droid/genai_otel_instrument/issues
11
- Project-URL: Changelog, https://github.com/Mandark-droid/genai_otel_instrument/blob/main/CHANGELOG.md
12
- Keywords: opentelemetry,observability,llm,genai,instrumentation,tracing,metrics,monitoring
13
- Classifier: Development Status :: 4 - Beta
14
- Classifier: Intended Audience :: Developers
15
- Classifier: Topic :: Software Development :: Libraries :: Python Modules
16
- Classifier: Topic :: System :: Monitoring
17
- Classifier: License :: OSI Approved :: Apache Software License
18
- Classifier: Operating System :: OS Independent
19
- Classifier: Programming Language :: Python :: 3
20
- Classifier: Programming Language :: Python :: 3.9
21
- Classifier: Programming Language :: Python :: 3.10
22
- Classifier: Programming Language :: Python :: 3.11
23
- Classifier: Programming Language :: Python :: 3.12
24
- Requires-Python: >=3.9
25
- Description-Content-Type: text/markdown
26
- License-File: LICENSE
27
- Requires-Dist: opentelemetry-api<2.0.0,>=1.20.0
28
- Requires-Dist: opentelemetry-sdk<2.0.0,>=1.20.0
29
- Requires-Dist: opentelemetry-instrumentation>=0.41b0
30
- Requires-Dist: opentelemetry-semantic-conventions<1.0.0,>=0.45b0
31
- Requires-Dist: opentelemetry-exporter-otlp>=1.20.0
32
- Requires-Dist: opentelemetry-instrumentation-requests>=0.41b0
33
- Requires-Dist: opentelemetry-instrumentation-httpx>=0.41b0
34
- Requires-Dist: requests>=2.20.0
35
- Requires-Dist: wrapt>=1.14.0
36
- Requires-Dist: httpx>=0.23.0
37
- Requires-Dist: opentelemetry-instrumentation-mysql>=0.41b0
38
- Requires-Dist: mysql-connector-python<9.0.0,>=8.0.0
39
- Requires-Dist: opentelemetry-instrumentation-psycopg2>=0.41b0
40
- Requires-Dist: psycopg2-binary>=2.9.0
41
- Requires-Dist: opentelemetry-instrumentation-redis>=0.41b0
42
- Requires-Dist: redis
43
- Requires-Dist: opentelemetry-instrumentation-pymongo>=0.41b0
44
- Requires-Dist: pymongo
45
- Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.41b0
46
- Requires-Dist: sqlalchemy>=1.4.0
47
- Requires-Dist: opentelemetry-instrumentation-kafka-python>=0.41b0
48
- Requires-Dist: kafka-python
49
- Provides-Extra: openinference
50
- Requires-Dist: openinference-instrumentation==0.1.31; extra == "openinference"
51
- Requires-Dist: openinference-instrumentation-litellm==0.1.19; extra == "openinference"
52
- Requires-Dist: openinference-instrumentation-mcp==1.3.0; extra == "openinference"
53
- Requires-Dist: openinference-instrumentation-smolagents==0.1.11; extra == "openinference"
54
- Requires-Dist: litellm>=1.0.0; extra == "openinference"
55
- Provides-Extra: gpu
56
- Requires-Dist: nvidia-ml-py>=11.495.46; extra == "gpu"
57
- Requires-Dist: codecarbon>=2.3.0; extra == "gpu"
58
- Provides-Extra: co2
59
- Requires-Dist: codecarbon>=2.3.0; extra == "co2"
60
- Provides-Extra: openai
61
- Requires-Dist: openai>=1.0.0; extra == "openai"
62
- Provides-Extra: anthropic
63
- Requires-Dist: anthropic>=0.18.0; extra == "anthropic"
64
- Provides-Extra: google
65
- Requires-Dist: google-generativeai>=0.3.0; extra == "google"
66
- Provides-Extra: aws
67
- Requires-Dist: boto3>=1.28.0; extra == "aws"
68
- Provides-Extra: azure
69
- Requires-Dist: azure-ai-openai>=1.0.0; extra == "azure"
70
- Provides-Extra: cohere
71
- Requires-Dist: cohere>=4.0.0; extra == "cohere"
72
- Provides-Extra: mistral
73
- Requires-Dist: mistralai>=0.4.2; extra == "mistral"
74
- Provides-Extra: together
75
- Requires-Dist: together>=0.2.0; extra == "together"
76
- Provides-Extra: groq
77
- Requires-Dist: groq>=0.4.0; extra == "groq"
78
- Provides-Extra: ollama
79
- Requires-Dist: ollama>=0.1.0; extra == "ollama"
80
- Provides-Extra: replicate
81
- Requires-Dist: replicate>=0.15.0; extra == "replicate"
82
- Provides-Extra: langchain
83
- Requires-Dist: langchain>=0.1.0; extra == "langchain"
84
- Provides-Extra: llamaindex
85
- Requires-Dist: llama-index>=0.9.0; extra == "llamaindex"
86
- Provides-Extra: huggingface
87
- Requires-Dist: transformers>=4.30.0; extra == "huggingface"
88
- Provides-Extra: databases
89
- Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.41b0; extra == "databases"
90
- Requires-Dist: sqlalchemy>=1.4.0; extra == "databases"
91
- Requires-Dist: opentelemetry-instrumentation-redis>=0.41b0; extra == "databases"
92
- Requires-Dist: redis; extra == "databases"
93
- Requires-Dist: opentelemetry-instrumentation-pymongo>=0.41b0; extra == "databases"
94
- Requires-Dist: pymongo; extra == "databases"
95
- Requires-Dist: opentelemetry-instrumentation-psycopg2>=0.41b0; extra == "databases"
96
- Requires-Dist: psycopg2-binary>=2.9.0; extra == "databases"
97
- Requires-Dist: opentelemetry-instrumentation-mysql>=0.41b0; extra == "databases"
98
- Requires-Dist: mysql-connector-python<9.0.0,>=8.0.0; extra == "databases"
99
- Provides-Extra: messaging
100
- Requires-Dist: opentelemetry-instrumentation-kafka-python>=0.41b0; extra == "messaging"
101
- Requires-Dist: kafka-python; extra == "messaging"
102
- Provides-Extra: vector-dbs
103
- Requires-Dist: pinecone>=3.0.0; extra == "vector-dbs"
104
- Requires-Dist: weaviate-client>=3.0.0; extra == "vector-dbs"
105
- Requires-Dist: qdrant-client>=1.0.0; extra == "vector-dbs"
106
- Requires-Dist: chromadb>=0.4.0; extra == "vector-dbs"
107
- Requires-Dist: pymilvus>=2.3.0; extra == "vector-dbs"
108
- Requires-Dist: faiss-cpu>=1.7.0; extra == "vector-dbs"
109
- Provides-Extra: all-providers
110
- Requires-Dist: openai>=1.0.0; extra == "all-providers"
111
- Requires-Dist: anthropic>=0.18.0; extra == "all-providers"
112
- Requires-Dist: google-generativeai>=0.3.0; extra == "all-providers"
113
- Requires-Dist: boto3>=1.28.0; extra == "all-providers"
114
- Requires-Dist: azure-ai-openai>=1.0.0; extra == "all-providers"
115
- Requires-Dist: cohere>=4.0.0; extra == "all-providers"
116
- Requires-Dist: mistralai>=0.4.2; extra == "all-providers"
117
- Requires-Dist: together>=0.2.0; extra == "all-providers"
118
- Requires-Dist: groq>=0.4.0; extra == "all-providers"
119
- Requires-Dist: ollama>=0.1.0; extra == "all-providers"
120
- Requires-Dist: replicate>=0.15.0; extra == "all-providers"
121
- Requires-Dist: langchain>=0.1.0; extra == "all-providers"
122
- Requires-Dist: llama-index>=0.9.0; extra == "all-providers"
123
- Requires-Dist: transformers>=4.30.0; extra == "all-providers"
124
- Requires-Dist: litellm>=1.0.0; extra == "all-providers"
125
- Provides-Extra: all-mcp
126
- Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.41b0; extra == "all-mcp"
127
- Requires-Dist: opentelemetry-instrumentation-redis>=0.41b0; extra == "all-mcp"
128
- Requires-Dist: opentelemetry-instrumentation-pymongo>=0.41b0; extra == "all-mcp"
129
- Requires-Dist: opentelemetry-instrumentation-psycopg2>=0.41b0; extra == "all-mcp"
130
- Requires-Dist: opentelemetry-instrumentation-mysql>=0.41b0; extra == "all-mcp"
131
- Requires-Dist: opentelemetry-instrumentation-kafka-python>=0.41b0; extra == "all-mcp"
132
- Requires-Dist: pinecone>=3.0.0; extra == "all-mcp"
133
- Requires-Dist: weaviate-client>=3.0.0; extra == "all-mcp"
134
- Requires-Dist: qdrant-client>=1.0.0; extra == "all-mcp"
135
- Requires-Dist: chromadb>=0.4.0; extra == "all-mcp"
136
- Requires-Dist: pymilvus>=2.3.0; extra == "all-mcp"
137
- Requires-Dist: faiss-cpu>=1.7.0; extra == "all-mcp"
138
- Requires-Dist: sqlalchemy; extra == "all-mcp"
139
- Provides-Extra: all
140
- Requires-Dist: openai>=1.0.0; extra == "all"
141
- Requires-Dist: anthropic>=0.18.0; extra == "all"
142
- Requires-Dist: google-generativeai>=0.3.0; extra == "all"
143
- Requires-Dist: boto3>=1.28.0; extra == "all"
144
- Requires-Dist: azure-ai-openai>=1.0.0; extra == "all"
145
- Requires-Dist: cohere>=4.0.0; extra == "all"
146
- Requires-Dist: mistralai>=0.4.2; extra == "all"
147
- Requires-Dist: together>=0.2.0; extra == "all"
148
- Requires-Dist: groq>=0.4.0; extra == "all"
149
- Requires-Dist: ollama>=0.1.0; extra == "all"
150
- Requires-Dist: replicate>=0.15.0; extra == "all"
151
- Requires-Dist: langchain>=0.1.0; extra == "all"
152
- Requires-Dist: llama-index>=0.9.0; extra == "all"
153
- Requires-Dist: transformers>=4.30.0; extra == "all"
154
- Requires-Dist: nvidia-ml-py>=11.495.46; extra == "all"
155
- Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.41b0; extra == "all"
156
- Requires-Dist: opentelemetry-instrumentation-redis>=0.41b0; extra == "all"
157
- Requires-Dist: opentelemetry-instrumentation-pymongo>=0.41b0; extra == "all"
158
- Requires-Dist: opentelemetry-instrumentation-psycopg2>=0.41b0; extra == "all"
159
- Requires-Dist: opentelemetry-instrumentation-mysql>=0.41b0; extra == "all"
160
- Requires-Dist: opentelemetry-instrumentation-kafka-python>=0.41b0; extra == "all"
161
- Requires-Dist: pinecone>=3.0.0; extra == "all"
162
- Requires-Dist: weaviate-client>=3.0.0; extra == "all"
163
- Requires-Dist: qdrant-client>=1.0.0; extra == "all"
164
- Requires-Dist: chromadb>=0.4.0; extra == "all"
165
- Requires-Dist: pymilvus>=2.3.0; extra == "all"
166
- Requires-Dist: faiss-cpu>=1.7.0; extra == "all"
167
- Requires-Dist: sqlalchemy; extra == "all"
168
- Provides-Extra: dev
169
- Requires-Dist: pytest>=7.0.0; extra == "dev"
170
- Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
171
- Requires-Dist: pytest-mock>=3.10.0; extra == "dev"
172
- Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
173
- Requires-Dist: black>=23.0.0; extra == "dev"
174
- Requires-Dist: isort>=5.12.0; extra == "dev"
175
- Requires-Dist: pylint>=2.17.0; extra == "dev"
176
- Requires-Dist: mypy>=1.0.0; extra == "dev"
177
- Requires-Dist: build>=0.10.0; extra == "dev"
178
- Requires-Dist: twine>=4.0.0; extra == "dev"
179
- Dynamic: license-file
180
-
181
- # GenAI OpenTelemetry Auto-Instrumentation
182
-
183
- <div align="center">
184
- <img src=".github/images/Logo.jpg" alt="GenAI OpenTelemetry Instrumentation Logo" width="400"/>
185
- </div>
186
-
187
- <br/>
188
-
189
- [![PyPI version](https://badge.fury.io/py/genai-otel-instrument.svg)](https://badge.fury.io/py/genai-otel-instrument)
190
- [![Python Versions](https://img.shields.io/pypi/pyversions/genai-otel-instrument.svg)](https://pypi.org/project/genai-otel-instrument/)
191
- [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
192
- [![Downloads](https://static.pepy.tech/badge/genai-otel-instrument)](https://pepy.tech/project/genai-otel-instrument)
193
- [![Downloads/Month](https://static.pepy.tech/badge/genai-otel-instrument/month)](https://pepy.tech/project/genai-otel-instrument)
194
-
195
- [![GitHub Stars](https://img.shields.io/github/stars/Mandark-droid/genai_otel_instrument?style=social)](https://github.com/Mandark-droid/genai_otel_instrument)
196
- [![GitHub Forks](https://img.shields.io/github/forks/Mandark-droid/genai_otel_instrument?style=social)](https://github.com/Mandark-droid/genai_otel_instrument)
197
- [![GitHub Issues](https://img.shields.io/github/issues/Mandark-droid/genai_otel_instrument)](https://github.com/Mandark-droid/genai_otel_instrument/issues)
198
- [![GitHub Pull Requests](https://img.shields.io/github/issues-pr/Mandark-droid/genai_otel_instrument)](https://github.com/Mandark-droid/genai_otel_instrument/pulls)
199
-
200
- [![Code Coverage](https://img.shields.io/badge/coverage-90%25-brightgreen.svg)](https://github.com/Mandark-droid/genai_otel_instrument)
201
- [![Code Style: Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
202
- [![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
203
- [![Type Checked: mypy](https://img.shields.io/badge/type%20checked-mypy-blue.svg)](http://mypy-lang.org/)
204
-
205
- [![OpenTelemetry](https://img.shields.io/badge/OpenTelemetry-1.20%2B-blueviolet)](https://opentelemetry.io/)
206
- [![Semantic Conventions](https://img.shields.io/badge/OTel%20Semconv-GenAI%20v1.28-orange)](https://opentelemetry.io/docs/specs/semconv/gen-ai/)
207
- [![CI/CD](https://img.shields.io/badge/CI%2FCD-GitHub%20Actions-2088FF?logo=github-actions&logoColor=white)](https://github.com/Mandark-droid/genai_otel_instrument/actions)
208
-
209
- ---
210
-
211
- <div align="center">
212
- <img src=".github/images/Landing_Page.jpg" alt="GenAI OpenTelemetry Instrumentation Overview" width="800"/>
213
- </div>
214
-
215
- ---
216
-
217
- Production-ready OpenTelemetry instrumentation for GenAI/LLM applications with zero-code setup.
218
-
219
- ## Features
220
-
221
- 🚀 **Zero-Code Instrumentation** - Just install and set env vars
222
- 🤖 **15+ LLM Providers** - OpenAI, Anthropic, Google, AWS, Azure, and more
223
- 🔧 **MCP Tool Support** - Auto-instrument databases, APIs, caches, vector DBs
224
- 💰 **Cost Tracking** - Automatic cost calculation for both streaming and non-streaming requests
225
- ⚡ **Streaming Support** - Full observability for streaming responses with TTFT/TBT metrics and cost tracking
226
- 🎮 **GPU Metrics** - Real-time GPU utilization, memory, temperature, power, and electricity cost tracking
227
- 📊 **Complete Observability** - Traces, metrics, and rich span attributes
228
- ➕ **Service Instance ID & Environment** - Identify your services and environments
229
- ⏱️ **Configurable Exporter Timeout** - Set timeout for OTLP exporter
230
- 🔗 **OpenInference Instrumentors** - Smolagents, MCP, and LiteLLM instrumentation
231
-
232
- ## Quick Start
233
-
234
- ### Installation
235
-
236
- ```bash
237
- pip install genai-otel-instrument
238
- ```
239
-
240
- ### Usage
241
-
242
- **Option 1: Environment Variables (No code changes)**
243
-
244
- ```bash
245
- export OTEL_SERVICE_NAME=my-llm-app
246
- export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
247
- python your_app.py
248
- ```
249
-
250
- **Option 2: One line of code**
251
-
252
- ```python
253
- import genai_otel
254
- genai_otel.instrument()
255
-
256
- # Your existing code works unchanged
257
- import openai
258
- client = openai.OpenAI()
259
- response = client.chat.completions.create(...)
260
- ```
261
-
262
- **Option 3: CLI wrapper**
263
-
264
- ```bash
265
- genai-instrument python your_app.py
266
- ```
267
-
268
- For a more comprehensive demonstration of various LLM providers and MCP tools, refer to `example_usage.py` in the project root. Note that running this example requires setting up relevant API keys and external services (e.g., databases, Redis, Pinecone).
269
-
270
- ## What Gets Instrumented?
271
-
272
- ### LLM Providers (Auto-detected)
273
- - **With Full Cost Tracking**: OpenAI, Anthropic, Google AI, AWS Bedrock, Azure OpenAI, Cohere, Mistral AI, Together AI, Groq, Ollama, Vertex AI
274
- - **Hardware/Local Pricing**: Replicate (hardware-based $/second), HuggingFace (local execution with estimated costs)
275
- - **HuggingFace Support**: `pipeline()`, `AutoModelForCausalLM.generate()`, `AutoModelForSeq2SeqLM.generate()`, `InferenceClient` API calls
276
- - **Other Providers**: Anyscale
277
-
278
- ### Frameworks
279
- - LangChain (chains, agents, tools)
280
- - LlamaIndex (query engines, indices)
281
-
282
- ### MCP Tools (Model Context Protocol)
283
- - **Databases**: PostgreSQL, MySQL, MongoDB, SQLAlchemy
284
- - **Caching**: Redis
285
- - **Message Queues**: Apache Kafka
286
- - **Vector Databases**: Pinecone, Weaviate, Qdrant, ChromaDB, Milvus, FAISS
287
- - **APIs**: HTTP/REST requests (requests, httpx)
288
-
289
- ### OpenInference (Optional - Python 3.10+ only)
290
- - Smolagents - HuggingFace smolagents framework tracing
291
- - MCP - Model Context Protocol instrumentation
292
- - LiteLLM - Multi-provider LLM proxy
293
-
294
- **Cost Enrichment:** OpenInference instrumentors are automatically enriched with cost tracking! When cost tracking is enabled (`GENAI_ENABLE_COST_TRACKING=true`), a custom `CostEnrichmentSpanProcessor` extracts model and token usage from OpenInference spans and adds cost attributes (`gen_ai.usage.cost.total`, `gen_ai.usage.cost.prompt`, `gen_ai.usage.cost.completion`) using our comprehensive pricing database of 145+ models.
295
-
296
- The processor supports OpenInference semantic conventions:
297
- - Model: `llm.model_name`, `embedding.model_name`
298
- - Tokens: `llm.token_count.prompt`, `llm.token_count.completion`
299
- - Operations: `openinference.span.kind` (LLM, EMBEDDING, CHAIN, RETRIEVER, etc.)
300
-
301
- **Note:** OpenInference instrumentors require Python >= 3.10. Install with:
302
- ```bash
303
- pip install genai-otel-instrument[openinference]
304
- ```
305
-
306
- ## Screenshots
307
-
308
- See the instrumentation in action across different LLM providers and observability backends.
309
-
310
- ### OpenAI Instrumentation
311
- Full trace capture for OpenAI API calls with token usage, costs, and latency metrics.
312
-
313
- <div align="center">
314
- <img src=".github/images/Screenshots/Traces_OpenAI.png" alt="OpenAI Traces" width="900"/>
315
- </div>
316
-
317
- ### Ollama (Local LLM) Instrumentation
318
- Zero-code instrumentation for local models running on Ollama with comprehensive observability.
319
-
320
- <div align="center">
321
- <img src=".github/images/Screenshots/Traces_Ollama.png" alt="Ollama Traces" width="900"/>
322
- </div>
323
-
324
- ### HuggingFace Transformers
325
- Direct instrumentation of HuggingFace Transformers with automatic token counting and cost estimation.
326
-
327
- <div align="center">
328
- <img src=".github/images/Screenshots/Trace_HuggingFace_Transformer_Models.png" alt="HuggingFace Transformer Traces" width="900"/>
329
- </div>
330
-
331
- ### SmolAgents Framework
332
- Complete agent workflow tracing with tool calls, iterations, and cost breakdown.
333
-
334
- <div align="center">
335
- <img src=".github/images/Screenshots/Traces_SmolAgent_with_tool_calls.png" alt="SmolAgent Traces with Tool Calls" width="900"/>
336
- </div>
337
-
338
- ### GPU Metrics Collection
339
- Real-time GPU utilization, memory, temperature, and power consumption metrics.
340
-
341
- <div align="center">
342
- <img src=".github/images/Screenshots/GPU_Metrics.png" alt="GPU Metrics Dashboard" width="900"/>
343
- </div>
344
-
345
- ### Additional Screenshots
346
-
347
- - **[Token Cost Breakdown](.github/images/Screenshots/Traces_SmolAgent_Token_Cost_breakdown.png)** - Detailed token usage and cost analysis for SmolAgent workflows
348
- - **[OpenSearch Dashboard](.github/images/Screenshots/GENAI_OpenSearch_output.png)** - GenAI metrics visualization in OpenSearch/Kibana
349
-
350
- ---
351
-
352
- ## Demo Video
353
-
354
- Watch a comprehensive walkthrough of GenAI OpenTelemetry Auto-Instrumentation in action, demonstrating setup, configuration, and real-time observability across multiple LLM providers.
355
-
356
- <div align="center">
357
-
358
- **🎥 [Watch Demo Video](https://youtu.be/YOUR_VIDEO_ID_HERE)**
359
- *(Coming Soon)*
360
-
361
- </div>
362
-
363
- ---
364
-
365
- ## Cost Tracking Coverage
366
-
367
- The library includes comprehensive cost tracking with pricing data for **145+ models** across **11 providers**:
368
-
369
- ### Providers with Full Token-Based Cost Tracking
370
- - **OpenAI**: GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo, o1/o3 series, embeddings, audio, vision (35+ models)
371
- - **Anthropic**: Claude 3.5 Sonnet/Opus/Haiku, Claude 3 series (10+ models)
372
- - **Google AI**: Gemini 1.5/2.0 Pro/Flash, PaLM 2 (12+ models)
373
- - **AWS Bedrock**: Amazon Titan, Claude, Llama, Mistral models (20+ models)
374
- - **Azure OpenAI**: Same as OpenAI with Azure-specific pricing
375
- - **Cohere**: Command R/R+, Command Light, Embed v3/v2 (8+ models)
376
- - **Mistral AI**: Mistral Large/Medium/Small, Mixtral, embeddings (8+ models)
377
- - **Together AI**: DeepSeek-R1, Llama 3.x, Qwen, Mixtral (25+ models)
378
- - **Groq**: Llama 3.x series, Mixtral, Gemma models (15+ models)
379
- - **Ollama**: Local models with token tracking (pricing via cost estimation)
380
- - **Vertex AI**: Gemini models via Google Cloud with usage metadata extraction
381
-
382
- ### Special Pricing Models
383
- - **Replicate**: Hardware-based pricing ($/second of GPU/CPU time) - not token-based
384
- - **HuggingFace Transformers**: Local model execution with estimated costs based on parameter count
385
- - Supports `pipeline()`, `AutoModelForCausalLM.generate()`, `AutoModelForSeq2SeqLM.generate()`
386
- - Cost estimation uses GPU/compute resource pricing tiers (tiny/small/medium/large)
387
- - Automatic token counting from tensor shapes
388
-
389
- ### Pricing Features
390
- - **Differential Pricing**: Separate rates for prompt tokens vs. completion tokens
391
- - **Reasoning Tokens**: Special pricing for OpenAI o1/o3 reasoning tokens
392
- - **Cache Pricing**: Anthropic prompt caching costs (read/write)
393
- - **Granular Cost Metrics**: Per-request cost breakdown by token type
394
- - **Auto-Updated Pricing**: Pricing data maintained in `llm_pricing.json`
395
- - **Custom Pricing**: Add pricing for custom/proprietary models via environment variable
396
-
397
- ### Adding Custom Model Pricing
398
-
399
- For custom or proprietary models not in `llm_pricing.json`, you can provide custom pricing via the `GENAI_CUSTOM_PRICING_JSON` environment variable:
400
-
401
- ```bash
402
- # For chat models
403
- export GENAI_CUSTOM_PRICING_JSON='{"chat":{"my-custom-model":{"promptPrice":0.001,"completionPrice":0.002}}}'
404
-
405
- # For embeddings
406
- export GENAI_CUSTOM_PRICING_JSON='{"embeddings":{"my-custom-embeddings":0.00005}}'
407
-
408
- # For multiple categories
409
- export GENAI_CUSTOM_PRICING_JSON='{
410
- "chat": {
411
- "my-custom-chat": {"promptPrice": 0.001, "completionPrice": 0.002}
412
- },
413
- "embeddings": {
414
- "my-custom-embed": 0.00005
415
- },
416
- "audio": {
417
- "my-custom-tts": 0.02
418
- }
419
- }'
420
- ```
421
-
422
- **Pricing Format:**
423
- - **Chat models**: `{"promptPrice": <$/1k tokens>, "completionPrice": <$/1k tokens>}`
424
- - **Embeddings**: Single number for price per 1k tokens
425
- - **Audio**: Price per 1k characters (TTS) or per second (STT)
426
- - **Images**: Nested structure with quality/size pricing (see `llm_pricing.json` for examples)
427
-
428
- **Hybrid Pricing:** Custom prices are merged with default pricing from `llm_pricing.json`. If you provide custom pricing for an existing model, the custom price overrides the default.
429
-
430
- **Coverage Statistics**: As of v0.1.3, 89% test coverage with 415 passing tests, including comprehensive cost calculation validation and cost enrichment processor tests (supporting both GenAI and OpenInference semantic conventions).
431
-
432
- ## Collected Telemetry
433
-
434
- ### Traces
435
- Every LLM call, database query, API request, and vector search is traced with full context propagation.
436
-
437
- ### Metrics
438
-
439
- **GenAI Metrics:**
440
- - `gen_ai.requests` - Request counts by provider and model
441
- - `gen_ai.client.token.usage` - Token usage (prompt/completion)
442
- - `gen_ai.client.operation.duration` - Request latency histogram (optimized buckets for LLM workloads)
443
- - `gen_ai.usage.cost` - Total estimated costs in USD
444
- - `gen_ai.usage.cost.prompt` - Prompt tokens cost (granular)
445
- - `gen_ai.usage.cost.completion` - Completion tokens cost (granular)
446
- - `gen_ai.usage.cost.reasoning` - Reasoning tokens cost (OpenAI o1 models)
447
- - `gen_ai.usage.cost.cache_read` - Cache read cost (Anthropic)
448
- - `gen_ai.usage.cost.cache_write` - Cache write cost (Anthropic)
449
- - `gen_ai.client.errors` - Error counts by operation and type
450
- - `gen_ai.gpu.*` - GPU utilization, memory, temperature, power (ObservableGauges)
451
- - `gen_ai.co2.emissions` - CO2 emissions tracking (opt-in via `GENAI_ENABLE_CO2_TRACKING`)
452
- - `gen_ai.power.cost` - Cumulative electricity cost in USD based on GPU power consumption (configurable via `GENAI_POWER_COST_PER_KWH`)
453
- - `gen_ai.server.ttft` - Time to First Token for streaming responses (histogram, 1ms-10s buckets)
454
- - `gen_ai.server.tbt` - Time Between Tokens for streaming responses (histogram, 10ms-2.5s buckets)
455
-
456
- **MCP Metrics (Database Operations):**
457
- - `mcp.requests` - Number of MCP/database requests
458
- - `mcp.client.operation.duration` - Operation duration histogram (1ms to 10s buckets)
459
- - `mcp.request.size` - Request payload size histogram (100B to 5MB buckets)
460
- - `mcp.response.size` - Response payload size histogram (100B to 5MB buckets)
461
-
462
- ### Span Attributes
463
- **Core Attributes:**
464
- - `gen_ai.system` - Provider name (e.g., "openai")
465
- - `gen_ai.operation.name` - Operation type (e.g., "chat")
466
- - `gen_ai.request.model` - Model identifier
467
- - `gen_ai.usage.prompt_tokens` / `gen_ai.usage.input_tokens` - Input tokens (dual emission supported)
468
- - `gen_ai.usage.completion_tokens` / `gen_ai.usage.output_tokens` - Output tokens (dual emission supported)
469
- - `gen_ai.usage.total_tokens` - Total tokens
470
-
471
- **Request Parameters:**
472
- - `gen_ai.request.temperature` - Temperature setting
473
- - `gen_ai.request.top_p` - Top-p sampling
474
- - `gen_ai.request.max_tokens` - Max tokens requested
475
- - `gen_ai.request.frequency_penalty` - Frequency penalty
476
- - `gen_ai.request.presence_penalty` - Presence penalty
477
-
478
- **Response Attributes:**
479
- - `gen_ai.response.id` - Response ID from provider
480
- - `gen_ai.response.model` - Actual model used (may differ from request)
481
- - `gen_ai.response.finish_reasons` - Array of finish reasons
482
-
483
- **Tool/Function Calls:**
484
- - `llm.tools` - JSON-serialized tool definitions
485
- - `llm.output_messages.{choice}.message.tool_calls.{index}.tool_call.id` - Tool call ID
486
- - `llm.output_messages.{choice}.message.tool_calls.{index}.tool_call.function.name` - Function name
487
- - `llm.output_messages.{choice}.message.tool_calls.{index}.tool_call.function.arguments` - Function arguments
488
-
489
- **Cost Attributes (granular):**
490
- - `gen_ai.usage.cost.total` - Total cost
491
- - `gen_ai.usage.cost.prompt` - Prompt tokens cost
492
- - `gen_ai.usage.cost.completion` - Completion tokens cost
493
- - `gen_ai.usage.cost.reasoning` - Reasoning tokens cost (o1 models)
494
- - `gen_ai.usage.cost.cache_read` - Cache read cost (Anthropic)
495
- - `gen_ai.usage.cost.cache_write` - Cache write cost (Anthropic)
496
-
497
- **Streaming Attributes:**
498
- - `gen_ai.server.ttft` - Time to First Token (seconds) for streaming responses
499
- - `gen_ai.streaming.token_count` - Total number of chunks in streaming response
500
- - `gen_ai.usage.prompt_tokens` - Actual prompt tokens (extracted from final chunk)
501
- - `gen_ai.usage.completion_tokens` - Actual completion tokens (extracted from final chunk)
502
- - `gen_ai.usage.total_tokens` - Total tokens (extracted from final chunk)
503
- - `gen_ai.usage.cost.total` - Total cost for streaming request
504
- - `gen_ai.usage.cost.prompt` - Prompt tokens cost for streaming request
505
- - `gen_ai.usage.cost.completion` - Completion tokens cost for streaming request
506
- - All granular cost attributes (reasoning, cache_read, cache_write) also available for streaming
507
-
508
- **Content Events (opt-in):**
509
- - `gen_ai.prompt.{index}` events with role and content
510
- - `gen_ai.completion.{index}` events with role and content
511
-
512
- **Additional:**
513
- - Database, vector DB, and API attributes from MCP instrumentation
514
-
515
- ## Configuration
516
-
517
- ### Environment Variables
518
-
519
- ```bash
520
- # Required
521
- OTEL_SERVICE_NAME=my-app
522
- OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
523
-
524
- # Optional
525
- OTEL_EXPORTER_OTLP_HEADERS=x-api-key=secret
526
- GENAI_ENABLE_GPU_METRICS=true
527
- GENAI_ENABLE_COST_TRACKING=true
528
- GENAI_ENABLE_MCP_INSTRUMENTATION=true
529
- GENAI_GPU_COLLECTION_INTERVAL=5 # GPU metrics collection interval in seconds (default: 5)
530
- OTEL_SERVICE_INSTANCE_ID=instance-1 # Optional service instance id
531
- OTEL_ENVIRONMENT=production # Optional environment
532
- OTEL_EXPORTER_OTLP_TIMEOUT=10.0 # Optional timeout for OTLP exporter
533
-
534
- # Semantic conventions (NEW)
535
- OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai # "gen_ai" for new conventions only, "gen_ai/dup" for dual emission
536
- GENAI_ENABLE_CONTENT_CAPTURE=false # WARNING: May capture sensitive data. Enable with caution.
537
-
538
- # Logging configuration
539
- GENAI_OTEL_LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL. Logs are written to 'logs/genai_otel.log' with rotation (10 files, 10MB each).
540
-
541
- # Error handling
542
- GENAI_FAIL_ON_ERROR=false # true to fail fast, false to continue on errors
543
- ```
544
-
545
- ### Programmatic Configuration
546
-
547
- ```python
548
- import genai_otel
549
-
550
- genai_otel.instrument(
551
- service_name="my-app",
552
- endpoint="http://localhost:4318",
553
- enable_gpu_metrics=True,
554
- enable_cost_tracking=True,
555
- enable_mcp_instrumentation=True
556
- )
557
- ```
558
-
559
- ### Sample Environment File (`sample.env`)
560
-
561
- A `sample.env` file has been generated in the project root directory. This file contains commented-out examples of all supported environment variables, along with their default values or expected formats. You can copy this file to `.env` and uncomment/modify the variables to configure the instrumentation for your specific needs.
562
-
563
- ## Advanced Features
564
-
565
- ### Session and User Tracking
566
-
567
- Track user sessions and identify users across multiple LLM requests for better analytics, debugging, and cost attribution.
568
-
569
- **Configuration:**
570
-
571
- ```python
572
- import genai_otel
573
- from genai_otel import OTelConfig
574
-
575
- # Define extractor functions
576
- def extract_session_id(instance, args, kwargs):
577
- """Extract session ID from request metadata."""
578
- # Option 1: From kwargs metadata
579
- metadata = kwargs.get("metadata", {})
580
- return metadata.get("session_id")
581
-
582
- # Option 2: From custom headers
583
- # headers = kwargs.get("headers", {})
584
- # return headers.get("X-Session-ID")
585
-
586
- # Option 3: From thread-local storage
587
- # import threading
588
- # return getattr(threading.current_thread(), "session_id", None)
589
-
590
- def extract_user_id(instance, args, kwargs):
591
- """Extract user ID from request metadata."""
592
- metadata = kwargs.get("metadata", {})
593
- return metadata.get("user_id")
594
-
595
- # Configure with extractors
596
- config = OTelConfig(
597
- service_name="my-rag-app",
598
- endpoint="http://localhost:4318",
599
- session_id_extractor=extract_session_id,
600
- user_id_extractor=extract_user_id,
601
- )
602
-
603
- genai_otel.instrument(config)
604
- ```
605
-
606
- **Usage:**
607
-
608
- ```python
609
- from openai import OpenAI
610
-
611
- client = OpenAI()
612
-
613
- # Pass session and user info via metadata
614
- response = client.chat.completions.create(
615
- model="gpt-3.5-turbo",
616
- messages=[{"role": "user", "content": "What is OpenTelemetry?"}],
617
- extra_body={"metadata": {"session_id": "sess_12345", "user_id": "user_alice"}}
618
- )
619
- ```
620
-
621
- **Span Attributes Added:**
622
- - `session.id` - Unique session identifier for tracking conversations
623
- - `user.id` - User identifier for per-user analytics and cost tracking
624
-
625
- **Use Cases:**
626
- - Track multi-turn conversations across requests
627
- - Analyze usage patterns per user
628
- - Debug session-specific issues
629
- - Calculate per-user costs and quotas
630
- - Build user-specific dashboards
631
-
632
- ### RAG and Embedding Attributes
633
-
634
- Enhanced observability for Retrieval-Augmented Generation (RAG) workflows, including embedding generation and document retrieval.
635
-
636
- **Helper Methods:**
637
-
638
- The `BaseInstrumentor` provides helper methods to add RAG-specific attributes to your spans:
639
-
640
- ```python
641
- from opentelemetry import trace
642
- from genai_otel.instrumentors.base import BaseInstrumentor
643
-
644
- # Get your instrumentor instance (or create spans manually)
645
- tracer = trace.get_tracer(__name__)
646
-
647
- # 1. Embedding Attributes
648
- with tracer.start_as_current_span("embedding.create") as span:
649
- # Your embedding logic
650
- embedding_response = client.embeddings.create(
651
- model="text-embedding-3-small",
652
- input="OpenTelemetry provides observability"
653
- )
654
-
655
- # Add embedding attributes (if using BaseInstrumentor)
656
- # instrumentor.add_embedding_attributes(
657
- # span,
658
- # model="text-embedding-3-small",
659
- # input_text="OpenTelemetry provides observability",
660
- # vector=embedding_response.data[0].embedding
661
- # )
662
-
663
- # Or manually set attributes
664
- span.set_attribute("embedding.model_name", "text-embedding-3-small")
665
- span.set_attribute("embedding.text", "OpenTelemetry provides observability"[:500])
666
- span.set_attribute("embedding.vector.dimension", len(embedding_response.data[0].embedding))
667
-
668
- # 2. Retrieval Attributes
669
- with tracer.start_as_current_span("retrieval.search") as span:
670
- # Your retrieval logic
671
- retrieved_docs = [
672
- {
673
- "id": "doc_001",
674
- "score": 0.95,
675
- "content": "OpenTelemetry is an observability framework...",
676
- "metadata": {"source": "docs.opentelemetry.io", "category": "intro"}
677
- },
678
- # ... more documents
679
- ]
680
-
681
- # Add retrieval attributes (if using BaseInstrumentor)
682
- # instrumentor.add_retrieval_attributes(
683
- # span,
684
- # documents=retrieved_docs,
685
- # query="What is OpenTelemetry?",
686
- # max_docs=5
687
- # )
688
-
689
- # Or manually set attributes
690
- span.set_attribute("retrieval.query", "What is OpenTelemetry?"[:500])
691
- span.set_attribute("retrieval.document_count", len(retrieved_docs))
692
-
693
- for i, doc in enumerate(retrieved_docs[:5]): # Limit to 5 docs
694
- prefix = f"retrieval.documents.{i}.document"
695
- span.set_attribute(f"{prefix}.id", doc["id"])
696
- span.set_attribute(f"{prefix}.score", doc["score"])
697
- span.set_attribute(f"{prefix}.content", doc["content"][:500])
698
-
699
- # Add metadata
700
- for key, value in doc.get("metadata", {}).items():
701
- span.set_attribute(f"{prefix}.metadata.{key}", str(value))
702
- ```
703
-
704
- **Embedding Attributes:**
705
- - `embedding.model_name` - Embedding model used
706
- - `embedding.text` - Input text (truncated to 500 chars)
707
- - `embedding.vector` - Embedding vector (optional, if configured)
708
- - `embedding.vector.dimension` - Vector dimensions
709
-
710
- **Retrieval Attributes:**
711
- - `retrieval.query` - Search query (truncated to 500 chars)
712
- - `retrieval.document_count` - Number of documents retrieved
713
- - `retrieval.documents.{i}.document.id` - Document ID
714
- - `retrieval.documents.{i}.document.score` - Relevance score
715
- - `retrieval.documents.{i}.document.content` - Document content (truncated to 500 chars)
716
- - `retrieval.documents.{i}.document.metadata.*` - Custom metadata fields
717
-
718
- **Safeguards:**
719
- - Text content truncated to 500 characters to avoid span size explosion
720
- - Document count limited to 5 by default (configurable via `max_docs`)
721
- - Metadata values truncated to prevent excessive attribute counts
722
-
723
- **Complete RAG Workflow Example:**
724
-
725
- See `examples/phase4_session_rag_tracking.py` for a comprehensive demonstration of:
726
- - Session and user tracking across RAG pipeline
727
- - Embedding attribute capture
728
- - Retrieval attribute capture
729
- - End-to-end RAG workflow with full observability
730
-
731
- **Use Cases:**
732
- - Monitor retrieval quality and relevance scores
733
- - Debug RAG pipeline performance
734
- - Track embedding model usage
735
- - Analyze document retrieval patterns
736
- - Optimize vector search configurations
737
-
738
- ## Example: Full-Stack GenAI App
739
-
740
- ```python
741
- import genai_otel
742
- genai_otel.instrument()
743
-
744
- import openai
745
- import pinecone
746
- import redis
747
- import psycopg2
748
-
749
- # All of these are automatically instrumented:
750
-
751
- # Cache check
752
- cache = redis.Redis().get('key')
753
-
754
- # Vector search
755
- pinecone_index = pinecone.Index("embeddings")
756
- results = pinecone_index.query(vector=[...], top_k=5)
757
-
758
- # Database query
759
- conn = psycopg2.connect("dbname=mydb")
760
- cursor = conn.cursor()
761
- cursor.execute("SELECT * FROM context")
762
-
763
- # LLM call with full context
764
- client = openai.OpenAI()
765
- response = client.chat.completions.create(
766
- model="gpt-4",
767
- messages=[...]
768
- )
769
-
770
- # You get:
771
- # ✓ Distributed traces across all services
772
- # ✓ Cost tracking for the LLM call
773
- # ✓ Performance metrics for DB, cache, vector DB
774
- # ✓ GPU metrics if using local models
775
- # ✓ Complete observability with zero manual instrumentation
776
- ```
777
-
778
- ## Backend Integration
779
-
780
- Works with any OpenTelemetry-compatible backend:
781
- - Jaeger, Zipkin
782
- - Prometheus, Grafana
783
- - Datadog, New Relic, Honeycomb
784
- - AWS X-Ray, Google Cloud Trace
785
- - Elastic APM, Splunk
786
- - Self-hosted OTEL Collector
787
-
788
- ## Project Structure
789
-
790
- ```bash
791
- genai-otel-instrument/
792
- ├── setup.py
793
- ├── MANIFEST.in
794
- ├── README.md
795
- ├── LICENSE
796
- ├── example_usage.py
797
- └── genai_otel/
798
- ├── __init__.py
799
- ├── config.py
800
- ├── auto_instrument.py
801
- ├── cli.py
802
- ├── cost_calculator.py
803
- ├── gpu_metrics.py
804
- ├── instrumentors/
805
- │ ├── __init__.py
806
- │ ├── base.py
807
- │ └── (other instrumentor files)
808
- └── mcp_instrumentors/
809
- ├── __init__.py
810
- ├── manager.py
811
- └── (other mcp files)
812
- ```
813
-
814
- ## Roadmap
815
-
816
- ### Next Release (v0.2.0) - Q1 2026
817
-
818
- We're planning significant enhancements for the next major release, focusing on evaluation metrics and safety guardrails alongside completing OpenTelemetry semantic convention compliance.
819
-
820
- #### 🎯 Evaluation & Monitoring
821
-
822
- **LLM Output Quality Metrics**
823
- - **Bias Detection** - Automatically detect and measure bias in LLM responses
824
- - Gender, racial, political, and cultural bias detection
825
- - Bias score metrics with configurable thresholds
826
- - Integration with fairness libraries (e.g., Fairlearn, AIF360)
827
-
828
- - **Toxicity Detection** - Monitor and alert on toxic or harmful content
829
- - Perspective API integration for toxicity scoring
830
- - Custom toxicity models support
831
- - Real-time toxicity metrics and alerts
832
- - Configurable severity levels
833
-
834
- - **Hallucination Detection** - Track factual accuracy and groundedness
835
- - Fact-checking against provided context
836
- - Citation validation for RAG applications
837
- - Confidence scoring for generated claims
838
- - Hallucination rate metrics by model and use case
839
-
840
- **Implementation:**
841
- ```python
842
- import genai_otel
843
-
844
- # Enable evaluation metrics
845
- genai_otel.instrument(
846
- enable_bias_detection=True,
847
- enable_toxicity_detection=True,
848
- enable_hallucination_detection=True,
849
-
850
- # Configure thresholds
851
- bias_threshold=0.7,
852
- toxicity_threshold=0.5,
853
- hallucination_threshold=0.8
854
- )
855
- ```
856
-
857
- **Metrics Added:**
858
- - `gen_ai.eval.bias_score` - Bias detection scores (histogram)
859
- - `gen_ai.eval.toxicity_score` - Toxicity scores (histogram)
860
- - `gen_ai.eval.hallucination_score` - Hallucination probability (histogram)
861
- - `gen_ai.eval.violations` - Count of threshold violations by type
862
-
863
- #### 🛡️ Safety Guardrails
864
-
865
- **Input/Output Filtering**
866
- - **Prompt Injection Detection** - Protect against prompt injection attacks
867
- - Pattern-based detection (jailbreaking attempts)
868
- - ML-based classifier for sophisticated attacks
869
- - Real-time blocking with configurable policies
870
- - Attack attempt metrics and logging
871
-
872
- - **Restricted Topics** - Block sensitive or inappropriate topics
873
- - Configurable topic blacklists (legal, medical, financial advice)
874
- - Industry-specific content filters
875
- - Topic detection with confidence scoring
876
- - Custom topic definition support
877
-
878
- - **Sensitive Information Protection** - Prevent PII leakage
879
- - PII detection (emails, phone numbers, SSN, credit cards)
880
- - Automatic redaction or blocking
881
- - Compliance mode (GDPR, HIPAA, PCI-DSS)
882
- - Data leak prevention metrics
883
-
884
- **Implementation:**
885
- ```python
886
- import genai_otel
887
-
888
- # Configure guardrails
889
- genai_otel.instrument(
890
- enable_prompt_injection_detection=True,
891
- enable_restricted_topics=True,
892
- enable_sensitive_info_detection=True,
893
-
894
- # Custom configuration
895
- restricted_topics=["medical_advice", "legal_advice", "financial_advice"],
896
- pii_detection_mode="block", # or "redact", "warn"
897
-
898
- # Callbacks for custom handling
899
- on_guardrail_violation=my_violation_handler
900
- )
901
- ```
902
-
903
- **Metrics Added:**
904
- - `gen_ai.guardrail.prompt_injection_detected` - Injection attempts blocked
905
- - `gen_ai.guardrail.restricted_topic_blocked` - Restricted topic violations
906
- - `gen_ai.guardrail.pii_detected` - PII detection events
907
- - `gen_ai.guardrail.violations` - Total guardrail violations by type
908
-
909
- **Span Attributes:**
910
- - `gen_ai.guardrail.violation_type` - Type of violation detected
911
- - `gen_ai.guardrail.violation_severity` - Severity level (low, medium, high, critical)
912
- - `gen_ai.guardrail.blocked` - Whether request was blocked (boolean)
913
- - `gen_ai.eval.bias_categories` - Detected bias types (array)
914
- - `gen_ai.eval.toxicity_categories` - Toxicity categories (array)
915
-
916
- #### 🔄 Migration Support
917
-
918
- **Backward Compatibility:**
919
- - All new features are opt-in via configuration
920
- - Existing instrumentation continues to work unchanged
921
- - Gradual migration path for new semantic conventions
922
-
923
- **Version Support:**
924
- - Python 3.9+ (evaluation features require 3.10+)
925
- - OpenTelemetry SDK 1.20.0+
926
- - Backward compatible with existing dashboards
927
-
928
- ### Future Releases
929
-
930
- **v0.3.0 - Advanced Analytics**
931
- - Custom metric aggregations
932
- - Cost optimization recommendations
933
- - Automated performance regression detection
934
- - A/B testing support for prompts
935
-
936
- **v0.4.0 - Enterprise Features**
937
- - Multi-tenancy support
938
- - Role-based access control for telemetry
939
- - Advanced compliance reporting
940
- - SLA monitoring and alerting
941
-
942
- **Community Feedback**
943
-
944
- We welcome feedback on our roadmap! Please:
945
- - Open issues for feature requests
946
- - Join discussions on prioritization
947
- - Share your use cases and requirements
948
-
949
- See [Contributing.md](Contributing.md) for how to get involved.
950
-
951
- ## License
952
- Apache-2.0 license
1
+ Metadata-Version: 2.4
2
+ Name: genai-otel-instrument
3
+ Version: 0.1.16
4
+ Summary: Comprehensive OpenTelemetry auto-instrumentation for LLM/GenAI applications
5
+ Author-email: Kshitij Thakkar <kshitijthakkar@rocketmail.com>
6
+ License: AGPL-3.0-or-later
7
+ Project-URL: Homepage, https://github.com/Mandark-droid/genai_otel_instrument
8
+ Project-URL: Repository, https://github.com/Mandark-droid/genai_otel_instrument
9
+ Project-URL: Documentation, https://github.com/Mandark-droid/genai_otel_instrument#readme
10
+ Project-URL: Issues, https://github.com/Mandark-droid/genai_otel_instrument/issues
11
+ Project-URL: Changelog, https://github.com/Mandark-droid/genai_otel_instrument/blob/main/CHANGELOG.md
12
+ Keywords: opentelemetry,observability,llm,genai,instrumentation,tracing,metrics,monitoring
13
+ Classifier: Development Status :: 4 - Beta
14
+ Classifier: Intended Audience :: Developers
15
+ Classifier: Topic :: Software Development :: Libraries :: Python Modules
16
+ Classifier: Topic :: System :: Monitoring
17
+ Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
18
+ Classifier: Operating System :: OS Independent
19
+ Classifier: Programming Language :: Python :: 3
20
+ Classifier: Programming Language :: Python :: 3.9
21
+ Classifier: Programming Language :: Python :: 3.10
22
+ Classifier: Programming Language :: Python :: 3.11
23
+ Classifier: Programming Language :: Python :: 3.12
24
+ Requires-Python: >=3.9
25
+ Description-Content-Type: text/markdown
26
+ License-File: LICENSE
27
+ Requires-Dist: opentelemetry-api<2.0.0,>=1.20.0
28
+ Requires-Dist: opentelemetry-sdk<2.0.0,>=1.20.0
29
+ Requires-Dist: opentelemetry-instrumentation>=0.41b0
30
+ Requires-Dist: opentelemetry-semantic-conventions<1.0.0,>=0.45b0
31
+ Requires-Dist: opentelemetry-exporter-otlp>=1.20.0
32
+ Requires-Dist: opentelemetry-instrumentation-requests>=0.41b0
33
+ Requires-Dist: opentelemetry-instrumentation-httpx>=0.41b0
34
+ Requires-Dist: requests>=2.20.0
35
+ Requires-Dist: wrapt>=1.14.0
36
+ Requires-Dist: httpx>=0.23.0
37
+ Requires-Dist: opentelemetry-instrumentation-mysql>=0.41b0
38
+ Requires-Dist: mysql-connector-python<9.0.0,>=8.0.0
39
+ Requires-Dist: opentelemetry-instrumentation-psycopg2>=0.41b0
40
+ Requires-Dist: psycopg2-binary>=2.9.0
41
+ Requires-Dist: opentelemetry-instrumentation-redis>=0.41b0
42
+ Requires-Dist: redis
43
+ Requires-Dist: opentelemetry-instrumentation-pymongo>=0.41b0
44
+ Requires-Dist: pymongo
45
+ Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.41b0
46
+ Requires-Dist: sqlalchemy>=1.4.0
47
+ Requires-Dist: opentelemetry-instrumentation-kafka-python>=0.41b0
48
+ Requires-Dist: kafka-python
49
+ Provides-Extra: openinference
50
+ Requires-Dist: openinference-instrumentation==0.1.31; extra == "openinference"
51
+ Requires-Dist: openinference-instrumentation-litellm==0.1.19; extra == "openinference"
52
+ Requires-Dist: openinference-instrumentation-mcp==1.3.0; extra == "openinference"
53
+ Requires-Dist: openinference-instrumentation-smolagents==0.1.11; extra == "openinference"
54
+ Requires-Dist: litellm>=1.0.0; extra == "openinference"
55
+ Provides-Extra: gpu
56
+ Requires-Dist: nvidia-ml-py>=11.495.46; extra == "gpu"
57
+ Requires-Dist: codecarbon>=2.3.0; extra == "gpu"
58
+ Provides-Extra: co2
59
+ Requires-Dist: codecarbon>=2.3.0; extra == "co2"
60
+ Provides-Extra: openai
61
+ Requires-Dist: openai>=1.0.0; extra == "openai"
62
+ Provides-Extra: anthropic
63
+ Requires-Dist: anthropic>=0.18.0; extra == "anthropic"
64
+ Provides-Extra: google
65
+ Requires-Dist: google-generativeai>=0.3.0; extra == "google"
66
+ Provides-Extra: aws
67
+ Requires-Dist: boto3>=1.28.0; extra == "aws"
68
+ Provides-Extra: azure
69
+ Requires-Dist: azure-ai-openai>=1.0.0; extra == "azure"
70
+ Provides-Extra: cohere
71
+ Requires-Dist: cohere>=4.0.0; extra == "cohere"
72
+ Provides-Extra: mistral
73
+ Requires-Dist: mistralai>=0.4.2; extra == "mistral"
74
+ Provides-Extra: together
75
+ Requires-Dist: together>=0.2.0; extra == "together"
76
+ Provides-Extra: groq
77
+ Requires-Dist: groq>=0.4.0; extra == "groq"
78
+ Provides-Extra: ollama
79
+ Requires-Dist: ollama>=0.1.0; extra == "ollama"
80
+ Provides-Extra: replicate
81
+ Requires-Dist: replicate>=0.15.0; extra == "replicate"
82
+ Provides-Extra: langchain
83
+ Requires-Dist: langchain>=0.1.0; extra == "langchain"
84
+ Provides-Extra: llamaindex
85
+ Requires-Dist: llama-index>=0.9.0; extra == "llamaindex"
86
+ Provides-Extra: huggingface
87
+ Requires-Dist: transformers>=4.30.0; extra == "huggingface"
88
+ Provides-Extra: databases
89
+ Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.41b0; extra == "databases"
90
+ Requires-Dist: sqlalchemy>=1.4.0; extra == "databases"
91
+ Requires-Dist: opentelemetry-instrumentation-redis>=0.41b0; extra == "databases"
92
+ Requires-Dist: redis; extra == "databases"
93
+ Requires-Dist: opentelemetry-instrumentation-pymongo>=0.41b0; extra == "databases"
94
+ Requires-Dist: pymongo; extra == "databases"
95
+ Requires-Dist: opentelemetry-instrumentation-psycopg2>=0.41b0; extra == "databases"
96
+ Requires-Dist: psycopg2-binary>=2.9.0; extra == "databases"
97
+ Requires-Dist: opentelemetry-instrumentation-mysql>=0.41b0; extra == "databases"
98
+ Requires-Dist: mysql-connector-python<9.0.0,>=8.0.0; extra == "databases"
99
+ Provides-Extra: messaging
100
+ Requires-Dist: opentelemetry-instrumentation-kafka-python>=0.41b0; extra == "messaging"
101
+ Requires-Dist: kafka-python; extra == "messaging"
102
+ Provides-Extra: vector-dbs
103
+ Requires-Dist: pinecone>=3.0.0; extra == "vector-dbs"
104
+ Requires-Dist: weaviate-client>=3.0.0; extra == "vector-dbs"
105
+ Requires-Dist: qdrant-client>=1.0.0; extra == "vector-dbs"
106
+ Requires-Dist: chromadb>=0.4.0; extra == "vector-dbs"
107
+ Requires-Dist: pymilvus>=2.3.0; extra == "vector-dbs"
108
+ Requires-Dist: faiss-cpu>=1.7.0; extra == "vector-dbs"
109
+ Provides-Extra: all-providers
110
+ Requires-Dist: openai>=1.0.0; extra == "all-providers"
111
+ Requires-Dist: anthropic>=0.18.0; extra == "all-providers"
112
+ Requires-Dist: google-generativeai>=0.3.0; extra == "all-providers"
113
+ Requires-Dist: boto3>=1.28.0; extra == "all-providers"
114
+ Requires-Dist: azure-ai-openai>=1.0.0; extra == "all-providers"
115
+ Requires-Dist: cohere>=4.0.0; extra == "all-providers"
116
+ Requires-Dist: mistralai>=0.4.2; extra == "all-providers"
117
+ Requires-Dist: together>=0.2.0; extra == "all-providers"
118
+ Requires-Dist: groq>=0.4.0; extra == "all-providers"
119
+ Requires-Dist: ollama>=0.1.0; extra == "all-providers"
120
+ Requires-Dist: replicate>=0.15.0; extra == "all-providers"
121
+ Requires-Dist: langchain>=0.1.0; extra == "all-providers"
122
+ Requires-Dist: llama-index>=0.9.0; extra == "all-providers"
123
+ Requires-Dist: transformers>=4.30.0; extra == "all-providers"
124
+ Requires-Dist: litellm>=1.0.0; extra == "all-providers"
125
+ Provides-Extra: all-mcp
126
+ Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.41b0; extra == "all-mcp"
127
+ Requires-Dist: opentelemetry-instrumentation-redis>=0.41b0; extra == "all-mcp"
128
+ Requires-Dist: opentelemetry-instrumentation-pymongo>=0.41b0; extra == "all-mcp"
129
+ Requires-Dist: opentelemetry-instrumentation-psycopg2>=0.41b0; extra == "all-mcp"
130
+ Requires-Dist: opentelemetry-instrumentation-mysql>=0.41b0; extra == "all-mcp"
131
+ Requires-Dist: opentelemetry-instrumentation-kafka-python>=0.41b0; extra == "all-mcp"
132
+ Requires-Dist: pinecone>=3.0.0; extra == "all-mcp"
133
+ Requires-Dist: weaviate-client>=3.0.0; extra == "all-mcp"
134
+ Requires-Dist: qdrant-client>=1.0.0; extra == "all-mcp"
135
+ Requires-Dist: chromadb>=0.4.0; extra == "all-mcp"
136
+ Requires-Dist: pymilvus>=2.3.0; extra == "all-mcp"
137
+ Requires-Dist: faiss-cpu>=1.7.0; extra == "all-mcp"
138
+ Requires-Dist: sqlalchemy; extra == "all-mcp"
139
+ Provides-Extra: all
140
+ Requires-Dist: openai>=1.0.0; extra == "all"
141
+ Requires-Dist: anthropic>=0.18.0; extra == "all"
142
+ Requires-Dist: google-generativeai>=0.3.0; extra == "all"
143
+ Requires-Dist: boto3>=1.28.0; extra == "all"
144
+ Requires-Dist: azure-ai-openai>=1.0.0; extra == "all"
145
+ Requires-Dist: cohere>=4.0.0; extra == "all"
146
+ Requires-Dist: mistralai>=0.4.2; extra == "all"
147
+ Requires-Dist: together>=0.2.0; extra == "all"
148
+ Requires-Dist: groq>=0.4.0; extra == "all"
149
+ Requires-Dist: ollama>=0.1.0; extra == "all"
150
+ Requires-Dist: replicate>=0.15.0; extra == "all"
151
+ Requires-Dist: langchain>=0.1.0; extra == "all"
152
+ Requires-Dist: llama-index>=0.9.0; extra == "all"
153
+ Requires-Dist: transformers>=4.30.0; extra == "all"
154
+ Requires-Dist: nvidia-ml-py>=11.495.46; extra == "all"
155
+ Requires-Dist: opentelemetry-instrumentation-sqlalchemy>=0.41b0; extra == "all"
156
+ Requires-Dist: opentelemetry-instrumentation-redis>=0.41b0; extra == "all"
157
+ Requires-Dist: opentelemetry-instrumentation-pymongo>=0.41b0; extra == "all"
158
+ Requires-Dist: opentelemetry-instrumentation-psycopg2>=0.41b0; extra == "all"
159
+ Requires-Dist: opentelemetry-instrumentation-mysql>=0.41b0; extra == "all"
160
+ Requires-Dist: opentelemetry-instrumentation-kafka-python>=0.41b0; extra == "all"
161
+ Requires-Dist: pinecone>=3.0.0; extra == "all"
162
+ Requires-Dist: weaviate-client>=3.0.0; extra == "all"
163
+ Requires-Dist: qdrant-client>=1.0.0; extra == "all"
164
+ Requires-Dist: chromadb>=0.4.0; extra == "all"
165
+ Requires-Dist: pymilvus>=2.3.0; extra == "all"
166
+ Requires-Dist: faiss-cpu>=1.7.0; extra == "all"
167
+ Requires-Dist: sqlalchemy; extra == "all"
168
+ Provides-Extra: dev
169
+ Requires-Dist: pytest>=7.0.0; extra == "dev"
170
+ Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
171
+ Requires-Dist: pytest-mock>=3.10.0; extra == "dev"
172
+ Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
173
+ Requires-Dist: black>=23.0.0; extra == "dev"
174
+ Requires-Dist: isort>=5.12.0; extra == "dev"
175
+ Requires-Dist: pylint>=2.17.0; extra == "dev"
176
+ Requires-Dist: mypy>=1.0.0; extra == "dev"
177
+ Requires-Dist: build>=0.10.0; extra == "dev"
178
+ Requires-Dist: twine>=4.0.0; extra == "dev"
179
+ Dynamic: license-file
180
+
181
+ # TraceVerde
182
+
183
+ <div align="center">
184
+ <img src=".github/images/Logo.jpg" alt="TraceVerde - GenAI OpenTelemetry Instrumentation Logo" width="400"/>
185
+ </div>
186
+
187
+ <br/>
188
+
189
+ [![PyPI version](https://badge.fury.io/py/genai-otel-instrument.svg)](https://badge.fury.io/py/genai-otel-instrument)
190
+ [![Python Versions](https://img.shields.io/pypi/pyversions/genai-otel-instrument.svg)](https://pypi.org/project/genai-otel-instrument/)
191
+ [![License](https://img.shields.io/badge/License-AGPL%203.0-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
192
+ [![Downloads](https://static.pepy.tech/badge/genai-otel-instrument)](https://pepy.tech/project/genai-otel-instrument)
193
+ [![Downloads/Month](https://static.pepy.tech/badge/genai-otel-instrument/month)](https://pepy.tech/project/genai-otel-instrument)
194
+
195
+ [![GitHub Stars](https://img.shields.io/github/stars/Mandark-droid/genai_otel_instrument?style=social)](https://github.com/Mandark-droid/genai_otel_instrument)
196
+ [![GitHub Forks](https://img.shields.io/github/forks/Mandark-droid/genai_otel_instrument?style=social)](https://github.com/Mandark-droid/genai_otel_instrument)
197
+ [![GitHub Issues](https://img.shields.io/github/issues/Mandark-droid/genai_otel_instrument)](https://github.com/Mandark-droid/genai_otel_instrument/issues)
198
+ [![GitHub Pull Requests](https://img.shields.io/github/issues-pr/Mandark-droid/genai_otel_instrument)](https://github.com/Mandark-droid/genai_otel_instrument/pulls)
199
+
200
+ [![Code Coverage](https://img.shields.io/badge/coverage-90%25-brightgreen.svg)](https://github.com/Mandark-droid/genai_otel_instrument)
201
+ [![Code Style: Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
202
+ [![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
203
+ [![Type Checked: mypy](https://img.shields.io/badge/type%20checked-mypy-blue.svg)](http://mypy-lang.org/)
204
+
205
+ [![OpenTelemetry](https://img.shields.io/badge/OpenTelemetry-1.20%2B-blueviolet)](https://opentelemetry.io/)
206
+ [![Semantic Conventions](https://img.shields.io/badge/OTel%20Semconv-GenAI%20v1.28-orange)](https://opentelemetry.io/docs/specs/semconv/gen-ai/)
207
+ [![CI/CD](https://img.shields.io/badge/CI%2FCD-GitHub%20Actions-2088FF?logo=github-actions&logoColor=white)](https://github.com/Mandark-droid/genai_otel_instrument/actions)
208
+
209
+ ---
210
+
211
+ <div align="center">
212
+ <img src=".github/images/Landing_Page.jpg" alt="GenAI OpenTelemetry Instrumentation Overview" width="800"/>
213
+ </div>
214
+
215
+ ---
216
+
217
+ Production-ready OpenTelemetry instrumentation for GenAI/LLM applications with zero-code setup.
218
+
219
+ ## Features
220
+
221
+ 🚀 **Zero-Code Instrumentation** - Just install and set env vars
222
+ 🤖 **15+ LLM Providers** - OpenAI, Anthropic, Google, AWS, Azure, and more
223
+ 🔧 **MCP Tool Support** - Auto-instrument databases, APIs, caches, vector DBs
224
+ 💰 **Cost Tracking** - Automatic cost calculation for both streaming and non-streaming requests
225
+ ⚡ **Streaming Support** - Full observability for streaming responses with TTFT/TBT metrics and cost tracking
226
+ 🎮 **GPU Metrics** - Real-time GPU utilization, memory, temperature, power, and electricity cost tracking
227
+ 📊 **Complete Observability** - Traces, metrics, and rich span attributes
228
+ ➕ **Service Instance ID & Environment** - Identify your services and environments
229
+ ⏱️ **Configurable Exporter Timeout** - Set timeout for OTLP exporter
230
+ 🔗 **OpenInference Instrumentors** - Smolagents, MCP, and LiteLLM instrumentation
231
+
232
+ ## Quick Start
233
+
234
+ ### Installation
235
+
236
+ ```bash
237
+ pip install genai-otel-instrument
238
+ ```
239
+
240
+ ### Usage
241
+
242
+ **Option 1: Environment Variables (No code changes)**
243
+
244
+ ```bash
245
+ export OTEL_SERVICE_NAME=my-llm-app
246
+ export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
247
+ python your_app.py
248
+ ```
249
+
250
+ **Option 2: One line of code**
251
+
252
+ ```python
253
+ import genai_otel
254
+ genai_otel.instrument()
255
+
256
+ # Your existing code works unchanged
257
+ import openai
258
+ client = openai.OpenAI()
259
+ response = client.chat.completions.create(...)
260
+ ```
261
+
262
+ **Option 3: CLI wrapper**
263
+
264
+ ```bash
265
+ genai-instrument python your_app.py
266
+ ```
267
+
268
+ For a more comprehensive demonstration of various LLM providers and MCP tools, refer to `example_usage.py` in the project root. Note that running this example requires setting up relevant API keys and external services (e.g., databases, Redis, Pinecone).
269
+
270
+ ## What Gets Instrumented?
271
+
272
+ ### LLM Providers (Auto-detected)
273
+ - **With Full Cost Tracking**: OpenAI, Anthropic, Google AI, AWS Bedrock, Azure OpenAI, Cohere, Mistral AI, Together AI, Groq, Ollama, Vertex AI
274
+ - **Hardware/Local Pricing**: Replicate (hardware-based $/second), HuggingFace (local execution with estimated costs)
275
+ - **HuggingFace Support**: `pipeline()`, `AutoModelForCausalLM.generate()`, `AutoModelForSeq2SeqLM.generate()`, `InferenceClient` API calls
276
+ - **Other Providers**: Anyscale
277
+
278
+ ### Frameworks
279
+ - LangChain (chains, agents, tools)
280
+ - LlamaIndex (query engines, indices)
281
+
282
+ ### MCP Tools (Model Context Protocol)
283
+ - **Databases**: PostgreSQL, MySQL, MongoDB, SQLAlchemy
284
+ - **Caching**: Redis
285
+ - **Message Queues**: Apache Kafka
286
+ - **Vector Databases**: Pinecone, Weaviate, Qdrant, ChromaDB, Milvus, FAISS
287
+ - **APIs**: HTTP/REST requests (requests, httpx)
288
+
289
+ ### OpenInference (Optional - Python 3.10+ only)
290
+ - Smolagents - HuggingFace smolagents framework tracing
291
+ - MCP - Model Context Protocol instrumentation
292
+ - LiteLLM - Multi-provider LLM proxy
293
+
294
+ **Cost Enrichment:** OpenInference instrumentors are automatically enriched with cost tracking! When cost tracking is enabled (`GENAI_ENABLE_COST_TRACKING=true`), a custom `CostEnrichmentSpanProcessor` extracts model and token usage from OpenInference spans and adds cost attributes (`gen_ai.usage.cost.total`, `gen_ai.usage.cost.prompt`, `gen_ai.usage.cost.completion`) using our comprehensive pricing database of 145+ models.
295
+
296
+ The processor supports OpenInference semantic conventions:
297
+ - Model: `llm.model_name`, `embedding.model_name`
298
+ - Tokens: `llm.token_count.prompt`, `llm.token_count.completion`
299
+ - Operations: `openinference.span.kind` (LLM, EMBEDDING, CHAIN, RETRIEVER, etc.)
300
+
301
+ **Note:** OpenInference instrumentors require Python >= 3.10. Install with:
302
+ ```bash
303
+ pip install genai-otel-instrument[openinference]
304
+ ```
305
+
306
+ ## Screenshots
307
+
308
+ See the instrumentation in action across different LLM providers and observability backends.
309
+
310
+ ### OpenAI Instrumentation
311
+ Full trace capture for OpenAI API calls with token usage, costs, and latency metrics.
312
+
313
+ <div align="center">
314
+ <img src=".github/images/Screenshots/Traces_OpenAI.png" alt="OpenAI Traces" width="900"/>
315
+ </div>
316
+
317
+ ### Ollama (Local LLM) Instrumentation
318
+ Zero-code instrumentation for local models running on Ollama with comprehensive observability.
319
+
320
+ <div align="center">
321
+ <img src=".github/images/Screenshots/Traces_Ollama.png" alt="Ollama Traces" width="900"/>
322
+ </div>
323
+
324
+ ### HuggingFace Transformers
325
+ Direct instrumentation of HuggingFace Transformers with automatic token counting and cost estimation.
326
+
327
+ <div align="center">
328
+ <img src=".github/images/Screenshots/Trace_HuggingFace_Transformer_Models.png" alt="HuggingFace Transformer Traces" width="900"/>
329
+ </div>
330
+
331
+ ### SmolAgents Framework
332
+ Complete agent workflow tracing with tool calls, iterations, and cost breakdown.
333
+
334
+ <div align="center">
335
+ <img src=".github/images/Screenshots/Traces_SmolAgent_with_tool_calls.png" alt="SmolAgent Traces with Tool Calls" width="900"/>
336
+ </div>
337
+
338
+ ### GPU Metrics Collection
339
+ Real-time GPU utilization, memory, temperature, and power consumption metrics.
340
+
341
+ <div align="center">
342
+ <img src=".github/images/Screenshots/GPU_Metrics.png" alt="GPU Metrics Dashboard" width="900"/>
343
+ </div>
344
+
345
+ ### Additional Screenshots
346
+
347
+ - **[Token Cost Breakdown](.github/images/Screenshots/Traces_SmolAgent_Token_Cost_breakdown.png)** - Detailed token usage and cost analysis for SmolAgent workflows
348
+ - **[OpenSearch Dashboard](.github/images/Screenshots/GENAI_OpenSearch_output.png)** - GenAI metrics visualization in OpenSearch/Kibana
349
+
350
+ ---
351
+
352
+ ## Demo Video
353
+
354
+ Watch a comprehensive walkthrough of GenAI OpenTelemetry Auto-Instrumentation in action, demonstrating setup, configuration, and real-time observability across multiple LLM providers.
355
+
356
+ <div align="center">
357
+
358
+ **🎥 [Watch Demo Video](https://youtu.be/YOUR_VIDEO_ID_HERE)**
359
+ *(Coming Soon)*
360
+
361
+ </div>
362
+
363
+ ---
364
+
365
+ ## Cost Tracking Coverage
366
+
367
+ The library includes comprehensive cost tracking with pricing data for **145+ models** across **11 providers**:
368
+
369
+ ### Providers with Full Token-Based Cost Tracking
370
+ - **OpenAI**: GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo, o1/o3 series, embeddings, audio, vision (35+ models)
371
+ - **Anthropic**: Claude 3.5 Sonnet/Opus/Haiku, Claude 3 series (10+ models)
372
+ - **Google AI**: Gemini 1.5/2.0 Pro/Flash, PaLM 2 (12+ models)
373
+ - **AWS Bedrock**: Amazon Titan, Claude, Llama, Mistral models (20+ models)
374
+ - **Azure OpenAI**: Same as OpenAI with Azure-specific pricing
375
+ - **Cohere**: Command R/R+, Command Light, Embed v3/v2 (8+ models)
376
+ - **Mistral AI**: Mistral Large/Medium/Small, Mixtral, embeddings (8+ models)
377
+ - **Together AI**: DeepSeek-R1, Llama 3.x, Qwen, Mixtral (25+ models)
378
+ - **Groq**: Llama 3.x series, Mixtral, Gemma models (15+ models)
379
+ - **Ollama**: Local models with token tracking (pricing via cost estimation)
380
+ - **Vertex AI**: Gemini models via Google Cloud with usage metadata extraction
381
+
382
+ ### Special Pricing Models
383
+ - **Replicate**: Hardware-based pricing ($/second of GPU/CPU time) - not token-based
384
+ - **HuggingFace Transformers**: Local model execution with estimated costs based on parameter count
385
+ - Supports `pipeline()`, `AutoModelForCausalLM.generate()`, `AutoModelForSeq2SeqLM.generate()`
386
+ - Cost estimation uses GPU/compute resource pricing tiers (tiny/small/medium/large)
387
+ - Automatic token counting from tensor shapes
388
+
389
+ ### Pricing Features
390
+ - **Differential Pricing**: Separate rates for prompt tokens vs. completion tokens
391
+ - **Reasoning Tokens**: Special pricing for OpenAI o1/o3 reasoning tokens
392
+ - **Cache Pricing**: Anthropic prompt caching costs (read/write)
393
+ - **Granular Cost Metrics**: Per-request cost breakdown by token type
394
+ - **Auto-Updated Pricing**: Pricing data maintained in `llm_pricing.json`
395
+ - **Custom Pricing**: Add pricing for custom/proprietary models via environment variable
396
+
397
+ ### Adding Custom Model Pricing
398
+
399
+ For custom or proprietary models not in `llm_pricing.json`, you can provide custom pricing via the `GENAI_CUSTOM_PRICING_JSON` environment variable:
400
+
401
+ ```bash
402
+ # For chat models
403
+ export GENAI_CUSTOM_PRICING_JSON='{"chat":{"my-custom-model":{"promptPrice":0.001,"completionPrice":0.002}}}'
404
+
405
+ # For embeddings
406
+ export GENAI_CUSTOM_PRICING_JSON='{"embeddings":{"my-custom-embeddings":0.00005}}'
407
+
408
+ # For multiple categories
409
+ export GENAI_CUSTOM_PRICING_JSON='{
410
+ "chat": {
411
+ "my-custom-chat": {"promptPrice": 0.001, "completionPrice": 0.002}
412
+ },
413
+ "embeddings": {
414
+ "my-custom-embed": 0.00005
415
+ },
416
+ "audio": {
417
+ "my-custom-tts": 0.02
418
+ }
419
+ }'
420
+ ```
421
+
422
+ **Pricing Format:**
423
+ - **Chat models**: `{"promptPrice": <$/1k tokens>, "completionPrice": <$/1k tokens>}`
424
+ - **Embeddings**: Single number for price per 1k tokens
425
+ - **Audio**: Price per 1k characters (TTS) or per second (STT)
426
+ - **Images**: Nested structure with quality/size pricing (see `llm_pricing.json` for examples)
427
+
428
+ **Hybrid Pricing:** Custom prices are merged with default pricing from `llm_pricing.json`. If you provide custom pricing for an existing model, the custom price overrides the default.
429
+
430
+ **Coverage Statistics**: As of v0.1.3, 89% test coverage with 415 passing tests, including comprehensive cost calculation validation and cost enrichment processor tests (supporting both GenAI and OpenInference semantic conventions).
431
+
432
+ ## Collected Telemetry
433
+
434
+ ### Traces
435
+ Every LLM call, database query, API request, and vector search is traced with full context propagation.
436
+
437
+ ### Metrics
438
+
439
+ **GenAI Metrics:**
440
+ - `gen_ai.requests` - Request counts by provider and model
441
+ - `gen_ai.client.token.usage` - Token usage (prompt/completion)
442
+ - `gen_ai.client.operation.duration` - Request latency histogram (optimized buckets for LLM workloads)
443
+ - `gen_ai.usage.cost` - Total estimated costs in USD
444
+ - `gen_ai.usage.cost.prompt` - Prompt tokens cost (granular)
445
+ - `gen_ai.usage.cost.completion` - Completion tokens cost (granular)
446
+ - `gen_ai.usage.cost.reasoning` - Reasoning tokens cost (OpenAI o1 models)
447
+ - `gen_ai.usage.cost.cache_read` - Cache read cost (Anthropic)
448
+ - `gen_ai.usage.cost.cache_write` - Cache write cost (Anthropic)
449
+ - `gen_ai.client.errors` - Error counts by operation and type
450
+ - `gen_ai.gpu.*` - GPU utilization, memory, temperature, power (ObservableGauges)
451
+ - `gen_ai.co2.emissions` - CO2 emissions tracking (opt-in via `GENAI_ENABLE_CO2_TRACKING`)
452
+ - `gen_ai.power.cost` - Cumulative electricity cost in USD based on GPU power consumption (configurable via `GENAI_POWER_COST_PER_KWH`)
453
+ - `gen_ai.server.ttft` - Time to First Token for streaming responses (histogram, 1ms-10s buckets)
454
+ - `gen_ai.server.tbt` - Time Between Tokens for streaming responses (histogram, 10ms-2.5s buckets)
455
+
456
+ **MCP Metrics (Database Operations):**
457
+ - `mcp.requests` - Number of MCP/database requests
458
+ - `mcp.client.operation.duration` - Operation duration histogram (1ms to 10s buckets)
459
+ - `mcp.request.size` - Request payload size histogram (100B to 5MB buckets)
460
+ - `mcp.response.size` - Response payload size histogram (100B to 5MB buckets)
461
+
462
+ ### Span Attributes
463
+ **Core Attributes:**
464
+ - `gen_ai.system` - Provider name (e.g., "openai")
465
+ - `gen_ai.operation.name` - Operation type (e.g., "chat")
466
+ - `gen_ai.request.model` - Model identifier
467
+ - `gen_ai.usage.prompt_tokens` / `gen_ai.usage.input_tokens` - Input tokens (dual emission supported)
468
+ - `gen_ai.usage.completion_tokens` / `gen_ai.usage.output_tokens` - Output tokens (dual emission supported)
469
+ - `gen_ai.usage.total_tokens` - Total tokens
470
+
471
+ **Request Parameters:**
472
+ - `gen_ai.request.temperature` - Temperature setting
473
+ - `gen_ai.request.top_p` - Top-p sampling
474
+ - `gen_ai.request.max_tokens` - Max tokens requested
475
+ - `gen_ai.request.frequency_penalty` - Frequency penalty
476
+ - `gen_ai.request.presence_penalty` - Presence penalty
477
+
478
+ **Response Attributes:**
479
+ - `gen_ai.response.id` - Response ID from provider
480
+ - `gen_ai.response.model` - Actual model used (may differ from request)
481
+ - `gen_ai.response.finish_reasons` - Array of finish reasons
482
+
483
+ **Tool/Function Calls:**
484
+ - `llm.tools` - JSON-serialized tool definitions
485
+ - `llm.output_messages.{choice}.message.tool_calls.{index}.tool_call.id` - Tool call ID
486
+ - `llm.output_messages.{choice}.message.tool_calls.{index}.tool_call.function.name` - Function name
487
+ - `llm.output_messages.{choice}.message.tool_calls.{index}.tool_call.function.arguments` - Function arguments
488
+
489
+ **Cost Attributes (granular):**
490
+ - `gen_ai.usage.cost.total` - Total cost
491
+ - `gen_ai.usage.cost.prompt` - Prompt tokens cost
492
+ - `gen_ai.usage.cost.completion` - Completion tokens cost
493
+ - `gen_ai.usage.cost.reasoning` - Reasoning tokens cost (o1 models)
494
+ - `gen_ai.usage.cost.cache_read` - Cache read cost (Anthropic)
495
+ - `gen_ai.usage.cost.cache_write` - Cache write cost (Anthropic)
496
+
497
+ **Streaming Attributes:**
498
+ - `gen_ai.server.ttft` - Time to First Token (seconds) for streaming responses
499
+ - `gen_ai.streaming.token_count` - Total number of chunks in streaming response
500
+ - `gen_ai.usage.prompt_tokens` - Actual prompt tokens (extracted from final chunk)
501
+ - `gen_ai.usage.completion_tokens` - Actual completion tokens (extracted from final chunk)
502
+ - `gen_ai.usage.total_tokens` - Total tokens (extracted from final chunk)
503
+ - `gen_ai.usage.cost.total` - Total cost for streaming request
504
+ - `gen_ai.usage.cost.prompt` - Prompt tokens cost for streaming request
505
+ - `gen_ai.usage.cost.completion` - Completion tokens cost for streaming request
506
+ - All granular cost attributes (reasoning, cache_read, cache_write) also available for streaming
507
+
508
+ **Content Events (opt-in):**
509
+ - `gen_ai.prompt.{index}` events with role and content
510
+ - `gen_ai.completion.{index}` events with role and content
511
+
512
+ **Additional:**
513
+ - Database, vector DB, and API attributes from MCP instrumentation
514
+
515
+ ## Configuration
516
+
517
+ ### Environment Variables
518
+
519
+ ```bash
520
+ # Required
521
+ OTEL_SERVICE_NAME=my-app
522
+ OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318
523
+
524
+ # Optional
525
+ OTEL_EXPORTER_OTLP_HEADERS=x-api-key=secret
526
+ GENAI_ENABLE_GPU_METRICS=true
527
+ GENAI_ENABLE_COST_TRACKING=true
528
+ GENAI_ENABLE_MCP_INSTRUMENTATION=true
529
+ GENAI_GPU_COLLECTION_INTERVAL=5 # GPU metrics collection interval in seconds (default: 5)
530
+ OTEL_SERVICE_INSTANCE_ID=instance-1 # Optional service instance id
531
+ OTEL_ENVIRONMENT=production # Optional environment
532
+ OTEL_EXPORTER_OTLP_TIMEOUT=10.0 # Optional timeout for OTLP exporter
533
+
534
+ # Semantic conventions (NEW)
535
+ OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai # "gen_ai" for new conventions only, "gen_ai/dup" for dual emission
536
+ GENAI_ENABLE_CONTENT_CAPTURE=false # WARNING: May capture sensitive data. Enable with caution.
537
+
538
+ # Logging configuration
539
+ GENAI_OTEL_LOG_LEVEL=INFO # DEBUG, INFO, WARNING, ERROR, CRITICAL. Logs are written to 'logs/genai_otel.log' with rotation (10 files, 10MB each).
540
+
541
+ # Error handling
542
+ GENAI_FAIL_ON_ERROR=false # true to fail fast, false to continue on errors
543
+ ```
544
+
545
+ ### Programmatic Configuration
546
+
547
+ ```python
548
+ import genai_otel
549
+
550
+ genai_otel.instrument(
551
+ service_name="my-app",
552
+ endpoint="http://localhost:4318",
553
+ enable_gpu_metrics=True,
554
+ enable_cost_tracking=True,
555
+ enable_mcp_instrumentation=True
556
+ )
557
+ ```
558
+
559
+ ### Sample Environment File (`sample.env`)
560
+
561
+ A `sample.env` file has been generated in the project root directory. This file contains commented-out examples of all supported environment variables, along with their default values or expected formats. You can copy this file to `.env` and uncomment/modify the variables to configure the instrumentation for your specific needs.
562
+
563
+ ## Advanced Features
564
+
565
+ ### Session and User Tracking
566
+
567
+ Track user sessions and identify users across multiple LLM requests for better analytics, debugging, and cost attribution.
568
+
569
+ **Configuration:**
570
+
571
+ ```python
572
+ import genai_otel
573
+ from genai_otel import OTelConfig
574
+
575
+ # Define extractor functions
576
+ def extract_session_id(instance, args, kwargs):
577
+ """Extract session ID from request metadata."""
578
+ # Option 1: From kwargs metadata
579
+ metadata = kwargs.get("metadata", {})
580
+ return metadata.get("session_id")
581
+
582
+ # Option 2: From custom headers
583
+ # headers = kwargs.get("headers", {})
584
+ # return headers.get("X-Session-ID")
585
+
586
+ # Option 3: From thread-local storage
587
+ # import threading
588
+ # return getattr(threading.current_thread(), "session_id", None)
589
+
590
+ def extract_user_id(instance, args, kwargs):
591
+ """Extract user ID from request metadata."""
592
+ metadata = kwargs.get("metadata", {})
593
+ return metadata.get("user_id")
594
+
595
+ # Configure with extractors
596
+ config = OTelConfig(
597
+ service_name="my-rag-app",
598
+ endpoint="http://localhost:4318",
599
+ session_id_extractor=extract_session_id,
600
+ user_id_extractor=extract_user_id,
601
+ )
602
+
603
+ genai_otel.instrument(config)
604
+ ```
605
+
606
+ **Usage:**
607
+
608
+ ```python
609
+ from openai import OpenAI
610
+
611
+ client = OpenAI()
612
+
613
+ # Pass session and user info via metadata
614
+ response = client.chat.completions.create(
615
+ model="gpt-3.5-turbo",
616
+ messages=[{"role": "user", "content": "What is OpenTelemetry?"}],
617
+ extra_body={"metadata": {"session_id": "sess_12345", "user_id": "user_alice"}}
618
+ )
619
+ ```
620
+
621
+ **Span Attributes Added:**
622
+ - `session.id` - Unique session identifier for tracking conversations
623
+ - `user.id` - User identifier for per-user analytics and cost tracking
624
+
625
+ **Use Cases:**
626
+ - Track multi-turn conversations across requests
627
+ - Analyze usage patterns per user
628
+ - Debug session-specific issues
629
+ - Calculate per-user costs and quotas
630
+ - Build user-specific dashboards
631
+
632
+ ### RAG and Embedding Attributes
633
+
634
+ Enhanced observability for Retrieval-Augmented Generation (RAG) workflows, including embedding generation and document retrieval.
635
+
636
+ **Helper Methods:**
637
+
638
+ The `BaseInstrumentor` provides helper methods to add RAG-specific attributes to your spans:
639
+
640
+ ```python
641
+ from opentelemetry import trace
642
+ from genai_otel.instrumentors.base import BaseInstrumentor
643
+
644
+ # Get your instrumentor instance (or create spans manually)
645
+ tracer = trace.get_tracer(__name__)
646
+
647
+ # 1. Embedding Attributes
648
+ with tracer.start_as_current_span("embedding.create") as span:
649
+ # Your embedding logic
650
+ embedding_response = client.embeddings.create(
651
+ model="text-embedding-3-small",
652
+ input="OpenTelemetry provides observability"
653
+ )
654
+
655
+ # Add embedding attributes (if using BaseInstrumentor)
656
+ # instrumentor.add_embedding_attributes(
657
+ # span,
658
+ # model="text-embedding-3-small",
659
+ # input_text="OpenTelemetry provides observability",
660
+ # vector=embedding_response.data[0].embedding
661
+ # )
662
+
663
+ # Or manually set attributes
664
+ span.set_attribute("embedding.model_name", "text-embedding-3-small")
665
+ span.set_attribute("embedding.text", "OpenTelemetry provides observability"[:500])
666
+ span.set_attribute("embedding.vector.dimension", len(embedding_response.data[0].embedding))
667
+
668
+ # 2. Retrieval Attributes
669
+ with tracer.start_as_current_span("retrieval.search") as span:
670
+ # Your retrieval logic
671
+ retrieved_docs = [
672
+ {
673
+ "id": "doc_001",
674
+ "score": 0.95,
675
+ "content": "OpenTelemetry is an observability framework...",
676
+ "metadata": {"source": "docs.opentelemetry.io", "category": "intro"}
677
+ },
678
+ # ... more documents
679
+ ]
680
+
681
+ # Add retrieval attributes (if using BaseInstrumentor)
682
+ # instrumentor.add_retrieval_attributes(
683
+ # span,
684
+ # documents=retrieved_docs,
685
+ # query="What is OpenTelemetry?",
686
+ # max_docs=5
687
+ # )
688
+
689
+ # Or manually set attributes
690
+ span.set_attribute("retrieval.query", "What is OpenTelemetry?"[:500])
691
+ span.set_attribute("retrieval.document_count", len(retrieved_docs))
692
+
693
+ for i, doc in enumerate(retrieved_docs[:5]): # Limit to 5 docs
694
+ prefix = f"retrieval.documents.{i}.document"
695
+ span.set_attribute(f"{prefix}.id", doc["id"])
696
+ span.set_attribute(f"{prefix}.score", doc["score"])
697
+ span.set_attribute(f"{prefix}.content", doc["content"][:500])
698
+
699
+ # Add metadata
700
+ for key, value in doc.get("metadata", {}).items():
701
+ span.set_attribute(f"{prefix}.metadata.{key}", str(value))
702
+ ```
703
+
704
+ **Embedding Attributes:**
705
+ - `embedding.model_name` - Embedding model used
706
+ - `embedding.text` - Input text (truncated to 500 chars)
707
+ - `embedding.vector` - Embedding vector (optional, if configured)
708
+ - `embedding.vector.dimension` - Vector dimensions
709
+
710
+ **Retrieval Attributes:**
711
+ - `retrieval.query` - Search query (truncated to 500 chars)
712
+ - `retrieval.document_count` - Number of documents retrieved
713
+ - `retrieval.documents.{i}.document.id` - Document ID
714
+ - `retrieval.documents.{i}.document.score` - Relevance score
715
+ - `retrieval.documents.{i}.document.content` - Document content (truncated to 500 chars)
716
+ - `retrieval.documents.{i}.document.metadata.*` - Custom metadata fields
717
+
718
+ **Safeguards:**
719
+ - Text content truncated to 500 characters to avoid span size explosion
720
+ - Document count limited to 5 by default (configurable via `max_docs`)
721
+ - Metadata values truncated to prevent excessive attribute counts
722
+
723
+ **Complete RAG Workflow Example:**
724
+
725
+ See `examples/phase4_session_rag_tracking.py` for a comprehensive demonstration of:
726
+ - Session and user tracking across RAG pipeline
727
+ - Embedding attribute capture
728
+ - Retrieval attribute capture
729
+ - End-to-end RAG workflow with full observability
730
+
731
+ **Use Cases:**
732
+ - Monitor retrieval quality and relevance scores
733
+ - Debug RAG pipeline performance
734
+ - Track embedding model usage
735
+ - Analyze document retrieval patterns
736
+ - Optimize vector search configurations
737
+
738
+ ## Example: Full-Stack GenAI App
739
+
740
+ ```python
741
+ import genai_otel
742
+ genai_otel.instrument()
743
+
744
+ import openai
745
+ import pinecone
746
+ import redis
747
+ import psycopg2
748
+
749
+ # All of these are automatically instrumented:
750
+
751
+ # Cache check
752
+ cache = redis.Redis().get('key')
753
+
754
+ # Vector search
755
+ pinecone_index = pinecone.Index("embeddings")
756
+ results = pinecone_index.query(vector=[...], top_k=5)
757
+
758
+ # Database query
759
+ conn = psycopg2.connect("dbname=mydb")
760
+ cursor = conn.cursor()
761
+ cursor.execute("SELECT * FROM context")
762
+
763
+ # LLM call with full context
764
+ client = openai.OpenAI()
765
+ response = client.chat.completions.create(
766
+ model="gpt-4",
767
+ messages=[...]
768
+ )
769
+
770
+ # You get:
771
+ # ✓ Distributed traces across all services
772
+ # ✓ Cost tracking for the LLM call
773
+ # ✓ Performance metrics for DB, cache, vector DB
774
+ # ✓ GPU metrics if using local models
775
+ # ✓ Complete observability with zero manual instrumentation
776
+ ```
777
+
778
+ ## Backend Integration
779
+
780
+ Works with any OpenTelemetry-compatible backend:
781
+ - Jaeger, Zipkin
782
+ - Prometheus, Grafana
783
+ - Datadog, New Relic, Honeycomb
784
+ - AWS X-Ray, Google Cloud Trace
785
+ - Elastic APM, Splunk
786
+ - Self-hosted OTEL Collector
787
+
788
+ ## Project Structure
789
+
790
+ ```bash
791
+ genai-otel-instrument/
792
+ ├── setup.py
793
+ ├── MANIFEST.in
794
+ ├── README.md
795
+ ├── LICENSE
796
+ ├── example_usage.py
797
+ └── genai_otel/
798
+ ├── __init__.py
799
+ ├── config.py
800
+ ├── auto_instrument.py
801
+ ├── cli.py
802
+ ├── cost_calculator.py
803
+ ├── gpu_metrics.py
804
+ ├── instrumentors/
805
+ │ ├── __init__.py
806
+ │ ├── base.py
807
+ │ └── (other instrumentor files)
808
+ └── mcp_instrumentors/
809
+ ├── __init__.py
810
+ ├── manager.py
811
+ └── (other mcp files)
812
+ ```
813
+
814
+ ## Roadmap
815
+
816
+ ### Next Release (v0.2.0) - Q1 2026
817
+
818
+ We're planning significant enhancements for the next major release, focusing on evaluation metrics and safety guardrails alongside completing OpenTelemetry semantic convention compliance.
819
+
820
+ #### 🎯 Evaluation & Monitoring
821
+
822
+ **LLM Output Quality Metrics**
823
+ - **Bias Detection** - Automatically detect and measure bias in LLM responses
824
+ - Gender, racial, political, and cultural bias detection
825
+ - Bias score metrics with configurable thresholds
826
+ - Integration with fairness libraries (e.g., Fairlearn, AIF360)
827
+
828
+ - **Toxicity Detection** - Monitor and alert on toxic or harmful content
829
+ - Perspective API integration for toxicity scoring
830
+ - Custom toxicity models support
831
+ - Real-time toxicity metrics and alerts
832
+ - Configurable severity levels
833
+
834
+ - **Hallucination Detection** - Track factual accuracy and groundedness
835
+ - Fact-checking against provided context
836
+ - Citation validation for RAG applications
837
+ - Confidence scoring for generated claims
838
+ - Hallucination rate metrics by model and use case
839
+
840
+ **Implementation:**
841
+ ```python
842
+ import genai_otel
843
+
844
+ # Enable evaluation metrics
845
+ genai_otel.instrument(
846
+ enable_bias_detection=True,
847
+ enable_toxicity_detection=True,
848
+ enable_hallucination_detection=True,
849
+
850
+ # Configure thresholds
851
+ bias_threshold=0.7,
852
+ toxicity_threshold=0.5,
853
+ hallucination_threshold=0.8
854
+ )
855
+ ```
856
+
857
+ **Metrics Added:**
858
+ - `gen_ai.eval.bias_score` - Bias detection scores (histogram)
859
+ - `gen_ai.eval.toxicity_score` - Toxicity scores (histogram)
860
+ - `gen_ai.eval.hallucination_score` - Hallucination probability (histogram)
861
+ - `gen_ai.eval.violations` - Count of threshold violations by type
862
+
863
+ #### 🛡️ Safety Guardrails
864
+
865
+ **Input/Output Filtering**
866
+ - **Prompt Injection Detection** - Protect against prompt injection attacks
867
+ - Pattern-based detection (jailbreaking attempts)
868
+ - ML-based classifier for sophisticated attacks
869
+ - Real-time blocking with configurable policies
870
+ - Attack attempt metrics and logging
871
+
872
+ - **Restricted Topics** - Block sensitive or inappropriate topics
873
+ - Configurable topic blacklists (legal, medical, financial advice)
874
+ - Industry-specific content filters
875
+ - Topic detection with confidence scoring
876
+ - Custom topic definition support
877
+
878
+ - **Sensitive Information Protection** - Prevent PII leakage
879
+ - PII detection (emails, phone numbers, SSN, credit cards)
880
+ - Automatic redaction or blocking
881
+ - Compliance mode (GDPR, HIPAA, PCI-DSS)
882
+ - Data leak prevention metrics
883
+
884
+ **Implementation:**
885
+ ```python
886
+ import genai_otel
887
+
888
+ # Configure guardrails
889
+ genai_otel.instrument(
890
+ enable_prompt_injection_detection=True,
891
+ enable_restricted_topics=True,
892
+ enable_sensitive_info_detection=True,
893
+
894
+ # Custom configuration
895
+ restricted_topics=["medical_advice", "legal_advice", "financial_advice"],
896
+ pii_detection_mode="block", # or "redact", "warn"
897
+
898
+ # Callbacks for custom handling
899
+ on_guardrail_violation=my_violation_handler
900
+ )
901
+ ```
902
+
903
+ **Metrics Added:**
904
+ - `gen_ai.guardrail.prompt_injection_detected` - Injection attempts blocked
905
+ - `gen_ai.guardrail.restricted_topic_blocked` - Restricted topic violations
906
+ - `gen_ai.guardrail.pii_detected` - PII detection events
907
+ - `gen_ai.guardrail.violations` - Total guardrail violations by type
908
+
909
+ **Span Attributes:**
910
+ - `gen_ai.guardrail.violation_type` - Type of violation detected
911
+ - `gen_ai.guardrail.violation_severity` - Severity level (low, medium, high, critical)
912
+ - `gen_ai.guardrail.blocked` - Whether request was blocked (boolean)
913
+ - `gen_ai.eval.bias_categories` - Detected bias types (array)
914
+ - `gen_ai.eval.toxicity_categories` - Toxicity categories (array)
915
+
916
+ #### 🔄 Migration Support
917
+
918
+ **Backward Compatibility:**
919
+ - All new features are opt-in via configuration
920
+ - Existing instrumentation continues to work unchanged
921
+ - Gradual migration path for new semantic conventions
922
+
923
+ **Version Support:**
924
+ - Python 3.9+ (evaluation features require 3.10+)
925
+ - OpenTelemetry SDK 1.20.0+
926
+ - Backward compatible with existing dashboards
927
+
928
+ ### Future Releases
929
+
930
+ **v0.3.0 - Advanced Analytics**
931
+ - Custom metric aggregations
932
+ - Cost optimization recommendations
933
+ - Automated performance regression detection
934
+ - A/B testing support for prompts
935
+
936
+ **v0.4.0 - Enterprise Features**
937
+ - Multi-tenancy support
938
+ - Role-based access control for telemetry
939
+ - Advanced compliance reporting
940
+ - SLA monitoring and alerting
941
+
942
+ **Community Feedback**
943
+
944
+ We welcome feedback on our roadmap! Please:
945
+ - Open issues for feature requests
946
+ - Join discussions on prioritization
947
+ - Share your use cases and requirements
948
+
949
+ See [Contributing.md](Contributing.md) for how to get involved.
950
+
951
+ ## License
952
+
953
+ TraceVerde is licensed under the GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later).
954
+
955
+ Copyright (C) 2025 Kshitij Thakkar
956
+
957
+ This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
958
+
959
+ See the [LICENSE](LICENSE) file for the full license text.