PyPI - genai-otel-instrument - Versions diffs - 0.1.1.dev0__tar.gz → 0.1.4.dev0__tar.gz - Mend

genai-otel-instrument 0.1.1.dev0tar.gz → 0.1.4.dev0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of genai-otel-instrument might be problematic. Click here for more details.

Files changed (210) hide show

genai_otel_instrument-0.1.4.dev0/.claude/settings.local.json ADDED Viewed

@@ -0,0 +1,9 @@
+{
+  "permissions": {
+    "allow": [
+    ],
+    "deny": [],
+    "ask": []
+  }
+}

{genai_otel_instrument-0.1.1.dev0 → genai_otel_instrument-0.1.4.dev0}/CHANGELOG.md RENAMED Viewed

@@ -6,6 +6,217 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+### Added
+- **HuggingFace InferenceClient Instrumentation**
+  - Added full instrumentation support for HuggingFace Inference API via `InferenceClient`
+  - Enables observability for smolagents workflows using `InferenceClientModel`
+  - Wraps `InferenceClient.chat_completion()` and `InferenceClient.text_generation()` methods
+  - Creates child spans showing actual HuggingFace API calls under agent/tool spans
+  - Extracts model name, temperature, max_tokens, top_p from API calls
+  - Supports both object and dict response formats for token usage
+  - Handles streaming responses with `gen_ai.server.ttft` and `gen_ai.streaming.token_count`
+  - Cost tracking enabled via fallback estimation based on model parameter count
+  - Implementation in `genai_otel/instrumentors/huggingface_instrumentor.py:141-222`
+  - Added 10 comprehensive tests covering all InferenceClient functionality
+  - Coverage increased from 85% → 98% for HuggingFace instrumentor
+  - Resolves issue where only AGENT and TOOL spans were visible without LLM child spans
+- **Fallback Cost Estimation for Local Models (Ollama & HuggingFace)**
+  - Added 36 Ollama models to `llm_pricing.json` with parameter-count-based pricing tiers
+  - Implemented intelligent fallback cost estimation for unknown local models in `CostCalculator`
+  - Automatically parses parameter count from model names (e.g., "360m", "7b", "70b")
+  - Supports both Ollama and HuggingFace model naming patterns:
+    - Explicit sizes: `llama3:7b`, `mistral-7b-v0.1`, `smollm2:360m`
+    - HuggingFace size indicators: `gpt2`, `gpt2-xl`, `bert-base`, `t5-xxl`, etc.
+  - Applies tiered pricing based on parameter count:
+    - Tiny (< 1B): $0.0001 / $0.0002 per 1k tokens
+    - Small (1-10B): $0.0003 / $0.0006
+    - Medium (10-20B): $0.0005 / $0.001
+    - Large (20-80B): $0.0008 / $0.0008
+    - XLarge (80B+): $0.0012 / $0.0012
+  - Acknowledges that local models are free but consume GPU power and electricity
+  - Provides synthetic cost estimates for carbon footprint and resource tracking
+  - Added `scripts/add_ollama_pricing.py` to update pricing database with new Ollama models
+  - Logs fallback pricing usage at INFO level for transparency
+### Improved
+- **CostEnrichmentSpanProcessor Performance Optimization**
+  - Added early-exit logic to skip spans that already have cost attributes
+  - Checks for `gen_ai.usage.cost.total` presence before attempting enrichment
+  - Saves processing compute by avoiding redundant cost calculations
+  - Eliminates warning messages for spans enriched by instrumentors
+  - Benefits all instrumentors that set cost attributes directly (Mistral, OpenAI, Anthropic, etc.)
+  - Implementation in `genai_otel/cost_enrichment_processor.py:69-74`
+  - Added comprehensive test coverage for skip logic
+  - Coverage increased from 94% → 98% for CostEnrichmentSpanProcessor
+### Fixed
+- **CRITICAL: Complete Rewrite of Mistral AI Instrumentor**
+  - **Root problem**: Original instrumentor used instance-level wrapping which didn't work reliably
+  - **Complete architectural rewrite** using class-level method wrapping with `wrapt.wrap_function_wrapper()`
+  - Now properly wraps `Chat.complete`, `Chat.stream`, and `Embeddings.create` at the class level
+  - All Mistral client instances now use instrumented methods automatically
+  - **Streaming support** with custom `_StreamWrapper` class:
+    - Iterates through streaming chunks and collects usage data
+    - Records TTFT (Time To First Token) metric
+    - Creates mock response objects for proper metrics recording
+  - **Proper error handling** with span exception recording
+  - **Cost tracking** now works correctly with BaseInstrumentor integration
+  - Fixed incorrect `_record_result_metrics()` signature usage
+  - Implementation in `genai_otel/instrumentors/mistralai_instrumentor.py` (180 lines, completely rewritten)
+  - All 5 Mistral tests passing with proper mocking
+  - Traces now collected with full details: model, tokens, costs, TTFT
+  - Resolves issue where no Mistral spans were being collected
+- **CRITICAL: Fixed Missing Granular Cost Counter Class Variables**
+  - Fixed `AttributeError: 'OllamaInstrumentor' object has no attribute '_shared_prompt_cost_counter'`
+  - **Root cause**: Granular cost counters were created in initialization but not declared as class variables
+  - **Impact**: Test suite failed with 34 errors when running full suite (but passed individually)
+  - Added missing class variable declarations in `BaseInstrumentor`:
+    - `_shared_prompt_cost_counter`
+    - `_shared_completion_cost_counter`
+    - `_shared_reasoning_cost_counter`
+    - `_shared_cache_read_cost_counter`
+    - `_shared_cache_write_cost_counter`
+  - Created instance variable references in `__init__` for all granular counters
+  - Updated all references to use instance variables instead of `_shared_*` variables
+  - Implementation in `genai_otel/instrumentors/base.py:85-90, 106-111`
+  - All 424 tests now passing consistently
+  - Affects all instrumentors using granular cost tracking
+- **CRITICAL: Fixed Cost Tracking Disabled by Wrong Variable Check**
+  - **Root cause**: Cost tracking checked `self._shared_cost_counter` which was always None
+  - Should have checked `self.config.enable_cost_tracking` flag only
+  - **Impact**: Cost attributes were never added to spans even when cost tracking was enabled
+  - Removed unnecessary `cost_counter` existence check
+  - Cost tracking now properly controlled by `GENAI_ENABLE_COST_TRACKING` environment variable
+  - Implementation in `genai_otel/instrumentors/base.py:384`
+  - Debug logging confirmed cost calculation working: "Calculating cost for model=smollm2:360m"
+  - Affects all instrumentors (Ollama, Mistral, OpenAI, Anthropic, etc.)
+- **CRITICAL: Fixed Token and Cost Attributes Not Being Set on Spans**
+  - Fixed critical bug where `gen_ai.usage.prompt_tokens`, `gen_ai.usage.completion_tokens`, and all cost attributes were not being set on spans
+  - **Root causes:**
+    1. Span attributes were only set if metric counters were available, but this check was too restrictive
+    2. Used wrong variable name (`self._shared_cost_counter` instead of `self.cost_counter`) in cost tracking check
+  - **Impact**: Cost calculation completely failed - only `gen_ai.usage.total_tokens` was set
+  - **Fixed by:**
+    1. Always setting span attributes regardless of metric availability
+    2. Using correct instance variables (`self.cost_counter`, `self.token_counter`)
+    3. Metrics recording is now optional, but span attributes are always set
+    4. Cost attributes (`gen_ai.usage.cost.total`, `gen_ai.usage.cost.prompt`, `gen_ai.usage.cost.completion`) are now always added
+  - This ensures cost tracking works even if metrics initialization fails
+  - Affects all instrumentors (OpenAI, Anthropic, Ollama, etc.)
+- **CRITICAL: Fixed 6 Instrumentors Missing `self._instrumented = True`**
+  - Ollama, Cohere, HuggingFace, Replicate, TogetherAI, and VertexAI instrumentors were completely broken
+  - No traces were being collected because `self._instrumented` flag was not set after wrapping functions
+  - The `create_span_wrapper()` checks this flag and skips instrumentation if False
+  - Added `self._instrumented = True` after successful wrapping in all 6 instrumentors
+  - All instrumentors now properly collect traces again
+- **CRITICAL: CostEnrichmentSpanProcessor Now Working**
+  - Fixed critical bug where `CostEnrichmentSpanProcessor` was calling `calculate_cost()` (returns float) but treating it as a dict
+  - This caused all cost enrichment to silently fail with `TypeError: 'float' object is not subscriptable`
+  - Now correctly calls `calculate_granular_cost()` which returns a proper dict with `total`, `prompt`, `completion` keys
+  - Cost attributes (`gen_ai.usage.cost.total`, `gen_ai.usage.cost.prompt`, `gen_ai.usage.cost.completion`) will now be added to OpenInference spans (smolagents, litellm, mcp)
+  - Improved error logging from `logger.debug` to `logger.warning` with full exception info for easier debugging
+  - Added logging of successful cost enrichment at `INFO` level with span name, model, and token details
+  - All 415 tests passing, including 20 cost enrichment processor tests
+- **Fixed OpenInference Instrumentor Loading Order**
+  - Corrected instrumentor initialization order to: smolagents → litellm → mcp
+  - This matches the correct order found in working implementations
+  - Ensures proper nested instrumentation and attribute capture
+## [0.1.3] - 2025-01-23
+### Added
+- **Cost Enrichment for OpenInference Instrumentors**
+  - **CostEnrichmentSpanProcessor**: New custom SpanProcessor that automatically adds cost tracking to spans created by OpenInference instrumentors (smolagents, litellm, mcp)
+    - Extracts model name and token usage from existing span attributes
+    - Calculates costs using the existing CostCalculator with 145+ model pricing data
+    - Adds granular cost attributes: `gen_ai.usage.cost.total`, `gen_ai.usage.cost.prompt`, `gen_ai.usage.cost.completion`
+    - **Dual Semantic Convention Support**: Works with both OpenTelemetry GenAI and OpenInference conventions
+      - GenAI: `gen_ai.request.model`, `gen_ai.usage.{prompt_tokens,completion_tokens,input_tokens,output_tokens}`
+      - OpenInference: `llm.model_name`, `embedding.model_name`, `llm.token_count.{prompt,completion}`
+      - OpenInference span kinds: LLM, EMBEDDING, CHAIN, RETRIEVER, RERANKER, TOOL, AGENT
+    - Maps operation names to call types (chat, embedding, image, audio) automatically
+    - Gracefully handles missing data and errors without failing span processing
+  - Enabled by default when `GENAI_ENABLE_COST_TRACKING=true`
+  - Works alongside OpenInference's native instrumentation without modifying upstream code
+  - 100% test coverage with 20 comprehensive test cases (includes 5 OpenInference-specific tests)
+- **Comprehensive Cost Tracking Enhancements**
+  - Added token usage extraction and cost calculation for **6 instrumentors**: Ollama, Cohere, Together AI, Vertex AI, HuggingFace, and Replicate
+  - Implemented `create_span_wrapper()` pattern across all instrumentors for consistent metrics recording
+  - Added `gen_ai.operation.name` attribute to all instrumentors for improved observability
+  - Total instrumentors with cost tracking increased from 8 to **11** (37.5% increase)
+- **Pricing Data Expansion**
+  - Added pricing for **45+ new LLM models** from 3 major providers:
+    - **Groq**: 9 models (Llama 3.1/3.3/4, Qwen, GPT-OSS, Kimi-K2)
+    - **Cohere**: 5 models (Command R/R+/R7B, Command A, updated legacy pricing)
+    - **Together AI**: 30+ models (DeepSeek R1/V3, Qwen 2.5/3, Mistral variants, GLM-4.5)
+  - All pricing verified from official provider documentation (2025 rates)
+- **Enhanced Instrumentor Implementations**
+  - **Ollama**: Extracts `prompt_eval_count` and `eval_count` from response (local model usage tracking)
+  - **Cohere**: Extracts from `meta.tokens` with `meta.billed_units` fallback
+  - **Together AI**: OpenAI-compatible format with dual API support (client + legacy Complete API)
+  - **Vertex AI**: Extracts `usage_metadata` with both snake_case and camelCase support
+  - **HuggingFace**: Documented as local/free execution (no API costs)
+  - **Replicate**: Documented as hardware-based pricing ($/second, not token-based)
+### Improved
+- **Standardization & Code Quality**
+  - Standardized all instrumentors to use `BaseInstrumentor.create_span_wrapper()` pattern
+  - Improved error handling with consistent `fail_on_error` support across all instrumentors
+  - Enhanced documentation with comprehensive docstrings explaining pricing models
+  - Added proper logging at all error points for better debugging
+  - Thread-safe metrics initialization across all instrumentors
+- **Test Coverage**
+  - All **415 tests passing** (100% test success rate)
+  - Increased overall code coverage to **89%**
+  - Individual instrumentor coverage: HuggingFace (98%), OpenAI (98%), Anthropic (95%), Groq (94%)
+  - Core modules at 100% coverage: config, metrics, logging, exceptions, __init__, cost_enrichment_processor
+  - Updated 40+ tests to match new `create_span_wrapper()` pattern
+  - Added 20 comprehensive tests for CostEnrichmentSpanProcessor (100% coverage)
+    - 15 tests for GenAI semantic conventions
+    - 5 tests for OpenInference semantic conventions
+- **Documentation**
+  - Updated all instrumentor docstrings to explain token extraction logic
+  - Added comments documenting non-standard pricing models (hardware-based, local execution)
+  - Improved code comments for complex fallback logic
+## [0.1.2.dev0] - 2025-01-22
+### Added
+- **GPU Power Consumption Metric**
+  - Added `gen_ai.gpu.power` observable gauge metric to track real-time GPU power consumption
+  - Metric reports power usage in Watts with `gpu_id` and `gpu_name` attributes
+  - Automatically collected alongside existing GPU metrics (utilization, memory, temperature)
+  - Implementation in `genai_otel/gpu_metrics.py:97-102, 195-220`
+  - Added test coverage in `tests/test_gpu_metrics.py:244-266`
+  - Completes the GPU metrics suite with 5 total metrics: utilization, memory, temperature, power, and CO2 emissions
+### Fixed
+- **Test Fixes for HuggingFace and MistralAI Instrumentors**
+  - Fixed HuggingFace instrumentor tests (2 failures) - corrected tracer mocking to use `instrumentor.tracer.start_span()` instead of `config.tracer.start_as_current_span()`
+  - Fixed HuggingFace instrumentor tests - added `instrumentor.request_counter` mock for proper metrics assertion
+  - Fixed MistralAI instrumentor test - corrected wrapt module mocking by adding to `sys.modules` instead of invalid module-level patch
+  - All 395 tests now passing with zero failures
+  - Tests modified: `tests/instrumentors/test_huggingface_instrumentor.py`, `tests/instrumentors/test_mistralai_instrumentor.py`
 ## [0.1.0] - 2025-01-20
 **First Beta Release** 🎉
@@ -251,5 +462,6 @@ This is the first public release of genai-otel-instrument, a comprehensive OpenT
 - Fixed tests for base/redis and auto instrument (a701603)
 - Updated `test_auto_instrument.py` assertions to match new OTLP exporter configuration (exporters now read endpoint from environment variables instead of direct parameters)
-[Unreleased]: https://github.com/Mandark-droid/genai_otel_instrument/compare/v0.1.0...HEAD
+[Unreleased]: https://github.com/Mandark-droid/genai_otel_instrument/compare/v0.1.2.dev0...HEAD
+[0.1.2.dev0]: https://github.com/Mandark-droid/genai_otel_instrument/compare/v0.1.0...v0.1.2.dev0
 [0.1.0]: https://github.com/Mandark-droid/genai_otel_instrument/releases/tag/v0.1.0

{genai_otel_instrument-0.1.1.dev0 → genai_otel_instrument-0.1.4.dev0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: genai-otel-instrument
-Version: 0.1.1.dev0
+Version: 0.1.4.dev0
 Summary: Comprehensive OpenTelemetry auto-instrumentation for LLM/GenAI applications
 Author-email: Kshitij Thakkar <kshitijthakkar@rocketmail.com>
 License: Apache-2.0
@@ -180,6 +180,26 @@ Dynamic: license-file
 # GenAI OpenTelemetry Auto-Instrumentation
+[![PyPI version](https://badge.fury.io/py/genai-otel-instrument.svg)](https://badge.fury.io/py/genai-otel-instrument)
+[![Python Versions](https://img.shields.io/pypi/pyversions/genai-otel-instrument.svg)](https://pypi.org/project/genai-otel-instrument/)
+[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
+[![Downloads](https://static.pepy.tech/badge/genai-otel-instrument)](https://pepy.tech/project/genai-otel-instrument)
+[![Downloads/Month](https://static.pepy.tech/badge/genai-otel-instrument/month)](https://pepy.tech/project/genai-otel-instrument)
+[![GitHub Stars](https://img.shields.io/github/stars/Mandark-droid/genai_otel_instrument?style=social)](https://github.com/Mandark-droid/genai_otel_instrument)
+[![GitHub Forks](https://img.shields.io/github/forks/Mandark-droid/genai_otel_instrument?style=social)](https://github.com/Mandark-droid/genai_otel_instrument)
+[![GitHub Issues](https://img.shields.io/github/issues/Mandark-droid/genai_otel_instrument)](https://github.com/Mandark-droid/genai_otel_instrument/issues)
+[![GitHub Pull Requests](https://img.shields.io/github/issues-pr/Mandark-droid/genai_otel_instrument)](https://github.com/Mandark-droid/genai_otel_instrument/pulls)
+[![Code Coverage](https://img.shields.io/badge/coverage-90%25-brightgreen.svg)](https://github.com/Mandark-droid/genai_otel_instrument)
+[![Code Style: Black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)
+[![Imports: isort](https://img.shields.io/badge/%20imports-isort-%231674b1?style=flat&labelColor=ef8336)](https://pycqa.github.io/isort/)
+[![Type Checked: mypy](https://img.shields.io/badge/type%20checked-mypy-blue.svg)](http://mypy-lang.org/)
+[![OpenTelemetry](https://img.shields.io/badge/OpenTelemetry-1.20%2B-blueviolet)](https://opentelemetry.io/)
+[![Semantic Conventions](https://img.shields.io/badge/OTel%20Semconv-GenAI%20v1.28-orange)](https://opentelemetry.io/docs/specs/semconv/gen-ai/)
+[![CI/CD](https://img.shields.io/badge/CI%2FCD-GitHub%20Actions-2088FF?logo=github-actions&logoColor=white)](https://github.com/Mandark-droid/genai_otel_instrument/actions)
 Production-ready OpenTelemetry instrumentation for GenAI/LLM applications with zero-code setup.
 ## Features
@@ -188,7 +208,7 @@ Production-ready OpenTelemetry instrumentation for GenAI/LLM applications with z
 🤖 **15+ LLM Providers** - OpenAI, Anthropic, Google, AWS, Azure, and more
 🔧 **MCP Tool Support** - Auto-instrument databases, APIs, caches, vector DBs
 💰 **Cost Tracking** - Automatic cost calculation per request
-🎮 **GPU Metrics** - Real-time GPU utilization, memory, temperature
+🎮 **GPU Metrics** - Real-time GPU utilization, memory, temperature, power
 📊 **Complete Observability** - Traces, metrics, and rich span attributes
 ➕ **Service Instance ID & Environment** - Identify your services and environments
 ⏱️ **Configurable Exporter Timeout** - Set timeout for OTLP exporter
@@ -235,9 +255,9 @@ For a more comprehensive demonstration of various LLM providers and MCP tools, r
 ## What Gets Instrumented?
 ### LLM Providers (Auto-detected)
-- OpenAI, Anthropic, Google AI, AWS Bedrock, Azure OpenAI
-- Cohere, Mistral AI, Together AI, Groq, Ollama
-- Vertex AI, Replicate, Anyscale, HuggingFace
+- **With Full Cost Tracking**: OpenAI, Anthropic, Google AI, AWS Bedrock, Azure OpenAI, Cohere, Mistral AI, Together AI, Groq, Ollama, Vertex AI
+- **Hardware/Local Pricing**: Replicate (hardware-based $/second), HuggingFace (local execution, free)
+- **Other Providers**: Anyscale
 ### Frameworks
 - LangChain (chains, agents, tools)
@@ -251,15 +271,52 @@ For a more comprehensive demonstration of various LLM providers and MCP tools, r
 - **APIs**: HTTP/REST requests (requests, httpx)
 ### OpenInference (Optional - Python 3.10+ only)
-- Smolagents
-- MCP
-- LiteLLM
+- Smolagents - HuggingFace smolagents framework tracing
+- MCP - Model Context Protocol instrumentation
+- LiteLLM - Multi-provider LLM proxy
+**Cost Enrichment:** OpenInference instrumentors are automatically enriched with cost tracking! When cost tracking is enabled (`GENAI_ENABLE_COST_TRACKING=true`), a custom `CostEnrichmentSpanProcessor` extracts model and token usage from OpenInference spans and adds cost attributes (`gen_ai.usage.cost.total`, `gen_ai.usage.cost.prompt`, `gen_ai.usage.cost.completion`) using our comprehensive pricing database of 145+ models.
+The processor supports OpenInference semantic conventions:
+- Model: `llm.model_name`, `embedding.model_name`
+- Tokens: `llm.token_count.prompt`, `llm.token_count.completion`
+- Operations: `openinference.span.kind` (LLM, EMBEDDING, CHAIN, RETRIEVER, etc.)
 **Note:** OpenInference instrumentors require Python >= 3.10. Install with:
 ```bash
 pip install genai-otel-instrument[openinference]
 ```
+## Cost Tracking Coverage
+The library includes comprehensive cost tracking with pricing data for **145+ models** across **11 providers**:
+### Providers with Full Token-Based Cost Tracking
+- **OpenAI**: GPT-4o, GPT-4 Turbo, GPT-3.5 Turbo, o1/o3 series, embeddings, audio, vision (35+ models)
+- **Anthropic**: Claude 3.5 Sonnet/Opus/Haiku, Claude 3 series (10+ models)
+- **Google AI**: Gemini 1.5/2.0 Pro/Flash, PaLM 2 (12+ models)
+- **AWS Bedrock**: Amazon Titan, Claude, Llama, Mistral models (20+ models)
+- **Azure OpenAI**: Same as OpenAI with Azure-specific pricing
+- **Cohere**: Command R/R+, Command Light, Embed v3/v2 (8+ models)
+- **Mistral AI**: Mistral Large/Medium/Small, Mixtral, embeddings (8+ models)
+- **Together AI**: DeepSeek-R1, Llama 3.x, Qwen, Mixtral (25+ models)
+- **Groq**: Llama 3.x series, Mixtral, Gemma models (15+ models)
+- **Ollama**: Local models with token tracking (pricing via cost estimation)
+- **Vertex AI**: Gemini models via Google Cloud with usage metadata extraction
+### Special Pricing Models
+- **Replicate**: Hardware-based pricing ($/second of GPU/CPU time) - not token-based
+- **HuggingFace Transformers**: Local execution - no API costs
+### Pricing Features
+- **Differential Pricing**: Separate rates for prompt tokens vs. completion tokens
+- **Reasoning Tokens**: Special pricing for OpenAI o1/o3 reasoning tokens
+- **Cache Pricing**: Anthropic prompt caching costs (read/write)
+- **Granular Cost Metrics**: Per-request cost breakdown by token type
+- **Auto-Updated Pricing**: Pricing data maintained in `llm_pricing.json`
+**Coverage Statistics**: As of v0.1.3, 89% test coverage with 415 passing tests, including comprehensive cost calculation validation and cost enrichment processor tests (supporting both GenAI and OpenInference semantic conventions).
 ## Collected Telemetry
 ### Traces
@@ -268,7 +325,7 @@ Every LLM call, database query, API request, and vector search is traced with fu
 ### Metrics
 **GenAI Metrics:**
-- `gen_ai.requests` - Request counts by provider/model
+- `gen_ai.requests` - Request counts by provider and model
 - `gen_ai.client.token.usage` - Token usage (prompt/completion)
 - `gen_ai.client.operation.duration` - Request latency histogram (optimized buckets for LLM workloads)
 - `gen_ai.usage.cost` - Total estimated costs in USD
@@ -278,7 +335,7 @@ Every LLM call, database query, API request, and vector search is traced with fu
 - `gen_ai.usage.cost.cache_read` - Cache read cost (Anthropic)
 - `gen_ai.usage.cost.cache_write` - Cache write cost (Anthropic)
 - `gen_ai.client.errors` - Error counts by operation and type
-- `gen_ai.gpu.*` - GPU utilization, memory, temperature (ObservableGauges)
+- `gen_ai.gpu.*` - GPU utilization, memory, temperature, power (ObservableGauges)
 - `gen_ai.co2.emissions` - CO2 emissions tracking (opt-in)
 - `gen_ai.server.ttft` - Time to First Token for streaming responses (histogram, 1ms-10s buckets)
 - `gen_ai.server.tbt` - Time Between Tokens for streaming responses (histogram, 10ms-2.5s buckets)
@@ -459,5 +516,168 @@ genai-otel-instrument/
         └── (other mcp files)
 ```
+## Roadmap
+### Next Release (v0.2.0) - Q1 2026
+We're planning significant enhancements for the next major release, focusing on evaluation metrics and safety guardrails alongside completing OpenTelemetry semantic convention compliance.
+#### 🎯 Evaluation & Monitoring
+**LLM Output Quality Metrics**
+- **Bias Detection** - Automatically detect and measure bias in LLM responses
+  - Gender, racial, political, and cultural bias detection
+  - Bias score metrics with configurable thresholds
+  - Integration with fairness libraries (e.g., Fairlearn, AIF360)
+- **Toxicity Detection** - Monitor and alert on toxic or harmful content
+  - Perspective API integration for toxicity scoring
+  - Custom toxicity models support
+  - Real-time toxicity metrics and alerts
+  - Configurable severity levels
+- **Hallucination Detection** - Track factual accuracy and groundedness
+  - Fact-checking against provided context
+  - Citation validation for RAG applications
+  - Confidence scoring for generated claims
+  - Hallucination rate metrics by model and use case
+**Implementation:**
+```python
+import genai_otel
+# Enable evaluation metrics
+genai_otel.instrument(
+    enable_bias_detection=True,
+    enable_toxicity_detection=True,
+    enable_hallucination_detection=True,
+    # Configure thresholds
+    bias_threshold=0.7,
+    toxicity_threshold=0.5,
+    hallucination_threshold=0.8
+)
+```
+**Metrics Added:**
+- `gen_ai.eval.bias_score` - Bias detection scores (histogram)
+- `gen_ai.eval.toxicity_score` - Toxicity scores (histogram)
+- `gen_ai.eval.hallucination_score` - Hallucination probability (histogram)
+- `gen_ai.eval.violations` - Count of threshold violations by type
+#### 🛡️ Safety Guardrails
+**Input/Output Filtering**
+- **Prompt Injection Detection** - Protect against prompt injection attacks
+  - Pattern-based detection (jailbreaking attempts)
+  - ML-based classifier for sophisticated attacks
+  - Real-time blocking with configurable policies
+  - Attack attempt metrics and logging
+- **Restricted Topics** - Block sensitive or inappropriate topics
+  - Configurable topic blacklists (legal, medical, financial advice)
+  - Industry-specific content filters
+  - Topic detection with confidence scoring
+  - Custom topic definition support
+- **Sensitive Information Protection** - Prevent PII leakage
+  - PII detection (emails, phone numbers, SSN, credit cards)
+  - Automatic redaction or blocking
+  - Compliance mode (GDPR, HIPAA, PCI-DSS)
+  - Data leak prevention metrics
+**Implementation:**
+```python
+import genai_otel
+# Configure guardrails
+genai_otel.instrument(
+    enable_prompt_injection_detection=True,
+    enable_restricted_topics=True,
+    enable_sensitive_info_detection=True,
+    # Custom configuration
+    restricted_topics=["medical_advice", "legal_advice", "financial_advice"],
+    pii_detection_mode="block",  # or "redact", "warn"
+    # Callbacks for custom handling
+    on_guardrail_violation=my_violation_handler
+)
+```
+**Metrics Added:**
+- `gen_ai.guardrail.prompt_injection_detected` - Injection attempts blocked
+- `gen_ai.guardrail.restricted_topic_blocked` - Restricted topic violations
+- `gen_ai.guardrail.pii_detected` - PII detection events
+- `gen_ai.guardrail.violations` - Total guardrail violations by type
+**Span Attributes:**
+- `gen_ai.guardrail.violation_type` - Type of violation detected
+- `gen_ai.guardrail.violation_severity` - Severity level (low, medium, high, critical)
+- `gen_ai.guardrail.blocked` - Whether request was blocked (boolean)
+- `gen_ai.eval.bias_categories` - Detected bias types (array)
+- `gen_ai.eval.toxicity_categories` - Toxicity categories (array)
+#### 📊 Enhanced OpenTelemetry Compliance
+Completing remaining items from [OTEL_SEMANTIC_GAP_ANALYSIS_AND_IMPLEMENTATION_PLAN.md](OTEL_SEMANTIC_GAP_ANALYSIS_AND_IMPLEMENTATION_PLAN.md):
+**Phase 4: Optional Enhancements**
+- ✅ Session & User Tracking - Track sessions and users across requests
+  ```python
+  genai_otel.instrument(
+      session_id_extractor=lambda ctx: ctx.get("session_id"),
+      user_id_extractor=lambda ctx: ctx.get("user_id")
+  )
+  ```
+- ✅ RAG/Embedding Attributes - Enhanced observability for retrieval-augmented generation
+  - `embedding.model_name` - Embedding model used
+  - `embedding.vector_dimensions` - Vector dimensions
+  - `retrieval.documents.{i}.document.id` - Retrieved document IDs
+  - `retrieval.documents.{i}.document.score` - Relevance scores
+  - `retrieval.documents.{i}.document.content` - Document content (truncated)
+- ✅ Agent Workflow Tracking - Better support for agentic workflows
+  - `agent.name` - Agent identifier
+  - `agent.iteration` - Current iteration number
+  - `agent.action` - Action taken
+  - `agent.observation` - Observation received
+#### 🔄 Migration Support
+**Backward Compatibility:**
+- All new features are opt-in via configuration
+- Existing instrumentation continues to work unchanged
+- Gradual migration path for new semantic conventions
+**Version Support:**
+- Python 3.9+ (evaluation features require 3.10+)
+- OpenTelemetry SDK 1.20.0+
+- Backward compatible with existing dashboards
+### Future Releases
+**v0.3.0 - Advanced Analytics**
+- Custom metric aggregations
+- Cost optimization recommendations
+- Automated performance regression detection
+- A/B testing support for prompts
+**v0.4.0 - Enterprise Features**
+- Multi-tenancy support
+- Role-based access control for telemetry
+- Advanced compliance reporting
+- SLA monitoring and alerting
+**Community Feedback**
+We welcome feedback on our roadmap! Please:
+- Open issues for feature requests
+- Join discussions on prioritization
+- Share your use cases and requirements
+See [Contributing.md](Contributing.md) for how to get involved.
 ## License
 Apache-2.0 license

genai-otel-instrument 0.1.1.dev0__tar.gz → 0.1.4.dev0__tar.gz

Potentially problematic release.

genai-otel-instrument 0.1.1.dev0tar.gz → 0.1.4.dev0tar.gz