PyPI - tokenator - Versions diffs - 0.1.15__tar.gz → 0.2.0__tar.gz - Mend

tokenator 0.1.15tar.gz → 0.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (25) hide show

{tokenator-0.1.15 → tokenator-0.2.0}/PKG-INFO RENAMED Viewed

@@ -1,10 +1,10 @@
 Metadata-Version: 2.3
 Name: tokenator
-Version: 0.1.15
+Version: 0.2.0
 Summary: Token usage tracking wrapper for LLMs
 License: MIT
 Author: Ujjwal Maheshwari
-Author-email: your.email@example.com
+Author-email: ujjwalm29@gmail.com
 Requires-Python: >=3.9,<4.0
 Classifier: License :: OSI Approved :: MIT License
 Classifier: Programming Language :: Python :: 3
@@ -15,23 +15,28 @@ Classifier: Programming Language :: Python :: 3.12
 Classifier: Programming Language :: Python :: 3.13
 Requires-Dist: alembic (>=1.13.0,<2.0.0)
 Requires-Dist: anthropic (>=0.43.0,<0.44.0)
+Requires-Dist: google-genai (>=1.3.0,<2.0.0)
 Requires-Dist: ipython
 Requires-Dist: openai (>=1.59.0,<2.0.0)
 Requires-Dist: requests (>=2.32.3,<3.0.0)
 Requires-Dist: sqlalchemy (>=2.0.0,<3.0.0)
 Description-Content-Type: text/markdown
-# Tokenator : Track and analyze LLM token usage and cost
+# Tokenator : Track, analyze, compare LLM token usage and costs
 Have you ever wondered :
 - How many tokens does your AI agent consume?
-- How much does it cost to do run a complex AI workflow with multiple LLM providers?
+- How much does it cost to run a complex AI workflow with multiple LLM providers?
+- Which LLM is more cost effective for my use case?
 - How much money/tokens did you spend today on developing with LLMs?
-Afraid not, tokenator is here! With tokenator's easy to use API, you can start tracking LLM usage in a matter of minutes.
+Afraid not, tokenator is here! With tokenator's easy to use functions, you can start tracking LLM usage in a matter of minutes.
 Get started with just 3 lines of code!
+Tokenator supports the official SDKs from openai, anthropic and google-genai(the new one).
+LLM providers which use the openai SDK like perplexity, deepseek and xAI are also supported.
 ## Installation
 ```bash
@@ -114,6 +119,10 @@ print(cost.last_hour().model_dump_json(indent=4))
 }
 ```
+## Cookbooks
+Want more code, example use cases and ideas? Check out our amazing [cookbooks](https://github.com/ujjwalm29/tokenator/tree/main/docs/cookbooks)!
 ## Features
 - Drop-in replacement for OpenAI, Anthropic client
@@ -173,6 +182,54 @@ print(usage.last_execution().model_dump_json(indent=4))
 """
 ```
+### Google (Gemini - through AI studio)
+```python
+from google import genai
+from tokenator import tokenator_gemini
+gemini_client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
+# Wrap it with Tokenator
+client = tokenator_gemini(gemini_client)
+# Use it exactly like the google-genai client
+response = models.generate_content(
+    model="gemini-2.0-flash",
+    contents="hello how are you",
+)
+print(response)
+print(usage.last_execution().model_dump_json(indent=4))
+"""
+{
+    "total_cost": 0.0001,
+    "total_tokens": 23,
+    "prompt_tokens": 10,
+    "completion_tokens": 13,
+    "providers": [
+        {
+            "total_cost": 0.0001,
+            "total_tokens": 23,
+            "prompt_tokens": 10,
+            "completion_tokens": 13,
+            "provider": "gemini",
+            "models": [
+                {
+                    "total_cost": 0.0004,
+                    "total_tokens": 79,
+                    "prompt_tokens": 52,
+                    "completion_tokens": 27,
+                    "model": "gemini-2.0-flash"
+                }
+            ]
+        }
+    ]
+}
+"""
+```
 ### xAI
 You can use xAI models through the `openai` SDK and track usage using `provider` parameter in `tokenator`.
@@ -221,7 +278,7 @@ client = tokenator_openai(perplexity_client, db_path=temp_db, provider="perplexi
 # Use it exactly like the OpenAI client but with perplexity models
 response = client.chat.completions.create(
-    model="llama-3.1-sonar-small-128k-online",
+    model="sonar",
     messages=[{"role": "user", "content": "Hello!"}]
 )

{tokenator-0.1.15 → tokenator-0.2.0}/README.md RENAMED Viewed

@@ -1,14 +1,18 @@
-# Tokenator : Track and analyze LLM token usage and cost
+# Tokenator : Track, analyze, compare LLM token usage and costs
 Have you ever wondered :
 - How many tokens does your AI agent consume?
-- How much does it cost to do run a complex AI workflow with multiple LLM providers?
+- How much does it cost to run a complex AI workflow with multiple LLM providers?
+- Which LLM is more cost effective for my use case?
 - How much money/tokens did you spend today on developing with LLMs?
-Afraid not, tokenator is here! With tokenator's easy to use API, you can start tracking LLM usage in a matter of minutes.
+Afraid not, tokenator is here! With tokenator's easy to use functions, you can start tracking LLM usage in a matter of minutes.
 Get started with just 3 lines of code!
+Tokenator supports the official SDKs from openai, anthropic and google-genai(the new one).
+LLM providers which use the openai SDK like perplexity, deepseek and xAI are also supported.
 ## Installation
 ```bash
@@ -91,6 +95,10 @@ print(cost.last_hour().model_dump_json(indent=4))
 }
 ```
+## Cookbooks
+Want more code, example use cases and ideas? Check out our amazing [cookbooks](https://github.com/ujjwalm29/tokenator/tree/main/docs/cookbooks)!
 ## Features
 - Drop-in replacement for OpenAI, Anthropic client
@@ -150,6 +158,54 @@ print(usage.last_execution().model_dump_json(indent=4))
 """
 ```
+### Google (Gemini - through AI studio)
+```python
+from google import genai
+from tokenator import tokenator_gemini
+gemini_client = genai.Client(api_key=os.getenv("GEMINI_API_KEY"))
+# Wrap it with Tokenator
+client = tokenator_gemini(gemini_client)
+# Use it exactly like the google-genai client
+response = models.generate_content(
+    model="gemini-2.0-flash",
+    contents="hello how are you",
+)
+print(response)
+print(usage.last_execution().model_dump_json(indent=4))
+"""
+{
+    "total_cost": 0.0001,
+    "total_tokens": 23,
+    "prompt_tokens": 10,
+    "completion_tokens": 13,
+    "providers": [
+        {
+            "total_cost": 0.0001,
+            "total_tokens": 23,
+            "prompt_tokens": 10,
+            "completion_tokens": 13,
+            "provider": "gemini",
+            "models": [
+                {
+                    "total_cost": 0.0004,
+                    "total_tokens": 79,
+                    "prompt_tokens": 52,
+                    "completion_tokens": 27,
+                    "model": "gemini-2.0-flash"
+                }
+            ]
+        }
+    ]
+}
+"""
+```
 ### xAI
 You can use xAI models through the `openai` SDK and track usage using `provider` parameter in `tokenator`.
@@ -198,7 +254,7 @@ client = tokenator_openai(perplexity_client, db_path=temp_db, provider="perplexi
 # Use it exactly like the OpenAI client but with perplexity models
 response = client.chat.completions.create(
-    model="llama-3.1-sonar-small-128k-online",
+    model="sonar",
     messages=[{"role": "user", "content": "Hello!"}]
 )

{tokenator-0.1.15 → tokenator-0.2.0}/pyproject.toml RENAMED Viewed

@@ -1,8 +1,8 @@
 [tool.poetry]
 name = "tokenator"
-version = "0.1.15"
+version = "0.2.0"
 description = "Token usage tracking wrapper for LLMs"
-authors = ["Ujjwal Maheshwari <your.email@example.com>"]
+authors = ["Ujjwal Maheshwari <ujjwalm29@gmail.com>"]
 readme = "README.md"
 license = "MIT"
 packages = [{include = "tokenator", from = "src"}]
@@ -15,12 +15,15 @@ requests = "^2.32.3"
 alembic = "^1.13.0"
 anthropic = "^0.43.0"
 ipython = "*"
+google-genai = "^1.3.0"
 [tool.poetry.group.dev.dependencies]
 pytest = "^8.0.0"
 pytest-asyncio = "^0.23.0"
 pytest-cov = "^4.1.0"
 ruff = "^0.8.4"
+langsmith = "^0.3.0"
+python-dotenv = "^1.0.1"
 [build-system]
 requires = ["poetry-core"]

{tokenator-0.1.15 → tokenator-0.2.0}/src/tokenator/__init__.py RENAMED Viewed

@@ -3,11 +3,18 @@
 import logging
 from .openai.client_openai import tokenator_openai
 from .anthropic.client_anthropic import tokenator_anthropic
+from .gemini.client_gemini import tokenator_gemini
 from . import usage
 from .utils import get_default_db_path
 from .usage import TokenUsageService
 usage = TokenUsageService()  # noqa: F811
-__all__ = ["tokenator_openai", "tokenator_anthropic", "usage", "get_default_db_path"]
+__all__ = [
+    "tokenator_openai",
+    "tokenator_anthropic",
+    "tokenator_gemini",
+    "usage",
+    "get_default_db_path",
+]
 logger = logging.getLogger(__name__)

{tokenator-0.1.15 → tokenator-0.2.0}/src/tokenator/base_wrapper.py RENAMED Viewed

@@ -112,7 +112,10 @@ class BaseWrapper:
             try:
                 self._log_usage_impl(token_usage_stats, session, execution_id)
                 session.commit()
-                logger.debug("Successfully committed token usage for execution_id: %s", execution_id)
+                logger.debug(
+                    "Successfully committed token usage for execution_id: %s",
+                    execution_id,
+                )
             except Exception as e:
                 logger.error("Failed to log token usage: %s", str(e))
                 session.rollback()

tokenator-0.2.0/src/tokenator/gemini/__init__.py ADDED Viewed

@@ -0,0 +1,5 @@
+"""Gemini client wrapper with token usage tracking."""
+from .client_gemini import tokenator_gemini
+__all__ = ["tokenator_gemini"]

tokenator-0.2.0/src/tokenator/gemini/client_gemini.py ADDED Viewed

@@ -0,0 +1,230 @@
+"""Gemini client wrapper with token usage tracking."""
+from typing import Any, Optional, Iterator, AsyncIterator
+import logging
+from google import genai
+from google.genai.types import GenerateContentResponse
+from ..models import (
+    TokenMetrics,
+    TokenUsageStats,
+)
+from ..base_wrapper import BaseWrapper, ResponseType
+from .stream_interceptors import (
+    GeminiAsyncStreamInterceptor,
+    GeminiSyncStreamInterceptor,
+)
+from ..state import is_tokenator_enabled
+logger = logging.getLogger(__name__)
+def _create_usage_callback(execution_id, log_usage_fn):
+    """Creates a callback function for processing usage statistics from stream chunks."""
+    def usage_callback(chunks):
+        if not chunks:
+            return
+        # Skip if tokenator is disabled
+        if not is_tokenator_enabled:
+            logger.debug("Tokenator is disabled - skipping stream usage logging")
+            return
+        logger.debug("Processing stream usage for execution_id: %s", execution_id)
+        # Build usage_data from the first chunk's model
+        usage_data = TokenUsageStats(
+            model=chunks[0].model_version,
+            usage=TokenMetrics(),
+        )
+        # Only take usage from the last chunk as it contains complete usage info
+        last_chunk = chunks[-1]
+        if last_chunk.usage_metadata:
+            usage_data.usage.prompt_tokens = (
+                last_chunk.usage_metadata.prompt_token_count
+            )
+            usage_data.usage.completion_tokens = (
+                last_chunk.usage_metadata.candidates_token_count or 0
+            )
+            usage_data.usage.total_tokens = last_chunk.usage_metadata.total_token_count
+            log_usage_fn(usage_data, execution_id=execution_id)
+    return usage_callback
+class BaseGeminiWrapper(BaseWrapper):
+    def __init__(self, client, db_path=None, provider: str = "gemini"):
+        super().__init__(client, db_path)
+        self.provider = provider
+        self._async_wrapper = None
+    def _process_response_usage(
+        self, response: ResponseType
+    ) -> Optional[TokenUsageStats]:
+        """Process and log usage statistics from a response."""
+        try:
+            if isinstance(response, GenerateContentResponse):
+                if response.usage_metadata is None:
+                    return None
+                usage = TokenMetrics(
+                    prompt_tokens=response.usage_metadata.prompt_token_count,
+                    completion_tokens=response.usage_metadata.candidates_token_count,
+                    total_tokens=response.usage_metadata.total_token_count,
+                )
+                return TokenUsageStats(model=response.model_version, usage=usage)
+            elif isinstance(response, dict):
+                usage_dict = response.get("usage_metadata")
+                if not usage_dict:
+                    return None
+                usage = TokenMetrics(
+                    prompt_tokens=usage_dict.get("prompt_token_count", 0),
+                    completion_tokens=usage_dict.get("candidates_token_count", 0),
+                    total_tokens=usage_dict.get("total_token_count", 0),
+                )
+                return TokenUsageStats(
+                    model=response.get("model", "unknown"), usage=usage
+                )
+        except Exception as e:
+            logger.warning("Failed to process usage stats: %s", str(e))
+            return None
+        return None
+    @property
+    def chat(self):
+        return self
+    @property
+    def chats(self):
+        return self
+    @property
+    def models(self):
+        return self
+    @property
+    def aio(self):
+        if self._async_wrapper is None:
+            self._async_wrapper = AsyncGeminiWrapper(self)
+        return self._async_wrapper
+    def count_tokens(self, *args: Any, **kwargs: Any):
+        return self.client.models.count_tokens(*args, **kwargs)
+class AsyncGeminiWrapper:
+    """Async wrapper for Gemini client to match the official SDK structure."""
+    def __init__(self, wrapper: BaseGeminiWrapper):
+        self.wrapper = wrapper
+        self._models = None
+    @property
+    def models(self):
+        if self._models is None:
+            self._models = AsyncModelsWrapper(self.wrapper)
+        return self._models
+class AsyncModelsWrapper:
+    """Async wrapper for models to match the official SDK structure."""
+    def __init__(self, wrapper: BaseGeminiWrapper):
+        self.wrapper = wrapper
+    async def generate_content(
+        self, *args: Any, **kwargs: Any
+    ) -> GenerateContentResponse:
+        """Async method for generate_content."""
+        execution_id = kwargs.pop("execution_id", None)
+        return await self.wrapper.generate_content_async(
+            *args, execution_id=execution_id, **kwargs
+        )
+    async def generate_content_stream(
+        self, *args: Any, **kwargs: Any
+    ) -> AsyncIterator[GenerateContentResponse]:
+        """Async method for generate_content_stream."""
+        execution_id = kwargs.pop("execution_id", None)
+        return await self.wrapper.generate_content_stream_async(
+            *args, execution_id=execution_id, **kwargs
+        )
+class GeminiWrapper(BaseGeminiWrapper):
+    def generate_content(
+        self, *args: Any, execution_id: Optional[str] = None, **kwargs: Any
+    ) -> GenerateContentResponse:
+        """Generate content and log token usage."""
+        logger.debug("Generating content with args: %s, kwargs: %s", args, kwargs)
+        response = self.client.models.generate_content(*args, **kwargs)
+        usage_data = self._process_response_usage(response)
+        if usage_data:
+            self._log_usage(usage_data, execution_id=execution_id)
+        return response
+    def generate_content_stream(
+        self, *args: Any, execution_id: Optional[str] = None, **kwargs: Any
+    ) -> Iterator[GenerateContentResponse]:
+        """Generate content with streaming and log token usage."""
+        logger.debug(
+            "Generating content stream with args: %s, kwargs: %s", args, kwargs
+        )
+        base_stream = self.client.models.generate_content_stream(*args, **kwargs)
+        return GeminiSyncStreamInterceptor(
+            base_stream=base_stream,
+            usage_callback=_create_usage_callback(execution_id, self._log_usage),
+        )
+    async def generate_content_async(
+        self, *args: Any, execution_id: Optional[str] = None, **kwargs: Any
+    ) -> GenerateContentResponse:
+        """Generate content asynchronously and log token usage."""
+        logger.debug("Generating content async with args: %s, kwargs: %s", args, kwargs)
+        response = await self.client.aio.models.generate_content(*args, **kwargs)
+        usage_data = self._process_response_usage(response)
+        if usage_data:
+            self._log_usage(usage_data, execution_id=execution_id)
+        return response
+    async def generate_content_stream_async(
+        self, *args: Any, execution_id: Optional[str] = None, **kwargs: Any
+    ) -> AsyncIterator[GenerateContentResponse]:
+        """Generate content with async streaming and log token usage."""
+        logger.debug(
+            "Generating content stream async with args: %s, kwargs: %s", args, kwargs
+        )
+        base_stream = await self.client.aio.models.generate_content_stream(
+            *args, **kwargs
+        )
+        return GeminiAsyncStreamInterceptor(
+            base_stream=base_stream,
+            usage_callback=_create_usage_callback(execution_id, self._log_usage),
+        )
+def tokenator_gemini(
+    client: genai.Client,
+    db_path: Optional[str] = None,
+    provider: str = "gemini",
+) -> GeminiWrapper:
+    """Create a token-tracking wrapper for a Gemini client.
+    Args:
+        client: Gemini client instance
+        db_path: Optional path to SQLite database for token tracking
+        provider: Provider name, defaults to "gemini"
+    """
+    if not isinstance(client, genai.Client):
+        raise ValueError("Client must be an instance of genai.Client")
+    return GeminiWrapper(client=client, db_path=db_path, provider=provider)

tokenator-0.2.0/src/tokenator/gemini/stream_interceptors.py ADDED Viewed

@@ -0,0 +1,77 @@
+"""Stream interceptors for Gemini responses."""
+import logging
+from typing import AsyncIterator, Callable, List, Optional, TypeVar, Iterator
+logger = logging.getLogger(__name__)
+_T = TypeVar("_T")  # GenerateContentResponse
+class GeminiAsyncStreamInterceptor(AsyncIterator[_T]):
+    """
+    A wrapper around Gemini async stream that intercepts each chunk to handle usage or
+    logging logic.
+    """
+    def __init__(
+        self,
+        base_stream: AsyncIterator[_T],
+        usage_callback: Optional[Callable[[List[_T]], None]] = None,
+    ):
+        self._base_stream = base_stream
+        self._usage_callback = usage_callback
+        self._chunks: List[_T] = []
+    def __aiter__(self) -> AsyncIterator[_T]:
+        """Return self as async iterator."""
+        return self
+    async def __anext__(self) -> _T:
+        """Get next chunk and track it."""
+        try:
+            chunk = await self._base_stream.__anext__()
+        except StopAsyncIteration:
+            # Once the base stream is fully consumed, we can do final usage/logging.
+            if self._usage_callback and self._chunks:
+                self._usage_callback(self._chunks)
+            raise
+        # Intercept each chunk
+        self._chunks.append(chunk)
+        return chunk
+class GeminiSyncStreamInterceptor(Iterator[_T]):
+    """
+    A wrapper around Gemini sync stream that intercepts each chunk to handle usage or
+    logging logic.
+    """
+    def __init__(
+        self,
+        base_stream: Iterator[_T],
+        usage_callback: Optional[Callable[[List[_T]], None]] = None,
+    ):
+        self._base_stream = base_stream
+        self._usage_callback = usage_callback
+        self._chunks: List[_T] = []
+    def __iter__(self) -> Iterator[_T]:
+        """Return self as iterator."""
+        return self
+    def __next__(self) -> _T:
+        """Get next chunk and track it."""
+        try:
+            chunk = next(self._base_stream)
+        except StopIteration:
+            # Once the base stream is fully consumed, we can do final usage/logging.
+            if self._usage_callback and self._chunks:
+                self._usage_callback(self._chunks)
+            raise
+        # Intercept each chunk
+        self._chunks.append(chunk)
+        return chunk

tokenator 0.1.15__tar.gz → 0.2.0__tar.gz

tokenator 0.1.15tar.gz → 0.2.0tar.gz