PyPI - chatlas - Versions diffs - 0.2.0__tar.gz → 0.3.0__tar.gz - Mend

chatlas 0.2.0tar.gz → 0.3.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of chatlas might be problematic. Click here for more details.

Files changed (152) hide show

{chatlas-0.2.0 → chatlas-0.3.0}/.github/workflows/release.yml RENAMED Viewed

@@ -5,8 +5,7 @@ on:
     types: [published]
 env:
-  UV_VERSION: "0.4.x"
-  PYTHON_VERSION: 3.13
+  PYTHON_VERSION: 3.12
 jobs:
   pypi-release:
@@ -27,8 +26,6 @@ jobs:
       - name: 🚀 Install uv
         uses: astral-sh/setup-uv@v3
-        with:
-          version: ${{ env.UV_VERSION }}
       - name: 🐍 Set up Python ${{ env.PYTHON_VERSION }}
         run: uv python install ${{ env.PYTHON_VERSION }}

chatlas-0.3.0/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,26 @@
+# Changelog
+<!--
+All notable changes to this project will be documented in this file.
+The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
+and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+-->
+## [0.3.0] - 2024-12-20
+### New features
+* `Chat`'s `.tokens()` method gains a `values` argument. Set it to `"discrete"` to get a result that can be summed to determine the token cost of submitting the current turns. The default (`"cumulative"`), remains the same (the result can be summed to determine the overall token cost of the conversation).
+* `Chat` gains a `.token_count()` method to help estimate token cost of new input. (#23)
+### Bug fixes
+* `ChatOllama` no longer fails when a `OPENAI_API_KEY` environment variable is not set.
+* `ChatOpenAI` now correctly includes the relevant `detail` on `ContentImageRemote()` input.
+* `ChatGoogle` now correctly logs its `token_usage()`. (#23)
+## [0.2.0] - 2024-12-11
+First stable release of `chatlas`, see the website to learn more <https://posit-dev.github.io/chatlas/>

{chatlas-0.2.0 → chatlas-0.3.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
-Metadata-Version: 2.3
+Metadata-Version: 2.4
 Name: chatlas
-Version: 0.2.0
+Version: 0.3.0
 Summary: A simple and consistent interface for chatting with LLMs
 Project-URL: Homepage, https://posit-dev.github.io/chatlas
 Project-URL: Documentation, https://posit-dev.github.io/chatlas
@@ -30,15 +30,18 @@ Requires-Dist: pillow; extra == 'dev'
 Requires-Dist: python-dotenv; extra == 'dev'
 Requires-Dist: ruff>=0.6.5; extra == 'dev'
 Requires-Dist: shiny; extra == 'dev'
+Requires-Dist: tiktoken; extra == 'dev'
 Provides-Extra: docs
 Requires-Dist: griffe>=1; extra == 'docs'
 Requires-Dist: ipykernel; extra == 'docs'
 Requires-Dist: ipywidgets; extra == 'docs'
 Requires-Dist: nbclient; extra == 'docs'
 Requires-Dist: nbformat; extra == 'docs'
+Requires-Dist: numpy; extra == 'docs'
 Requires-Dist: pandas; extra == 'docs'
 Requires-Dist: pyyaml; extra == 'docs'
 Requires-Dist: quartodoc>=0.7; extra == 'docs'
+Requires-Dist: sentence-transformers; extra == 'docs'
 Provides-Extra: test
 Requires-Dist: pyright>=1.1.379; extra == 'test'
 Requires-Dist: pytest-asyncio; extra == 'test'
@@ -48,6 +51,14 @@ Description-Content-Type: text/markdown
 # chatlas
+<p>
+<!-- badges start -->
+<a href="https://pypi.org/project/chatlas/"><img alt="PyPI" src="https://img.shields.io/pypi/v/chatlas?logo=python&logoColor=white&color=orange"></a>
+<a href="https://choosealicense.com/licenses/mit/"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="MIT License"></a>
+<a href="https://github.com/posit-dev/chatlas"><img src="https://github.com/posit-dev/chatlas/actions/workflows/test.yml/badge.svg?branch=main" alt="Python Tests"></a>
+<!-- badges end -->
+</p>
 chatlas provides a simple and unified interface across large language model (llm) providers in Python.
 It abstracts away complexity from common tasks like streaming chat interfaces, tool calling, structured output, and much more.
 chatlas helps you prototype faster without painting you into a corner; for example, switching providers is as easy as changing one line of code, but provider specific features are still accessible when needed.
@@ -123,7 +134,7 @@ From a `chat` instance, it's simple to start a web-based or terminal-based chat
 chat.app()
 ```
-<div style="display:flex;justify-content:center;">
+<div align="center">
 <img width="500" alt="A web app for chatting with an LLM via chatlas" src="https://github.com/user-attachments/assets/e43f60cb-3686-435a-bd11-8215cb024d2e" class="border rounded">
 </div>
@@ -279,7 +290,7 @@ asyncio.run(main())
 `chatlas` has full typing support, meaning that, among other things, autocompletion just works in your favorite editor:
-<div style="display:flex;justify-content:center;">
+<div align="center">
 <img width="500" alt="Autocompleting model options in ChatOpenAI" src="https://github.com/user-attachments/assets/163d6d8a-7d58-422d-b3af-cc9f2adee759" class="rounded">
 </div>
@@ -299,7 +310,7 @@ This shows important information like tool call results, finish reasons, and mor
 If the problem isn't self-evident, you can also reach into the `.get_last_turn()`, which contains the full response object, with full details about the completion.
-<div style="display:flex;justify-content:center;">
+<div align="center">
   <img width="500" alt="Turn completion details with typing support" src="https://github.com/user-attachments/assets/eaea338d-e44a-4e23-84a7-2e998d8af3ba" class="rounded">
 </div>

{chatlas-0.2.0 → chatlas-0.3.0}/README.md RENAMED Viewed

@@ -1,5 +1,13 @@
 # chatlas
+<p>
+<!-- badges start -->
+<a href="https://pypi.org/project/chatlas/"><img alt="PyPI" src="https://img.shields.io/pypi/v/chatlas?logo=python&logoColor=white&color=orange"></a>
+<a href="https://choosealicense.com/licenses/mit/"><img src="https://img.shields.io/badge/License-MIT-blue.svg" alt="MIT License"></a>
+<a href="https://github.com/posit-dev/chatlas"><img src="https://github.com/posit-dev/chatlas/actions/workflows/test.yml/badge.svg?branch=main" alt="Python Tests"></a>
+<!-- badges end -->
+</p>
 chatlas provides a simple and unified interface across large language model (llm) providers in Python.
 It abstracts away complexity from common tasks like streaming chat interfaces, tool calling, structured output, and much more.
 chatlas helps you prototype faster without painting you into a corner; for example, switching providers is as easy as changing one line of code, but provider specific features are still accessible when needed.
@@ -75,7 +83,7 @@ From a `chat` instance, it's simple to start a web-based or terminal-based chat
 chat.app()
 ```
-<div style="display:flex;justify-content:center;">
+<div align="center">
 <img width="500" alt="A web app for chatting with an LLM via chatlas" src="https://github.com/user-attachments/assets/e43f60cb-3686-435a-bd11-8215cb024d2e" class="border rounded">
 </div>
@@ -231,7 +239,7 @@ asyncio.run(main())
 `chatlas` has full typing support, meaning that, among other things, autocompletion just works in your favorite editor:
-<div style="display:flex;justify-content:center;">
+<div align="center">
 <img width="500" alt="Autocompleting model options in ChatOpenAI" src="https://github.com/user-attachments/assets/163d6d8a-7d58-422d-b3af-cc9f2adee759" class="rounded">
 </div>
@@ -251,7 +259,7 @@ This shows important information like tool call results, finish reasons, and mor
 If the problem isn't self-evident, you can also reach into the `.get_last_turn()`, which contains the full response object, with full details about the completion.
-<div style="display:flex;justify-content:center;">
+<div align="center">
   <img width="500" alt="Turn completion details with typing support" src="https://github.com/user-attachments/assets/eaea338d-e44a-4e23-84a7-2e998d8af3ba" class="rounded">
 </div>

{chatlas-0.2.0 → chatlas-0.3.0}/chatlas/_anthropic.py RENAMED Viewed

@@ -20,7 +20,7 @@ from ._logging import log_model_default
 from ._provider import Provider
 from ._tokens import tokens_log
 from ._tools import Tool, basemodel_to_param_schema
-from ._turn import Turn, normalize_turns
+from ._turn import Turn, normalize_turns, user_turn
 if TYPE_CHECKING:
     from anthropic.types import (
@@ -380,6 +380,59 @@ class AnthropicProvider(Provider[Message, RawMessageStreamEvent, Message]):
     def value_turn(self, completion, has_data_model) -> Turn:
         return self._as_turn(completion, has_data_model)
+    def token_count(
+        self,
+        *args: Content | str,
+        tools: dict[str, Tool],
+        data_model: Optional[type[BaseModel]],
+    ) -> int:
+        kwargs = self._token_count_args(
+            *args,
+            tools=tools,
+            data_model=data_model,
+        )
+        res = self._client.messages.count_tokens(**kwargs)
+        return res.input_tokens
+    async def token_count_async(
+        self,
+        *args: Content | str,
+        tools: dict[str, Tool],
+        data_model: Optional[type[BaseModel]],
+    ) -> int:
+        kwargs = self._token_count_args(
+            *args,
+            tools=tools,
+            data_model=data_model,
+        )
+        res = await self._async_client.messages.count_tokens(**kwargs)
+        return res.input_tokens
+    def _token_count_args(
+        self,
+        *args: Content | str,
+        tools: dict[str, Tool],
+        data_model: Optional[type[BaseModel]],
+    ) -> dict[str, Any]:
+        turn = user_turn(*args)
+        kwargs = self._chat_perform_args(
+            stream=False,
+            turns=[turn],
+            tools=tools,
+            data_model=data_model,
+        )
+        args_to_keep = [
+            "messages",
+            "model",
+            "system",
+            "tools",
+            "tool_choice",
+        ]
+        return {arg: kwargs[arg] for arg in args_to_keep if arg in kwargs}
     def _as_message_params(self, turns: list[Turn]) -> list["MessageParam"]:
         messages: list["MessageParam"] = []
         for turn in turns:
@@ -575,6 +628,53 @@ def ChatBedrockAnthropic(
         Additional arguments to pass to the `anthropic.AnthropicBedrock()`
         client constructor.
+    Troubleshooting
+    ---------------
+    If you encounter 400 or 403 errors when trying to use the model, keep the
+    following in mind:
+    ::: {.callout-note}
+    #### Incorrect model name
+    If the model name is completely incorrect, you'll see an error like
+    `Error code: 400 - {'message': 'The provided model identifier is invalid.'}`
+    Make sure the model name is correct and active in the specified region.
+    :::
+    ::: {.callout-note}
+    #### Models are region specific
+    If you encounter errors similar to `Error code: 403 - {'message': "You don't
+    have access to the model with the specified model ID."}`, make sure your
+    model is active in the relevant `aws_region`.
+    Keep in mind, if `aws_region` is not specified, and AWS_REGION is not set,
+    the region defaults to us-east-1, which may not match to your AWS config's
+    default region.
+    :::
+    ::: {.callout-note}
+    #### Cross region inference ID
+    In some cases, even if you have the right model and the right region, you
+    may still encounter an error like  `Error code: 400 - {'message':
+    'Invocation of model ID anthropic.claude-3-5-sonnet-20240620-v1:0 with
+    on-demand throughput isn't supported. Retry your request with the ID or ARN
+    of an inference profile that contains this model.'}`
+    In this case, you'll need to look up the 'cross region inference ID' for
+    your model. This might required opening your `aws-console` and navigating to
+    the 'Anthropic Bedrock' service page. From there, go to the 'cross region
+    inference' tab and copy the relevant ID.
+    For example, if the desired model ID is
+    `anthropic.claude-3-5-sonnet-20240620-v1:0`, the cross region ID might look
+    something like `us.anthropic.claude-3-5-sonnet-20240620-v1:0`.
+    :::
     Returns
     -------
     Chat

{chatlas-0.2.0 → chatlas-0.3.0}/chatlas/_chat.py RENAMED Viewed

@@ -16,6 +16,7 @@ from typing import (
     Optional,
     Sequence,
     TypeVar,
+    overload,
 )
 from pydantic import BaseModel
@@ -176,17 +177,209 @@ class Chat(Generic[SubmitInputArgsT, CompletionT]):
         if value is not None:
             self._turns.insert(0, Turn("system", value))
-    def tokens(self) -> list[tuple[int, int] | None]:
+    @overload
+    def tokens(self) -> list[tuple[int, int] | None]: ...
+    @overload
+    def tokens(
+        self,
+        values: Literal["cumulative"],
+    ) -> list[tuple[int, int] | None]: ...
+    @overload
+    def tokens(
+        self,
+        values: Literal["discrete"],
+    ) -> list[int]: ...
+    def tokens(
+        self,
+        values: Literal["cumulative", "discrete"] = "discrete",
+    ) -> list[int] | list[tuple[int, int] | None]:
         """
         Get the tokens for each turn in the chat.
+        Parameters
+        ----------
+        values
+            If "cumulative" (the default), the result can be summed to get the
+            chat's overall token usage (helpful for computing overall cost of
+            the chat). If "discrete", the result can be summed to get the number of
+            tokens the turns will cost to generate the next response (helpful
+            for estimating cost of the next response, or for determining if you
+            are about to exceed the token limit).
+        Returns
+        -------
+        list[int]
+            A list of token counts for each (non-system) turn in the chat. The
+            1st turn includes the tokens count for the system prompt (if any).
+        Raises
+        ------
+        ValueError
+            If the chat's turns (i.e., `.get_turns()`) are not in an expected
+            format. This may happen if the chat history is manually set (i.e.,
+            `.set_turns()`). In this case, you can inspect the "raw" token
+            values via the `.get_turns()` method (each turn has a `.tokens`
+            attribute).
+        """
+        turns = self.get_turns(include_system_prompt=False)
+        if values == "cumulative":
+            return [turn.tokens for turn in turns]
+        if len(turns) == 0:
+            return []
+        err_info = (
+            "This can happen if the chat history is manually set (i.e., `.set_turns()`). "
+            "Consider getting the 'raw' token values via the `.get_turns()` method "
+            "(each turn has a `.tokens` attribute)."
+        )
+        # Sanity checks for the assumptions made to figure out user token counts
+        if len(turns) == 1:
+            raise ValueError(
+                "Expected at least two turns in the chat history. " + err_info
+            )
+        if len(turns) % 2 != 0:
+            raise ValueError(
+                "Expected an even number of turns in the chat history. " + err_info
+            )
+        if turns[0].role != "user":
+            raise ValueError(
+                "Expected the 1st non-system turn to have role='user'. " + err_info
+            )
+        if turns[1].role != "assistant":
+            raise ValueError(
+                "Expected the 2nd turn non-system to have role='assistant'. " + err_info
+            )
+        if turns[1].tokens is None:
+            raise ValueError(
+                "Expected the 1st assistant turn to contain token counts. " + err_info
+            )
+        res: list[int] = [
+            # Implied token count for the 1st user input
+            turns[1].tokens[0],
+            # The token count for the 1st assistant response
+            turns[1].tokens[1],
+        ]
+        for i in range(1, len(turns) - 1, 2):
+            ti = turns[i]
+            tj = turns[i + 2]
+            if ti.role != "assistant" or tj.role != "assistant":
+                raise ValueError(
+                    "Expected even turns to have role='assistant'." + err_info
+                )
+            if ti.tokens is None or tj.tokens is None:
+                raise ValueError(
+                    "Expected role='assistant' turns to contain token counts."
+                    + err_info
+                )
+            res.extend(
+                [
+                    # Implied token count for the user input
+                    tj.tokens[0] - sum(ti.tokens),
+                    # The token count for the assistant response
+                    tj.tokens[1],
+                ]
+            )
+        return res
+    def token_count(
+        self,
+        *args: Content | str,
+        data_model: Optional[type[BaseModel]] = None,
+    ) -> int:
+        """
+        Get an estimated token count for the given input.
+        Estimate the token size of input content. This can help determine whether input(s)
+        and/or conversation history (i.e., `.get_turns()`) should be reduced in size before
+        sending it to the model.
+        Parameters
+        ----------
+        args
+            The input to get a token count for.
+        data_model
+            If the input is meant for data extraction (i.e., `.extract_data()`), then
+            this should be the Pydantic model that describes the structure of the data to
+            extract.
+        Returns
+        -------
+        int
+            The token count for the input.
+        Note
+        ----
+        Remember that the token count is an estimate. Also, models based on
+        `ChatOpenAI()` currently does not take tools into account when
+        estimating token counts.
+        Examples
+        --------
+        ```python
+        from chatlas import ChatAnthropic
+        chat = ChatAnthropic()
+        # Estimate the token count before sending the input
+        print(chat.token_count("What is 2 + 2?"))
+        # Once input is sent, you can get the actual input and output
+        # token counts from the chat object
+        chat.chat("What is 2 + 2?", echo="none")
+        print(chat.token_usage())
+        ```
+        """
+        return self.provider.token_count(
+            *args,
+            tools=self._tools,
+            data_model=data_model,
+        )
+    async def token_count_async(
+        self,
+        *args: Content | str,
+        data_model: Optional[type[BaseModel]] = None,
+    ) -> int:
+        """
+        Get an estimated token count for the given input asynchronously.
+        Estimate the token size of input content. This can help determine whether input(s)
+        and/or conversation history (i.e., `.get_turns()`) should be reduced in size before
+        sending it to the model.
+        Parameters
+        ----------
+        args
+            The input to get a token count for.
+        data_model
+            If this input is meant for data extraction (i.e., `.extract_data_async()`),
+            then this should be the Pydantic model that describes the structure of the data
+            to extract.
         Returns
         -------
-        list[tuple[int, int] | None]
-            A list of tuples, where each tuple contains the start and end token
-            indices for a turn.
+        int
+            The token count for the input.
         """
-        return [turn.tokens for turn in self._turns]
+        return await self.provider.token_count_async(
+            *args,
+            tools=self._tools,
+            data_model=data_model,
+        )
     def app(
         self,

{chatlas-0.2.0 → chatlas-0.3.0}/chatlas/_google.py RENAMED Viewed

@@ -17,8 +17,9 @@ from ._content import (
 )
 from ._logging import log_model_default
 from ._provider import Provider
+from ._tokens import tokens_log
 from ._tools import Tool, basemodel_to_param_schema
-from ._turn import Turn, normalize_turns
+from ._turn import Turn, normalize_turns, user_turn
 if TYPE_CHECKING:
     from google.generativeai.types.content_types import (
@@ -332,6 +333,55 @@ class GoogleProvider(
     def value_turn(self, completion, has_data_model) -> Turn:
         return self._as_turn(completion, has_data_model)
+    def token_count(
+        self,
+        *args: Content | str,
+        tools: dict[str, Tool],
+        data_model: Optional[type[BaseModel]],
+    ):
+        kwargs = self._token_count_args(
+            *args,
+            tools=tools,
+            data_model=data_model,
+        )
+        res = self._client.count_tokens(**kwargs)
+        return res.total_tokens
+    async def token_count_async(
+        self,
+        *args: Content | str,
+        tools: dict[str, Tool],
+        data_model: Optional[type[BaseModel]],
+    ):
+        kwargs = self._token_count_args(
+            *args,
+            tools=tools,
+            data_model=data_model,
+        )
+        res = await self._client.count_tokens_async(**kwargs)
+        return res.total_tokens
+    def _token_count_args(
+        self,
+        *args: Content | str,
+        tools: dict[str, Tool],
+        data_model: Optional[type[BaseModel]],
+    ) -> dict[str, Any]:
+        turn = user_turn(*args)
+        kwargs = self._chat_perform_args(
+            stream=False,
+            turns=[turn],
+            tools=tools,
+            data_model=data_model,
+        )
+        args_to_keep = ["contents", "tools"]
+        return {arg: kwargs[arg] for arg in args_to_keep if arg in kwargs}
     def _google_contents(self, turns: list[Turn]) -> list["ContentDict"]:
         contents: list["ContentDict"] = []
         for turn in turns:
@@ -421,6 +471,8 @@ class GoogleProvider(
             usage.candidates_token_count,
         )
+        tokens_log(self, tokens)
         finish = message.candidates[0].finish_reason
         return Turn(

{chatlas-0.2.0 → chatlas-0.3.0}/chatlas/_ollama.py RENAMED Viewed

@@ -48,6 +48,13 @@ def ChatOllama(
     (e.g. `ollama pull llama3.2`).
     :::
+    ::: {.callout-note}
+    ## Python requirements
+    `ChatOllama` requires the `openai` package (e.g., `pip install openai`).
+    :::
     Examples
     --------
@@ -103,6 +110,7 @@ def ChatOllama(
     return ChatOpenAI(
         system_prompt=system_prompt,
+        api_key="ollama",  # ignored
         turns=turns,
         base_url=f"{base_url}/v1",
         model=model,

chatlas 0.2.0__tar.gz → 0.3.0__tar.gz

Potentially problematic release.

chatlas 0.2.0tar.gz → 0.3.0tar.gz