PyPI - pydantic-evals - Versions diffs - 0.5.0__tar.gz → 1.12.0__tar.gz - Mend

pydantic-evals 0.5.0tar.gz → 1.12.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of pydantic-evals might be problematic. Click here for more details.

Files changed (25) hide show

{pydantic_evals-0.5.0 → pydantic_evals-1.12.0}/.gitignore RENAMED Viewed

@@ -10,7 +10,7 @@ env*/
 /TODO.md
 /postgres-data/
 .DS_Store
-examples/pydantic_ai_examples/.chat_app_messages.sqlite
+.chat_app_messages.sqlite
 .cache/
 .vscode/
 /question_graph_history.json
@@ -19,3 +19,5 @@ node_modules/
 **.idea/
 .coverage*
 /test_tmp/
+.mcp.json
+.claude/

{pydantic_evals-0.5.0 → pydantic_evals-1.12.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: pydantic-evals
-Version: 0.5.0
+Version: 1.12.0
 Summary: Framework for evaluating stochastic code execution, especially code making use of LLMs
 Project-URL: Homepage, https://ai.pydantic.dev/evals
 Project-URL: Source, https://github.com/pydantic/pydantic-ai
@@ -9,7 +9,7 @@ Project-URL: Changelog, https://github.com/pydantic/pydantic-ai/releases
 Author-email: Samuel Colvin <samuel@pydantic.dev>, Marcelo Trylesinski <marcelotryle@gmail.com>, David Montague <david@pydantic.dev>, Alex Hall <alex@pydantic.dev>, Douwe Maan <douwe@pydantic.dev>
 License-Expression: MIT
 License-File: LICENSE
-Classifier: Development Status :: 4 - Beta
+Classifier: Development Status :: 5 - Production/Stable
 Classifier: Environment :: Console
 Classifier: Environment :: MacOS X
 Classifier: Intended Audience :: Developers
@@ -21,23 +21,21 @@ Classifier: Operating System :: Unix
 Classifier: Programming Language :: Python
 Classifier: Programming Language :: Python :: 3
 Classifier: Programming Language :: Python :: 3 :: Only
-Classifier: Programming Language :: Python :: 3.9
 Classifier: Programming Language :: Python :: 3.10
 Classifier: Programming Language :: Python :: 3.11
 Classifier: Programming Language :: Python :: 3.12
 Classifier: Programming Language :: Python :: 3.13
 Classifier: Topic :: Internet
 Classifier: Topic :: Software Development :: Libraries :: Python Modules
-Requires-Python: >=3.9
+Requires-Python: >=3.10
 Requires-Dist: anyio>=0
-Requires-Dist: eval-type-backport>=0; python_version < '3.11'
-Requires-Dist: logfire-api>=1.2.0
-Requires-Dist: pydantic-ai-slim==0.5.0
+Requires-Dist: logfire-api>=3.14.1
+Requires-Dist: pydantic-ai-slim==1.12.0
 Requires-Dist: pydantic>=2.10
 Requires-Dist: pyyaml>=6.0.2
 Requires-Dist: rich>=13.9.4
 Provides-Extra: logfire
-Requires-Dist: logfire>=2.3; extra == 'logfire'
+Requires-Dist: logfire>=3.14.1; extra == 'logfire'
 Description-Content-Type: text/markdown
 # Pydantic Evals

pydantic_evals-1.12.0/pydantic_evals/__init__.py ADDED Viewed

@@ -0,0 +1,16 @@
+"""A toolkit for evaluating the execution of arbitrary "stochastic functions", such as LLM calls.
+This package provides functionality for:
+- Creating and loading test datasets with structured inputs and outputs
+- Evaluating model performance using various metrics and evaluators
+- Generating reports for evaluation results
+"""
+from .dataset import Case, Dataset, increment_eval_metric, set_eval_attribute
+__all__ = (
+    'Case',
+    'Dataset',
+    'increment_eval_metric',
+    'set_eval_attribute',
+)

{pydantic_evals-0.5.0 → pydantic_evals-1.12.0}/pydantic_evals/_utils.py RENAMED Viewed

@@ -2,13 +2,20 @@ from __future__ import annotations as _annotations
 import asyncio
 import inspect
-from collections.abc import Awaitable, Sequence
+import warnings
+from collections.abc import Awaitable, Callable, Generator, Sequence
+from contextlib import contextmanager
 from functools import partial
-from typing import Any, Callable, TypeVar
+from pathlib import Path
+from typing import TYPE_CHECKING, Any, TypeVar
 import anyio
+import logfire_api
 from typing_extensions import ParamSpec, TypeIs
+_logfire = logfire_api.Logfire(otel_scope='pydantic-evals')
+logfire_api.add_non_user_code_prefix(Path(__file__).parent.absolute())
 class Unset:
     """A singleton to represent an unset value.
@@ -101,3 +108,28 @@ async def task_group_gather(tasks: Sequence[Callable[[], Awaitable[T]]]) -> list
             tg.start_soon(_run_task, task, i)
     return results
+try:
+    from logfire._internal.config import (
+        LogfireNotConfiguredWarning,  # pyright: ignore[reportAssignmentType,reportPrivateImportUsage]
+    )
+# TODO: Remove this `pragma: no cover` once we test evals without pydantic-ai (which includes logfire)
+except ImportError:  # pragma: no cover
+    class LogfireNotConfiguredWarning(UserWarning):
+        pass
+if TYPE_CHECKING:
+    logfire_span = _logfire.span
+else:
+    @contextmanager
+    def logfire_span(*args: Any, **kwargs: Any) -> Generator[logfire_api.LogfireSpan, None, None]:
+        """Create a Logfire span without warning if logfire is not configured."""
+        # TODO: Remove once Logfire has the ability to suppress this warning from non-user code
+        with warnings.catch_warnings():
+            warnings.filterwarnings('ignore', category=LogfireNotConfiguredWarning)
+            with _logfire.span(*args, **kwargs) as span:
+                yield span

pydantic-evals 0.5.0__tar.gz → 1.12.0__tar.gz

Potentially problematic release.

pydantic-evals 0.5.0tar.gz → 1.12.0tar.gz