PyPI - judgeval - Versions diffs - 0.5.0__tar.gz → 0.7.0__tar.gz - Mend

judgeval 0.5.0tar.gz → 0.7.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (128) hide show

{judgeval-0.5.0 → judgeval-0.7.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: judgeval
-Version: 0.5.0
+Version: 0.7.0
 Summary: Judgeval Package
 Project-URL: Homepage, https://github.com/JudgmentLabs/judgeval
 Project-URL: Issues, https://github.com/JudgmentLabs/judgeval/issues
@@ -11,6 +11,8 @@ Classifier: Operating System :: OS Independent
 Classifier: Programming Language :: Python :: 3
 Requires-Python: >=3.11
 Requires-Dist: boto3
+Requires-Dist: click<8.2.0
+Requires-Dist: fireworks-ai>=0.19.18
 Requires-Dist: langchain-anthropic
 Requires-Dist: langchain-core
 Requires-Dist: langchain-huggingface
@@ -23,6 +25,7 @@ Requires-Dist: orjson>=3.9.0
 Requires-Dist: python-dotenv
 Requires-Dist: requests
 Requires-Dist: rich
+Requires-Dist: typer>=0.9.0
 Provides-Extra: langchain
 Requires-Dist: langchain-anthropic; extra == 'langchain'
 Requires-Dist: langchain-core; extra == 'langchain'
@@ -37,7 +40,7 @@ Description-Content-Type: text/markdown
 <br>
 <div style="font-size: 1.5em;">
-    Enable self-learning agents with traces, evals, and environment data.
+    Enable self-learning agents with environment data and evals.
 </div>
 ## [Docs](https://docs.judgmentlabs.ai/)  •  [Judgment Cloud](https://app.judgmentlabs.ai/register)  • [Self-Host](https://docs.judgmentlabs.ai/documentation/self-hosting/get-started)  • [Landing Page](https://judgmentlabs.ai/)
@@ -54,11 +57,11 @@ We're hiring! Join us in our mission to enable self-learning agents by providing
 </div>
-Judgeval offers **open-source tooling** for tracing and evaluating autonomous, stateful agents. It **provides runtime data from agent-environment interactions** for continuous learning and self-improvement.
+Judgeval offers **open-source tooling** for evaluating autonomous, stateful agents. It **provides runtime data from agent-environment interactions** for continuous learning and self-improvement.
 ## 🎬 See Judgeval in Action
-**[Multi-Agent System](https://github.com/JudgmentLabs/judgment-cookbook/tree/main/cookbooks/agents/multi-agent) with complete observability:** (1) A multi-agent system spawns agents to research topics on the internet. (2) With just **3 lines of code**, Judgeval traces every input/output + environment response across all agent tool calls for debugging. (3) After completion, (4) export all interaction data to enable further environment-specific learning and optimization.
+**[Multi-Agent System](https://github.com/JudgmentLabs/judgment-cookbook/tree/main/cookbooks/agents/multi-agent) with complete observability:** (1) A multi-agent system spawns agents to research topics on the internet. (2) With just **3 lines of code**, Judgeval captures all environment responses across all agent tool calls for monitoring. (3) After completion, (4) export all interaction data to enable further environment-specific learning and optimization.
 <table style="width: 100%; max-width: 800px; table-layout: fixed;">
 <tr>
@@ -67,8 +70,8 @@ Judgeval offers **open-source tooling** for tracing and evaluating autonomous, s
   <br><strong>🤖 Agents Running</strong>
 </td>
 <td align="center" style="padding: 8px; width: 50%;">
-  <img src="assets/trace.gif" alt="Trace Demo" style="width: 100%; max-width: 350px; height: auto;" />
-  <br><strong>📊 Real-time Tracing</strong>
+  <img src="assets/trace.gif" alt="Capturing Environment Data Demo" style="width: 100%; max-width: 350px; height: auto;" />
+  <br><strong>📊 Capturing Environment Data </strong>
 </td>
 </tr>
 <tr>
@@ -109,54 +112,14 @@ export JUDGMENT_ORG_ID=...
 **If you don't have keys, [create an account](https://app.judgmentlabs.ai/register) on the platform!**
-## 🏁 Quickstarts
-### 🛰️ Tracing
-Create a file named `agent.py` with the following code:
-```python
-from judgeval.tracer import Tracer, wrap
-from openai import OpenAI
-client = wrap(OpenAI())  # tracks all LLM calls
-judgment = Tracer(project_name="my_project")
-@judgment.observe(span_type="tool")
-def format_question(question: str) -> str:
-    # dummy tool
-    return f"Question : {question}"
-@judgment.observe(span_type="function")
-def run_agent(prompt: str) -> str:
-    task = format_question(prompt)
-    response = client.chat.completions.create(
-        model="gpt-4.1",
-        messages=[{"role": "user", "content": task}]
-    )
-    return response.choices[0].message.content
-run_agent("What is the capital of the United States?")
-```
-You'll see your trace exported to the Judgment Platform:
-<p align="center"><img src="assets/online_eval.png" alt="Judgment Platform Trace Example" width="1500" /></p>
-[Click here](https://docs.judgmentlabs.ai/documentation/tracing/introduction) for a more detailed explanation.
-<!-- Created by https://github.com/ekalinin/github-markdown-toc -->
 ## ✨ Features
 |  |  |
 |:---|:---:|
-| <h3>🔍 Tracing</h3>Automatic agent tracing integrated with common frameworks (LangGraph, OpenAI, Anthropic). **Tracks inputs/outputs, agent tool calls, latency, cost, and custom metadata** at every step.<br><br>**Useful for:**<br>• 🐛 Debugging agent runs <br>• 📋 Collecting agent environment data <br>• 🔬 Pinpointing performance bottlenecks| <p align="center"><img src="assets/agent_trace_example.png" alt="Tracing visualization" width="1200"/></p> |
 | <h3>🧪 Evals</h3>Build custom evaluators on top of your agents. Judgeval supports LLM-as-a-judge, manual labeling, and code-based evaluators that connect with our metric-tracking infrastructure. <br><br>**Useful for:**<br>• ⚠️ Unit-testing <br>• 🔬 A/B testing <br>• 🛡️ Online guardrails | <p align="center"><img src="assets/test.png" alt="Evaluation metrics" width="800"/></p> |
 | <h3>📡 Monitoring</h3>Get Slack alerts for agent failures in production. Add custom hooks to address production regressions.<br><br> **Useful for:** <br>• 📉 Identifying degradation early <br>• 📈 Visualizing performance trends across agent versions and time | <p align="center"><img src="assets/errors.png" alt="Monitoring Dashboard" width="1200"/></p> |
-| <h3>📊 Datasets</h3>Export traces and test cases to datasets for scaled analysis and optimization. Move datasets to/from Parquet, S3, etc. <br><br>Run evals on datasets as unit tests or to A/B test different agent configurations, enabling continuous learning from production interactions. <br><br> **Useful for:**<br>• 🗃️ Agent environment interaction data for optimization<br>• 🔄 Scaled analysis for A/B tests | <p align="center"><img src="assets/datasets_preview_screenshot.png" alt="Dataset management" width="1200"/></p> |
+| <h3>📊 Datasets</h3>Export environment interactions and test cases to datasets for scaled analysis and optimization. Move datasets to/from Parquet, S3, etc. <br><br>Run evals on datasets as unit tests or to A/B test different agent configurations, enabling continuous learning from production interactions. <br><br> **Useful for:**<br>• 🗃️ Agent environment interaction data for optimization<br>• 🔄 Scaled analysis for A/B tests | <p align="center"><img src="assets/datasets_preview_screenshot.png" alt="Dataset management" width="1200"/></p> |
 ## 🏢 Self-Hosting

{judgeval-0.5.0 → judgeval-0.7.0}/README.md RENAMED Viewed

@@ -5,7 +5,7 @@
 <br>
 <div style="font-size: 1.5em;">
-    Enable self-learning agents with traces, evals, and environment data.
+    Enable self-learning agents with environment data and evals.
 </div>
 ## [Docs](https://docs.judgmentlabs.ai/)  •  [Judgment Cloud](https://app.judgmentlabs.ai/register)  • [Self-Host](https://docs.judgmentlabs.ai/documentation/self-hosting/get-started)  • [Landing Page](https://judgmentlabs.ai/)
@@ -22,11 +22,11 @@ We're hiring! Join us in our mission to enable self-learning agents by providing
 </div>
-Judgeval offers **open-source tooling** for tracing and evaluating autonomous, stateful agents. It **provides runtime data from agent-environment interactions** for continuous learning and self-improvement.
+Judgeval offers **open-source tooling** for evaluating autonomous, stateful agents. It **provides runtime data from agent-environment interactions** for continuous learning and self-improvement.
 ## 🎬 See Judgeval in Action
-**[Multi-Agent System](https://github.com/JudgmentLabs/judgment-cookbook/tree/main/cookbooks/agents/multi-agent) with complete observability:** (1) A multi-agent system spawns agents to research topics on the internet. (2) With just **3 lines of code**, Judgeval traces every input/output + environment response across all agent tool calls for debugging. (3) After completion, (4) export all interaction data to enable further environment-specific learning and optimization.
+**[Multi-Agent System](https://github.com/JudgmentLabs/judgment-cookbook/tree/main/cookbooks/agents/multi-agent) with complete observability:** (1) A multi-agent system spawns agents to research topics on the internet. (2) With just **3 lines of code**, Judgeval captures all environment responses across all agent tool calls for monitoring. (3) After completion, (4) export all interaction data to enable further environment-specific learning and optimization.
 <table style="width: 100%; max-width: 800px; table-layout: fixed;">
 <tr>
@@ -35,8 +35,8 @@ Judgeval offers **open-source tooling** for tracing and evaluating autonomous, s
   <br><strong>🤖 Agents Running</strong>
 </td>
 <td align="center" style="padding: 8px; width: 50%;">
-  <img src="assets/trace.gif" alt="Trace Demo" style="width: 100%; max-width: 350px; height: auto;" />
-  <br><strong>📊 Real-time Tracing</strong>
+  <img src="assets/trace.gif" alt="Capturing Environment Data Demo" style="width: 100%; max-width: 350px; height: auto;" />
+  <br><strong>📊 Capturing Environment Data </strong>
 </td>
 </tr>
 <tr>
@@ -77,54 +77,14 @@ export JUDGMENT_ORG_ID=...
 **If you don't have keys, [create an account](https://app.judgmentlabs.ai/register) on the platform!**
-## 🏁 Quickstarts
-### 🛰️ Tracing
-Create a file named `agent.py` with the following code:
-```python
-from judgeval.tracer import Tracer, wrap
-from openai import OpenAI
-client = wrap(OpenAI())  # tracks all LLM calls
-judgment = Tracer(project_name="my_project")
-@judgment.observe(span_type="tool")
-def format_question(question: str) -> str:
-    # dummy tool
-    return f"Question : {question}"
-@judgment.observe(span_type="function")
-def run_agent(prompt: str) -> str:
-    task = format_question(prompt)
-    response = client.chat.completions.create(
-        model="gpt-4.1",
-        messages=[{"role": "user", "content": task}]
-    )
-    return response.choices[0].message.content
-run_agent("What is the capital of the United States?")
-```
-You'll see your trace exported to the Judgment Platform:
-<p align="center"><img src="assets/online_eval.png" alt="Judgment Platform Trace Example" width="1500" /></p>
-[Click here](https://docs.judgmentlabs.ai/documentation/tracing/introduction) for a more detailed explanation.
-<!-- Created by https://github.com/ekalinin/github-markdown-toc -->
 ## ✨ Features
 |  |  |
 |:---|:---:|
-| <h3>🔍 Tracing</h3>Automatic agent tracing integrated with common frameworks (LangGraph, OpenAI, Anthropic). **Tracks inputs/outputs, agent tool calls, latency, cost, and custom metadata** at every step.<br><br>**Useful for:**<br>• 🐛 Debugging agent runs <br>• 📋 Collecting agent environment data <br>• 🔬 Pinpointing performance bottlenecks| <p align="center"><img src="assets/agent_trace_example.png" alt="Tracing visualization" width="1200"/></p> |
 | <h3>🧪 Evals</h3>Build custom evaluators on top of your agents. Judgeval supports LLM-as-a-judge, manual labeling, and code-based evaluators that connect with our metric-tracking infrastructure. <br><br>**Useful for:**<br>• ⚠️ Unit-testing <br>• 🔬 A/B testing <br>• 🛡️ Online guardrails | <p align="center"><img src="assets/test.png" alt="Evaluation metrics" width="800"/></p> |
 | <h3>📡 Monitoring</h3>Get Slack alerts for agent failures in production. Add custom hooks to address production regressions.<br><br> **Useful for:** <br>• 📉 Identifying degradation early <br>• 📈 Visualizing performance trends across agent versions and time | <p align="center"><img src="assets/errors.png" alt="Monitoring Dashboard" width="1200"/></p> |
-| <h3>📊 Datasets</h3>Export traces and test cases to datasets for scaled analysis and optimization. Move datasets to/from Parquet, S3, etc. <br><br>Run evals on datasets as unit tests or to A/B test different agent configurations, enabling continuous learning from production interactions. <br><br> **Useful for:**<br>• 🗃️ Agent environment interaction data for optimization<br>• 🔄 Scaled analysis for A/B tests | <p align="center"><img src="assets/datasets_preview_screenshot.png" alt="Dataset management" width="1200"/></p> |
+| <h3>📊 Datasets</h3>Export environment interactions and test cases to datasets for scaled analysis and optimization. Move datasets to/from Parquet, S3, etc. <br><br>Run evals on datasets as unit tests or to A/B test different agent configurations, enabling continuous learning from production interactions. <br><br> **Useful for:**<br>• 🗃️ Agent environment interaction data for optimization<br>• 🔄 Scaled analysis for A/B tests | <p align="center"><img src="assets/datasets_preview_screenshot.png" alt="Dataset management" width="1200"/></p> |
 ## 🏢 Self-Hosting

{judgeval-0.5.0 → judgeval-0.7.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "judgeval"
-version = "0.5.0"
+version = "0.7.0"
 authors = [
     { name = "Andrew Li", email = "andrew@judgmentlabs.ai" },
     { name = "Alex Shan", email = "alex@judgmentlabs.ai" },
@@ -29,12 +29,18 @@ dependencies = [
     "langchain-openai",
     "langchain-anthropic",
     "langchain-core",
+    "click<8.2.0",
+    "typer>=0.9.0",
+    "fireworks-ai>=0.19.18",
 ]
 [project.urls]
 Homepage = "https://github.com/JudgmentLabs/judgeval"
 Issues = "https://github.com/JudgmentLabs/judgeval/issues"
+[project.scripts]
+judgeval = "judgeval.cli:app"
 [build-system]
 requires = ["hatchling"]
 build-backend = "hatchling.build"

judgeval-0.7.0/src/judgeval/cli.py ADDED Viewed

@@ -0,0 +1,65 @@
+#!/usr/bin/env python3
+import typer
+from pathlib import Path
+from dotenv import load_dotenv
+from judgeval.common.logger import judgeval_logger
+from judgeval.judgment_client import JudgmentClient
+load_dotenv()
+app = typer.Typer(
+    no_args_is_help=True,
+    rich_markup_mode=None,
+    rich_help_panel=None,
+    pretty_exceptions_enable=False,
+    pretty_exceptions_show_locals=False,
+    pretty_exceptions_short=False,
+)
+@app.command("upload_scorer")
+def upload_scorer(
+    scorer_file_path: str,
+    requirements_file_path: str,
+    unique_name: str = typer.Option(
+        None, help="Custom name for the scorer (auto-detected if not provided)"
+    ),
+):
+    # Validate file paths
+    if not Path(scorer_file_path).exists():
+        judgeval_logger.error(f"Scorer file not found: {scorer_file_path}")
+        raise typer.Exit(1)
+    if not Path(requirements_file_path).exists():
+        judgeval_logger.error(f"Requirements file not found: {requirements_file_path}")
+        raise typer.Exit(1)
+    try:
+        client = JudgmentClient()
+        result = client.upload_custom_scorer(
+            scorer_file_path=scorer_file_path,
+            requirements_file_path=requirements_file_path,
+            unique_name=unique_name,
+        )
+        if not result:
+            judgeval_logger.error("Failed to upload custom scorer")
+            raise typer.Exit(1)
+        raise typer.Exit(0)
+    except Exception:
+        raise
+@app.command()
+def version():
+    """Show version info"""
+    judgeval_logger.info("JudgEval CLI v0.0.0")
+if __name__ == "__main__":
+    app()
+# judgeval upload_scorer /Users/alanzhang/repo/JudgmentLabs/judgeval/src/demo/profile_match_scorer.py /Users/alanzhang/repo/JudgmentLabs/judgeval/src/demo/requirements.txt

{judgeval-0.5.0 → judgeval-0.7.0}/src/judgeval/common/api/api.py RENAMED Viewed

@@ -20,13 +20,11 @@ from judgeval.common.api.constants import (
     JUDGMENT_EVAL_DELETE_API_URL,
     JUDGMENT_ADD_TO_RUN_EVAL_QUEUE_API_URL,
     JUDGMENT_GET_EVAL_STATUS_API_URL,
-    JUDGMENT_CHECK_EXPERIMENT_TYPE_API_URL,
-    JUDGMENT_EVAL_RUN_NAME_EXISTS_API_URL,
     JUDGMENT_SCORER_SAVE_API_URL,
     JUDGMENT_SCORER_FETCH_API_URL,
     JUDGMENT_SCORER_EXISTS_API_URL,
+    JUDGMENT_CUSTOM_SCORER_UPLOAD_API_URL,
     JUDGMENT_DATASETS_APPEND_TRACES_API_URL,
-    JUDGMENT_CHECK_EXAMPLE_KEYS_API_URL,
 )
 from judgeval.common.api.constants import (
     TraceFetchPayload,
@@ -45,12 +43,11 @@ from judgeval.common.api.constants import (
     DeleteEvalRunRequestBody,
     EvalLogPayload,
     EvalStatusPayload,
-    CheckExperimentTypePayload,
-    EvalRunNameExistsPayload,
     ScorerSavePayload,
     ScorerFetchPayload,
     ScorerExistsPayload,
-    CheckExampleKeysPayload,
+    CustomScorerUploadPayload,
+    CustomScorerTemplateResponse,
 )
 from judgeval.utils.requests import requests
 from judgeval.common.api.json_encoder import json_encoder
@@ -97,14 +94,20 @@ class JudgmentApiClient:
         method: Literal["POST", "PATCH", "GET", "DELETE"],
         url: str,
         payload: Any,
+        timeout: Optional[Union[float, tuple]] = None,
     ) -> Any:
+        # Prepare request kwargs with optional timeout
+        request_kwargs = self._request_kwargs()
+        if timeout is not None:
+            request_kwargs["timeout"] = timeout
         if method == "GET":
             r = requests.request(
                 method,
                 url,
                 params=payload,
                 headers=self._headers(),
-                **self._request_kwargs(),
+                **request_kwargs,
             )
         else:
             r = requests.request(
@@ -112,7 +115,7 @@ class JudgmentApiClient:
                 url,
                 json=json_encoder(payload),
                 headers=self._headers(),
-                **self._request_kwargs(),
+                **request_kwargs,
             )
         try:
@@ -186,10 +189,10 @@ class JudgmentApiClient:
         payload: EvalLogPayload = {"results": results, "run": run}
         return self._do_request("POST", JUDGMENT_EVAL_LOG_API_URL, payload)
-    def fetch_evaluation_results(self, project_name: str, eval_name: str):
+    def fetch_evaluation_results(self, experiment_run_id: str, project_name: str):
         payload: EvalRunRequestBody = {
             "project_name": project_name,
-            "eval_name": eval_name,
+            "experiment_run_id": experiment_run_id,
         }
         return self._do_request("POST", JUDGMENT_EVAL_FETCH_API_URL, payload)
@@ -204,43 +207,21 @@ class JudgmentApiClient:
     def add_to_evaluation_queue(self, payload: Dict[str, Any]):
         return self._do_request("POST", JUDGMENT_ADD_TO_RUN_EVAL_QUEUE_API_URL, payload)
-    def get_evaluation_status(self, eval_name: str, project_name: str):
+    def get_evaluation_status(self, experiment_run_id: str, project_name: str):
         payload: EvalStatusPayload = {
-            "eval_name": eval_name,
+            "experiment_run_id": experiment_run_id,
             "project_name": project_name,
             "judgment_api_key": self.api_key,
         }
         return self._do_request("GET", JUDGMENT_GET_EVAL_STATUS_API_URL, payload)
-    def check_experiment_type(self, eval_name: str, project_name: str, is_trace: bool):
-        payload: CheckExperimentTypePayload = {
-            "eval_name": eval_name,
-            "project_name": project_name,
-            "judgment_api_key": self.api_key,
-            "is_trace": is_trace,
-        }
-        return self._do_request("POST", JUDGMENT_CHECK_EXPERIMENT_TYPE_API_URL, payload)
-    def check_eval_run_name_exists(self, eval_name: str, project_name: str):
-        payload: EvalRunNameExistsPayload = {
-            "eval_name": eval_name,
-            "project_name": project_name,
-            "judgment_api_key": self.api_key,
-        }
-        return self._do_request("POST", JUDGMENT_EVAL_RUN_NAME_EXISTS_API_URL, payload)
-    def check_example_keys(self, keys: List[str], eval_name: str, project_name: str):
-        payload: CheckExampleKeysPayload = {
-            "keys": keys,
-            "eval_name": eval_name,
-            "project_name": project_name,
-        }
-        return self._do_request("POST", JUDGMENT_CHECK_EXAMPLE_KEYS_API_URL, payload)
-    def save_scorer(self, name: str, prompt: str, options: Optional[dict] = None):
+    def save_scorer(
+        self, name: str, prompt: str, threshold: float, options: Optional[dict] = None
+    ):
         payload: ScorerSavePayload = {
             "name": name,
             "prompt": prompt,
+            "threshold": threshold,
             "options": options,
         }
         try:
@@ -292,6 +273,31 @@ class JudgmentApiClient:
                 request=e.request,
             )
+    def upload_custom_scorer(
+        self,
+        scorer_name: str,
+        scorer_code: str,
+        requirements_text: str,
+    ) -> CustomScorerTemplateResponse:
+        """Upload custom scorer to backend"""
+        payload: CustomScorerUploadPayload = {
+            "scorer_name": scorer_name,
+            "scorer_code": scorer_code,
+            "requirements_text": requirements_text,
+        }
+        try:
+            # Use longer timeout for custom scorer upload (5 minutes)
+            response = self._do_request(
+                "POST",
+                JUDGMENT_CUSTOM_SCORER_UPLOAD_API_URL,
+                payload,
+                timeout=(10, 300),
+            )
+            return response
+        except JudgmentAPIException as e:
+            raise e
     def push_dataset(
         self,
         dataset_alias: str,

{judgeval-0.5.0 → judgeval-0.7.0}/src/judgeval/common/api/constants.py RENAMED Viewed

@@ -49,9 +49,9 @@ JUDGMENT_EVAL_DELETE_API_URL = (
 JUDGMENT_EVAL_DELETE_PROJECT_API_URL = f"{ROOT_API}/delete_eval_results_by_project/"
 JUDGMENT_ADD_TO_RUN_EVAL_QUEUE_API_URL = f"{ROOT_API}/add_to_run_eval_queue/"
 JUDGMENT_GET_EVAL_STATUS_API_URL = f"{ROOT_API}/get_evaluation_status/"
-JUDGMENT_CHECK_EXPERIMENT_TYPE_API_URL = f"{ROOT_API}/check_experiment_type/"
-JUDGMENT_EVAL_RUN_NAME_EXISTS_API_URL = f"{ROOT_API}/eval-run-name-exists/"
-JUDGMENT_CHECK_EXAMPLE_KEYS_API_URL = f"{ROOT_API}/check_example_keys/"
+# Custom Scorers API
+JUDGMENT_CUSTOM_SCORER_UPLOAD_API_URL = f"{ROOT_API}/upload_scorer/"
 # Evaluation API Payloads
@@ -73,9 +73,9 @@ class EvalLogPayload(TypedDict):
 class EvalStatusPayload(TypedDict):
-    eval_name: str
-    project_name: str
+    experiment_run_id: str
     judgment_api_key: str
+    project_name: str
 class CheckExperimentTypePayload(TypedDict):
@@ -162,6 +162,7 @@ JUDGMENT_SCORER_EXISTS_API_URL = f"{ROOT_API}/scorer_exists/"
 class ScorerSavePayload(TypedDict):
     name: str
     prompt: str
+    threshold: float
     options: Optional[dict]
@@ -171,3 +172,15 @@ class ScorerFetchPayload(TypedDict):
 class ScorerExistsPayload(TypedDict):
     name: str
+class CustomScorerUploadPayload(TypedDict):
+    scorer_name: str
+    scorer_code: str
+    requirements_text: str
+class CustomScorerTemplateResponse(TypedDict):
+    scorer_name: str
+    status: str
+    message: str

{judgeval-0.5.0 → judgeval-0.7.0}/src/judgeval/common/api/json_encoder.py RENAMED Viewed

@@ -84,7 +84,7 @@ def json_encoder(
         )
     # Sequences
-    if isinstance(obj, (list, set, frozenset, GeneratorType, tuple, deque)):
+    if isinstance(obj, (list, set, frozenset, tuple, deque)):
         return _dump_sequence(
             obj=obj,
         )
@@ -169,16 +169,15 @@ def _dump_other(
     obj: Any,
 ) -> Any:
     """
-    Dump an object to a hashable object, using the same parameters as jsonable_encoder
+    Dump an object to a representation without iterating it.
+    Avoids calling dict(obj) which can consume iterators/generators or
+    invoke user-defined iteration protocols.
     """
     try:
-        data = dict(obj)
-    except Exception:
         return repr(obj)
-    return json_encoder(
-        data,
-    )
+    except Exception:
+        return str(obj)
 def iso_format(o: Union[datetime.date, datetime.time]) -> str:
@@ -218,7 +217,7 @@ ENCODERS_BY_TYPE: Dict[Type[Any], Callable[[Any], Any]] = {
     Enum: lambda o: o.value,
     frozenset: list,
     deque: list,
-    GeneratorType: list,
+    GeneratorType: repr,
     Path: str,
     Pattern: lambda o: o.pattern,
     SecretBytes: str,

judgeval 0.5.0__tar.gz → 0.7.0__tar.gz

judgeval 0.5.0tar.gz → 0.7.0tar.gz