PyPI - versionhq - Versions diffs - 1.2.2.9__tar.gz → 1.2.2.10__tar.gz - Mend

versionhq 1.2.2.9tar.gz → 1.2.2.10tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (159) hide show

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/.gitignore RENAMED Viewed

@@ -1,6 +1,8 @@
 deploy.py
 destinations.py
+refine.py
 train.py
 _memo

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.2
 Name: versionhq
-Version: 1.2.2.9
+Version: 1.2.2.10
 Summary: An agentic orchestration framework for building agent networks that handle task automation.
 Author-email: Kuriko Iwai <kuriko@versi0n.io>
 License: MIT License
@@ -100,7 +100,7 @@ Agentic orchestration framework for multi-agent networks and task graphs for com
 **Visit:**
 - [Playground](https://versi0n.io/)
-- [Docs](https://docs.versi0n.io)
+- [Documentation](https://docs.versi0n.io)
 - [Github](https://github.com/versionHQ/)
 - [Python SDK](https://pypi.org/project/versionhq/)

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/README.md RENAMED Viewed

@@ -13,7 +13,7 @@ Agentic orchestration framework for multi-agent networks and task graphs for com
 **Visit:**
 - [Playground](https://versi0n.io/)
-- [Docs](https://docs.versi0n.io)
+- [Documentation](https://docs.versi0n.io)
 - [Github](https://github.com/versionHQ/)
 - [Python SDK](https://pypi.org/project/versionhq/)

versionhq-1.2.2.10/docs/core/knowledge.md ADDED Viewed

@@ -0,0 +1,11 @@
+---
+tags:
+  - Utilities
+---
+# Knowledge
+<class>`class` versionhq.knowledge.model.<bold>Knowledge<bold></class>
+A Pydantic class to store `Knowledge` of the agent.

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/docs/core/llm/index.md RENAMED Viewed

@@ -16,9 +16,10 @@ You can specify a model and integration platform from the list. Else, we'll use
 ```python
 "openai": [
-    "gpt-4",
+    "gpt-4.5-preview-2025-02-27",
     "gpt-4o",
     "gpt-4o-mini",
+    "gpt-4",
     "o1-mini",
     "o1-preview",
 ],

versionhq-1.2.2.10/docs/core/memory.md ADDED Viewed

@@ -0,0 +1,11 @@
+---
+tags:
+  - Utilities
+---
+# Memory
+<class>`class` versionhq.memory.model.<bold>Memory<bold></class>
+A Pydantic class to store `Memory` of the agent.

versionhq-1.2.2.10/docs/core/rag-tool.md ADDED Viewed

@@ -0,0 +1,85 @@
+---
+tags:
+  - Utilities
+---
+# RAG Tool
+<class>`class` versionhq.tool.rag_tool.<bold>RagTool<bold></class>
+A Pydantic class to store RAG tools that the agent will use when it executes the task.
+## Quick Start
+Similar to the `Tool` class, you can run the RAG tool using `url` and `query` variables.
+```python
+import versionhq as vhq
+rt = vhq.RagTool(
+    url="https://github.com/chroma-core/chroma/issues/3233",
+    query="What is the next action plan?"
+)
+res = rt.run()
+assert rt.text is not None # text source from the url
+assert res is not None
+```
+<hr>
+## Using with Agents
+You can call a specific agent when you run a RAG tool.
+```python
+import versionhq as vhq
+rt = vhq.RagTool(url="https://github.com/chroma-core/chroma/issues/3233", query="What is the next action plan?")
+agent = vhq.Agent(role="RAG Tool Tester")
+res = rt.run(agent=agent)
+assert agent.knowledge_sources is not None
+assert rt.text is not None
+assert res is not None
+```
+Agents can own RAG tools.
+```python
+import versionhq as vhq
+rt = vhq.RagTool(url="https://github.com/chroma-core/chroma/issues/3233", query="What is the next action plan?")
+agent = vhq.Agent(role="RAG Tool Tester", tools=[rt]) # adding RAG tool/s
+task = vhq.Task(description="return a simple response", can_use_agent_tools=True, tool_res_as_final=True)
+res = task.execute(agent=agent)
+assert res.raw is not None
+assert res.tool_output is not None
+```
+### Variables
+| <div style="width:160px">**Variable**</div> | **Data Type** | **Default** | **Nullable** | **Description** |
+| :---               | :---  | :--- | :--- | :--- |
+| **`api_key_name`** | Optional[str]   | None | True | API key name in .env file. |
+| **`api_endpoint`**       | Optional[str]   | None | True |API endpoint. |
+| **`url`** | Optional[str] | None | True | URLs to extract the text source. |
+| **`headers`** | Optional[Dict[str, Any]]  | dict() | - | Request headers |
+| **`query`** |  Optional[str] | None | True | Query. |
+| **`text`** |  Optional[str] | None | True | Text sources extracted from the URL or API call |
+### Class Methods
+| <div style="width:120px">**Method**</div> |  <div style="width:300px">**Params**</div> | **Returns** | **Description** |
+| :---               | :---  | :--- | :--- |
+| **`store_data`**  | <p>agent: Optional["vhq.Agent"] = Non</p> | None | Stores the retrieved data in the storage. |
+| **`run`**  | *args, **kwargs | Any | Execute the tool. |

versionhq-1.2.2.10/docs/core/task/evaluation.md ADDED Viewed

@@ -0,0 +1,77 @@
+---
+tags:
+  - Task Graph
+---
+# Evaluation
+<class>`class` versionhq.task.evaluate.<bold>Evaluation<bold></class>
+A Pydantic class to store conditions and results of the evaluation.
+### Variables
+| <div style="width:120px">**Variable**</div> | **Data Type** | **Default** | **Nullable** | **Description** |
+| :---            | :---  | :--- | :--- | :--- |
+| **`items`**     | List[InstanceOf[EvaluationItem]]  | list() | - | Stores evaluation items. |
+| **`eval_by`**   | Any   | None | True | Stores an agent evaluated the output. |
+### Property
+| <div style="width:120px">**Property**</div> | **Returns** | **Description** |
+| :---               | :---  | :--- |
+| **`aggregate_score`**   | float | Calucurates weighted average eval scores of the task output. |
+| **`suggestion_summary`**  | str   | Returns summary of the suggestions. |
+<hr>
+## EvaluationItem
+<class>`class` versionhq.task.evaluate.<bold>EvaluationItem<bold></class>
+### Variables
+| <div style="width:120px">**Variable**</div> | **Data Type** | **Default** | **Nullable** | **Description** |
+| :---            | :---  | :--- | :--- | :--- |
+| **`criteria`**     | str  | None | False | Stores evaluation criteria given by the client. |
+| **`suggestion`**   | str   | None | True | Stores suggestion on improvement from the evaluator agent. |
+| **`score`**   | float   | None | True | Stores the score on a 0 to 1 scale. |
+<hr>
+## Usage
+Evaluator agents will evaluate the task output based on the given criteria, and store the results in the `TaskOutput` object.
+```python
+import versionhq as vhq
+from pydantic import BaseModel
+class CustomOutput(BaseModel):
+    test1: str
+    test2: list[str]
+task = vhq.Task(
+    description="Research a topic to teach a kid aged 6 about math.",
+    pydantic_output=CustomOutput,
+    should_evaluate=True, # triggers evaluation
+    eval_criteria=["uniquness", "audience fit",],
+)
+res = task.execute()
+assert isinstance(res.evaluation, vhq.Evaluation)
+assert [item for item in res.evaluation.items if item.criteria == "uniquness" or item.criteria == "audience fit"]
+assert res.evaluation.aggregate_score is not None
+assert res.evaluation.suggestion_summary is not None
+```
+An `Evaluation` object provides scores for the given criteria.
+For example, it might indicate a `uniqueness` score of 0.56, an `audience fit` score of 0.70, and an `aggregate score` of 0.63.

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/docs/core/task/index.md RENAMED Viewed

@@ -7,7 +7,7 @@ tags:
 <class>`class` versionhq.task.model.<bold>Task<bold></class>
-A class to store and manage information for individual tasks, including their assignment to agents or agent networks, and dependencies via a node-based system that tracks conditions and status.
+A Pydantic class to store and manage information for individual tasks, including their assignment to agents or agent networks, and dependencies via a node-based system that tracks conditions and status.
 Ref. Node / Edge / <a href="/core/task-graph">TaskGraph</a> class

versionhq-1.2.2.10/docs/core/task/reference.md ADDED Viewed

@@ -0,0 +1,105 @@
+## `Task`
+### Variables
+| <div style="width:160px">**Variable**</div> | **Data Type** | **Default** | **Nullable** | **Description** |
+| :---               | :---  | :--- | :--- | :--- |
+| **`id`**   | UUID  | uuid.uuid4() | False | Stores task `id` as an identifier. |
+| **`name`**       | Optional[str]   | None | True | Stores a task name (Inherited as `node` identifier if the task is dependent) |
+| **`description`**       | str   | None | False | Required field to store a concise task description |
+| **`pydantic_output`** | Optional[Type[BaseModel]] | None | True | Stores pydantic custom output class for structured response |
+| **`response_fields`** | Optional[List[ResponseField]]  | list() | True | Stores JSON formats for stuructured response |
+| **`tools`** |  Optional[List[ToolSet | Tool | Any]] | None | True | Stores tools to be called when the agent executes the task. |
+| **`can_use_agent_tools`** |  bool | True | - | Whether to use the agent tools |
+| **`tool_res_as_final`** |  bool | False | - | Whether to make the tool response a final response from the agent |
+| **`execution_type`** | TaskExecutionType  | TaskExecutionType.SYNC | - | Sync or async execution |
+| **`allow_delegation`** | bool  | False | - | Whether to allow the agent to delegate the task to another agent |
+| **`callback`** | Optional[Callable] | None | True | Callback function to be executed after LLM calling |
+| **`callback_kwargs`** | Optional[Dict[str, Any]] | dict() | True | Args for the callback function (if any)|
+| **`should_evaluate`** | bool | False | - | Whether to evaluate the task output using eval criteria |
+| **`eval_criteria`** | Optional[List[str]] | list() | True | Evaluation criteria given by the human client |
+| **`fsls`** | Optional[List[str]] | None | True | Examples of excellent and weak responses |
+| **`processed_agents`** | Set[str] | set() | True | [Ops] Stores roles of the agents executed the task |
+| **`tool_errors`** | int | 0 | True | [Ops] Stores number of tool errors |
+| **`delegation`** | int | 0 | True | [Ops] Stores number of agent delegations |
+| **`output`** | Optional[TaskOutput] | None | True | [Ops] Stores `TaskOutput` object after the execution |
+### Class Methods
+| <div style="width:120px">**Method**</div> |  <div style="width:300px">**Params**</div> | **Returns** | **Description** |
+| :---               | :---  | :--- | :--- |
+| **`execute`**  | <p>type: TaskExecutionType = None<br>agent: Optional["vhq.Agent"] = None<br>context: Optional[Any] = None</p> | InstanceOf[`TaskOutput`] or None (error) |  A main method to handle task execution. Auto-build an agent when the agent is not given. |
+### Properties
+| <div style="width:120px">**Property**</div> | **Returns** | **Description** |
+| :---               | :---  | :--- |
+| **`key`**   | str | Returns task key based on its description and output format. |
+| **`summary`**  | str   | Returns a summary of the task based on its id, description and tools. |
+<hr>
+## `TaskOutput`
+### Variables
+| <div style="width:120px">**Variable**</div> | **Data Type** | **Default** | **Nullable** | **Description** |
+| :---               | :---  | :--- | :--- | :--- |
+| **`task_id`**   | UUID  | uuid.uuid4() | False | Stores task `id` as an identifier. |
+| **`raw`**       | str   | None | False | Stores response in plane text format. `None` or `""` when the model returned errors.|
+| **`json_dict`** | Dict[str, Any] | None | False | Stores response in JSON serializable dictionary. When the system failed formatting or executing tasks without response_fields, `{ output: <res.raw> }` will be returned. |
+| **`pydantic`** | Type[`BaseModel`]  | None | True | Populates and stores Pydantic class object defined in the `pydantic_output` field. `None` if `pydantic_output` is NOT given. |
+| **`tool_output`** |  Optional[Any] | None | True | Stores results from the tools of the task or agents ONLY when `tool_res_as_final` set as `True`. |
+| **`callback_output`** |  Optional[Any] | None | True | Stores results from callback functions if any. |
+| **`latency`** |  Optional[float] | None | True | Stores job latency in milseconds. |
+| **`evaluation`** |  Optional[InstanceOf[`Evaluation`]] | None | True | Stores overall evaluations and usage of the task output. |
+### Class Methods
+| <div style="width:120px">**Method**</div> | **Params** | **Returns** | **Description** |
+| :---               | :---  | :--- | :--- |
+| **`evaluate`**   | task: InstanceOf[`Task`]  | InstanceOf[`Evaluation`]  | Evaluates task output based on the criteria |
+### Property
+| <div style="width:120px">**Property**</div> | **Returns** | **Description** |
+| :---               | :---  | :--- |
+| **`aggregate_score`**   | float | Calucurates weighted average eval scores of the task output. |
+| **`json_string`**       | str   | Returns `json_dict` in string format. |
+<hr>
+## `Evaluation`
+### Variables
+| <div style="width:120px">**Variable**</div> | **Data Type** | **Default** | **Nullable** | **Description** |
+| :---            | :---  | :--- | :--- | :--- |
+| **`items`**     | List[InstanceOf[EvaluationItem]]  | list() | - | Stores evaluation items. |
+| **`eval_by`**   | Any   | None | True | Stores an agent evaluated the output. |
+### Property
+| <div style="width:120px">**Property**</div> | **Returns** | **Description** |
+| :---               | :---  | :--- |
+| **`aggregate_score`**   | float | Calucurates weighted average eval scores of the task output. |
+| **`suggestion_summary`**  | str   | Returns summary of the suggestions. |
+<hr>
+## `EvaluationItem`
+### Variables
+| <div style="width:120px">**Variable**</div> | **Data Type** | **Default** | **Nullable** | **Description** |
+| :---            | :---  | :--- | :--- | :--- |
+| **`criteria`**     | str  | None | False | Stores evaluation criteria given by the client. |
+| **`suggestion`**   | str   | None | True | Stores suggestion on improvement from the evaluator agent. |
+| **`score`**   | float   | None | True | Stores the score on a 0 to 1 scale. |

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/docs/core/task/task-output.md RENAMED Viewed

@@ -7,7 +7,7 @@ tags:
 <class>`class` versionhq.task.model.<bold>TaskOutput<bold></class>
-A Pydantic class to store and manage response from the `Task` object.
+A Pydantic class to store and manage results of `Task`.
 <hr />
@@ -18,9 +18,10 @@ A Pydantic class to store and manage response from the `Task` object.
 | **`task_id`**   | UUID  | uuid.uuid4() | False | Stores task `id` as an identifier. |
 | **`raw`**       | str   | None | False | Stores response in plane text format. `None` or `""` when the model returned errors.|
 | **`json_dict`** | Dict[str, Any] | None | False | Stores response in JSON serializable dictionary. When the system failed formatting or executing tasks without response_fields, `{ output: <res.raw> }` will be returned. |
-| **`pydantic`** | Type[`BaseModel`]  | None | True | Populates and stores Pydantic class defined in the `pydantic_output` field. `None` if `pydantic_output` is NOT given. |
+| **`pydantic`** | Type[`BaseModel`]  | None | True | Populates and stores Pydantic class object defined in the `pydantic_output` field. `None` if `pydantic_output` is NOT given. |
 | **`tool_output`** |  Optional[Any] | None | True | Stores results from the tools of the task or agents ONLY when `tool_res_as_final` set as `True`. |
 | **`callback_output`** |  Optional[Any] | None | True | Stores results from callback functions if any. |
+| **`latency`** |  Optional[float] | None | True | Stores job latency in milseconds. |
 | **`evaluation`** |  Optional[InstanceOf[`Evaluation`]] | None | True | Stores overall evaluations and usage of the task output. |

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/docs/core/task/task-strc-response.md RENAMED Viewed

@@ -12,7 +12,7 @@ But you can choose to generate Pydantic class or specifig JSON object as respons
 `[var]`<bold>`pydantic_output: Optional[Type[BaseModel]] = None`</bold>
-Create and add a `custom Pydantic class` as a structured response format to the `pydantic_output` field.
+Add a `custom Pydantic class` as a structured response format to the `pydantic_output` field.
 The custom class can accept **one layer of a nested child** as you can see in the following code snippet:

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/docs/core/tool.md RENAMED Viewed

@@ -7,7 +7,8 @@ tags:
 <class>`class` versionhq.tool.model.<bold>Tool<bold></class>
-Tool is an unique function that the agent can utilize when they execute the task with or without LLM.
+A Pydantic class to store the tool object.
 ## Quick Start

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/mkdocs.yml RENAMED Viewed

@@ -131,9 +131,12 @@ nav:
       - Executing: 'core/task/task-execution.md'
       - Outputs: 'core/task/task-output.md'
       - Evaluating: 'core/task/evaluation.md'
-      - Reference: 'core/task/task-ref.md'
+      - Reference: 'core/task/reference.md'
   - Components:
     - Tool: 'core/tool.md'
+    - RAG Tool: core/rag-tool.md
+    - Memory: core/memory.md
+    - Knowledge: core/knowledge.md
   - Archive: 'tags.md'
   - Cases:
     - Playground: https://versi0n.io/playground

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/pyproject.toml RENAMED Viewed

@@ -15,7 +15,7 @@ exclude = ["test*", "__pycache__", "*.egg-info"]
 [project]
 name = "versionhq"
-version = "1.2.2.9"
+version = "1.2.2.10"
 authors = [{ name = "Kuriko Iwai", email = "kuriko@versi0n.io" }]
 description = "An agentic orchestration framework for building agent networks that handle task automation."
 readme = "README.md"

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/src/versionhq/__init__.py RENAMED Viewed

@@ -32,7 +32,7 @@ from versionhq.agent_network.formation import form_agent_network
 from versionhq.task_graph.draft import workflow
-__version__ = "1.2.2.9"
+__version__ = "1.2.2.10"
 __all__ = [
     "Agent",

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/src/versionhq/llm/llm_vars.py RENAMED Viewed

@@ -19,6 +19,7 @@ ENDPOINT_PROVIDERS = [
 MODELS = {
     "openai": [
+        "gpt-4.5-preview-2025-02-27",
         "gpt-4",
         "gpt-4o",
         "gpt-4o-mini",
@@ -77,13 +78,15 @@ ENV_VARS = {
 }
 """
 Max input token size by the model.
 """
 LLM_CONTEXT_WINDOW_SIZES = {
-    "gpt-4": 8192,
+    "gpt-4.5-preview-2025-02-27": 128000,
     "gpt-4o": 128000,
     "gpt-4o-mini": 128000,
+    "gpt-4": 8192,
     "o1-preview": 128000,
     "o1-mini": 128000,

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/src/versionhq/tool/rag_tool.py RENAMED Viewed

@@ -108,5 +108,6 @@ class RagTool(BaseTool):
                     results.append(str(item))
             return results
     def run(self, *args, **kwargs):
         return self._run(*args, **kwargs)

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/src/versionhq.egg-info/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.2
 Name: versionhq
-Version: 1.2.2.9
+Version: 1.2.2.10
 Summary: An agentic orchestration framework for building agent networks that handle task automation.
 Author-email: Kuriko Iwai <kuriko@versi0n.io>
 License: MIT License
@@ -100,7 +100,7 @@ Agentic orchestration framework for multi-agent networks and task graphs for com
 **Visit:**
 - [Playground](https://versi0n.io/)
-- [Docs](https://docs.versi0n.io)
+- [Documentation](https://docs.versi0n.io)
 - [Github](https://github.com/versionHQ/)
 - [Python SDK](https://pypi.org/project/versionhq/)

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/src/versionhq.egg-info/SOURCES.txt RENAMED Viewed

@@ -23,6 +23,9 @@ docs/quickstart.md
 docs/tags.md
 docs/_logos/favicon.ico
 docs/_logos/logo192.png
+docs/core/knowledge.md
+docs/core/memory.md
+docs/core/rag-tool.md
 docs/core/tool.md
 docs/core/agent/config.md
 docs/core/agent/index.md
@@ -34,10 +37,10 @@ docs/core/agent-network/ref.md
 docs/core/llm/index.md
 docs/core/task/evaluation.md
 docs/core/task/index.md
+docs/core/task/reference.md
 docs/core/task/response-field.md
 docs/core/task/task-execution.md
 docs/core/task/task-output.md
-docs/core/task/task-ref.md
 docs/core/task/task-strc-response.md
 docs/core/task-graph/index.md
 docs/stylesheets/main.css
@@ -137,6 +140,7 @@ tests/llm/llm_test.py
 tests/memory/__init__.py
 tests/memory/memory_test.py
 tests/task/__init__.py
+tests/task/doc_eval_test.py
 tests/task/doc_taskoutput_test.py
 tests/task/doc_test.py
 tests/task/eval_test.py

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/tests/llm/llm_connection_test.py RENAMED Viewed

@@ -37,7 +37,7 @@ def res_field_task():
     return Task(description="return random values strictly following the given response format.", response_fields=demo_response_fields)
-def test_con(simple_task, tool_task, schema_task, res_field_task):
+def test_con_bedrock(simple_task, tool_task, schema_task, res_field_task):
     llms_to_test = [
         "bedrock/converse/us.meta.llama3-3-70b-instruct-v1:0",
         "bedrock/us.meta.llama3-2-11b-instruct-v1:0",
@@ -65,3 +65,30 @@ def test_con(simple_task, tool_task, schema_task, res_field_task):
         res_4 = res_field_task.execute(agent=agent, context="running a test")
         assert [v and type(v) == res_field_task.response_fields[i].data_type for i, (k, v) in enumerate(res_4.json_dict.items())]
+def test_con_gpt(simple_task, tool_task, schema_task, res_field_task):
+    llms_to_test = [
+        "gpt-4.5-preview-2025-02-27",
+    ]
+    agents = [set_agent(llm=llm) for llm in llms_to_test]
+    for agent in agents:
+        assert isinstance(agent.llm, LLM)
+        assert agent.llm.provider == "openai"
+        assert agent.llm._init_model_name and agent.llm.provider and agent.llm.llm_config["max_tokens"] == agent.llm_config["max_tokens"]
+        res_1 = simple_task.execute(agent=agent, context="running a test")
+        assert res_1.raw is not None
+        res_2 = tool_task.execute(agent=agent, context="running a test")
+        assert res_2.tool_output is not None
+        res_3 = schema_task.execute(agent=agent, context="running a test")
+        assert [
+            getattr(res_3.pydantic, k) and v.annotation == Demo.model_fields[k].annotation
+            for k, v in res_3.pydantic.model_fields.items()
+        ]
+        res_4 = res_field_task.execute(agent=agent, context="running a test")
+        assert [v and type(v) == res_field_task.response_fields[i].data_type for i, (k, v) in enumerate(res_4.json_dict.items())]

versionhq-1.2.2.10/tests/task/doc_eval_test.py ADDED Viewed

@@ -0,0 +1,23 @@
+def test_eval():
+    import versionhq as vhq
+    from pydantic import BaseModel
+    class CustomOutput(BaseModel):
+        test1: str
+        test2: list[str]
+    task = vhq.Task(
+        description="Research a topic to teach a kid aged 6 about math.",
+        pydantic_output=CustomOutput,
+        should_evaluate=True, # triggers evaluation
+        eval_criteria=["Uniquness", "Fit to audience",],
+    )
+    res = task.execute()
+    assert isinstance(res.evaluation, vhq.Evaluation)
+    assert [item for item in res.evaluation.items if item.criteria == "Uniquness" or item.criteria == "Fit to audience"]
+    assert res.evaluation.aggregate_score is not None
+    assert res.evaluation.suggestion_summary is not None
+test_eval()

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/tests/task/eval_test.py RENAMED Viewed

@@ -50,4 +50,4 @@ def test_eval_with_fsls():
     assert [isinstance(item, vhq.EvaluationItem) and item.criteria in task.eval_criteria for item in res.evaluation.items]
     assert res.latency and res._tokens
     assert res.evaluation.aggregate_score is not None
-    assert res.evaluation.suggestion_summary
+    assert res.evaluation.suggestion_summary is not None

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/tests/tool/rag_tool_test.py RENAMED Viewed

@@ -10,7 +10,7 @@ def test_rag_tool():
 def test_rag_tool_with_agent():
     import versionhq as vhq
-    agent = vhq.Agent(role="test", goal="use rag tools")
+    agent = vhq.Agent(role="RAG Tool Tester")
     rt = vhq.RagTool(url="https://github.com/chroma-core/chroma/issues/3233", query="What is the next action plan?")
     res = rt.run(agent=agent)
@@ -23,8 +23,9 @@ def test_use_rag_tool():
     import versionhq as vhq
     rt = vhq.RagTool(url="https://github.com/chroma-core/chroma/issues/3233", query="What is the next action plan?")
-    agent = vhq.Agent(role="test", goal="use rag tools", tools=[rt])
+    agent = vhq.Agent(role="RAG Tool Tester", tools=[rt])
     task = vhq.Task(description="return a simple response", can_use_agent_tools=True, tool_res_as_final=True)
     res = task.execute(agent=agent)
     assert res.raw is not None
+    assert res.tool_output is not None

{versionhq-1.2.2.9 → versionhq-1.2.2.10}/uv.lock RENAMED Viewed

@@ -4075,7 +4075,7 @@ wheels = [
 [[package]]
 name = "versionhq"
-version = "1.2.2.8"
+version = "1.2.2.10"
 source = { editable = "." }
 dependencies = [
     { name = "appdirs" },

versionhq-1.2.2.9/docs/core/task/evaluation.md DELETED Viewed

@@ -1,6 +0,0 @@
----
-tags:
-  - Task Graph
----
-# Evaluation

versionhq 1.2.2.9__tar.gz → 1.2.2.10__tar.gz

versionhq 1.2.2.9tar.gz → 1.2.2.10tar.gz