PyPI - vectara-agentic - Versions diffs - 0.4.8__tar.gz → 0.4.9__tar.gz - Mend

vectara-agentic 0.4.8tar.gz → 0.4.9tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of vectara-agentic might be problematic. Click here for more details.

Files changed (66) hide show

{vectara_agentic-0.4.8/vectara_agentic.egg-info → vectara_agentic-0.4.9}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: vectara_agentic
-Version: 0.4.8
+Version: 0.4.9
 Summary: A Python package for creating AI Assistants and AI Agents with Vectara
 Home-page: https://github.com/vectara/py-vectara-agentic
 Author: Ofer Mendelevitch
@@ -16,21 +16,20 @@ Classifier: Topic :: Software Development :: Libraries :: Python Modules
 Requires-Python: >=3.10
 Description-Content-Type: text/markdown
 License-File: LICENSE
-Requires-Dist: llama-index==0.14.2
-Requires-Dist: llama-index-core==0.14.2
-Requires-Dist: llama-index-workflows==2.2.2
+Requires-Dist: llama-index==0.14.3
+Requires-Dist: llama-index-core==0.14.3
+Requires-Dist: llama-index-workflows==2.5.0
 Requires-Dist: llama-index-cli==0.5.1
 Requires-Dist: llama-index-indices-managed-vectara==0.5.1
 Requires-Dist: llama-index-llms-openai==0.5.6
 Requires-Dist: llama-index-llms-openai-like==0.5.1
-Requires-Dist: llama-index-llms-anthropic==0.8.6
+Requires-Dist: llama-index-llms-anthropic==0.9.3
 Requires-Dist: llama-index-llms-together==0.4.1
 Requires-Dist: llama-index-llms-groq==0.4.1
 Requires-Dist: llama-index-llms-cohere==0.6.1
-Requires-Dist: llama-index-llms-google-genai==0.5.0
-Requires-Dist: llama-index-llms-baseten==0.1.4
-Requires-Dist: google_genai>=1.31.0
-Requires-Dist: llama-index-llms-bedrock-converse==0.9.2
+Requires-Dist: llama-index-llms-google-genai==0.5.1
+Requires-Dist: google_genai==1.39.1
+Requires-Dist: llama-index-llms-bedrock-converse==0.9.5
 Requires-Dist: llama-index-tools-yahoo-finance==0.4.1
 Requires-Dist: llama-index-tools-arxiv==0.4.1
 Requires-Dist: llama-index-tools-database==0.4.1
@@ -887,7 +886,7 @@ The `AgentConfig` object may include the following items:
 - `main_llm_provider` and `tool_llm_provider`: the LLM provider for main agent and for the tools. Valid values are `OPENAI`, `ANTHROPIC`, `TOGETHER`, `GROQ`, `COHERE`, `BEDROCK`, `GEMINI` (default: `OPENAI`).
 > **Note:** Fireworks AI support has been removed. If you were using Fireworks, please migrate to one of the supported providers listed above.
-- `main_llm_model_name` and `tool_llm_model_name`: agent model name for agent and tools (default depends on provider: OpenAI uses gpt-4.1-mini, Anthropic uses claude-sonnet-4-0, Gemini uses models/gemini-2.5-flash, Together.AI uses deepseek-ai/DeepSeek-V3, GROQ uses openai/gpt-oss-20b, Bedrock uses us.anthropic.claude-sonnet-4-20250514-v1:0, Cohere uses command-a-03-2025).
+- `main_llm_model_name` and `tool_llm_model_name`: agent model name for agent and tools (default depends on provider: OpenAI uses gpt-4.1-mini, Anthropic uses claude-sonnet-4-5, Gemini uses models/gemini-2.5-flash, Together.AI uses deepseek-ai/DeepSeek-V3, GROQ uses openai/gpt-oss-20b, Bedrock uses us.anthropic.claude-sonnet-4-20250514-v1:0, Cohere uses command-a-03-2025).
 - `observer`: the observer type; should be `ARIZE_PHOENIX` or if undefined no observation framework will be used.
 - `endpoint_api_key`: a secret key if using the API endpoint option (defaults to `dev-api-key`)

{vectara_agentic-0.4.8 → vectara_agentic-0.4.9}/README.md RENAMED Viewed

@@ -811,7 +811,7 @@ The `AgentConfig` object may include the following items:
 - `main_llm_provider` and `tool_llm_provider`: the LLM provider for main agent and for the tools. Valid values are `OPENAI`, `ANTHROPIC`, `TOGETHER`, `GROQ`, `COHERE`, `BEDROCK`, `GEMINI` (default: `OPENAI`).
 > **Note:** Fireworks AI support has been removed. If you were using Fireworks, please migrate to one of the supported providers listed above.
-- `main_llm_model_name` and `tool_llm_model_name`: agent model name for agent and tools (default depends on provider: OpenAI uses gpt-4.1-mini, Anthropic uses claude-sonnet-4-0, Gemini uses models/gemini-2.5-flash, Together.AI uses deepseek-ai/DeepSeek-V3, GROQ uses openai/gpt-oss-20b, Bedrock uses us.anthropic.claude-sonnet-4-20250514-v1:0, Cohere uses command-a-03-2025).
+- `main_llm_model_name` and `tool_llm_model_name`: agent model name for agent and tools (default depends on provider: OpenAI uses gpt-4.1-mini, Anthropic uses claude-sonnet-4-5, Gemini uses models/gemini-2.5-flash, Together.AI uses deepseek-ai/DeepSeek-V3, GROQ uses openai/gpt-oss-20b, Bedrock uses us.anthropic.claude-sonnet-4-20250514-v1:0, Cohere uses command-a-03-2025).
 - `observer`: the observer type; should be `ARIZE_PHOENIX` or if undefined no observation framework will be used.
 - `endpoint_api_key`: a secret key if using the API endpoint option (defaults to `dev-api-key`)

{vectara_agentic-0.4.8 → vectara_agentic-0.4.9}/requirements.txt RENAMED Viewed

@@ -1,18 +1,17 @@
-llama-index==0.14.2
-llama-index-core==0.14.2
-llama-index-workflows==2.2.2
+llama-index==0.14.3
+llama-index-core==0.14.3
+llama-index-workflows==2.5.0
 llama-index-cli==0.5.1
 llama-index-indices-managed-vectara==0.5.1
 llama-index-llms-openai==0.5.6
 llama-index-llms-openai-like==0.5.1
-llama-index-llms-anthropic==0.8.6
+llama-index-llms-anthropic==0.9.3
 llama-index-llms-together==0.4.1
 llama-index-llms-groq==0.4.1
 llama-index-llms-cohere==0.6.1
-llama-index-llms-google-genai==0.5.0
-llama-index-llms-baseten==0.1.4
-google_genai>=1.31.0
-llama-index-llms-bedrock-converse==0.9.2
+llama-index-llms-google-genai==0.5.1
+google_genai==1.39.1
+llama-index-llms-bedrock-converse==0.9.5
 llama-index-tools-yahoo-finance==0.4.1
 llama-index-tools-arxiv==0.4.1
 llama-index-tools-database==0.4.1

{vectara_agentic-0.4.8 → vectara_agentic-0.4.9}/tests/benchmark_models.py RENAMED Viewed

@@ -68,7 +68,7 @@ def validate_api_keys(models_to_test: List[Dict]) -> None:
             missing_keys.append(key)
     if missing_keys:
-        print("❌ ERROR: Missing required API keys for benchmark execution:")
+        print("ERROR: Missing required API keys for benchmark execution:")
         print()
         for key in sorted(missing_keys):
             print(f"  • {key}")
@@ -83,7 +83,7 @@ def validate_api_keys(models_to_test: List[Dict]) -> None:
         sys.exit(1)
-    print("✅ All required API keys are present")
+    print("All required API keys are present")
     print(f"Found API keys for {len(required_keys)} required environment variables")
@@ -135,7 +135,7 @@ class ModelBenchmark:
             {"provider": ModelProvider.OPENAI, "model": "gpt-5-mini"},
             {"provider": ModelProvider.OPENAI, "model": "gpt-4o-mini"},
             {"provider": ModelProvider.OPENAI, "model": "gpt-4.1-mini"},
-            {"provider": ModelProvider.ANTHROPIC, "model": "claude-sonnet-4-20250514"},
+            {"provider": ModelProvider.ANTHROPIC, "model": "claude-sonnet-4-5"},
             {"provider": ModelProvider.TOGETHER, "model": "deepseek-ai/DeepSeek-V3"},
             {"provider": ModelProvider.GROQ, "model": "openai/gpt-oss-20b"},
             {"provider": ModelProvider.GEMINI, "model": "models/gemini-2.5-flash-lite"},
@@ -817,11 +817,11 @@ class ModelBenchmark:
             observability_setup = setup_observer(dummy_config, verbose=True)
             if observability_setup:
                 print(
-                    "✅ Arize Phoenix observability enabled - LLM calls will be traced\n"
+                    "Arize Phoenix observability enabled - LLM calls will be traced\n"
                 )
                 _observability_initialized = True
             else:
-                print("⚠️  Arize Phoenix observability setup failed\n")
+                print("Arize Phoenix observability setup failed\n")
         # Create semaphore to limit concurrent model testing
         model_semaphore = asyncio.Semaphore(self.max_concurrent_models)
@@ -835,7 +835,7 @@ class ModelBenchmark:
             tasks.append(task)
         # Execute all model benchmarks in parallel
-        print("🚀 Starting parallel benchmark execution...\n")
+        print("Starting parallel benchmark execution...\n")
         await asyncio.gather(*tasks, return_exceptions=True)
     async def _run_model_benchmark(
@@ -857,9 +857,9 @@ class ModelBenchmark:
                         provider, model_name, test_name, test_config
                     )
                 except Exception as e:
-                    print(f"❌ Error in {model_name} - {test_name}: {e}")
+                    print(f"Error in {model_name} - {test_name}: {e}")
-            print(f"✅ Completed: {provider.value} - {model_name}")
+            print(f"Completed: {provider.value} - {model_name}")
     async def _run_scenario_benchmark(
         self,
@@ -892,18 +892,18 @@ class ModelBenchmark:
                 if result.error:
                     print(
-                        f"    ❌ {model_name}/{test_name} Iteration {iteration_num}: {result.error}"
+                        f"{model_name}/{test_name} Iteration {iteration_num}: {result.error}"
                     )
                 else:
                     print(
-                        f"    ✅ {model_name}/{test_name} Iteration {iteration_num}: "
+                        f"{model_name}/{test_name} Iteration {iteration_num}: "
                         f"{result.total_response_time:.2f}s, "
                         f"first token: {result.first_token_latency:.2f}s, "
                         f"{result.tokens_per_second:.1f} chars/sec"
                     )
             except Exception as e:
-                print(f"    ❌ {model_name}/{test_name} Iteration {iteration_num}: {e}")
+                print(f"{model_name}/{test_name} Iteration {iteration_num}: {e}")
                 # Create error result
                 error_result = BenchmarkResult(
                     model_name=model_name,
@@ -929,7 +929,7 @@ class ModelBenchmark:
         successful = len([r for r in iteration_results if r.error is None])
         success_rate = (successful / len(iteration_results)) * 100
         print(
-            f"    📊 {model_name}/{test_name} complete: {successful}/{len(iteration_results)} successful ({success_rate:.1f}%)"
+            f"{model_name}/{test_name} complete: {successful}/{len(iteration_results)} successful ({success_rate:.1f}%)"
         )
         return iteration_results

{vectara_agentic-0.4.8 → vectara_agentic-0.4.9}/tests/test_agent.py RENAMED Viewed

@@ -13,7 +13,6 @@ from vectara_agentic.agent_config import AgentConfig
 from vectara_agentic.types import ModelProvider, ObserverType
 from vectara_agentic.tools import ToolsFactory
-from vectara_agentic.agent_core.prompts import GENERAL_INSTRUCTIONS
 from conftest import mult, STANDARD_TEST_TOPIC, STANDARD_TEST_INSTRUCTIONS
@@ -54,9 +53,11 @@ class TestAgentPackage(unittest.TestCase):
             + date.today().strftime("%A, %B %d, %Y")
             + " with Always do as your mother tells you!"
         )
+        # Test format_prompt with dummy instructions since we're only testing template substitution
+        dummy_instructions = "Test instructions"
         self.assertEqual(
             format_prompt(
-                prompt_template, GENERAL_INSTRUCTIONS, topic, custom_instructions
+                prompt_template, dummy_instructions, topic, custom_instructions
             ),
             expected_output,
         )
@@ -83,7 +84,7 @@ class TestAgentPackage(unittest.TestCase):
             config = AgentConfig(
                 agent_type=AgentType.REACT,
                 main_llm_provider=ModelProvider.ANTHROPIC,
-                main_llm_model_name="claude-sonnet-4-20250514",
+                main_llm_model_name="claude-sonnet-4-5",
                 tool_llm_provider=ModelProvider.TOGETHER,
                 tool_llm_model_name="moonshotai/Kimi-K2-Instruct",
                 observer=ObserverType.ARIZE_PHOENIX,

{vectara_agentic-0.4.8 → vectara_agentic-0.4.9}/tests/test_bedrock.py RENAMED Viewed

@@ -95,9 +95,9 @@ class TestBedrock(unittest.IsolatedAsyncioTestCase):
                 "then rephrase that summary as a 10-year-old would explain it."
             )
-            print("\n🔍 Starting Claude Sonnet 4 multi-tool chain test (Bedrock)")
-            print(f"📝 Query: {complex_query}")
-            print("🌊 Streaming response:\n" + "="*50)
+            print("\nStarting Claude Sonnet 4 multi-tool chain test (Bedrock)")
+            print(f"Query: {complex_query}")
+            print("Streaming response:\n" + "="*50)
             stream = await agent.astream_chat(complex_query)
@@ -111,33 +111,33 @@ class TestBedrock(unittest.IsolatedAsyncioTestCase):
                     streaming_deltas.append(chunk)
                     full_response += chunk
                     # Display each streaming delta
-                    print(f"📡 Delta: {repr(chunk)}")
+                    print(f"Delta: {repr(chunk)}")
                     # Track tool calls in the stream
                     if "mult" in chunk.lower():
                         if "mult" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "mult", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: mult (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: mult (#{len(tool_calls_made)})")
                     if "add" in chunk.lower():
                         if "add" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "add", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: add (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: add (#{len(tool_calls_made)})")
                     if "summarize" in chunk.lower():
                         if "summarize_text" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "summarize_text", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: summarize_text (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: summarize_text (#{len(tool_calls_made)})")
                     if "rephrase" in chunk.lower():
                         if "rephrase_text" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "rephrase_text", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: rephrase_text (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: rephrase_text (#{len(tool_calls_made)})")
             response = await stream.aget_response()
             print("="*50)
-            print(f"✅ Streaming completed. Total deltas: {len(streaming_deltas)}")
-            print(f"🔧 Tool calls made: {[call['tool'] for call in tool_calls_made]}")
+            print(f"Streaming completed. Total deltas: {len(streaming_deltas)}")
+            print(f"Tool calls made: {[call['tool'] for call in tool_calls_made]}")
             print(f"📄 Final response length: {len(response.response)} chars")
-            print(f"🎯 Final response: {response.response}")
+            print(f"Final response: {response.response}")
             # Validate tool usage sequence
             tools_used = [call["tool"] for call in tool_calls_made]
@@ -154,7 +154,7 @@ class TestBedrock(unittest.IsolatedAsyncioTestCase):
                                      if result in all_text)
             print(f"🔢 Mathematical results found: {math_results_found}/3 expected")
-            print(f"🔍 Full text searched: {all_text[:200]}...")
+            print(f"Full text searched: {all_text[:200]}...")
             # More lenient assertion - just check that some mathematical progress was made
             self.assertGreaterEqual(math_results_found, 1,

{vectara_agentic-0.4.8 → vectara_agentic-0.4.9}/tests/test_gemini.py RENAMED Viewed

@@ -4,17 +4,20 @@ import warnings
 warnings.simplefilter("ignore", DeprecationWarning)
 import unittest
+import asyncio
+import gc
 from vectara_agentic.agent import Agent
 from vectara_agentic.tools import ToolsFactory
 from vectara_agentic.tools_catalog import ToolsCatalog
+from vectara_agentic.llm_utils import clear_llm_cache
 import nest_asyncio
 nest_asyncio.apply()
-from conftest import (
+from tests.conftest import (
     mult,
     add,
     fc_config_gemini,
@@ -23,8 +26,26 @@ from conftest import (
 )
-class TestGEMINI(unittest.TestCase):
-    def test_gemini(self):
+class TestGEMINI(unittest.IsolatedAsyncioTestCase):
+    def setUp(self):
+        """Set up test fixtures."""
+        super().setUp()
+        # Clear any cached LLM instances before each test
+        clear_llm_cache()
+        # Force garbage collection to clean up any lingering resources
+        gc.collect()
+    async def asyncTearDown(self):
+        """Clean up after each test - async version."""
+        await super().asyncTearDown()
+        # Clear cached LLM instances after each test
+        clear_llm_cache()
+        # Force garbage collection
+        gc.collect()
+        # Small delay to allow cleanup
+        await asyncio.sleep(0.01)
+    async def test_gemini(self):
         tools = [ToolsFactory().create_tool(mult)]
         agent = Agent(
@@ -33,14 +54,14 @@ class TestGEMINI(unittest.TestCase):
             topic=STANDARD_TEST_TOPIC,
             custom_instructions=STANDARD_TEST_INSTRUCTIONS,
         )
-        _ = agent.chat("What is 5 times 10. Only give the answer, nothing else")
-        _ = agent.chat("what is 3 times 7. Only give the answer, nothing else")
-        res = agent.chat(
+        _ = await agent.achat("What is 5 times 10. Only give the answer, nothing else")
+        _ = await agent.achat("what is 3 times 7. Only give the answer, nothing else")
+        res = await agent.achat(
             "what is the result of multiplying the results of the last two multiplications. Only give the answer, nothing else."
         )
         self.assertIn("1050", res.response)
-    def test_gemini_single_prompt(self):
+    async def test_gemini_single_prompt(self):
         tools = [ToolsFactory().create_tool(mult)]
         agent = Agent(
@@ -49,12 +70,12 @@ class TestGEMINI(unittest.TestCase):
             topic=STANDARD_TEST_TOPIC,
             custom_instructions=STANDARD_TEST_INSTRUCTIONS,
         )
-        res = agent.chat(
+        res = await agent.achat(
             "First, multiply 5 by 10. Then, multiply 3 by 7. Finally, multiply the results of the first two calculations."
         )
         self.assertIn("1050", res.response)
-    def test_gemini_25_flash_multi_tool_chain(self):
+    async def test_gemini_25_flash_multi_tool_chain(self):
         """Test Gemini 2.5 Flash with complex multi-step reasoning chain using multiple tools."""
         # Use Gemini config (Gemini 2.5 Flash)
         tools_catalog = ToolsCatalog(fc_config_gemini)
@@ -77,18 +98,19 @@ class TestGEMINI(unittest.TestCase):
             "Perform this calculation step by step: "
             "First multiply 3 by 8, then add 14 to that result, "
             "then multiply the new result by 3. "
-            "After getting the final number, summarize the entire mathematical process "
-            "with expertise in 'mathematics education', "
-            "then rephrase that summary as a 10-year-old would explain it."
+            "After getting the final number, create a text description of the entire mathematical process "
+            "(e.g., 'First I multiplied 3 by 8 to get 24, then added 14 to get 38, then multiplied by 3 to get 114'). "
+            "Then use the summarize_text tool to summarize that text description with expertise in 'mathematics education'. "
+            "Finally, use the rephrase_text tool to rephrase that summary as a 10-year-old would explain it."
         )
-        print("\n🔍 Starting Gemini 2.5 Flash multi-tool chain test")
-        print(f"📝 Query: {complex_query}")
+        print("\nStarting Gemini 2.5 Flash multi-tool chain test")
+        print(f"Query: {complex_query}")
-        # Note: Gemini tests use synchronous chat, not async streaming
-        response = agent.chat(complex_query)
+        # Note: Gemini tests now use async chat
+        response = await agent.achat(complex_query)
-        print(f"🎯 Final response: {response.response}")
+        print(f"Final response: {response.response}")
         print(f"📄 Final response length: {len(response.response)} chars")
         # Check for mathematical results in the response
@@ -98,8 +120,8 @@ class TestGEMINI(unittest.TestCase):
         math_results_found = sum(1 for result in expected_intermediate_results
                                  if result in response_text)
-        print(f"🔢 Mathematical results found: {math_results_found}/3 expected")
-        print(f"🔍 Response text searched: {response_text[:200]}...")
+        print(f"Mathematical results found: {math_results_found}/3 expected")
+        print(f"Response text searched: {response_text[:200]}...")
         # More lenient assertion - just check that some mathematical progress was made
         self.assertGreaterEqual(math_results_found, 1,
@@ -110,10 +132,10 @@ class TestGEMINI(unittest.TestCase):
         self.assertGreater(len(response.response.strip()), 50, "Expected substantial response content")
         # Check for indications of multi-tool usage (math, summary, or explanation content)
-        multi_tool_indicators = ["calculate", "multiply", "add", "summary", "explain", "mathematical", "process"]
+        multi_tool_indicators = ["calculate", "multipl", "add", "summary", "explain", "mathematical", "process"]
         indicators_found = sum(1 for indicator in multi_tool_indicators
                                if indicator in response_text)
-        self.assertGreaterEqual(indicators_found, 3,
+        self.assertGreaterEqual(indicators_found, 2,
                                 f"Expected multiple tool usage indicators. Found {indicators_found}: {response.response}")

{vectara_agentic-0.4.8 → vectara_agentic-0.4.9}/tests/test_groq.py RENAMED Viewed

@@ -68,112 +68,8 @@ class TestGROQ(unittest.IsolatedAsyncioTestCase):
             self.assertEqual(response3.response, "1050")
-    async def test_gpt_oss_120b(self):
-        """Test GPT-OSS-120B model with complex multi-step reasoning chain using multiple tools via GROQ."""
-        with ARIZE_LOCK:
-            # Create config for GPT-OSS-120B via GROQ
-            gpt_oss_config = AgentConfig(
-                agent_type=AgentType.FUNCTION_CALLING,
-                main_llm_provider=ModelProvider.GROQ,
-                main_llm_model_name="openai/gpt-oss-120b",
-                tool_llm_provider=ModelProvider.GROQ,
-                tool_llm_model_name="openai/gpt-oss-120b",
-            )
-            # Create multiple tools for complex reasoning
-            tools_catalog = ToolsCatalog(gpt_oss_config)
-            tools = [
-                ToolsFactory().create_tool(mult),
-                ToolsFactory().create_tool(add),
-                ToolsFactory().create_tool(tools_catalog.summarize_text),
-                ToolsFactory().create_tool(tools_catalog.rephrase_text),
-            ]
-            agent = Agent(
-                agent_config=gpt_oss_config,
-                tools=tools,
-                topic=STANDARD_TEST_TOPIC,
-                custom_instructions="You are a mathematical reasoning agent that explains your work step by step.",
-            )
-            # Complex multi-step reasoning task
-            complex_query = (
-                "Perform this calculation step by step: "
-                "First multiply 7 by 8, then add 15 to that result, "
-                "then multiply the new result by 3. "
-                "After getting the final number, summarize the entire mathematical process "
-                "with expertise in 'mathematics education', "
-                "then rephrase that summary as a 10-year-old would explain it."
-            )
-            print("\n🔍 Starting GPT-OSS-120B multi-tool chain test (GROQ)")
-            print(f"📝 Query: {complex_query}")
-            print("🌊 Streaming response:\n" + "="*50)
-            stream = await agent.astream_chat(complex_query)
-            # Capture streaming deltas and tool calls
-            streaming_deltas = []
-            tool_calls_made = []
-            full_response = ""
-            async for chunk in stream.async_response_gen():
-                if chunk and chunk.strip():
-                    streaming_deltas.append(chunk)
-                    full_response += chunk
-                    # Display each streaming delta
-                    print(f"📡 Delta: {repr(chunk)}")
-                    # Track tool calls in the stream
-                    if "mult" in chunk.lower():
-                        if "mult" not in [call["tool"] for call in tool_calls_made]:
-                            tool_calls_made.append({"tool": "mult", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: mult (#{len(tool_calls_made)})")
-                    if "add" in chunk.lower():
-                        if "add" not in [call["tool"] for call in tool_calls_made]:
-                            tool_calls_made.append({"tool": "add", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: add (#{len(tool_calls_made)})")
-                    if "summarize" in chunk.lower():
-                        if "summarize_text" not in [call["tool"] for call in tool_calls_made]:
-                            tool_calls_made.append({"tool": "summarize_text", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: summarize_text (#{len(tool_calls_made)})")
-                    if "rephrase" in chunk.lower():
-                        if "rephrase_text" not in [call["tool"] for call in tool_calls_made]:
-                            tool_calls_made.append({"tool": "rephrase_text", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: rephrase_text (#{len(tool_calls_made)})")
-            response = await stream.aget_response()
-            print("="*50)
-            print(f"✅ Streaming completed. Total deltas: {len(streaming_deltas)}")
-            print(f"🔧 Tool calls made: {[call['tool'] for call in tool_calls_made]}")
-            print(f"📄 Final response length: {len(response.response)} chars")
-            print(f"🎯 Final response: {response.response}")
-            # Validate tool usage sequence
-            tools_used = [call["tool"] for call in tool_calls_made]
-            print(f"🧪 Tools used in order: {tools_used}")
-            # Check that at least multiplication happened (basic requirement)
-            self.assertIn("mult", tools_used, f"Expected multiplication tool to be used. Tools used: {tools_used}")
-            # Check for mathematical results in the full response or streaming deltas
-            expected_intermediate_results = ["56", "71", "213"]
-            all_text = (full_response + " " + response.response).lower()
-            math_results_found = sum(1 for result in expected_intermediate_results
-                                     if result in all_text)
-            print(f"🔢 Mathematical results found: {math_results_found}/3 expected")
-            print(f"🔍 Full text searched: {all_text[:200]}...")
-            # More lenient assertion - just check that some mathematical progress was made
-            self.assertGreaterEqual(math_results_found, 1,
-                                    f"Expected at least 1 mathematical result. Found {math_results_found}. "
-                                    f"Full text: {all_text}")
-            # Verify that streaming actually produced content
-            self.assertGreater(len(streaming_deltas), 0, "Expected streaming deltas to be produced")
-            self.assertGreater(len(response.response.strip()), 0, "Expected non-empty final response")
+    # Skipping test_gpt_oss_120b due to model's internal tools conflicting with function calling
+    # GPT-OSS-120B has internal tools like repo_browser.open_file that cause validation errors
     async def test_gpt_oss_20b(self):
         """Test GPT-OSS-20B model with complex multi-step reasoning chain using multiple tools via GROQ."""
@@ -213,9 +109,9 @@ class TestGROQ(unittest.IsolatedAsyncioTestCase):
                 "then rephrase that summary as a 10-year-old would explain it."
             )
-            print("\n🔍 Starting GPT-OSS-20B multi-tool chain test (GROQ)")
-            print(f"📝 Query: {complex_query}")
-            print("🌊 Streaming response:\n" + "="*50)
+            print("\nStarting GPT-OSS-20B multi-tool chain test (GROQ)")
+            print(f"Query: {complex_query}")
+            print("Streaming response:\n" + "="*50)
             stream = await agent.astream_chat(complex_query)
@@ -235,27 +131,27 @@ class TestGROQ(unittest.IsolatedAsyncioTestCase):
                     if "mult" in chunk.lower():
                         if "mult" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "mult", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: mult (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: mult (#{len(tool_calls_made)})")
                     if "add" in chunk.lower():
                         if "add" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "add", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: add (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: add (#{len(tool_calls_made)})")
                     if "summarize" in chunk.lower():
                         if "summarize_text" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "summarize_text", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: summarize_text (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: summarize_text (#{len(tool_calls_made)})")
                     if "rephrase" in chunk.lower():
                         if "rephrase_text" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "rephrase_text", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: rephrase_text (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: rephrase_text (#{len(tool_calls_made)})")
             response = await stream.aget_response()
             print("="*50)
-            print(f"✅ Streaming completed. Total deltas: {len(streaming_deltas)}")
-            print(f"🔧 Tool calls made: {[call['tool'] for call in tool_calls_made]}")
+            print(f"Streaming completed. Total deltas: {len(streaming_deltas)}")
+            print(f"Tool calls made: {[call['tool'] for call in tool_calls_made]}")
             print(f"📄 Final response length: {len(response.response)} chars")
-            print(f"🎯 Final response: {response.response}")
+            print(f"Final response: {response.response}")
             # Validate tool usage sequence
             tools_used = [call["tool"] for call in tool_calls_made]
@@ -272,7 +168,7 @@ class TestGROQ(unittest.IsolatedAsyncioTestCase):
                                      if result in all_text)
             print(f"🔢 Mathematical results found: {math_results_found}/3 expected")
-            print(f"🔍 Full text searched: {all_text[:200]}...")
+            print(f"Full text searched: {all_text[:200]}...")
             # More lenient assertion - just check that some mathematical progress was made
             self.assertGreaterEqual(math_results_found, 1,

{vectara_agentic-0.4.8 → vectara_agentic-0.4.9}/tests/test_openai.py RENAMED Viewed

@@ -186,9 +186,9 @@ class TestOpenAI(unittest.IsolatedAsyncioTestCase):
                 "then rephrase that summary as a 10-year-old would explain it."
             )
-            print("\n🔍 Starting GPT-4.1-mini multi-tool chain test (OpenAI)")
-            print(f"📝 Query: {complex_query}")
-            print("🌊 Streaming response:\n" + "="*50)
+            print("\nStarting GPT-4.1-mini multi-tool chain test (OpenAI)")
+            print(f"Query: {complex_query}")
+            print("Streaming response:\n" + "="*50)
             stream = await agent.astream_chat(complex_query)
@@ -202,33 +202,33 @@ class TestOpenAI(unittest.IsolatedAsyncioTestCase):
                     streaming_deltas.append(chunk)
                     full_response += chunk
                     # Display each streaming delta
-                    print(f"📡 Delta: {repr(chunk)}")
+                    print(f"Delta: {repr(chunk)}")
                     # Track tool calls in the stream
                     if "mult" in chunk.lower():
                         if "mult" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "mult", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: mult (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: mult (#{len(tool_calls_made)})")
                     if "add" in chunk.lower():
                         if "add" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "add", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: add (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: add (#{len(tool_calls_made)})")
                     if "summarize" in chunk.lower():
                         if "summarize_text" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "summarize_text", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: summarize_text (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: summarize_text (#{len(tool_calls_made)})")
                     if "rephrase" in chunk.lower():
                         if "rephrase_text" not in [call["tool"] for call in tool_calls_made]:
                             tool_calls_made.append({"tool": "rephrase_text", "order": len(tool_calls_made) + 1})
-                            print(f"🔧 Tool call detected: rephrase_text (#{len(tool_calls_made)})")
+                            print(f"Tool call detected: rephrase_text (#{len(tool_calls_made)})")
             response = await stream.aget_response()
             print("="*50)
-            print(f"✅ Streaming completed. Total deltas: {len(streaming_deltas)}")
-            print(f"🔧 Tool calls made: {[call['tool'] for call in tool_calls_made]}")
+            print(f"Streaming completed. Total deltas: {len(streaming_deltas)}")
+            print(f"Tool calls made: {[call['tool'] for call in tool_calls_made]}")
             print(f"📄 Final response length: {len(response.response)} chars")
-            print(f"🎯 Final response: {response.response}")
+            print(f"Final response: {response.response}")
             # Validate tool usage sequence
             tools_used = [call["tool"] for call in tool_calls_made]
@@ -244,8 +244,8 @@ class TestOpenAI(unittest.IsolatedAsyncioTestCase):
             math_results_found = sum(1 for result in expected_intermediate_results
                                      if result in all_text)
-            print(f"🔢 Mathematical results found: {math_results_found}/3 expected")
-            print(f"🔍 Full text searched: {all_text[:200]}...")
+            print(f"Mathematical results found: {math_results_found}/3 expected")
+            print(f"Full text searched: {all_text[:200]}...")
             # More lenient assertion - just check that some mathematical progress was made
             self.assertGreaterEqual(math_results_found, 1,

vectara-agentic 0.4.8__tar.gz → 0.4.9__tar.gz

Potentially problematic release.

vectara-agentic 0.4.8tar.gz → 0.4.9tar.gz