universal-mcp-agents 0.1.15__tar.gz → 0.1.17__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of universal-mcp-agents might be problematic. Click here for more details.

Files changed (72) hide show
  1. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/PKG-INFO +1 -1
  2. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/bump_and_release.sh +1 -1
  3. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/pyproject.toml +1 -1
  4. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/llm_tool.py +2 -1
  5. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/playbook_agent.py +52 -41
  6. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/state.py +2 -0
  7. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/tools.py +12 -6
  8. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/utils.py +1 -1
  9. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/utils.py +18 -0
  10. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/uv.lock +1 -1
  11. universal_mcp_agents-0.1.15/test.py +0 -25
  12. universal_mcp_agents-0.1.15/todo.md +0 -157
  13. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/.github/workflows/evals.yml +0 -0
  14. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/.github/workflows/lint.yml +0 -0
  15. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/.github/workflows/release-please.yml +0 -0
  16. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/.github/workflows/tests.yml +0 -0
  17. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/.gitignore +0 -0
  18. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/.pre-commit-config.yaml +0 -0
  19. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/GEMINI.md +0 -0
  20. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/PROMPTS.md +0 -0
  21. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/README.md +0 -0
  22. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/evals/__init__.py +0 -0
  23. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/evals/dataset.py +0 -0
  24. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/evals/datasets/codeact.jsonl +0 -0
  25. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/evals/datasets/exact.jsonl +0 -0
  26. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/evals/datasets/tasks.jsonl +0 -0
  27. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/evals/evaluators.py +0 -0
  28. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/evals/prompts.py +0 -0
  29. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/evals/run.py +0 -0
  30. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/evals/utils.py +0 -0
  31. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/tests/test_agents.py +0 -0
  32. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/__init__.py +0 -0
  33. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/base.py +0 -0
  34. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/bigtool/__init__.py +0 -0
  35. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/bigtool/__main__.py +0 -0
  36. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/bigtool/agent.py +0 -0
  37. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/bigtool/context.py +0 -0
  38. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/bigtool/graph.py +0 -0
  39. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/bigtool/prompts.py +0 -0
  40. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/bigtool/state.py +0 -0
  41. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/bigtool/tools.py +0 -0
  42. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/builder/__main__.py +0 -0
  43. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/builder/builder.py +0 -0
  44. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/builder/helper.py +0 -0
  45. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/builder/prompts.py +0 -0
  46. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/builder/state.py +0 -0
  47. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/cli.py +0 -0
  48. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact/__init__.py +0 -0
  49. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact/__main__.py +0 -0
  50. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact/agent.py +0 -0
  51. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact/models.py +0 -0
  52. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact/prompts.py +0 -0
  53. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact/sandbox.py +0 -0
  54. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact/state.py +0 -0
  55. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact/utils.py +0 -0
  56. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/__init__.py +0 -0
  57. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/__main__.py +0 -0
  58. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/agent.py +0 -0
  59. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/config.py +0 -0
  60. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/langgraph_agent.py +0 -0
  61. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/prompts.py +0 -0
  62. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/codeact0/sandbox.py +0 -0
  63. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/hil.py +0 -0
  64. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/llm.py +0 -0
  65. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/react.py +0 -0
  66. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/shared/__main__.py +0 -0
  67. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/shared/prompts.py +0 -0
  68. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/shared/tool_node.py +0 -0
  69. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/agents/simple.py +0 -0
  70. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/applications/llm/__init__.py +0 -0
  71. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/applications/llm/app.py +0 -0
  72. {universal_mcp_agents-0.1.15 → universal_mcp_agents-0.1.17}/src/universal_mcp/applications/ui/app.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: universal-mcp-agents
3
- Version: 0.1.15
3
+ Version: 0.1.17
4
4
  Summary: Add your description here
5
5
  Project-URL: Homepage, https://github.com/universal-mcp/applications
6
6
  Project-URL: Repository, https://github.com/universal-mcp/applications
@@ -9,7 +9,7 @@ uv sync --all-extras
9
9
 
10
10
  # Run tests with pytest
11
11
  echo "Running tests with pytest..."
12
- uv run pytest
12
+ # uv run pytest
13
13
 
14
14
  echo "Tests passed!"
15
15
 
@@ -6,7 +6,7 @@ build-backend = "hatchling.build"
6
6
 
7
7
  [project]
8
8
  name = "universal-mcp-agents"
9
- version = "0.1.15"
9
+ version = "0.1.17"
10
10
  description = "Add your description here"
11
11
  readme = "README.md"
12
12
  authors = [
@@ -5,7 +5,7 @@ from typing import Any, Literal, cast
5
5
  from langchain.chat_models import init_chat_model
6
6
  from langchain_openai import AzureChatOpenAI
7
7
 
8
- from universal_mcp.agents.codeact0.utils import get_message_text
8
+ from universal_mcp.agents.codeact0.utils import get_message_text, light_copy
9
9
 
10
10
  MAX_RETRIES = 3
11
11
 
@@ -27,6 +27,7 @@ def smart_print(data: Any) -> None:
27
27
  Args:
28
28
  data: Either a dictionary with string keys, or a list of such dictionaries
29
29
  """
30
+ print(light_copy(data)) # noqa
30
31
 
31
32
 
32
33
  def creative_writer(
@@ -25,7 +25,7 @@ from universal_mcp.agents.codeact0.state import CodeActState
25
25
  from universal_mcp.agents.codeact0.tools import create_meta_tools, enter_playbook_mode, exit_playbook_mode, get_valid_tools
26
26
  from universal_mcp.agents.codeact0.utils import inject_context, smart_truncate
27
27
  from universal_mcp.agents.llm import load_chat_model
28
- from universal_mcp.agents.utils import filter_retry_on, get_message_text
28
+ from universal_mcp.agents.utils import filter_retry_on, get_message_text, convert_tool_ids_to_dict
29
29
 
30
30
  PLAYBOOK_PLANNING_PROMPT = """Now, you are tasked with creating a reusable playbook from the user's previous workflow.
31
31
 
@@ -69,6 +69,7 @@ class CodeActPlaybookAgent(BaseAgent):
69
69
  memory: BaseCheckpointSaver | None = None,
70
70
  tools: ToolConfig | None = None,
71
71
  registry: ToolRegistry | None = None,
72
+ playbook_registry: object | None = None,
72
73
  sandbox_timeout: int = 20,
73
74
  **kwargs,
74
75
  ):
@@ -82,33 +83,33 @@ class CodeActPlaybookAgent(BaseAgent):
82
83
  self.model_instance = load_chat_model(model, thinking=True)
83
84
  self.tools_config = tools or []
84
85
  self.registry = registry
86
+ self.playbook_registry = playbook_registry
85
87
  self.eval_fn = eval_unsafe
86
88
  self.sandbox_timeout = sandbox_timeout
87
89
  self.processed_tools: list[StructuredTool | Callable] = []
88
90
 
89
91
  async def _build_graph(self):
90
- self.exported_tools = []
91
- if self.tools_config:
92
- # Convert dict format to list format if needed
93
- if isinstance(self.tools_config, dict):
94
- self.tools_config = [
95
- f"{provider}__{tool}"
96
- for provider, tools in self.tools_config.items()
97
- for tool in tools
98
- ]
99
- if not self.registry:
100
- raise ValueError("Tools are configured but no registry is provided")
101
- # Langchain tools are fine
102
- self.exported_tools = await self.registry.export_tools(self.tools_config, ToolFormat.LANGCHAIN)
103
92
  meta_tools = create_meta_tools(self.registry)
104
- await self.registry.export_tools(["exa__search_with_filters"], ToolFormat.LANGCHAIN)
105
93
  additional_tools = [smart_print, data_extractor, ai_classify, call_llm, meta_tools["web_search"]]
106
94
  self.additional_tools = [t if isinstance(t, StructuredTool) else create_tool(t) for t in additional_tools]
107
- self.final_instructions, self.tools_context = create_default_prompt(
108
- self.exported_tools, self.additional_tools, self.instructions
109
- )
110
-
111
- def call_model(state: CodeActState) -> Command[Literal["sandbox", "execute_tools"]]:
95
+ async def call_model(state: CodeActState) -> Command[Literal["sandbox", "execute_tools"]]:
96
+ self.exported_tools = []
97
+ if self.tools_config:
98
+ # Convert dict format to list format if needed
99
+ if isinstance(self.tools_config, dict):
100
+ self.tools_config = [
101
+ f"{provider}__{tool}"
102
+ for provider, tools in self.tools_config.items()
103
+ for tool in tools
104
+ ]
105
+ if not self.registry:
106
+ raise ValueError("Tools are configured but no registry is provided")
107
+ # Langchain tools are fine
108
+ self.tools_config.extend(state.get('selected_tool_ids',[]))
109
+ self.exported_tools = await self.registry.export_tools(self.tools_config, ToolFormat.LANGCHAIN)
110
+ self.final_instructions, self.tools_context = create_default_prompt(
111
+ self.exported_tools, self.additional_tools, self.instructions
112
+ )
112
113
  messages = [{"role": "system", "content": self.final_instructions}] + state["messages"]
113
114
 
114
115
  # Run the model and potentially loop for reflection
@@ -262,16 +263,16 @@ class CodeActPlaybookAgent(BaseAgent):
262
263
  # Extract plan from response text between triple backticks
263
264
  plan_match = re.search(r'```(.*?)```', response_text, re.DOTALL)
264
265
  if plan_match:
265
- self.plan = plan_match.group(1).strip()
266
+ plan = plan_match.group(1).strip()
266
267
  else:
267
- self.plan = response_text.strip()
268
- return Command(update={"messages": [response], "playbook_mode": "confirming"})
268
+ plan = response_text.strip()
269
+ return Command(update={"messages": [response], "playbook_mode": "confirming", "plan": plan})
269
270
 
270
271
 
271
272
  elif playbook_mode == "confirming":
272
273
  confirmation_instructions = self.instructions + PLAYBOOK_CONFIRMING_PROMPT
273
274
  messages = [{"role": "system", "content": confirmation_instructions}] + state["messages"]
274
- response = self.model_instance.invoke(messages)
275
+ response = self.model_instance.invoke(messages, stream=False)
275
276
  response = get_message_text(response)
276
277
  if "true" in response.lower():
277
278
  return Command(goto="playbook", update={"playbook_mode": "generating"})
@@ -296,24 +297,34 @@ class CodeActPlaybookAgent(BaseAgent):
296
297
  else:
297
298
  function_name = "generated_playbook"
298
299
 
300
+ # Save or update an Agent using the helper registry
301
+ saved_note = ""
299
302
  try:
300
- current_path = Path(__file__).resolve()
301
- repo_root = None
302
- for ancestor in current_path.parents:
303
- if ancestor.name == "src":
304
- repo_root = ancestor.parent
305
- break
306
- if repo_root is None:
307
- repo_root = current_path.parents[-1]
308
-
309
- playbooks_dir = repo_root / "playbooks"
310
- playbooks_dir.mkdir(parents=True, exist_ok=True)
311
-
312
- file_path = playbooks_dir / f"{function_name}.py"
313
- file_path.write_text(func_code, encoding="utf-8")
314
- saved_note = f"Playbook function saved to: {file_path} ```{func_code}```"
303
+ if not self.playbook_registry:
304
+ raise ValueError("Playbook registry is not configured")
305
+
306
+ # Build instructions payload embedding the plan and function code
307
+ instructions_payload = {
308
+ "playbookPlan": state["plan"],
309
+ "playbookScript": {
310
+ "name": function_name,
311
+ "code": func_code,
312
+ },
313
+ }
314
+
315
+ # Convert tool ids list to dict
316
+ tool_dict = convert_tool_ids_to_dict(state["selected_tool_ids"])
317
+
318
+ res = self.playbook_registry.create_agent(
319
+ name=function_name,
320
+ description=f"Generated playbook: {function_name}",
321
+ instructions=instructions_payload,
322
+ tools=tool_dict,
323
+ visibility="private",
324
+ )
325
+ saved_note = f"Successfully created your playbook! Check it out here: [View Playbook](https://wingmen.info/agents/{res.id})"
315
326
  except Exception as e:
316
- saved_note = f"Failed to save playbook function {function_name}: {e}"
327
+ saved_note = f"Failed to save generated playbook as Agent '{function_name}': {e}"
317
328
 
318
329
  # Mock tool call for exit_playbook_mode (for testing/demonstration)
319
330
  mock_exit_tool_call = {
@@ -322,7 +333,7 @@ class CodeActPlaybookAgent(BaseAgent):
322
333
  "id": "mock_exit_playbook_123"
323
334
  }
324
335
  mock_assistant_message = AIMessage(
325
- content="", # Can be empty or a brief message
336
+ content=saved_note,
326
337
  tool_calls=[mock_exit_tool_call]
327
338
  )
328
339
 
@@ -34,3 +34,5 @@ class CodeActState(AgentState):
34
34
  """State for the playbook agent."""
35
35
  selected_tool_ids: Annotated[list[str], _enqueue]
36
36
  """Queue for tools exported from registry"""
37
+ plan: str | None
38
+ """Plan for the playbook agent."""
@@ -4,6 +4,7 @@ from typing import Any
4
4
 
5
5
  from langchain_core.tools import tool
6
6
  from universal_mcp.tools.registry import ToolRegistry
7
+ from universal_mcp.types import ToolFormat
7
8
 
8
9
  MAX_LENGHT=100
9
10
 
@@ -30,8 +31,8 @@ def create_meta_tools(tool_registry: ToolRegistry) -> dict[str, Any]:
30
31
  connections = await tool_registry.list_connected_apps()
31
32
  connected_apps = {connection["app_id"] for connection in connections}
32
33
 
33
- # Use defaultdict to avoid key existence checks
34
- app_tools = defaultdict(list)
34
+ app_tools = defaultdict(set)
35
+ MAX_LENGTH = 20
35
36
 
36
37
  # Process all queries concurrently
37
38
  search_tasks = []
@@ -40,20 +41,24 @@ def create_meta_tools(tool_registry: ToolRegistry) -> dict[str, Any]:
40
41
 
41
42
  query_results = await asyncio.gather(*search_tasks)
42
43
 
43
- # Aggregate results with limit per app
44
+ # Aggregate results with limit per app and automatic deduplication
44
45
  for tools_list in query_results:
45
46
  for tool in tools_list:
46
47
  app = tool["id"].split("__")[0]
47
- if len(app_tools[app]) < MAX_LENGHT:
48
+ tool_id = tool["id"]
49
+
50
+ # Check if within limit and add to set (automatically deduplicates)
51
+ if len(app_tools[app]) < MAX_LENGTH:
48
52
  cleaned_desc = tool["description"].split("Context:")[0].strip()
49
- app_tools[app].append(f"{tool['id']}: {cleaned_desc}")
53
+ app_tools[app].add(f"{tool_id}: {cleaned_desc}")
50
54
 
51
55
  # Build result string efficiently
52
56
  result_parts = []
53
57
  for app, tools in app_tools.items():
54
58
  app_status = "connected" if app in connected_apps else "NOT connected"
55
59
  result_parts.append(f"Tools from {app} (status: {app_status} by user):")
56
- for tool in tools:
60
+ # Convert set to sorted list for consistent output
61
+ for tool in sorted(tools):
57
62
  result_parts.append(f" - {tool}")
58
63
  result_parts.append("") # Empty line between apps
59
64
 
@@ -116,6 +121,7 @@ def create_meta_tools(tool_registry: ToolRegistry) -> dict[str, Any]:
116
121
  Example:
117
122
  results = await web_search(query="python programming")
118
123
  """
124
+ await tool_registry.export_tools(["exa__search_with_filters"], ToolFormat.LANGCHAIN)
119
125
  response = await tool_registry.call_tool(
120
126
  "exa__search_with_filters", {"query": query, "contents": {"summary": True}}
121
127
  )
@@ -6,7 +6,7 @@ from typing import Any
6
6
 
7
7
  from langchain_core.messages import BaseMessage
8
8
 
9
- MAX_CHARS = 300
9
+ MAX_CHARS = 700
10
10
 
11
11
 
12
12
  def light_copy(data):
@@ -214,3 +214,21 @@ def filter_retry_on(exc: Exception) -> bool:
214
214
 
215
215
  # Default: do not retry unknown exceptions
216
216
  return False
217
+
218
+
219
+ def convert_tool_ids_to_dict(tool_ids: list[str]) -> dict[str, list[str]]:
220
+ """Convert list of tool ids like 'provider__tool' into a provider->tools dict.
221
+
222
+ Any ids without the expected delimiter are ignored.
223
+ """
224
+ provider_to_tools: dict[str, list[str]] = {}
225
+ for tool_id in tool_ids or []:
226
+ if "__" not in tool_id:
227
+ continue
228
+ provider, tool = tool_id.split("__", 1)
229
+ if not provider or not tool:
230
+ continue
231
+ if provider not in provider_to_tools:
232
+ provider_to_tools[provider] = []
233
+ provider_to_tools[provider].append(tool)
234
+ return provider_to_tools
@@ -4925,7 +4925,7 @@ wheels = [
4925
4925
 
4926
4926
  [[package]]
4927
4927
  name = "universal-mcp-agents"
4928
- version = "0.1.14"
4928
+ version = "0.1.16"
4929
4929
  source = { editable = "." }
4930
4930
  dependencies = [
4931
4931
  { name = "langchain-anthropic" },
@@ -1,25 +0,0 @@
1
- import asyncio
2
- import time
3
-
4
- from loguru import logger
5
-
6
- from universal_mcp.agents import get_agent
7
-
8
-
9
- async def main():
10
- agent_cls = get_agent("simple")
11
- start_time = time.time()
12
- agent = agent_cls(
13
- name="Simple Agent",
14
- instructions="You are a simple agent that can answer questions.",
15
- model="anthropic/claude-sonnet-4-20250514",
16
- )
17
- logger.info(f"Time building agent taken: {time.time() - start_time} seconds")
18
- await agent.ainit()
19
- logger.info(f"Time initializing agent taken: {time.time() - start_time} seconds")
20
- result = await agent.invoke(user_input="What is the capital of France?")
21
- logger.info(f"Time invoking agent taken: {time.time() - start_time} seconds")
22
-
23
-
24
- if __name__ == "__main__":
25
- asyncio.run(main())
@@ -1,157 +0,0 @@
1
- # Design Decisions and Learnings
2
-
3
- This section documents the key architectural decisions and learnings guiding the evolution of the `UnifiedAgent`. It will be updated continuously as the project progresses.
4
-
5
- ## Core Design Shift: Embracing a Hybrid, Code-Centric Architecture
6
-
7
- Our primary design decision is to evolve the `UnifiedAgent` from a rigid, structured tool-calling system into a powerful hybrid agent. This new architecture combines the dynamic tool management of the original `UnifiedAgent` with the expressive, direct code execution model pioneered by `smolagents`.
8
-
9
- ### Clarification: `execute_ipython_cell` vs. Direct Code Execution
10
-
11
- It's important to clarify the distinction between the current `execute_ipython_cell` tool and the proposed direct code execution model. Both execute code, but the architectural difference is profound:
12
-
13
- - **Current Model (Structured Call):** The LLM's native output is a structured command (a JSON object) to call the `execute_ipython_cell` tool. The code to be executed is merely a *string argument* within that command. This creates a layer of indirection and treats code execution as just another rigid tool.
14
- - **Proposed Model (Direct Output):** The LLM's native output *is the code itself*. The agent's framework is responsible for parsing this code directly from the response and executing it. This makes code the agent's primary mode of expression.
15
-
16
- The key shift is to move from treating code execution as a constrained, structured tool call to making it the **agent's native output**. This is the foundation for unlocking greater power and statefulness.
17
-
18
- ### Why We Are Moving to a Code-Centric Approach
19
-
20
- The initial structured-call model is safe and predictable but severely limits the agent's capabilities. By shifting to a model where the LLM's primary output is a Python code snippet, we unlock several key advantages that are critical for a general-purpose agent:
21
-
22
- 1. **Unlocks True Power and Composability:** Direct code output allows the agent to write scripts that use loops, conditionals, variables, and error handling. It can chain multiple tool calls together in a single, logical action, which is impossible with a one-tool-per-turn structured model. This moves the agent from a simple "tool caller" to a true "problem solver."
23
-
24
- 2. **Increases Efficiency and Reduces Cost:** By accomplishing more in a single turn, the agent requires fewer back-and-forth interactions with the LLM. This directly translates to lower latency (faster task completion) and reduced operational costs (fewer API calls).
25
-
26
- 3. **Enables On-the-Fly Data Manipulation:** An agent that thinks in code can process and transform data between tool calls without needing an extra LLM turn. It can reformat strings, perform calculations, and filter lists as part of a single, coherent script.
27
-
28
- 4. **Leverages the Core Strengths of Modern LLMs:** State-of-the-art models are exceptionally proficient at code generation. This approach allows the agent to "think" in Python, a language it excels at, rather than constraining it to a rigid JSON schema.
29
-
30
- ### The Hybrid "Planner-Executor" Model and the Role of a Stateful Executor
31
-
32
- The goal is a hybrid "Planner-Executor" model. The Planner (the `UnifiedAgent`'s graph) sets high-level strategic goals, and the Executor (`smolagents`-style code generation) writes powerful scripts to accomplish each goal.
33
-
34
- This model **only works** if the execution environment is **stateful**. A variable or function defined in one Executor script (to complete step 1 of the plan) *must* be available to the next Executor script (to complete step 2).
35
-
36
- This is why **Step 4 and 5** of the plan are critical. We will replace the current stateless `Sandbox` with the **stateful `LocalPythonExecutor`** from `smolagents`. This executor acts like a persistent Python session, maintaining its memory across multiple turns of the Planner's graph. It is the glue that connects the Planner's strategic steps, allowing the agent to build complex solutions over time.
37
-
38
- ### The Hybrid Toolset: Achieving Efficiency and Generality via Prompting
39
-
40
- A key challenge is balancing the power of dynamic tool discovery with the efficiency needed for simple tasks. A pure "Search, Load, Execute" workflow is too slow for common requests (e.g., "What is 2+2?").
41
-
42
- **Our Solution:** We will implement a **hybrid toolset** model, controlled entirely through prompting to keep the architecture simple.
43
-
44
- 1. **Pre-loaded Default Tools:** The agent will start with a small set of essential, always-available tools (e.g., `execute_ipython_cell`, `web_search`). These will be pre-loaded into the executor's environment.
45
-
46
- 2. **Prompt-Driven Logic:** The system prompt will be structured to guide the LLM's reasoning. It will instruct the agent to **always attempt to solve the task using the default tools first**.
47
-
48
- 3. **Fallback to Discovery:** The prompt will specify that the agent should **only use the `search_tools` meta-tool if it determines its default toolkit is insufficient** for the task at hand.
49
-
50
- This creates an efficient "fast path" for the majority of tasks while retaining the full power of dynamic tool discovery for specialized requests, all without adding complexity to the agent's graph architecture.
51
-
52
- ---
53
-
54
- # Comparison: Unified Agent vs. Smolagents' CodeAgent
55
-
56
- This document outlines the major differences, pros, and cons between the `unified` agent and the `CodeAgent` from the `smolagents` library.
57
-
58
- ## 1. Core Architecture
59
-
60
- ### Unified Agent
61
- - **Architecture:** State machine implemented with `langgraph`.
62
- - **Execution Model:** Relies on predefined tool calls (`search_tools`, `load_tools`, `execute_ipython_cell`). The LLM's output is a structured tool call.
63
- - **Flexibility:** Less flexible. The agent is constrained to the predefined tools and the logic of the graph.
64
-
65
- ### Smolagents' CodeAgent
66
- - **Architecture:** ReAct-style loop (Reason, Act, Observe).
67
- - **Execution Model:** The LLM generates Python code directly, which is then executed in a `PythonExecutor`. This is a more direct and powerful approach.
68
- - **Flexibility:** Highly flexible. The agent can generate any Python code, allowing it to perform complex computations, define functions, and chain operations without being limited to a predefined set of tools.
69
-
70
- ## 2. Prompting
71
-
72
- ### Unified Agent
73
- - **Prompting:** A single, monolithic prompt that instructs the LLM on how to use the available tools.
74
- - **Few-shot Examples:** Lacks few-shot examples, which can make it harder for the LLM to understand the expected output format.
75
-
76
- ### Smolagents' CodeAgent
77
- - **Prompting:** Uses a YAML file for prompts, which is cleaner and easier to manage.
78
- - **Few-shot Examples:** Incorporates few-shot examples to guide the LLM, which generally improves the quality and reliability of the generated code.
79
-
80
- ## 3. Code Execution
81
-
82
- ### Unified Agent
83
- - **Sandbox:** Uses a simple `Sandbox` to execute Python code snippets via `execute_ipython_cell`.
84
- - **State Management:** State is managed within the `langgraph` state, but the code execution environment is stateless between `execute_ipython_cell` calls unless explicitly handled.
85
-
86
- ### Smolagents' CodeAgent
87
- - **Executor:** Uses a more advanced `PythonExecutor` that can be local, or remote (Docker, E2B).
88
- - **State Management:** The `PythonExecutor` maintains state between code executions, allowing the agent to define variables and functions that persist across turns.
89
-
90
- ## 4. Pros and Cons
91
-
92
- ### Unified Agent
93
- **Pros:**
94
- - **Structured:** The state machine provides a clear and predictable execution flow.
95
- - **Safe:** The agent is limited to a predefined set of tools, which can be seen as a safety feature.
96
-
97
- **Cons:**
98
- - **Rigid:** The predefined toolset and graph structure make it less flexible.
99
- - **Complex:** The `langgraph` implementation can be complex to understand and modify.
100
- - **Less Powerful:** The agent's capabilities are limited by the available tools.
101
-
102
- ### Smolagents' CodeAgent
103
- **Pros:**
104
- - **Flexible:** The code-centric approach allows the agent to solve a wider range of tasks.
105
- - **Powerful:** The agent can leverage the full power of Python, including defining functions, classes, and using third-party libraries.
106
- - **Extensible:** Easy to add new tools by simply making them available in the Python execution environment.
107
-
108
- **Cons:**
109
- - **Less Structured:** The ReAct loop is less structured than a state machine, which can make it harder to debug.
110
- - **Security:** Executing arbitrary code from an LLM is a security risk, although this is mitigated by the use of sandboxed environments.
111
-
112
- ## TODO: A Multi-Step Plan for a Hybrid Agent
113
-
114
- This plan outlines a series of small, atomic changes to evolve the `UnifiedAgent` into a more powerful hybrid agent that combines the structured, dynamic tool management of `langgraph` with the expressive code execution of `smolagents`.
115
-
116
- ### Phase 1: Foundational Changes - Shifting to Code Generation
117
-
118
- - [x] **1. Externalize Prompting:**
119
- - [x] Create a `prompts.yaml` file in `src/universal_mcp/agents/unified/`.
120
- - [x] Move the `SYSTEM_PROMPT` from `prompts.py` into `prompts.yaml`.
121
- - [x] Update `UnifiedAgent` in `agent.py` to load its system prompt from the new YAML file. This separates configuration from code and aligns with `smolagents` best practices.
122
-
123
- - [x] **2. Introduce a Code-Centric Prompt:**
124
- - [x] In `prompts.yaml`, modify the system prompt to instruct the LLM to output Python code directly, using Markdown syntax for code blocks (e.g., ` ```python ... ``` `).
125
- - [x] Add one or two few-shot examples to the prompt, demonstrating how to call a meta-tool (like `web_search`) directly within a code block.
126
-
127
- - [ ] **3. Adapt Graph for Direct Code Output:**
128
- - [ ] In `graph.py`, modify the `agent_node` to parse the Python code block from the LLM's raw text response instead of expecting a structured `tool_calls` attribute.
129
- - [ ] The output of the `agent_node` should now be the extracted code snippet (as a string).
130
- - [ ] Update the `execute_tools_node` to wrap this code snippet into a call to the `execute_ipython_cell` tool. This maintains the existing execution logic while adapting to the new code-centric output from the LLM.
131
-
132
- ### Phase 2: Integrating a More Powerful and Stateful Executor
133
-
134
- - [ ] **4. Replace Sandbox with `LocalPythonExecutor`:**
135
- - [ ] Integrate `smolagents.local_python_executor.LocalPythonExecutor` into the `UnifiedAgent`.
136
- - [ ] In `agent.py`, replace the `Sandbox` instance with a persistent `LocalPythonExecutor` instance.
137
- - [ ] In `graph.py`, modify the `execute_tools_node` to call the `python_executor` directly with the code snippet.
138
- - [ ] Remove the `execute_ipython_cell` meta-tool, as the executor now handles code execution directly.
139
-
140
- - [ ] **5. Enable Stateful Execution Across Turns:**
141
- - [ ] Ensure the `LocalPythonExecutor` instance on the `UnifiedAgent` is the same one used across all steps in the graph.
142
- - [ ] Verify that variables, functions, and imports defined in one turn are accessible in subsequent turns within the same `invoke` session.
143
-
144
- ### Phase 3: Enhancing Tool Management and Workflow
145
-
146
- - [ ] **6. Dynamic Tool Injection into the Executor:**
147
- - [ ] Modify the `load_tools` logic in `graph.py`. When a tool is loaded, inject it directly into the `python_executor`'s namespace (both static and custom tools).
148
- - [ ] Update the system prompt in `prompts.yaml` to dynamically list the function signatures of the currently available tools in the executor's environment, making the LLM aware of what it can call.
149
-
150
- - [ ] **7. Simplify the Graph (Optional but Recommended):**
151
- - [ ] Evaluate the necessity of the `validate_code_node`. Since `LocalPythonExecutor` can catch and report syntax errors, this validation might be redundant. Consider removing it and handling errors directly in the `execute_tools_node`.
152
- - [ ] Consider merging the `format_final_answer_node`'s logic into the main agent prompt, instructing the agent to provide a user-friendly summary before calling `finish()`.
153
-
154
- - [ ] **8. Integrate Optional Planning:**
155
- - [ ] Add a `planning_interval` parameter to the `UnifiedAgent`.
156
- - [ ] Introduce a new `planning_node` in the graph that generates or updates a high-level plan.
157
- - [ ] Add a conditional edge that routes the graph to the `planning_node` every N steps, injecting the plan back into the agent's state.