massgen 0.0.3__py3-none-any.whl → 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of massgen might be problematic. Click here for more details.

Files changed (268) hide show
  1. massgen/__init__.py +142 -8
  2. massgen/adapters/__init__.py +29 -0
  3. massgen/adapters/ag2_adapter.py +483 -0
  4. massgen/adapters/base.py +183 -0
  5. massgen/adapters/tests/__init__.py +0 -0
  6. massgen/adapters/tests/test_ag2_adapter.py +439 -0
  7. massgen/adapters/tests/test_agent_adapter.py +128 -0
  8. massgen/adapters/utils/__init__.py +2 -0
  9. massgen/adapters/utils/ag2_utils.py +236 -0
  10. massgen/adapters/utils/tests/__init__.py +0 -0
  11. massgen/adapters/utils/tests/test_ag2_utils.py +138 -0
  12. massgen/agent_config.py +329 -55
  13. massgen/api_params_handler/__init__.py +10 -0
  14. massgen/api_params_handler/_api_params_handler_base.py +99 -0
  15. massgen/api_params_handler/_chat_completions_api_params_handler.py +176 -0
  16. massgen/api_params_handler/_claude_api_params_handler.py +113 -0
  17. massgen/api_params_handler/_response_api_params_handler.py +130 -0
  18. massgen/backend/__init__.py +39 -4
  19. massgen/backend/azure_openai.py +385 -0
  20. massgen/backend/base.py +341 -69
  21. massgen/backend/base_with_mcp.py +1102 -0
  22. massgen/backend/capabilities.py +386 -0
  23. massgen/backend/chat_completions.py +577 -130
  24. massgen/backend/claude.py +1033 -537
  25. massgen/backend/claude_code.py +1203 -0
  26. massgen/backend/cli_base.py +209 -0
  27. massgen/backend/docs/BACKEND_ARCHITECTURE.md +126 -0
  28. massgen/backend/{CLAUDE_API_RESEARCH.md → docs/CLAUDE_API_RESEARCH.md} +18 -18
  29. massgen/backend/{GEMINI_API_DOCUMENTATION.md → docs/GEMINI_API_DOCUMENTATION.md} +9 -9
  30. massgen/backend/docs/Gemini MCP Integration Analysis.md +1050 -0
  31. massgen/backend/docs/MCP_IMPLEMENTATION_CLAUDE_BACKEND.md +177 -0
  32. massgen/backend/docs/MCP_INTEGRATION_RESPONSE_BACKEND.md +352 -0
  33. massgen/backend/docs/OPENAI_GPT5_MODELS.md +211 -0
  34. massgen/backend/{OPENAI_RESPONSES_API_FORMAT.md → docs/OPENAI_RESPONSE_API_TOOL_CALLS.md} +3 -3
  35. massgen/backend/docs/OPENAI_response_streaming.md +20654 -0
  36. massgen/backend/docs/inference_backend.md +257 -0
  37. massgen/backend/docs/permissions_and_context_files.md +1085 -0
  38. massgen/backend/external.py +126 -0
  39. massgen/backend/gemini.py +1850 -241
  40. massgen/backend/grok.py +40 -156
  41. massgen/backend/inference.py +156 -0
  42. massgen/backend/lmstudio.py +171 -0
  43. massgen/backend/response.py +1095 -322
  44. massgen/chat_agent.py +131 -113
  45. massgen/cli.py +1560 -275
  46. massgen/config_builder.py +2396 -0
  47. massgen/configs/BACKEND_CONFIGURATION.md +458 -0
  48. massgen/configs/README.md +559 -216
  49. massgen/configs/ag2/ag2_case_study.yaml +27 -0
  50. massgen/configs/ag2/ag2_coder.yaml +34 -0
  51. massgen/configs/ag2/ag2_coder_case_study.yaml +36 -0
  52. massgen/configs/ag2/ag2_gemini.yaml +27 -0
  53. massgen/configs/ag2/ag2_groupchat.yaml +108 -0
  54. massgen/configs/ag2/ag2_groupchat_gpt.yaml +118 -0
  55. massgen/configs/ag2/ag2_single_agent.yaml +21 -0
  56. massgen/configs/basic/multi/fast_timeout_example.yaml +37 -0
  57. massgen/configs/basic/multi/gemini_4o_claude.yaml +31 -0
  58. massgen/configs/basic/multi/gemini_gpt5nano_claude.yaml +36 -0
  59. massgen/configs/{gemini_4o_claude.yaml → basic/multi/geminicode_4o_claude.yaml} +3 -3
  60. massgen/configs/basic/multi/geminicode_gpt5nano_claude.yaml +36 -0
  61. massgen/configs/basic/multi/glm_gemini_claude.yaml +25 -0
  62. massgen/configs/basic/multi/gpt4o_audio_generation.yaml +30 -0
  63. massgen/configs/basic/multi/gpt4o_image_generation.yaml +31 -0
  64. massgen/configs/basic/multi/gpt5nano_glm_qwen.yaml +26 -0
  65. massgen/configs/basic/multi/gpt5nano_image_understanding.yaml +26 -0
  66. massgen/configs/{three_agents_default.yaml → basic/multi/three_agents_default.yaml} +8 -4
  67. massgen/configs/basic/multi/three_agents_opensource.yaml +27 -0
  68. massgen/configs/basic/multi/three_agents_vllm.yaml +20 -0
  69. massgen/configs/basic/multi/two_agents_gemini.yaml +19 -0
  70. massgen/configs/{two_agents.yaml → basic/multi/two_agents_gpt5.yaml} +14 -6
  71. massgen/configs/basic/multi/two_agents_opensource_lmstudio.yaml +31 -0
  72. massgen/configs/basic/multi/two_qwen_vllm_sglang.yaml +28 -0
  73. massgen/configs/{single_agent.yaml → basic/single/single_agent.yaml} +1 -1
  74. massgen/configs/{single_flash2.5.yaml → basic/single/single_flash2.5.yaml} +1 -2
  75. massgen/configs/basic/single/single_gemini2.5pro.yaml +16 -0
  76. massgen/configs/basic/single/single_gpt4o_audio_generation.yaml +22 -0
  77. massgen/configs/basic/single/single_gpt4o_image_generation.yaml +22 -0
  78. massgen/configs/basic/single/single_gpt4o_video_generation.yaml +24 -0
  79. massgen/configs/basic/single/single_gpt5nano.yaml +20 -0
  80. massgen/configs/basic/single/single_gpt5nano_file_search.yaml +18 -0
  81. massgen/configs/basic/single/single_gpt5nano_image_understanding.yaml +17 -0
  82. massgen/configs/basic/single/single_gptoss120b.yaml +15 -0
  83. massgen/configs/basic/single/single_openrouter_audio_understanding.yaml +15 -0
  84. massgen/configs/basic/single/single_qwen_video_understanding.yaml +15 -0
  85. massgen/configs/debug/code_execution/command_filtering_blacklist.yaml +29 -0
  86. massgen/configs/debug/code_execution/command_filtering_whitelist.yaml +28 -0
  87. massgen/configs/debug/code_execution/docker_verification.yaml +29 -0
  88. massgen/configs/debug/skip_coordination_test.yaml +27 -0
  89. massgen/configs/debug/test_sdk_migration.yaml +17 -0
  90. massgen/configs/docs/DISCORD_MCP_SETUP.md +208 -0
  91. massgen/configs/docs/TWITTER_MCP_ENESCINAR_SETUP.md +82 -0
  92. massgen/configs/providers/azure/azure_openai_multi.yaml +21 -0
  93. massgen/configs/providers/azure/azure_openai_single.yaml +19 -0
  94. massgen/configs/providers/claude/claude.yaml +14 -0
  95. massgen/configs/providers/gemini/gemini_gpt5nano.yaml +28 -0
  96. massgen/configs/providers/local/lmstudio.yaml +11 -0
  97. massgen/configs/providers/openai/gpt5.yaml +46 -0
  98. massgen/configs/providers/openai/gpt5_nano.yaml +46 -0
  99. massgen/configs/providers/others/grok_single_agent.yaml +19 -0
  100. massgen/configs/providers/others/zai_coding_team.yaml +108 -0
  101. massgen/configs/providers/others/zai_glm45.yaml +12 -0
  102. massgen/configs/{creative_team.yaml → teams/creative/creative_team.yaml} +16 -6
  103. massgen/configs/{travel_planning.yaml → teams/creative/travel_planning.yaml} +16 -6
  104. massgen/configs/{news_analysis.yaml → teams/research/news_analysis.yaml} +16 -6
  105. massgen/configs/{research_team.yaml → teams/research/research_team.yaml} +15 -7
  106. massgen/configs/{technical_analysis.yaml → teams/research/technical_analysis.yaml} +16 -6
  107. massgen/configs/tools/code-execution/basic_command_execution.yaml +25 -0
  108. massgen/configs/tools/code-execution/code_execution_use_case_simple.yaml +41 -0
  109. massgen/configs/tools/code-execution/docker_claude_code.yaml +32 -0
  110. massgen/configs/tools/code-execution/docker_multi_agent.yaml +32 -0
  111. massgen/configs/tools/code-execution/docker_simple.yaml +29 -0
  112. massgen/configs/tools/code-execution/docker_with_resource_limits.yaml +32 -0
  113. massgen/configs/tools/code-execution/multi_agent_playwright_automation.yaml +57 -0
  114. massgen/configs/tools/filesystem/cc_gpt5_gemini_filesystem.yaml +34 -0
  115. massgen/configs/tools/filesystem/claude_code_context_sharing.yaml +68 -0
  116. massgen/configs/tools/filesystem/claude_code_flash2.5.yaml +43 -0
  117. massgen/configs/tools/filesystem/claude_code_flash2.5_gptoss.yaml +49 -0
  118. massgen/configs/tools/filesystem/claude_code_gpt5nano.yaml +31 -0
  119. massgen/configs/tools/filesystem/claude_code_single.yaml +40 -0
  120. massgen/configs/tools/filesystem/fs_permissions_test.yaml +87 -0
  121. massgen/configs/tools/filesystem/gemini_gemini_workspace_cleanup.yaml +54 -0
  122. massgen/configs/tools/filesystem/gemini_gpt5_filesystem_casestudy.yaml +30 -0
  123. massgen/configs/tools/filesystem/gemini_gpt5nano_file_context_path.yaml +43 -0
  124. massgen/configs/tools/filesystem/gemini_gpt5nano_protected_paths.yaml +45 -0
  125. massgen/configs/tools/filesystem/gpt5mini_cc_fs_context_path.yaml +31 -0
  126. massgen/configs/tools/filesystem/grok4_gpt5_gemini_filesystem.yaml +32 -0
  127. massgen/configs/tools/filesystem/multiturn/grok4_gpt5_claude_code_filesystem_multiturn.yaml +58 -0
  128. massgen/configs/tools/filesystem/multiturn/grok4_gpt5_gemini_filesystem_multiturn.yaml +58 -0
  129. massgen/configs/tools/filesystem/multiturn/two_claude_code_filesystem_multiturn.yaml +47 -0
  130. massgen/configs/tools/filesystem/multiturn/two_gemini_flash_filesystem_multiturn.yaml +48 -0
  131. massgen/configs/tools/mcp/claude_code_discord_mcp_example.yaml +27 -0
  132. massgen/configs/tools/mcp/claude_code_simple_mcp.yaml +35 -0
  133. massgen/configs/tools/mcp/claude_code_twitter_mcp_example.yaml +32 -0
  134. massgen/configs/tools/mcp/claude_mcp_example.yaml +24 -0
  135. massgen/configs/tools/mcp/claude_mcp_test.yaml +27 -0
  136. massgen/configs/tools/mcp/five_agents_travel_mcp_test.yaml +157 -0
  137. massgen/configs/tools/mcp/five_agents_weather_mcp_test.yaml +103 -0
  138. massgen/configs/tools/mcp/gemini_mcp_example.yaml +24 -0
  139. massgen/configs/tools/mcp/gemini_mcp_filesystem_test.yaml +23 -0
  140. massgen/configs/tools/mcp/gemini_mcp_filesystem_test_sharing.yaml +23 -0
  141. massgen/configs/tools/mcp/gemini_mcp_filesystem_test_single_agent.yaml +17 -0
  142. massgen/configs/tools/mcp/gemini_mcp_filesystem_test_with_claude_code.yaml +24 -0
  143. massgen/configs/tools/mcp/gemini_mcp_test.yaml +27 -0
  144. massgen/configs/tools/mcp/gemini_notion_mcp.yaml +52 -0
  145. massgen/configs/tools/mcp/gpt5_nano_mcp_example.yaml +24 -0
  146. massgen/configs/tools/mcp/gpt5_nano_mcp_test.yaml +27 -0
  147. massgen/configs/tools/mcp/gpt5mini_claude_code_discord_mcp_example.yaml +38 -0
  148. massgen/configs/tools/mcp/gpt_oss_mcp_example.yaml +25 -0
  149. massgen/configs/tools/mcp/gpt_oss_mcp_test.yaml +28 -0
  150. massgen/configs/tools/mcp/grok3_mini_mcp_example.yaml +24 -0
  151. massgen/configs/tools/mcp/grok3_mini_mcp_test.yaml +27 -0
  152. massgen/configs/tools/mcp/multimcp_gemini.yaml +111 -0
  153. massgen/configs/tools/mcp/qwen_api_mcp_example.yaml +25 -0
  154. massgen/configs/tools/mcp/qwen_api_mcp_test.yaml +28 -0
  155. massgen/configs/tools/mcp/qwen_local_mcp_example.yaml +24 -0
  156. massgen/configs/tools/mcp/qwen_local_mcp_test.yaml +27 -0
  157. massgen/configs/tools/planning/five_agents_discord_mcp_planning_mode.yaml +140 -0
  158. massgen/configs/tools/planning/five_agents_filesystem_mcp_planning_mode.yaml +151 -0
  159. massgen/configs/tools/planning/five_agents_notion_mcp_planning_mode.yaml +151 -0
  160. massgen/configs/tools/planning/five_agents_twitter_mcp_planning_mode.yaml +155 -0
  161. massgen/configs/tools/planning/gpt5_mini_case_study_mcp_planning_mode.yaml +73 -0
  162. massgen/configs/tools/web-search/claude_streamable_http_test.yaml +43 -0
  163. massgen/configs/tools/web-search/gemini_streamable_http_test.yaml +43 -0
  164. massgen/configs/tools/web-search/gpt5_mini_streamable_http_test.yaml +43 -0
  165. massgen/configs/tools/web-search/gpt_oss_streamable_http_test.yaml +44 -0
  166. massgen/configs/tools/web-search/grok3_mini_streamable_http_test.yaml +43 -0
  167. massgen/configs/tools/web-search/qwen_api_streamable_http_test.yaml +44 -0
  168. massgen/configs/tools/web-search/qwen_local_streamable_http_test.yaml +43 -0
  169. massgen/coordination_tracker.py +708 -0
  170. massgen/docker/README.md +462 -0
  171. massgen/filesystem_manager/__init__.py +21 -0
  172. massgen/filesystem_manager/_base.py +9 -0
  173. massgen/filesystem_manager/_code_execution_server.py +545 -0
  174. massgen/filesystem_manager/_docker_manager.py +477 -0
  175. massgen/filesystem_manager/_file_operation_tracker.py +248 -0
  176. massgen/filesystem_manager/_filesystem_manager.py +813 -0
  177. massgen/filesystem_manager/_path_permission_manager.py +1261 -0
  178. massgen/filesystem_manager/_workspace_tools_server.py +1815 -0
  179. massgen/formatter/__init__.py +10 -0
  180. massgen/formatter/_chat_completions_formatter.py +284 -0
  181. massgen/formatter/_claude_formatter.py +235 -0
  182. massgen/formatter/_formatter_base.py +156 -0
  183. massgen/formatter/_response_formatter.py +263 -0
  184. massgen/frontend/__init__.py +1 -2
  185. massgen/frontend/coordination_ui.py +471 -286
  186. massgen/frontend/displays/base_display.py +56 -11
  187. massgen/frontend/displays/create_coordination_table.py +1956 -0
  188. massgen/frontend/displays/rich_terminal_display.py +1259 -619
  189. massgen/frontend/displays/simple_display.py +9 -4
  190. massgen/frontend/displays/terminal_display.py +27 -68
  191. massgen/logger_config.py +681 -0
  192. massgen/mcp_tools/README.md +232 -0
  193. massgen/mcp_tools/__init__.py +105 -0
  194. massgen/mcp_tools/backend_utils.py +1035 -0
  195. massgen/mcp_tools/circuit_breaker.py +195 -0
  196. massgen/mcp_tools/client.py +894 -0
  197. massgen/mcp_tools/config_validator.py +138 -0
  198. massgen/mcp_tools/docs/circuit_breaker.md +646 -0
  199. massgen/mcp_tools/docs/client.md +950 -0
  200. massgen/mcp_tools/docs/config_validator.md +478 -0
  201. massgen/mcp_tools/docs/exceptions.md +1165 -0
  202. massgen/mcp_tools/docs/security.md +854 -0
  203. massgen/mcp_tools/exceptions.py +338 -0
  204. massgen/mcp_tools/hooks.py +212 -0
  205. massgen/mcp_tools/security.py +780 -0
  206. massgen/message_templates.py +342 -64
  207. massgen/orchestrator.py +1515 -241
  208. massgen/stream_chunk/__init__.py +35 -0
  209. massgen/stream_chunk/base.py +92 -0
  210. massgen/stream_chunk/multimodal.py +237 -0
  211. massgen/stream_chunk/text.py +162 -0
  212. massgen/tests/mcp_test_server.py +150 -0
  213. massgen/tests/multi_turn_conversation_design.md +0 -8
  214. massgen/tests/test_azure_openai_backend.py +156 -0
  215. massgen/tests/test_backend_capabilities.py +262 -0
  216. massgen/tests/test_backend_event_loop_all.py +179 -0
  217. massgen/tests/test_chat_completions_refactor.py +142 -0
  218. massgen/tests/test_claude_backend.py +15 -28
  219. massgen/tests/test_claude_code.py +268 -0
  220. massgen/tests/test_claude_code_context_sharing.py +233 -0
  221. massgen/tests/test_claude_code_orchestrator.py +175 -0
  222. massgen/tests/test_cli_backends.py +180 -0
  223. massgen/tests/test_code_execution.py +679 -0
  224. massgen/tests/test_external_agent_backend.py +134 -0
  225. massgen/tests/test_final_presentation_fallback.py +237 -0
  226. massgen/tests/test_gemini_planning_mode.py +351 -0
  227. massgen/tests/test_grok_backend.py +7 -10
  228. massgen/tests/test_http_mcp_server.py +42 -0
  229. massgen/tests/test_integration_simple.py +198 -0
  230. massgen/tests/test_mcp_blocking.py +125 -0
  231. massgen/tests/test_message_context_building.py +29 -47
  232. massgen/tests/test_orchestrator_final_presentation.py +48 -0
  233. massgen/tests/test_path_permission_manager.py +2087 -0
  234. massgen/tests/test_rich_terminal_display.py +14 -13
  235. massgen/tests/test_timeout.py +133 -0
  236. massgen/tests/test_v3_3agents.py +11 -12
  237. massgen/tests/test_v3_simple.py +8 -13
  238. massgen/tests/test_v3_three_agents.py +11 -18
  239. massgen/tests/test_v3_two_agents.py +8 -13
  240. massgen/token_manager/__init__.py +7 -0
  241. massgen/token_manager/token_manager.py +400 -0
  242. massgen/utils.py +52 -16
  243. massgen/v1/agent.py +45 -91
  244. massgen/v1/agents.py +18 -53
  245. massgen/v1/backends/gemini.py +50 -153
  246. massgen/v1/backends/grok.py +21 -54
  247. massgen/v1/backends/oai.py +39 -111
  248. massgen/v1/cli.py +36 -93
  249. massgen/v1/config.py +8 -12
  250. massgen/v1/logging.py +43 -127
  251. massgen/v1/main.py +18 -32
  252. massgen/v1/orchestrator.py +68 -209
  253. massgen/v1/streaming_display.py +62 -163
  254. massgen/v1/tools.py +8 -12
  255. massgen/v1/types.py +9 -23
  256. massgen/v1/utils.py +5 -23
  257. massgen-0.1.0.dist-info/METADATA +1245 -0
  258. massgen-0.1.0.dist-info/RECORD +273 -0
  259. massgen-0.1.0.dist-info/entry_points.txt +2 -0
  260. massgen/frontend/logging/__init__.py +0 -9
  261. massgen/frontend/logging/realtime_logger.py +0 -197
  262. massgen-0.0.3.dist-info/METADATA +0 -568
  263. massgen-0.0.3.dist-info/RECORD +0 -76
  264. massgen-0.0.3.dist-info/entry_points.txt +0 -2
  265. /massgen/backend/{Function calling openai responses.md → docs/Function calling openai responses.md} +0 -0
  266. {massgen-0.0.3.dist-info → massgen-0.1.0.dist-info}/WHEEL +0 -0
  267. {massgen-0.0.3.dist-info → massgen-0.1.0.dist-info}/licenses/LICENSE +0 -0
  268. {massgen-0.0.3.dist-info → massgen-0.1.0.dist-info}/top_level.txt +0 -0
@@ -0,0 +1,209 @@
1
+ # -*- coding: utf-8 -*-
2
+ """
3
+ CLI Backend Base Class - Abstract interface for CLI-based LLM backends.
4
+
5
+ This module provides the base class for backends that interact with LLM providers
6
+ through command-line interfaces (like Claude Code CLI, Gemini CLI, etc.).
7
+ """
8
+
9
+ import asyncio
10
+ import subprocess
11
+ import tempfile
12
+ from abc import abstractmethod
13
+ from pathlib import Path
14
+ from typing import Any, AsyncGenerator, Dict, List, Optional
15
+
16
+ from .base import LLMBackend, StreamChunk, TokenUsage
17
+
18
+
19
+ class CLIBackend(LLMBackend):
20
+ """Abstract base class for CLI-based LLM backends."""
21
+
22
+ def __init__(self, cli_command: str, api_key: Optional[str] = None, **kwargs):
23
+ super().__init__(api_key, **kwargs)
24
+ self.cli_command = cli_command
25
+ self.working_dir = kwargs.get("working_dir", Path.cwd())
26
+ self.timeout = kwargs.get("timeout", 300) # 5 minutes default
27
+
28
+ @abstractmethod
29
+ def _build_command(self, messages: List[Dict[str, Any]], tools: List[Dict[str, Any]], **kwargs) -> List[str]:
30
+ """Build the CLI command to execute.
31
+
32
+ Args:
33
+ messages: Conversation messages
34
+ tools: Available tools
35
+ **kwargs: Additional parameters
36
+
37
+ Returns:
38
+ List of command arguments for subprocess
39
+ """
40
+
41
+ @abstractmethod
42
+ def _parse_output(self, output: str) -> Dict[str, Any]:
43
+ """Parse CLI output into structured format.
44
+
45
+ Args:
46
+ output: Raw CLI output
47
+
48
+ Returns:
49
+ Parsed response data
50
+ """
51
+
52
+ async def _execute_cli_command(self, command: List[str]) -> str:
53
+ """Execute CLI command asynchronously.
54
+
55
+ Args:
56
+ command: Command arguments
57
+
58
+ Returns:
59
+ Command output
60
+
61
+ Raises:
62
+ subprocess.CalledProcessError: If command fails
63
+ asyncio.TimeoutError: If command times out
64
+ """
65
+ process = await asyncio.create_subprocess_exec(
66
+ *command,
67
+ stdout=asyncio.subprocess.PIPE,
68
+ stderr=asyncio.subprocess.PIPE,
69
+ cwd=self.working_dir,
70
+ )
71
+
72
+ try:
73
+ stdout, stderr = await asyncio.wait_for(process.communicate(), timeout=self.timeout)
74
+
75
+ if process.returncode != 0:
76
+ error_msg = stderr.decode("utf-8") if stderr else "Unknown error"
77
+ raise subprocess.CalledProcessError(process.returncode, command, error_msg)
78
+
79
+ return stdout.decode("utf-8")
80
+
81
+ except asyncio.TimeoutError as exc:
82
+ process.kill()
83
+ await process.wait()
84
+ raise asyncio.TimeoutError(f"CLI command timed out after {self.timeout} seconds") from exc
85
+
86
+ def _create_temp_file(self, content: str, suffix: str = ".txt") -> Path:
87
+ """Create a temporary file with content.
88
+
89
+ Args:
90
+ content: File content
91
+ suffix: File suffix
92
+
93
+ Returns:
94
+ Path to temporary file
95
+ """
96
+ with tempfile.NamedTemporaryFile(mode="w", suffix=suffix, delete=False) as temp_file:
97
+ temp_file.write(content)
98
+ return Path(temp_file.name)
99
+
100
+ def _format_messages_for_cli(self, messages: List[Dict[str, Any]]) -> str:
101
+ """Format messages for CLI input.
102
+
103
+ Args:
104
+ messages: Conversation messages
105
+
106
+ Returns:
107
+ Formatted string for CLI
108
+ """
109
+ formatted_parts = []
110
+
111
+ for msg in messages:
112
+ role = msg.get("role", "user")
113
+ content = msg.get("content", "")
114
+
115
+ if role == "system":
116
+ formatted_parts.append(f"System: {content}")
117
+ elif role == "user":
118
+ formatted_parts.append(f"User: {content}")
119
+ elif role == "assistant":
120
+ formatted_parts.append(f"Assistant: {content}")
121
+
122
+ return "\n\n".join(formatted_parts)
123
+
124
+ async def stream_with_tools(self, messages: List[Dict[str, Any]], tools: List[Dict[str, Any]], **kwargs) -> AsyncGenerator[StreamChunk, None]:
125
+ """Stream response with tools support."""
126
+ try:
127
+ # Build CLI command
128
+ command = self._build_command(messages, tools, **kwargs)
129
+
130
+ # Execute command
131
+ output = await self._execute_cli_command(command)
132
+
133
+ # Parse output
134
+ parsed_response = self._parse_output(output)
135
+
136
+ # Convert to stream chunks
137
+ async for chunk in self._convert_to_stream_chunks(parsed_response):
138
+ yield chunk
139
+
140
+ except Exception as e:
141
+ yield StreamChunk(
142
+ type="error",
143
+ error=f"CLI backend error: {str(e)}",
144
+ source=self.__class__.__name__,
145
+ )
146
+
147
+ async def _convert_to_stream_chunks(self, response: Dict[str, Any]) -> AsyncGenerator[StreamChunk, None]:
148
+ """Convert parsed response to stream chunks.
149
+
150
+ Args:
151
+ response: Parsed response data
152
+
153
+ Yields:
154
+ StreamChunk objects
155
+ """
156
+ # Yield content
157
+ if "content" in response and response["content"]:
158
+ yield StreamChunk(
159
+ type="content",
160
+ content=response["content"],
161
+ source=self.__class__.__name__,
162
+ )
163
+
164
+ # Yield tool calls if present
165
+ if "tool_calls" in response and response["tool_calls"]:
166
+ yield StreamChunk(
167
+ type="tool_calls",
168
+ tool_calls=response["tool_calls"],
169
+ source=self.__class__.__name__,
170
+ )
171
+
172
+ # Yield complete message
173
+ yield StreamChunk(
174
+ type="complete_message",
175
+ complete_message=response,
176
+ source=self.__class__.__name__,
177
+ )
178
+
179
+ # Yield done
180
+ yield StreamChunk(type="done", source=self.__class__.__name__)
181
+
182
+ def get_token_usage(self) -> TokenUsage:
183
+ """Get token usage statistics."""
184
+ # CLI backends typically don't provide detailed token usage
185
+ # This could be estimated or left as zero
186
+ return self.token_usage
187
+
188
+ def get_cost_per_token(self) -> Dict[str, float]:
189
+ """Get cost per token for this provider."""
190
+ # Override in specific implementations
191
+ return {"input": 0.0, "output": 0.0}
192
+
193
+ def get_model_name(self) -> str:
194
+ """Get the model name being used."""
195
+ return self.config.get("model", "unknown")
196
+
197
+ def get_provider_info(self) -> Dict[str, Any]:
198
+ """Get provider information."""
199
+ return {
200
+ "provider": self.__class__.__name__,
201
+ "cli_command": self.cli_command,
202
+ "model": self.get_model_name(),
203
+ "supports_tools": True,
204
+ "supports_streaming": True,
205
+ }
206
+
207
+ def get_provider_name(self) -> str:
208
+ """Get the name of this provider."""
209
+ return self.__class__.__name__
@@ -0,0 +1,126 @@
1
+ # Backend Architecture: Stateful vs Stateless
2
+
3
+ ## Overview
4
+
5
+ The MassGen backend system supports two distinct architectural patterns for AI model backends: stateless and stateful. Understanding these patterns is crucial for proper agent implementation and state management.
6
+
7
+ ## Backend Types
8
+
9
+ ### Stateless Backends
10
+
11
+ **Examples:** `ChatCompletionBackend`, OpenAI GPT models, Gemini
12
+
13
+ **Characteristics:**
14
+ - No conversation state maintained between requests
15
+ - Each request is independent and self-contained
16
+ - Complete context must be provided with every request
17
+ - No memory of previous interactions
18
+ - Simpler to scale horizontally
19
+
20
+ **Implementation Pattern:**
21
+ ```python
22
+ # Each request includes full conversation history
23
+ # NOTE: Documentation uses .generate() for clarity, actual code uses .stream_with_tools()
24
+ response = backend.generate(
25
+ messages=[
26
+ {"role": "user", "content": "Previous context..."},
27
+ {"role": "assistant", "content": "Previous response..."},
28
+ {"role": "user", "content": "Current request..."}
29
+ ]
30
+ )
31
+ ```
32
+
33
+ ### Stateful Backends
34
+
35
+ **Examples:** `Claude Code CLI`, Interactive CLI sessions
36
+
37
+ **Characteristics:**
38
+ - Maintains conversation context across interactions
39
+ - State persists between requests
40
+ - Can reference previous interactions without resending context
41
+ - Requires explicit state management (reset, clear, etc.)
42
+ - More complex but efficient for long conversations
43
+
44
+ **Implementation Pattern:**
45
+ ```python
46
+ # Only current request needed, context maintained internally
47
+ # NOTE: Documentation uses .generate() for clarity, actual code uses .stream_with_tools()
48
+ response = backend.generate(message="Current request...")
49
+ ```
50
+
51
+ ## Current Agent Implementation Issue
52
+
53
+ The current agent implementation assumes all backends are **stateless**, which creates inefficiencies and potential issues:
54
+
55
+ ### Problems with Current Approach:
56
+ 1. **Redundant Context**: Sends complete conversation history to stateful backends
57
+ 2. **Inefficient Resource Usage**: Wastes bandwidth and processing power
58
+ 3. **State Confusion**: May conflict with backend's internal state management
59
+ 4. **Reset Handling**: Doesn't properly clear stateful backend state on reset
60
+
61
+ ## Recommended Solution
62
+
63
+ ### 1. Backend Detection
64
+ Add capability detection to identify backend type:
65
+
66
+ ```python
67
+ class Backend:
68
+ @property
69
+ def is_stateful(self) -> bool:
70
+ """Returns True if backend maintains conversation state"""
71
+ return False # Default to stateless
72
+ ```
73
+
74
+ ### 2. Conditional Context Management
75
+ Adjust message sending based on backend type:
76
+
77
+ ```python
78
+ def send_message(self, message: str):
79
+ if self.backend.is_stateful:
80
+ # Send only current message
81
+ # NOTE: Documentation uses .generate() for clarity, actual code uses .stream_with_tools()
82
+ response = self.backend.generate(message)
83
+ else:
84
+ # Send full conversation history
85
+ # NOTE: Documentation uses .generate() for clarity, actual code uses .stream_with_tools()
86
+ response = self.backend.generate(self.get_full_context())
87
+ ```
88
+
89
+ ### 3. Reset Handling
90
+ Handle resets differently for each backend type:
91
+
92
+ ```python
93
+ # NOTE: Methods shown are conceptual examples, not current implementation
94
+ def reset_conversation(self):
95
+ if self.backend.is_stateful:
96
+ # Clear backend's internal state
97
+ self.backend.reset()
98
+ else:
99
+ # Clear local conversation history
100
+ self.conversation_history.clear()
101
+ ```
102
+
103
+ ## Implementation Files
104
+
105
+ - `base.py` - Base backend interface with `LLMBackend` abstract class
106
+ - `chat_completions.py` - Stateless ChatCompletion backends (OpenAI-compatible)
107
+ - `claude_code.py` - **Stateful** Claude Code backend with streaming support
108
+ - `cli_base.py` - Base CLI backend functionality
109
+
110
+ ## Benefits of Proper Implementation
111
+
112
+ 1. **Performance**: Reduced context transmission for stateful backends
113
+ 2. **Reliability**: Proper state management prevents confusion
114
+ 3. **Scalability**: Optimized resource usage
115
+ 4. **Consistency**: Uniform behavior across backend types
116
+ 5. **Maintainability**: Clear separation of concerns
117
+
118
+ ## Next Steps
119
+
120
+ 1. Add `is_stateful` property to backend interface
121
+ 2. Update agent logic to detect and handle backend types
122
+ 3. Implement proper reset mechanisms for both types
123
+ 4. Add tests for both stateful and stateless scenarios
124
+ 5. Update documentation for backend developers
125
+
126
+ TODO: Clean up the design - StreamChunk has grown complex with many optional fields for different reasoning types and provider-specific features
@@ -2,14 +2,14 @@
2
2
 
3
3
  ## API Status & Availability (2025)
4
4
 
5
- ✅ **Production Ready**: Claude API is stable and production-ready
6
- ✅ **Active Development**: Regular updates with new features in 2025
7
- ✅ **Strong SDK Support**: Official Python SDK with async/sync support
5
+ ✅ **Production Ready**: Claude API is stable and production-ready
6
+ ✅ **Active Development**: Regular updates with new features in 2025
7
+ ✅ **Strong SDK Support**: Official Python SDK with async/sync support
8
8
 
9
9
  ## Models Available (2025)
10
10
 
11
11
  - **Claude 4 Opus**: Most capable, hybrid with extended thinking mode
12
- - **Claude 4 Sonnet**: Balanced performance, also available to free users
12
+ - **Claude 4 Sonnet**: Balanced performance, also available to free users
13
13
  - **Claude 3.7 Sonnet**: Previous generation, still supported
14
14
  - **Claude 3.5 Haiku**: Fastest, cost-effective option
15
15
 
@@ -17,7 +17,7 @@
17
17
 
18
18
  ### ✅ Excellent Multi-Tool Support
19
19
  **Key Advantage**: Claude can combine ALL tool types in a single request:
20
- - ✅ **Server-side tools** (web search, code execution)
20
+ - ✅ **Server-side tools** (web search, code execution)
21
21
  - ✅ **User-defined functions** (custom tools)
22
22
  - ✅ **File processing** via Files API
23
23
  - ✅ **No restrictions** on combining different tool types
@@ -114,7 +114,7 @@ response = await beta_client.beta.messages.create(
114
114
 
115
115
  ## Advanced Features (2025)
116
116
 
117
- ### New Beta Features
117
+ ### New Beta Features
118
118
  - **Code execution**: Python sandbox with server-side execution
119
119
  - Header: `"anthropic-beta": "code-execution-2025-05-22"`
120
120
  - Tool type: `code_execution_20250522`
@@ -144,7 +144,7 @@ response = await beta_client.beta.messages.create(
144
144
 
145
145
  **Production Readiness:**
146
146
  - ✅ Stable API with predictable pricing
147
- - ✅ No session limits or experimental restrictions
147
+ - ✅ No session limits or experimental restrictions
148
148
  - ✅ Strong error handling and rate limits
149
149
 
150
150
  ## Implementation Recommendation
@@ -165,7 +165,7 @@ response = await beta_client.beta.messages.create(
165
165
 
166
166
  ### Suggested Implementation Order:
167
167
  1. ✅ OpenAI Backend (completed)
168
- 2. ✅ Grok Backend (completed)
168
+ 2. ✅ Grok Backend (completed)
169
169
  3. 🎯 **Claude Backend** (recommended next)
170
170
  4. ⏳ Gemini Backend (when API supports multi-tools)
171
171
 
@@ -175,27 +175,27 @@ response = await beta_client.beta.messages.create(
175
175
  class ClaudeBackend(LLMBackend):
176
176
  def __init__(self, api_key: Optional[str] = None):
177
177
  self.client = anthropic.AsyncAnthropic(api_key=api_key)
178
-
178
+
179
179
  async def stream_with_tools(self, messages, tools, **kwargs):
180
180
  # Can freely combine all tool types
181
181
  combined_tools = []
182
-
183
- # Add server-side tools
182
+
183
+ # Add server-side tools
184
184
  if kwargs.get("enable_web_search"):
185
185
  combined_tools.append({"type": "web_search_20250305"})
186
-
186
+
187
187
  if kwargs.get("enable_code_execution"):
188
188
  combined_tools.append({"type": "code_execution_20250522"})
189
-
189
+
190
190
  # Add user-defined tools
191
191
  if tools:
192
192
  combined_tools.extend(tools)
193
-
193
+
194
194
  # Single API call with all tools - USE BETA CLIENT FOR CODE EXECUTION
195
195
  headers = {}
196
196
  if kwargs.get("enable_code_execution"):
197
197
  headers["anthropic-beta"] = "code-execution-2025-05-22"
198
-
198
+
199
199
  stream = await self.client.beta.messages.create(
200
200
  model="claude-3-5-sonnet-20241022",
201
201
  messages=messages,
@@ -203,7 +203,7 @@ class ClaudeBackend(LLMBackend):
203
203
  headers=headers,
204
204
  stream=True
205
205
  )
206
-
206
+
207
207
  async for event in stream:
208
208
  yield StreamChunk(...)
209
209
  ```
@@ -219,13 +219,13 @@ class ClaudeBackend(LLMBackend):
219
219
  ### ✅ Tool Execution Pattern
220
220
  Claude's code execution is **server-side** - Claude executes the code and streams results back:
221
221
  1. Send request with `code_execution_20250522` tool
222
- 2. Claude generates code and executes it server-side
222
+ 2. Claude generates code and executes it server-side
223
223
  3. Claude streams back execution results automatically
224
224
  4. No client-side tool execution needed for code execution tools
225
225
 
226
226
  ### ✅ Streaming Event Types to Handle
227
227
  - `content_block_start`: Tool use begins
228
- - `content_block_delta`: Tool input streaming
228
+ - `content_block_delta`: Tool input streaming
229
229
  - `input_json_delta`: Tool arguments as JSON fragments
230
230
  - Tool execution results are streamed as additional content blocks
231
231
 
@@ -12,7 +12,7 @@ The Gemini API provides access to Google's latest generative AI models with mult
12
12
  ## Models Available
13
13
 
14
14
  1. **Gemini 2.5 Pro**: Most powerful thinking model with features for complex reasoning
15
- 2. **Gemini 2.5 Flash**: Newest multimodal model with next generation features
15
+ 2. **Gemini 2.5 Flash**: Newest multimodal model with next generation features
16
16
  3. **Gemini 2.5 Flash-Lite**: Lighter version
17
17
 
18
18
  **Note**: Starting April 29, 2025, Gemini 1.5 Pro and Gemini 1.5 Flash models are not available in projects with no prior usage.
@@ -41,7 +41,7 @@ print(response.text)
41
41
  ### Synchronous Streaming
42
42
  ```python
43
43
  for chunk in client.models.generate_content_stream(
44
- model='gemini-2.0-flash',
44
+ model='gemini-2.0-flash',
45
45
  contents='Tell me a story in 300 words.'
46
46
  ):
47
47
  print(chunk.text)
@@ -51,7 +51,7 @@ for chunk in client.models.generate_content_stream(
51
51
  ### Asynchronous Streaming
52
52
  ```python
53
53
  async for chunk in await client.aio.models.generate_content_stream(
54
- model='gemini-2.0-flash',
54
+ model='gemini-2.0-flash',
55
55
  contents="Write a cute story about cats."
56
56
  ):
57
57
  if chunk.text:
@@ -87,7 +87,7 @@ async def async_demo():
87
87
  - Allows models to interact with external tools and APIs
88
88
  - Three primary use cases:
89
89
  1. Augment Knowledge
90
- 2. Extend Capabilities
90
+ 2. Extend Capabilities
91
91
  3. Take Actions
92
92
 
93
93
  ### Function Call Workflow
@@ -218,7 +218,7 @@ response = client.models.generate_content(
218
218
 
219
219
  **Response Format:**
220
220
  - `text`: Model's explanatory text
221
- - `executableCode`: Generated Python code
221
+ - `executableCode`: Generated Python code
222
222
  - `codeExecutionResult`: Execution output
223
223
  - Access via `response.candidates[0].content.parts`
224
224
 
@@ -243,7 +243,7 @@ grounding_tool = types.Tool(google_search=types.GoogleSearch())
243
243
  config = types.GenerateContentConfig(tools=[grounding_tool])
244
244
 
245
245
  response = client.models.generate_content(
246
- model="gemini-2.5-flash",
246
+ model="gemini-2.5-flash",
247
247
  contents="Latest AI developments in 2025",
248
248
  config=config
249
249
  )
@@ -345,11 +345,11 @@ client.models.generate_content(...)
345
345
  - `function_declarations` only (user-defined tools)
346
346
 
347
347
  **❌ NOT Supported:**
348
- - `code_execution` + `function_declarations`
348
+ - `code_execution` + `function_declarations`
349
349
  - `grounding` + `function_declarations`
350
350
  - All three tool types together
351
351
 
352
- ### Live API (Preview/Experimental)
352
+ ### Live API (Preview/Experimental)
353
353
  **✅ Multi-Tool Support:**
354
354
  - Can combine `google_search` + `code_execution` + `function_declarations`
355
355
  - Full flexibility but comes with major limitations
@@ -399,7 +399,7 @@ client.models.generate_content(...)
399
399
  **Usage Examples:**
400
400
  ```python
401
401
  # CLI usage
402
- python -m massgen.cli --backend gemini --model gemini-2.5-flash "Your question"
402
+ uv run python -m massgen.cli --backend gemini --model gemini-2.5-flash "Your question"
403
403
 
404
404
  # Configuration
405
405
  AgentConfig.create_gemini_config(