PyPI - massgen - Versions diffs - 0.1.2__py3-none-any.whl → 0.1.3__py3-none-any.whl - Mend

massgen 0.1.2py3-none-any.whl → 0.1.3py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (63) hide show

massgen/__init__.py +1 -1
massgen/agent_config.py +33 -7
massgen/api_params_handler/_api_params_handler_base.py +3 -0
massgen/backend/azure_openai.py +9 -1
massgen/backend/base.py +4 -0
massgen/backend/claude_code.py +9 -1
massgen/backend/gemini.py +35 -6
massgen/backend/gemini_utils.py +30 -0
massgen/chat_agent.py +9 -3
massgen/cli.py +291 -43
massgen/config_builder.py +163 -18
massgen/configs/README.md +52 -6
massgen/configs/debug/restart_test_controlled.yaml +60 -0
massgen/configs/debug/restart_test_controlled_filesystem.yaml +73 -0
massgen/configs/tools/code-execution/docker_with_sudo.yaml +35 -0
massgen/configs/tools/custom_tools/computer_use_browser_example.yaml +56 -0
massgen/configs/tools/custom_tools/computer_use_docker_example.yaml +65 -0
massgen/configs/tools/custom_tools/computer_use_example.yaml +50 -0
massgen/configs/tools/custom_tools/crawl4ai_mcp_example.yaml +67 -0
massgen/configs/tools/custom_tools/crawl4ai_multi_agent_example.yaml +68 -0
massgen/configs/tools/custom_tools/multimodal_tools/playwright_with_img_understanding.yaml +98 -0
massgen/configs/tools/custom_tools/multimodal_tools/understand_audio.yaml +33 -0
massgen/configs/tools/custom_tools/multimodal_tools/understand_file.yaml +34 -0
massgen/configs/tools/custom_tools/multimodal_tools/understand_image.yaml +33 -0
massgen/configs/tools/custom_tools/multimodal_tools/understand_video.yaml +34 -0
massgen/configs/tools/custom_tools/multimodal_tools/understand_video_example.yaml +54 -0
massgen/configs/tools/custom_tools/multimodal_tools/youtube_video_analysis.yaml +59 -0
massgen/configs/tools/memory/README.md +199 -0
massgen/configs/tools/memory/gpt5mini_gemini_context_window_management.yaml +131 -0
massgen/configs/tools/memory/gpt5mini_gemini_no_persistent_memory.yaml +133 -0
massgen/configs/tools/memory/test_context_window_management.py +286 -0
massgen/configs/tools/multimodal/gpt5mini_gpt5nano_documentation_evolution.yaml +97 -0
massgen/docker/README.md +83 -0
massgen/filesystem_manager/_code_execution_server.py +22 -7
massgen/filesystem_manager/_docker_manager.py +21 -1
massgen/filesystem_manager/_filesystem_manager.py +8 -0
massgen/filesystem_manager/_workspace_tools_server.py +0 -997
massgen/formatter/_gemini_formatter.py +73 -0
massgen/frontend/coordination_ui.py +175 -257
massgen/frontend/displays/base_display.py +29 -0
massgen/frontend/displays/rich_terminal_display.py +155 -9
massgen/frontend/displays/simple_display.py +21 -0
massgen/frontend/displays/terminal_display.py +22 -2
massgen/logger_config.py +50 -6
massgen/message_templates.py +123 -3
massgen/orchestrator.py +319 -38
massgen/tests/test_code_execution.py +178 -0
massgen/tests/test_orchestration_restart.py +204 -0
massgen/tool/__init__.py +4 -0
massgen/tool/_multimodal_tools/understand_audio.py +193 -0
massgen/tool/_multimodal_tools/understand_file.py +550 -0
massgen/tool/_multimodal_tools/understand_image.py +212 -0
massgen/tool/_multimodal_tools/understand_video.py +313 -0
massgen/tool/docs/multimodal_tools.md +779 -0
massgen/tool/workflow_toolkits/__init__.py +26 -0
massgen/tool/workflow_toolkits/post_evaluation.py +216 -0
massgen/utils.py +1 -0
{massgen-0.1.2.dist-info → massgen-0.1.3.dist-info}/METADATA +8 -3
{massgen-0.1.2.dist-info → massgen-0.1.3.dist-info}/RECORD +63 -36
{massgen-0.1.2.dist-info → massgen-0.1.3.dist-info}/WHEEL +0 -0
{massgen-0.1.2.dist-info → massgen-0.1.3.dist-info}/entry_points.txt +0 -0
{massgen-0.1.2.dist-info → massgen-0.1.3.dist-info}/licenses/LICENSE +0 -0
{massgen-0.1.2.dist-info → massgen-0.1.3.dist-info}/top_level.txt +0 -0

massgen/configs/tools/memory/README.md ADDED Viewed

@@ -0,0 +1,199 @@
+# Memory and Context Window Management Examples
+This directory contains example configurations and tests for MassGen's memory system and automatic context window management.
+## Features Demonstrated
+- **Automatic Context Compression**: When conversation history approaches 75% of the model's context window, older messages are automatically compressed
+- **Token-Aware Management**: System keeps most recent messages within 40% token budget
+- **Persistent Memory Integration**: Compressed messages stored in long-term memory using mem0
+- **Graceful Degradation**: Works with or without persistent memory (with appropriate warnings)
+## Files
+### Configuration Files
+#### `gpt5mini_gemini_context_window_management.yaml`
+Example configuration showing how to configure memory directly in YAML.
+Features two agents:
+- **agent_a**: GPT-5-mini with medium reasoning
+- **agent_b**: Gemini 2.5 Flash
+**Memory Control** - Configure directly in YAML:
+```yaml
+memory:
+  enabled: true  # Master switch
+  conversation_memory:
+    enabled: true  # Short-term tracking
+  persistent_memory:
+    enabled: true  # Long-term storage (set to false to disable)
+    on_disk: true
+    agent_name: "storyteller_agent"
+    # session_name: "test_session"  # Optional - auto-generated if not specified
+  compression:
+    trigger_threshold: 0.75  # Compress at 75%
+    target_ratio: 0.40       # Target 40% after
+```
+**Session Management:**
+- If `session_name` is not specified, a unique ID is auto-generated (e.g., `agent_storyteller_20251023_143022_a1b2c3`)
+- Each new run gets a fresh session by default
+- To continue a previous session, specify the `session_name` explicitly
+To disable persistent memory, set `memory.persistent_memory.enabled: false`
+#### `gpt5mini_gemini_no_persistent_memory.yaml`
+Example showing what happens when persistent memory is disabled.
+**Key difference**: Sets `memory.persistent_memory.enabled: false` to demonstrate warning messages when context fills up without long-term storage.
+### Test Script
+#### `test_context_window_management.py`
+Complete test script demonstrating:
+- Setup of ConversationMemory and PersistentMemory
+- Integration with SingleAgent
+- Both scenarios (with/without persistent memory)
+- Logging of compression events
+## Quick Start
+### Prerequisites
+```bash
+# Install dependencies
+pip install massgen mem0ai
+# Set up API keys - Create a .env file in project root:
+cat > .env << EOF
+OPENAI_API_KEY='your-key-here'
+GOOGLE_API_KEY='your-key-here'  # Optional, for Gemini
+EOF
+# Or export directly:
+export OPENAI_API_KEY='your-key-here'
+```
+The test script automatically loads `.env` files from:
+- Project root
+- Current directory
+- Script directory
+### Run the Test
+```bash
+# Run with default config (memory enabled)
+python massgen/configs/tools/memory/test_context_window_management.py
+# Run with custom config
+python massgen/configs/tools/memory/test_context_window_management.py --config path/to/config.yaml
+```
+The test script reads the `memory` section from YAML and:
+- If `persistent_memory.enabled: true` → Runs Test 1 (with persistent memory)
+- If `persistent_memory.enabled: false` → Runs Test 2 (without persistent memory)
+### Expected Output
+**With Persistent Memory:**
+```
+📊 Context usage: 96,000 / 128,000 tokens (75.0%) - compressing old context
+📦 Compressed 15 messages (60,000 tokens) into long-term memory
+   Kept 8 messages (36,000 tokens) in context
+```
+**Without Persistent Memory:**
+```
+📊 Context usage: 96,000 / 128,000 tokens (75.0%) - compressing old context
+⚠️  Warning: Dropping 15 messages (60,000 tokens)
+   No persistent memory configured to retain this information
+   Consider adding persistent_memory to avoid losing context
+```
+## How It Works
+### Token Budget Allocation
+After compression, the context window is allocated as follows:
+| Component | Allocation | Purpose |
+|-----------|------------|---------|
+| Conversation History | 40% | Most recent messages kept in active context |
+| New User Messages | 20% | Room for incoming requests |
+| Retrieved Memories | 10% | Injected relevant facts from persistent memory |
+| System Prompt | 10% | Overhead for instructions |
+| Response Generation | 20% | Space for model output |
+### Compression Strategy
+1. **Threshold**: Compression triggers at **75%** of context window
+2. **Target**: Reduces to **40%** of context window after compression
+3. **Selection**: Keeps most recent messages that fit within budget
+4. **Preservation**: System messages always kept (never compressed)
+### Model Context Windows
+The system automatically detects context limits for each model:
+| Model | Context Window | Compression at | Target After |
+|-------|----------------|----------------|--------------|
+| GPT-4o | 128K | 96K tokens | 51K tokens |
+| GPT-4o-mini | 128K | 96K tokens | 51K tokens |
+| Claude Sonnet 4 | 200K | 150K tokens | 80K tokens |
+| Gemini 2.5 Flash | 1M | 750K tokens | 400K tokens |
+| DeepSeek R1 | 128K | 96K tokens | 51K tokens |
+## Programmatic Usage
+To use memory in your own code:
+```python
+from massgen.backend.chat_completions import ChatCompletionsBackend
+from massgen.chat_agent import SingleAgent
+from massgen.memory import ConversationMemory, PersistentMemory
+# Create backends
+llm = ChatCompletionsBackend(type="openai", model="gpt-4o-mini", api_key="...")
+embedding = ChatCompletionsBackend(type="openai", model="text-embedding-3-small", api_key="...")
+# Initialize memory
+conversation_memory = ConversationMemory()
+persistent_memory = PersistentMemory(
+    agent_name="my_agent",
+    session_name="session_1",
+    llm_backend=llm,
+    embedding_backend=embedding,
+    on_disk=True,  # Persist across restarts
+)
+# Create agent with memory
+agent = SingleAgent(
+    backend=llm,
+    agent_id="my_agent",
+    system_message="You are a helpful assistant",
+    conversation_memory=conversation_memory,
+    persistent_memory=persistent_memory,
+)
+# Use normally - compression happens automatically
+async for chunk in agent.chat([{"role": "user", "content": "Hello!"}]):
+    if chunk.type == "content":
+        print(chunk.content, end="")
+```
+## Related Documentation
+- [Memory System Design](../../../massgen/memory/docs/DESIGN.md)
+- [Memory Quickstart](../../../massgen/memory/docs/QUICKSTART.md)
+- [Single Agent Memory Integration](../../../massgen/memory/docs/agent_use_memory.md)
+- [Orchestrator Shared Memory](../../../massgen/memory/docs/orchestrator_use_memory.md)
+## Related Issues
+- [Issue #347](https://github.com/Leezekun/MassGen/issues/347): Handle context limit with summarization
+- [Issue #348](https://github.com/Leezekun/MassGen/issues/348): Ensure memory persists across restarts ✅
+- [Issue #349](https://github.com/Leezekun/MassGen/issues/349): File caching with memory (future work)

massgen/configs/tools/memory/gpt5mini_gemini_context_window_management.yaml ADDED Viewed

@@ -0,0 +1,131 @@
+# Example Configuration: Context Window Management with Memory
+#
+# Use Case: Demonstrates automatic context compression when approaching token limits
+#
+# This configuration demonstrates:
+# - Automatic context window monitoring and compression
+# - Token-aware conversation management (75% threshold, 40% target)
+# - Persistent memory integration for long-term knowledge retention
+# - Graceful handling when context window fills up
+# - Multi-agent collaboration with shared context management
+#
+# Run with:
+#   massgen \
+#     --config massgen/configs/tools/memory/gpt5mini_gemini_context_window_management.yaml \
+#     "Tell me a detailed story about a space explorer. After each paragraph, ask me what happens next, and I'll guide the story. Keep expanding the narrative with rich details about planets, aliens, technology, and adventures. Make each response at least 500 words."
+# ====================
+# AGENT DEFINITIONS
+# ====================
+agents:
+  - id: "agent_a"
+    backend:
+      # Use GPT-5-mini with medium reasoning
+      type: "openai"
+      model: "gpt-5-mini"
+      text:
+        verbosity: "medium"
+      reasoning:
+        effort: "medium"
+        summary: "auto"
+  - id: "agent_b"
+    backend:
+      # Use Gemini 2.5 Flash for cost-effective testing
+      type: "gemini"
+      model: "gemini-2.5-flash"
+# ====================
+# MEMORY CONFIGURATION
+# ====================
+memory:
+  # Enable/disable persistent memory (default: true)
+  enabled: true
+  # Memory configuration
+  conversation_memory:
+    enabled: true  # Short-term conversation tracking (recommended: always true)
+  persistent_memory:
+    enabled: true  # Long-term knowledge storage (set to false to disable)
+    on_disk: true  # Persist across restarts
+    # session_name: "test_session"  # Optional - if not specified, auto-generates unique ID
+                                     # Format: agent_storyteller_20251023_143022_a1b2c3
+                                     # Specify to continue a specific session
+    # Vector store backend (default: qdrant)
+    vector_store: "qdrant"
+  # Context window management thresholds
+  compression:
+    trigger_threshold: 0.75  # Compress when context usage exceeds 75%
+    target_ratio: 0.40       # Target 40% of context after compression
+# Memory system behavior when enabled:
+# - ConversationMemory: Tracks short-term conversation history
+# - PersistentMemory: Stores long-term knowledge in vector database
+# - Automatic compression: Triggers at 75% of context window
+# - Token budget: Keeps 40% after compression
+# - Persistence: Saves to disk and survives restarts
+#
+# Session management:
+# - Each agent gets its own memory (separate by agent_name)
+# - New sessions start fresh (session_name auto-generated if not specified)
+# - To continue a previous session, specify the session_name
+#
+# To disable persistent memory for testing, set:
+#   memory.persistent_memory.enabled: false
+#
+# See massgen/memory/docs/ for detailed documentation.
+# ====================
+# ORCHESTRATOR CONFIGURATION
+# ====================
+orchestrator:
+  # Multi-turn mode to enable interactive storytelling
+  session_storage: "memory_test_sessions"
+  # Agent workspace for any file operations
+  agent_temporary_workspace: "memory_test_workspaces"
+  snapshot_storage: "memory_test_snapshots"
+# ====================
+# UI CONFIGURATION
+# ====================
+ui:
+  display_type: "rich_terminal"
+  logging_enabled: true
+# ====================
+# EXECUTION FLOW
+# ====================
+# What happens:
+# 1. User starts an interactive story with the agent
+# 2. Agent responds with detailed narrative (400-600 words per turn)
+# 3. As conversation continues, token usage is monitored automatically
+# 4. When context usage reaches 75% of model's limit:
+#    - System logs: "📊 Context usage: X / Y tokens (Z%) - compressing old context"
+#    - Old messages are compressed into persistent memory (if configured)
+#    - Recent messages (fitting in 40% of context window) are kept
+#    - Compression details logged: "📦 Compressed N messages (X tokens) into long-term memory"
+# 5. Agent continues seamlessly with compressed context
+# 6. Story maintains consistency by referencing persistent memories
+# 7. Process repeats as needed for very long conversations
+#
+# Expected output with persistent memory:
+#   📊 Context usage: 96,000 / 128,000 tokens (75.0%) - compressing old context
+#   📦 Compressed 15 messages (60,000 tokens) into long-term memory
+#      Kept 8 messages (36,000 tokens) in context
+#
+# Expected output WITHOUT persistent memory:
+#   📊 Context usage: 96,000 / 128,000 tokens (75.0%) - compressing old context
+#   ⚠️  Warning: Dropping 15 messages (60,000 tokens)
+#      No persistent memory configured to retain this information
+#      Consider adding persistent_memory to avoid losing context
+#
+# Token Budget Allocation (after compression):
+# - Conversation history: 40% (kept in active context)
+# - New user messages: 20%
+# - Retrieved memories: 10%
+# - System prompt overhead: 10%
+# - Response generation: 20%

massgen/configs/tools/memory/gpt5mini_gemini_no_persistent_memory.yaml ADDED Viewed

@@ -0,0 +1,133 @@
+# Example Configuration: Context Window Management WITHOUT Persistent Memory
+#
+# Use Case: Demonstrates context compression warnings when no persistent memory
+#
+# This configuration demonstrates what happens when:
+# - Conversation memory is enabled (tracks short-term history)
+# - Persistent memory is DISABLED (no long-term storage)
+# - Context window fills up (triggering compression warnings)
+#
+# Run with:
+#   python massgen/configs/tools/memory/test_context_window_management.py \
+#     --config massgen/configs/tools/memory/gpt5mini_gemini_no_persistent_memory.yaml
+# ====================
+# AGENT DEFINITIONS
+# ====================
+agents:
+  - id: "agent_a"
+    system_message: |
+      You are a creative storyteller who crafts detailed, immersive narratives.
+      When telling stories:
+      - Create rich, detailed descriptions of settings, characters, and events
+      - Build on previous plot points and maintain narrative consistency
+      - Ask engaging questions to guide the story forward
+      - Make each response substantial and immersive (aim for 400-600 words)
+      - Reference earlier story elements to create callbacks and continuity
+      Your goal is to create a long, engaging narrative that will naturally fill up
+      the context window over multiple turns, demonstrating how the system manages
+      conversation history automatically.
+    backend:
+      # Use GPT-5-mini with medium reasoning
+      type: "openai"
+      model: "gpt-5-mini"
+      # LLM parameters
+      temperature: 0.8
+      max_tokens: 2000
+      text:
+        verbosity: "medium"
+      reasoning:
+        effort: "medium"
+        summary: "auto"
+  - id: "agent_b"
+    system_message: |
+      You are a creative storyteller who crafts detailed, immersive narratives.
+      When telling stories:
+      - Create rich, detailed descriptions of settings, characters, and events
+      - Build on previous plot points and maintain narrative consistency
+      - Ask engaging questions to guide the story forward
+      - Make each response substantial and immersive (aim for 400-600 words)
+      - Reference earlier story elements to create callbacks and continuity
+      Your goal is to create a long, engaging narrative that will naturally fill up
+      the context window over multiple turns, demonstrating how the system manages
+      conversation history automatically.
+    backend:
+      # Use Gemini 2.5 Flash
+      type: "gemini"
+      model: "gemini-2.5-flash"
+      # LLM parameters
+      temperature: 0.8
+      max_tokens: 2000
+# ====================
+# MEMORY CONFIGURATION
+# ====================
+memory:
+  # Memory is enabled
+  enabled: true
+  # Conversation memory tracks short-term history
+  conversation_memory:
+    enabled: true
+  # Persistent memory is DISABLED - this will trigger warnings
+  persistent_memory:
+    enabled: false  # ⚠️ Set to false to see warning behavior
+  # Context window management still works
+  compression:
+    trigger_threshold: 0.75  # Compress when context usage exceeds 75%
+    target_ratio: 0.40       # Target 40% of context after compression
+# Expected behavior when context fills:
+# ⚠️  Warning: Dropping N messages (X tokens)
+#    No persistent memory configured to retain this information
+#    Consider adding persistent_memory to avoid losing context
+#
+# The system will still compress context, but information is lost
+# rather than stored in long-term memory.
+# ====================
+# ORCHESTRATOR CONFIGURATION
+# ====================
+orchestrator:
+  # Multi-turn mode
+  session_storage: "massgen_logs/memory_test_sessions"
+  # Agent workspaces
+  agent_temporary_workspace: "massgen_logs/memory_test_workspaces"
+  snapshot_storage: "massgen_logs/memory_test_snapshots"
+# ====================
+# UI CONFIGURATION
+# ====================
+ui:
+  display_type: "rich_terminal"
+  logging_enabled: true
+# ====================
+# EXECUTION FLOW
+# ====================
+# What happens:
+# 1. Conversation proceeds normally with short-term memory
+# 2. When context reaches 75% capacity:
+#    - System logs: "📊 Context usage: X / Y tokens (Z%) - compressing old context"
+#    - Warning shown: "⚠️  Warning: Dropping N messages"
+#    - Warning shown: "No persistent memory configured"
+# 3. Old messages are dropped (not saved anywhere)
+# 4. Agent continues with reduced context
+# 5. Information from dropped messages is permanently lost
+#
+# Compare this to the config with persistent memory enabled to see
+# the difference between graceful compression and data loss.

massgen 0.1.2__py3-none-any.whl → 0.1.3__py3-none-any.whl

massgen 0.1.2py3-none-any.whl → 0.1.3py3-none-any.whl