massgen 0.1.3__py3-none-any.whl → 0.1.5__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of massgen might be problematic. Click here for more details.

Files changed (90) hide show
  1. massgen/__init__.py +1 -1
  2. massgen/api_params_handler/_chat_completions_api_params_handler.py +4 -0
  3. massgen/api_params_handler/_claude_api_params_handler.py +4 -0
  4. massgen/api_params_handler/_gemini_api_params_handler.py +4 -0
  5. massgen/api_params_handler/_response_api_params_handler.py +4 -0
  6. massgen/backend/base_with_custom_tool_and_mcp.py +25 -5
  7. massgen/backend/docs/permissions_and_context_files.md +2 -2
  8. massgen/backend/response.py +2 -0
  9. massgen/chat_agent.py +340 -20
  10. massgen/cli.py +326 -19
  11. massgen/configs/README.md +92 -41
  12. massgen/configs/memory/gpt5mini_gemini_baseline_research_to_implementation.yaml +94 -0
  13. massgen/configs/memory/gpt5mini_gemini_context_window_management.yaml +187 -0
  14. massgen/configs/memory/gpt5mini_gemini_research_to_implementation.yaml +127 -0
  15. massgen/configs/memory/gpt5mini_high_reasoning_gemini.yaml +107 -0
  16. massgen/configs/memory/single_agent_compression_test.yaml +64 -0
  17. massgen/configs/tools/custom_tools/crawl4ai_example.yaml +55 -0
  18. massgen/configs/tools/custom_tools/multimodal_tools/text_to_file_generation_multi.yaml +61 -0
  19. massgen/configs/tools/custom_tools/multimodal_tools/text_to_file_generation_single.yaml +29 -0
  20. massgen/configs/tools/custom_tools/multimodal_tools/text_to_image_generation_multi.yaml +51 -0
  21. massgen/configs/tools/custom_tools/multimodal_tools/text_to_image_generation_single.yaml +33 -0
  22. massgen/configs/tools/custom_tools/multimodal_tools/text_to_speech_generation_multi.yaml +55 -0
  23. massgen/configs/tools/custom_tools/multimodal_tools/text_to_speech_generation_single.yaml +33 -0
  24. massgen/configs/tools/custom_tools/multimodal_tools/text_to_video_generation_multi.yaml +47 -0
  25. massgen/configs/tools/custom_tools/multimodal_tools/text_to_video_generation_single.yaml +29 -0
  26. massgen/configs/tools/custom_tools/multimodal_tools/understand_audio.yaml +1 -1
  27. massgen/configs/tools/custom_tools/multimodal_tools/understand_file.yaml +1 -1
  28. massgen/configs/tools/custom_tools/multimodal_tools/understand_image.yaml +1 -1
  29. massgen/configs/tools/custom_tools/multimodal_tools/understand_video.yaml +1 -1
  30. massgen/configs/tools/custom_tools/multimodal_tools/youtube_video_analysis.yaml +1 -1
  31. massgen/filesystem_manager/_filesystem_manager.py +1 -0
  32. massgen/filesystem_manager/_path_permission_manager.py +148 -0
  33. massgen/memory/README.md +277 -0
  34. massgen/memory/__init__.py +26 -0
  35. massgen/memory/_base.py +193 -0
  36. massgen/memory/_compression.py +237 -0
  37. massgen/memory/_context_monitor.py +211 -0
  38. massgen/memory/_conversation.py +255 -0
  39. massgen/memory/_fact_extraction_prompts.py +333 -0
  40. massgen/memory/_mem0_adapters.py +257 -0
  41. massgen/memory/_persistent.py +687 -0
  42. massgen/memory/docker-compose.qdrant.yml +36 -0
  43. massgen/memory/docs/DESIGN.md +388 -0
  44. massgen/memory/docs/QUICKSTART.md +409 -0
  45. massgen/memory/docs/SUMMARY.md +319 -0
  46. massgen/memory/docs/agent_use_memory.md +408 -0
  47. massgen/memory/docs/orchestrator_use_memory.md +586 -0
  48. massgen/memory/examples.py +237 -0
  49. massgen/message_templates.py +160 -12
  50. massgen/orchestrator.py +223 -7
  51. massgen/tests/memory/test_agent_compression.py +174 -0
  52. massgen/{configs/tools → tests}/memory/test_context_window_management.py +30 -30
  53. massgen/tests/memory/test_force_compression.py +154 -0
  54. massgen/tests/memory/test_simple_compression.py +147 -0
  55. massgen/tests/test_agent_memory.py +534 -0
  56. massgen/tests/test_binary_file_blocking.py +274 -0
  57. massgen/tests/test_case_studies.md +12 -12
  58. massgen/tests/test_conversation_memory.py +382 -0
  59. massgen/tests/test_multimodal_size_limits.py +407 -0
  60. massgen/tests/test_orchestrator_memory.py +620 -0
  61. massgen/tests/test_persistent_memory.py +435 -0
  62. massgen/token_manager/token_manager.py +6 -0
  63. massgen/tool/_manager.py +7 -2
  64. massgen/tool/_multimodal_tools/image_to_image_generation.py +293 -0
  65. massgen/tool/_multimodal_tools/text_to_file_generation.py +455 -0
  66. massgen/tool/_multimodal_tools/text_to_image_generation.py +222 -0
  67. massgen/tool/_multimodal_tools/text_to_speech_continue_generation.py +226 -0
  68. massgen/tool/_multimodal_tools/text_to_speech_transcription_generation.py +217 -0
  69. massgen/tool/_multimodal_tools/text_to_video_generation.py +223 -0
  70. massgen/tool/_multimodal_tools/understand_audio.py +19 -1
  71. massgen/tool/_multimodal_tools/understand_file.py +6 -1
  72. massgen/tool/_multimodal_tools/understand_image.py +112 -8
  73. massgen/tool/_multimodal_tools/understand_video.py +32 -5
  74. massgen/tool/_web_tools/crawl4ai_tool.py +718 -0
  75. massgen/tool/docs/multimodal_tools.md +589 -0
  76. massgen/tools/__init__.py +8 -0
  77. massgen/tools/_planning_mcp_server.py +520 -0
  78. massgen/tools/planning_dataclasses.py +434 -0
  79. {massgen-0.1.3.dist-info → massgen-0.1.5.dist-info}/METADATA +142 -82
  80. {massgen-0.1.3.dist-info → massgen-0.1.5.dist-info}/RECORD +84 -41
  81. massgen/configs/tools/custom_tools/crawl4ai_mcp_example.yaml +0 -67
  82. massgen/configs/tools/custom_tools/crawl4ai_multi_agent_example.yaml +0 -68
  83. massgen/configs/tools/memory/README.md +0 -199
  84. massgen/configs/tools/memory/gpt5mini_gemini_context_window_management.yaml +0 -131
  85. massgen/configs/tools/memory/gpt5mini_gemini_no_persistent_memory.yaml +0 -133
  86. massgen/configs/tools/multimodal/gpt5mini_gpt5nano_documentation_evolution.yaml +0 -97
  87. {massgen-0.1.3.dist-info → massgen-0.1.5.dist-info}/WHEEL +0 -0
  88. {massgen-0.1.3.dist-info → massgen-0.1.5.dist-info}/entry_points.txt +0 -0
  89. {massgen-0.1.3.dist-info → massgen-0.1.5.dist-info}/licenses/LICENSE +0 -0
  90. {massgen-0.1.3.dist-info → massgen-0.1.5.dist-info}/top_level.txt +0 -0
massgen/cli.py CHANGED
@@ -488,8 +488,18 @@ def create_backend(backend_type: str, **kwargs) -> Any:
488
488
  raise ConfigurationError(f"Unsupported backend type: {backend_type}")
489
489
 
490
490
 
491
- def create_agents_from_config(config: Dict[str, Any], orchestrator_config: Optional[Dict[str, Any]] = None, config_path: Optional[str] = None) -> Dict[str, ConfigurableAgent]:
492
- """Create agents from configuration."""
491
+ def create_agents_from_config(
492
+ config: Dict[str, Any],
493
+ orchestrator_config: Optional[Dict[str, Any]] = None,
494
+ config_path: Optional[str] = None,
495
+ memory_session_id: Optional[str] = None,
496
+ ) -> Dict[str, ConfigurableAgent]:
497
+ """Create agents from configuration.
498
+
499
+ Args:
500
+ memory_session_id: Optional session ID to use for memory isolation.
501
+ If provided, overrides session_name from YAML config.
502
+ """
493
503
  agents = {}
494
504
 
495
505
  agent_entries = [config["agent"]] if "agent" in config else config.get("agents", None)
@@ -497,6 +507,43 @@ def create_agents_from_config(config: Dict[str, Any], orchestrator_config: Optio
497
507
  if not agent_entries:
498
508
  raise ConfigurationError("Configuration must contain either 'agent' or 'agents' section")
499
509
 
510
+ # Create shared Qdrant client for all agents (avoids concurrent access errors)
511
+ # ONE client can be used by multiple mem0 instances safely
512
+ shared_qdrant_client = None
513
+ global_memory_config = config.get("memory", {})
514
+ if global_memory_config.get("enabled", False) and global_memory_config.get("persistent_memory", {}).get("enabled", False):
515
+ try:
516
+ from qdrant_client import QdrantClient
517
+
518
+ pm_config = global_memory_config.get("persistent_memory", {})
519
+
520
+ # Support both server mode and file-based mode
521
+ qdrant_config = pm_config.get("qdrant", {})
522
+ mode = qdrant_config.get("mode", "local") # "local" or "server"
523
+
524
+ if mode == "server":
525
+ # Server mode (RECOMMENDED for multi-agent)
526
+ host = qdrant_config.get("host", "localhost")
527
+ port = qdrant_config.get("port", 6333)
528
+ shared_qdrant_client = QdrantClient(host=host, port=port)
529
+ logger.info(f"🗄️ Shared Qdrant client created (server mode: {host}:{port})")
530
+ else:
531
+ # Local file-based mode (single agent only)
532
+ # WARNING: Does NOT support concurrent access by multiple agents
533
+ qdrant_path = pm_config.get("path", ".massgen/qdrant")
534
+ shared_qdrant_client = QdrantClient(path=qdrant_path)
535
+ logger.info(f"🗄️ Shared Qdrant client created (local mode: {qdrant_path})")
536
+ if len(agent_entries) > 1:
537
+ logger.warning(
538
+ "⚠️ Multi-agent setup detected with local Qdrant mode. "
539
+ "This may cause concurrent access errors. "
540
+ "Consider using server mode: set memory.persistent_memory.qdrant.mode='server'",
541
+ )
542
+ except Exception as e:
543
+ logger.warning(f"⚠️ Failed to create shared Qdrant client: {e}")
544
+ logger.warning(" Persistent memory will be disabled for all agents")
545
+ logger.warning(" For multi-agent setup, start Qdrant server: docker-compose -f docker-compose.qdrant.yml up -d")
546
+
500
547
  for i, agent_data in enumerate(agent_entries, start=1):
501
548
  backend_config = agent_data.get("backend", {})
502
549
 
@@ -579,7 +626,201 @@ def create_agents_from_config(config: Dict[str, Any], orchestrator_config: Optio
579
626
 
580
627
  # Timeout configuration will be applied to orchestrator instead of individual agents
581
628
 
582
- agent = ConfigurableAgent(config=agent_config, backend=backend)
629
+ # Merge global and per-agent memory configuration
630
+ global_memory_config = config.get("memory", {})
631
+ agent_memory_config = agent_data.get("memory", {})
632
+
633
+ # Deep merge: agent config overrides global config
634
+ def merge_configs(global_cfg, agent_cfg):
635
+ """Recursively merge agent config into global config."""
636
+ merged = global_cfg.copy()
637
+ for key, value in agent_cfg.items():
638
+ if isinstance(value, dict) and key in merged and isinstance(merged[key], dict):
639
+ merged[key] = merge_configs(merged[key], value)
640
+ else:
641
+ merged[key] = value
642
+ return merged
643
+
644
+ memory_config = merge_configs(global_memory_config, agent_memory_config)
645
+
646
+ # Create context monitor if memory config is enabled
647
+ context_monitor = None
648
+ if memory_config.get("enabled", False):
649
+ from .memory._context_monitor import ContextWindowMonitor
650
+
651
+ compression_config = memory_config.get("compression", {})
652
+ trigger_threshold = compression_config.get("trigger_threshold", 0.75)
653
+ target_ratio = compression_config.get("target_ratio", 0.40)
654
+
655
+ # Get model name from backend config
656
+ model_name = backend_config.get("model", "unknown")
657
+
658
+ # Normalize provider name for monitor
659
+ provider_map = {
660
+ "openai": "openai",
661
+ "anthropic": "anthropic",
662
+ "claude": "anthropic",
663
+ "google": "google",
664
+ "gemini": "google",
665
+ }
666
+ provider = provider_map.get(backend_type_lower, backend_type_lower)
667
+
668
+ context_monitor = ContextWindowMonitor(
669
+ model_name=model_name,
670
+ provider=provider,
671
+ trigger_threshold=trigger_threshold,
672
+ target_ratio=target_ratio,
673
+ enabled=True,
674
+ )
675
+ logger.info(
676
+ f"📊 Context monitor created for {agent_config.agent_id}: " f"{context_monitor.context_window:,} tokens, " f"trigger={trigger_threshold*100:.0f}%, target={target_ratio*100:.0f}%",
677
+ )
678
+
679
+ # Create per-agent memory objects if memory is enabled
680
+ conversation_memory = None
681
+ persistent_memory = None
682
+
683
+ if memory_config.get("enabled", False):
684
+ from .memory import ConversationMemory
685
+
686
+ # Create conversation memory for this agent
687
+ if memory_config.get("conversation_memory", {}).get("enabled", True):
688
+ conversation_memory = ConversationMemory()
689
+ logger.info(f"💾 Conversation memory created for {agent_config.agent_id}")
690
+
691
+ # Create persistent memory for this agent (if enabled)
692
+ if memory_config.get("persistent_memory", {}).get("enabled", False):
693
+ from .memory import PersistentMemory
694
+
695
+ pm_config = memory_config.get("persistent_memory", {})
696
+
697
+ # Get persistent memory configuration
698
+ agent_name = pm_config.get("agent_name", agent_config.agent_id)
699
+
700
+ # Use unified session: memory_session_id (from CLI) > YAML session_name > None
701
+ session_name = memory_session_id or pm_config.get("session_name")
702
+
703
+ on_disk = pm_config.get("on_disk", True)
704
+ qdrant_path = pm_config.get("path", ".massgen/qdrant") # Project dir, not /tmp
705
+
706
+ try:
707
+ # Configure LLM for memory operations (fact extraction)
708
+ # RECOMMENDED: Use mem0's native LLMs (no adapter overhead, no async complexity)
709
+ llm_cfg = pm_config.get("llm", {})
710
+
711
+ if not llm_cfg:
712
+ # Default: gpt-4.1-nano-2025-04-14 (mem0's default, fast and cheap for memory ops)
713
+ llm_cfg = {
714
+ "provider": "openai",
715
+ "model": "gpt-4.1-nano-2025-04-14",
716
+ }
717
+
718
+ # Add API key if not specified
719
+ if "api_key" not in llm_cfg:
720
+ llm_provider = llm_cfg.get("provider", "openai")
721
+ if llm_provider == "openai":
722
+ llm_cfg["api_key"] = os.getenv("OPENAI_API_KEY")
723
+ elif llm_provider == "anthropic":
724
+ llm_cfg["api_key"] = os.getenv("ANTHROPIC_API_KEY")
725
+ elif llm_provider == "groq":
726
+ llm_cfg["api_key"] = os.getenv("GROQ_API_KEY")
727
+ # Add more providers as needed
728
+
729
+ # Configure embedding for persistent memory
730
+ # RECOMMENDED: Use mem0's native embedders (no adapter overhead)
731
+ embedding_cfg = pm_config.get("embedding", {})
732
+
733
+ if not embedding_cfg:
734
+ # Default: OpenAI text-embedding-3-small
735
+ embedding_cfg = {
736
+ "provider": "openai",
737
+ "model": "text-embedding-3-small",
738
+ }
739
+
740
+ # Add API key if not specified
741
+ if "api_key" not in embedding_cfg:
742
+ emb_provider = embedding_cfg.get("provider", "openai")
743
+ if emb_provider == "openai":
744
+ api_key = os.getenv("OPENAI_API_KEY")
745
+ if not api_key:
746
+ logger.warning("⚠️ OPENAI_API_KEY not found in environment - embedding will fail!")
747
+ else:
748
+ logger.debug(f"✅ Using OPENAI_API_KEY from environment (key starts with: {api_key[:7]}...)")
749
+ embedding_cfg["api_key"] = api_key
750
+ elif emb_provider == "together":
751
+ embedding_cfg["api_key"] = os.getenv("TOGETHER_API_KEY")
752
+ elif emb_provider == "azure_openai":
753
+ embedding_cfg["api_key"] = os.getenv("AZURE_OPENAI_API_KEY")
754
+ # Add more providers as needed
755
+
756
+ # Use shared Qdrant client if available
757
+ if shared_qdrant_client:
758
+ persistent_memory = PersistentMemory(
759
+ agent_name=agent_name,
760
+ session_name=session_name,
761
+ llm_config=llm_cfg, # Use native mem0 LLM
762
+ embedding_config=embedding_cfg, # Use native mem0 embedder
763
+ qdrant_client=shared_qdrant_client, # Share ONE client from server
764
+ on_disk=on_disk,
765
+ )
766
+ logger.info(
767
+ f"💾 Persistent memory created for {agent_config.agent_id} "
768
+ f"(agent_name={agent_name}, session={session_name or 'cross-session'}, "
769
+ f"llm={llm_cfg.get('provider')}/{llm_cfg.get('model')}, "
770
+ f"embedder={embedding_cfg.get('provider')}/{embedding_cfg.get('model')}, shared_qdrant=True)",
771
+ )
772
+ else:
773
+ # Fallback: create individual vector store (for backward compatibility)
774
+ # WARNING: File-based Qdrant doesn't support concurrent access
775
+ from mem0.vector_stores.configs import VectorStoreConfig
776
+
777
+ vector_store_config = VectorStoreConfig(
778
+ config={
779
+ "on_disk": on_disk,
780
+ "path": qdrant_path,
781
+ },
782
+ )
783
+
784
+ persistent_memory = PersistentMemory(
785
+ agent_name=agent_name,
786
+ session_name=session_name,
787
+ llm_config=llm_cfg, # Use native mem0 LLM
788
+ embedding_config=embedding_cfg, # Use native mem0 embedder
789
+ vector_store_config=vector_store_config,
790
+ on_disk=on_disk,
791
+ )
792
+ logger.info(
793
+ f"💾 Persistent memory created for {agent_config.agent_id} "
794
+ f"(agent_name={agent_name}, session={session_name or 'cross-session'}, "
795
+ f"llm={llm_cfg.get('provider')}/{llm_cfg.get('model')}, "
796
+ f"embedder={embedding_cfg.get('provider')}/{embedding_cfg.get('model')}, path={qdrant_path})",
797
+ )
798
+ except Exception as e:
799
+ logger.warning(
800
+ f"⚠️ Failed to create persistent memory for {agent_config.agent_id}: {e}",
801
+ )
802
+ persistent_memory = None
803
+
804
+ # Create agent
805
+ agent = ConfigurableAgent(
806
+ config=agent_config,
807
+ backend=backend,
808
+ conversation_memory=conversation_memory,
809
+ persistent_memory=persistent_memory,
810
+ context_monitor=context_monitor,
811
+ )
812
+
813
+ # Configure retrieval settings from YAML (if memory is enabled)
814
+ if memory_config.get("enabled", False):
815
+ retrieval_config = memory_config.get("retrieval", {})
816
+ agent._retrieval_limit = retrieval_config.get("limit", 5)
817
+ agent._retrieval_exclude_recent = retrieval_config.get("exclude_recent", True)
818
+
819
+ if retrieval_config: # Only log if custom config provided
820
+ logger.info(
821
+ f"🔧 Retrieval configured for {agent_config.agent_id}: " f"limit={agent._retrieval_limit}, exclude_recent={agent._retrieval_exclude_recent}",
822
+ )
823
+
583
824
  agents[agent.config.agent_id] = agent
584
825
 
585
826
  return agents
@@ -696,21 +937,25 @@ def relocate_filesystem_paths(config: Dict[str, Any]) -> None:
696
937
  backend_config["cwd"] = str(massgen_dir / "workspaces" / user_cwd)
697
938
 
698
939
 
699
- def load_previous_turns(session_info: Dict[str, Any], session_storage: str) -> List[Dict[str, Any]]:
940
+ def load_previous_turns(session_info: Dict[str, Any], session_storage: str) -> tuple[List[Dict[str, Any]], List[Dict[str, Any]]]:
700
941
  """
701
- Load previous turns from session storage.
942
+ Load previous turns and winning agents history from session storage.
702
943
 
703
944
  Returns:
704
- List of previous turn metadata dicts
945
+ tuple: (previous_turns, winning_agents_history)
946
+ - previous_turns: List of previous turn metadata dicts
947
+ - winning_agents_history: List of winning agents for memory sharing
948
+ Format: [{"agent_id": "agent_b", "turn": 1}, ...]
705
949
  """
706
950
  session_id = session_info.get("session_id")
707
951
  if not session_id:
708
- return []
952
+ return [], []
709
953
 
710
954
  session_dir = Path(session_storage) / session_id
711
955
  if not session_dir.exists():
712
- return []
956
+ return [], []
713
957
 
958
+ # Load previous turns
714
959
  previous_turns = []
715
960
  turn_num = 1
716
961
 
@@ -735,7 +980,17 @@ def load_previous_turns(session_info: Dict[str, Any], session_storage: str) -> L
735
980
 
736
981
  turn_num += 1
737
982
 
738
- return previous_turns
983
+ # Load winning agents history for memory sharing across turns
984
+ winning_agents_history = []
985
+ winning_agents_file = session_dir / "winning_agents_history.json"
986
+ if winning_agents_file.exists():
987
+ try:
988
+ winning_agents_history = json.loads(winning_agents_file.read_text(encoding="utf-8"))
989
+ logger.info(f"📚 Loaded {len(winning_agents_history)} winning agent(s) from session storage: {winning_agents_history}")
990
+ except Exception as e:
991
+ logger.warning(f"⚠️ Failed to load winning agents history: {e}")
992
+
993
+ return previous_turns, winning_agents_history
739
994
 
740
995
 
741
996
  async def handle_session_persistence(
@@ -795,6 +1050,16 @@ async def handle_session_persistence(
795
1050
  metadata_file = turn_dir / "metadata.json"
796
1051
  metadata_file.write_text(json.dumps(metadata, indent=2), encoding="utf-8")
797
1052
 
1053
+ # Save winning agents history for memory sharing across turns
1054
+ # This allows the orchestrator to restore winner tracking when recreated
1055
+ if final_result.get("winning_agents_history"):
1056
+ winning_agents_file = session_dir / "winning_agents_history.json"
1057
+ winning_agents_file.write_text(
1058
+ json.dumps(final_result["winning_agents_history"], indent=2),
1059
+ encoding="utf-8",
1060
+ )
1061
+ logger.info(f"📚 Saved {len(final_result['winning_agents_history'])} winning agent(s) to session storage")
1062
+
798
1063
  # Create/update session summary for easy viewing
799
1064
  session_summary_file = session_dir / "SESSION_SUMMARY.txt"
800
1065
  summary_lines = []
@@ -896,8 +1161,8 @@ async def run_question_with_history(
896
1161
  max_orchestration_restarts=coord_cfg.get("max_orchestration_restarts", 0),
897
1162
  )
898
1163
 
899
- # Load previous turns from session storage for multi-turn conversations
900
- previous_turns = load_previous_turns(session_info, session_storage)
1164
+ # Load previous turns and winning agents history from session storage for multi-turn conversations
1165
+ previous_turns, winning_agents_history = load_previous_turns(session_info, session_storage)
901
1166
 
902
1167
  orchestrator = Orchestrator(
903
1168
  agents=agents,
@@ -905,6 +1170,7 @@ async def run_question_with_history(
905
1170
  snapshot_storage=snapshot_storage,
906
1171
  agent_temporary_workspace=agent_temporary_workspace,
907
1172
  previous_turns=previous_turns,
1173
+ winning_agents_history=winning_agents_history, # Restore for memory sharing
908
1174
  )
909
1175
  # Create a fresh UI instance for each question to ensure clean state
910
1176
  ui = CoordinationUI(
@@ -1883,6 +2149,7 @@ async def run_interactive_mode(
1883
2149
  original_config: Dict[str, Any] = None,
1884
2150
  orchestrator_cfg: Dict[str, Any] = None,
1885
2151
  config_path: Optional[str] = None,
2152
+ memory_session_id: Optional[str] = None,
1886
2153
  **kwargs,
1887
2154
  ):
1888
2155
  """Run MassGen in interactive mode with conversation history."""
@@ -1971,8 +2238,13 @@ async def run_interactive_mode(
1971
2238
  if original_config and orchestrator_cfg:
1972
2239
  config_modified = prompt_for_context_paths(original_config, orchestrator_cfg)
1973
2240
  if config_modified:
1974
- # Recreate agents with updated context paths
1975
- agents = create_agents_from_config(original_config, orchestrator_cfg, config_path=config_path)
2241
+ # Recreate agents with updated context paths (use same session)
2242
+ agents = create_agents_from_config(
2243
+ original_config,
2244
+ orchestrator_cfg,
2245
+ config_path=config_path,
2246
+ memory_session_id=memory_session_id,
2247
+ )
1976
2248
  print(f" {BRIGHT_GREEN}✓ Agents reloaded with updated context paths{RESET}", flush=True)
1977
2249
  print()
1978
2250
 
@@ -1982,7 +2254,8 @@ async def run_interactive_mode(
1982
2254
  conversation_history = []
1983
2255
 
1984
2256
  # Session management for multi-turn filesystem support
1985
- session_id = None
2257
+ # Use memory_session_id (unified with memory system) if provided, otherwise create later
2258
+ session_id = memory_session_id
1986
2259
  current_turn = 0
1987
2260
  session_storage = kwargs.get("orchestrator", {}).get("session_storage", "sessions")
1988
2261
 
@@ -2029,8 +2302,13 @@ async def run_interactive_mode(
2029
2302
  new_turn_config = {"path": str(latest_turn_workspace.resolve()), "permission": "read"}
2030
2303
  backend_config["context_paths"] = existing_context_paths + [new_turn_config]
2031
2304
 
2032
- # Recreate agents from modified config
2033
- agents = create_agents_from_config(modified_config, orchestrator_cfg, config_path=config_path)
2305
+ # Recreate agents from modified config (use same session)
2306
+ agents = create_agents_from_config(
2307
+ modified_config,
2308
+ orchestrator_cfg,
2309
+ config_path=config_path,
2310
+ memory_session_id=session_id,
2311
+ )
2034
2312
  logger.info(f"[CLI] Successfully recreated {len(agents)} agents with turn {current_turn} path as read-only context")
2035
2313
 
2036
2314
  question = input(f"\n{BRIGHT_BLUE}👤 User:{RESET} ").strip()
@@ -2322,7 +2600,28 @@ async def main(args):
2322
2600
  ' agent_temporary_workspace: "your_temp_dir" # Directory for temporary agent workspaces',
2323
2601
  )
2324
2602
 
2325
- agents = create_agents_from_config(config, orchestrator_cfg, config_path=str(resolved_path) if resolved_path else None)
2603
+ # Create unified session ID for memory system (before creating agents)
2604
+ # This ensures memory is isolated per session and unifies orchestrator + memory sessions
2605
+ memory_session_id = None
2606
+ if args.question:
2607
+ # Single question mode: Create temp session per run
2608
+ from datetime import datetime
2609
+
2610
+ memory_session_id = f"temp_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
2611
+ logger.info(f"📝 Created temp session for single-question mode: {memory_session_id}")
2612
+ else:
2613
+ # Interactive mode: Create session now (will be reused by orchestrator)
2614
+ from datetime import datetime
2615
+
2616
+ memory_session_id = f"session_{datetime.now().strftime('%Y%m%d_%H%M%S')}"
2617
+ logger.info(f"📝 Created session for interactive mode: {memory_session_id}")
2618
+
2619
+ agents = create_agents_from_config(
2620
+ config,
2621
+ orchestrator_cfg,
2622
+ config_path=str(resolved_path) if resolved_path else None,
2623
+ memory_session_id=memory_session_id,
2624
+ )
2326
2625
 
2327
2626
  if not agents:
2328
2627
  raise ConfigurationError("No agents configured")
@@ -2358,9 +2657,17 @@ async def main(args):
2358
2657
  # print(f"\n{BRIGHT_GREEN}Final Response:{RESET}", flush=True)
2359
2658
  # print(f"{response}", flush=True)
2360
2659
  else:
2361
- # Pass the config path to interactive mode
2660
+ # Pass the config path and session_id to interactive mode
2362
2661
  config_file_path = str(resolved_path) if args.config and resolved_path else None
2363
- await run_interactive_mode(agents, ui_config, original_config=config, orchestrator_cfg=orchestrator_cfg, config_path=config_file_path, **kwargs)
2662
+ await run_interactive_mode(
2663
+ agents,
2664
+ ui_config,
2665
+ original_config=config,
2666
+ orchestrator_cfg=orchestrator_cfg,
2667
+ config_path=config_file_path,
2668
+ memory_session_id=memory_session_id,
2669
+ **kwargs,
2670
+ )
2364
2671
  finally:
2365
2672
  # Cleanup all agents' filesystem managers (including Docker containers)
2366
2673
  for agent_id, agent in agents.items():
massgen/configs/README.md CHANGED
@@ -227,53 +227,104 @@ Most configurations use environment variables for API keys:so
227
227
 
228
228
  ## Release History & Examples
229
229
 
230
- ### v0.1.3 - Latest
231
- **New Features:** Post-Evaluation Workflow, Custom Multimodal Understanding Tools, Docker Sudo Mode
230
+ ### v0.1.5 - Latest
231
+ **New Features:** Memory System with Semantic Retrieval
232
232
 
233
233
  **Configuration Files:**
234
- - `configs/tools/custom_tools/multimodal_tools/understand_image.yaml` - Image analysis configuration
235
- - `configs/tools/custom_tools/multimodal_tools/understand_audio.yaml` - Audio transcription configuration
236
- - `configs/tools/custom_tools/multimodal_tools/understand_video.yaml` - Video analysis configuration
237
- - `configs/tools/custom_tools/multimodal_tools/understand_file.yaml` - Document processing configuration
234
+ - `gpt5mini_gemini_context_window_management.yaml` - Multi-agent with automatic context compression
235
+ - `gpt5mini_gemini_research_to_implementation.yaml` - **Research-to-implementation workflow** (featured in case study)
236
+ - `gpt5mini_high_reasoning_gemini.yaml` - High reasoning agents with memory integration
237
+ - `gpt5mini_gemini_baseline_research_to_implementation.yaml` - Baseline research workflow
238
+ - `single_agent_compression_test.yaml` - Testing context compression behavior
239
+
240
+ **Documentation & Case Studies:**
241
+ - `docs/source/user_guide/memory.rst` - Complete memory system user guide
242
+ - `docs/source/examples/case_studies/multi-turn-persistent-memory.md` - **Memory case study with demo video**
243
+ - Memory design decisions and architecture documentation
244
+ - API reference for PersistentMemory, ConversationMemory, and ContextMonitor
238
245
 
239
- **Documentation:**
240
- - `massgen/tool/docs/multimodal_tools.md` - Complete 779-line multimodal tools guide
241
- - `docs/source/user_guide/multimodal.rst` - Updated multimodal documentation with custom tools
242
- - `docs/source/user_guide/code_execution.rst` - Enhanced with 98 lines documenting sudo mode
243
- - `massgen/docker/README.md` - Updated Docker documentation with sudo mode instructions
246
+ **Key Features:**
247
+ - **Long-Term Memory**: Semantic storage via mem0 with vector database integration
248
+ - **Context Compression**: Automatic compression when approaching token limits
249
+ - **Cross-Agent Sharing**: Agents learn from each other's experiences
250
+ - **Session Management**: Memory persistence across conversations
244
251
 
245
- **Case Study:**
246
- - [Multimodal Video Understanding](../../docs/case_studies/multimodal-case-study-video-analysis.md)
252
+ **Try it:**
253
+ ```bash
254
+ # Install or upgrade
255
+ pip install --upgrade massgen
256
+
257
+ # Multi-agent collaboration with context compression
258
+ massgen --config @examples/memory/gpt5mini_gemini_context_window_management \
259
+ "Analyze the MassGen codebase comprehensively. Create an architecture document that explains: (1) Core components and their responsibilities, (2) How different modules interact, (3) Key design patterns used, (4) Main entry points and request flows. Read > 30 files to build a complete understanding."
247
260
 
248
- **Example Resources:**
249
- - `configs/resources/v0.1.3-example/multimodality.jpg` - Image example
250
- - `configs/resources/v0.1.3-example/Sherlock_Holmes.mp3` - Audio example
251
- - `configs/resources/v0.1.3-example/oppenheimer_trailer_1920.mp4` - Video example
252
- - `configs/resources/v0.1.3-example/TUMIX.pdf` - PDF document example
261
+ # Research-to-implementation workflow with memory persistence
262
+ # Prerequisites: Start Qdrant and crawl4ai Docker containers
263
+ docker run -d -p 6333:6333 -p 6334:6334 \
264
+ -v $(pwd)/.massgen/qdrant_storage:/qdrant/storage:z qdrant/qdrant
265
+ docker run -d -p 11235:11235 --name crawl4ai --shm-size=1g unclecode/crawl4ai:latest
266
+
267
+ # Session 1 - Research phase:
268
+ massgen --config @examples/memory/gpt5mini_gemini_research_to_implementation \
269
+ "Use crawl4ai to research the latest multi-agent AI papers and techniques from 2025. Focus on: coordination mechanisms, voting strategies, tool-use patterns, and architectural innovations."
270
+
271
+ # Session 2 - Implementation analysis (continue in same session):
272
+ # "Based on the multi-agent research from earlier, which techniques should we implement in MassGen to make it more state-of-the-art? Consider MassGen's current architecture and what would be most impactful."
273
+
274
+ → See [Multi-Turn Persistent Memory Case Study](../../docs/source/examples/case_studies/multi-turn-persistent-memory.md) for detailed analysis
275
+
276
+ # Test automatic context compression
277
+ massgen --config @examples/memory/single_agent_compression_test \
278
+ "Analyze the MassGen codebase comprehensively. Create an architecture document that explains: (1) Core components and their responsibilities, (2) How different modules interact, (3) Key design patterns used, (4) Main entry points and request flows. Read > 30 files to build a complete understanding."
279
+ ```
280
+
281
+ ### v0.1.4
282
+ **New Features:** Multimodal Generation Tools, Binary File Protection, Crawl4AI Integration
283
+
284
+ **Configuration Files:**
285
+ - `text_to_image_generation_single.yaml` / `text_to_image_generation_multi.yaml` - Image generation
286
+ - `text_to_video_generation_single.yaml` / `text_to_video_generation_multi.yaml` - Video generation
287
+ - `text_to_speech_generation_single.yaml` / `text_to_speech_generation_multi.yaml` - Audio generation
288
+ - `text_to_file_generation_single.yaml` / `text_to_file_generation_multi.yaml` - Document generation
289
+ - `crawl4ai_example.yaml` - Web scraping configuration
253
290
 
254
291
  **Key Features:**
255
- - **Post-Evaluation Tools**: Submit and restart capabilities for winning agents with confidence assessments
256
- - **Multimodal Understanding**: Analyze images, audio, video, and documents using GPT-4.1
257
- - **Docker Sudo Mode**: Execute privileged commands in containerized environments
258
- - **Config Builder**: Improved workflow with auto-detection and better provider handling
292
+ - **Generation Tools**: Create images, videos, audio, and documents using OpenAI APIs
293
+ - **Binary File Protection**: Automatic blocking prevents text tools from reading 40+ binary file types
294
+ - **Web Scraping**: Crawl4AI integration for intelligent content extraction
295
+ - **Enhanced Security**: Smart tool suggestions guide users to appropriate specialized tools
259
296
 
260
297
  **Try it:**
261
298
  ```bash
262
- # Install or upgrade
263
- pip install --upgrade massgen
299
+ # Generate an image from text
300
+ massgen --config @examples/tools/custom_tools/multimodal_tools/text_to_image_generation_single \
301
+ "Please generate an image of a cat in space."
302
+
303
+ # Generate a video from text
304
+ massgen --config @examples/tools/custom_tools/multimodal_tools/text_to_video_generation_single \
305
+ "Generate a 4 seconds video with neon-lit alley at night, light rain, slow push-in, cinematic."
306
+
307
+ # Generate documents (PDF, DOCX, etc.)
308
+ massgen --config @examples/tools/custom_tools/multimodal_tools/text_to_file_generation_single \
309
+ "Please generate a comprehensive technical report about the latest developments in Large Language Models (LLMs)."
310
+ ```
311
+
312
+ ### v0.1.3
313
+ **New Features:** Post-Evaluation Workflow, Custom Multimodal Understanding Tools, Docker Sudo Mode
264
314
 
315
+ **Configuration Files:**
316
+ - `understand_image.yaml`, `understand_audio.yaml`, `understand_video.yaml`, `understand_file.yaml`
317
+
318
+ **Key Features:**
319
+ - **Post-Evaluation Tools**: Submit and restart capabilities for winning agents
320
+ - **Multimodal Understanding**: Analyze images, audio, video, and documents
321
+ - **Docker Sudo Mode**: Execute privileged commands in containers
322
+
323
+ **Try it:**
324
+ ```bash
265
325
  # Try multimodal image understanding
266
- # (Requires OPENAI_API_KEY in .env)
267
326
  massgen --config @examples/tools/custom_tools/multimodal_tools/understand_image \
268
327
  "Please summarize the content in this image."
269
-
270
- # Try multimodal audio understanding
271
- massgen --config @examples/tools/custom_tools/multimodal_tools/understand_audio \
272
- "Please summarize the content in this audio."
273
-
274
- # Try multimodal video understanding
275
- massgen --config @examples/tools/custom_tools/multimodal_tools/understand_video \
276
- "What's happening in this video?"
277
328
  ```
278
329
 
279
330
  ### v0.1.2
@@ -284,7 +335,7 @@ massgen --config @examples/tools/custom_tools/multimodal_tools/understand_video
284
335
  - `configs/basic/multi/three_agents_default.yaml` - Updated with Grok-4-fast model
285
336
 
286
337
  **Documentation:**
287
- - `docs/case_studies/INTELLIGENT_PLANNING_MODE.md` - Complete intelligent planning mode guide
338
+ - `docs/dev_notes/intelligent_planning_mode.md` - Complete intelligent planning mode guide
288
339
 
289
340
  **Key Features:**
290
341
  - **Intelligent Planning Mode**: Automatic analysis of question irreversibility for dynamic MCP tool blocking
@@ -392,7 +443,7 @@ massgen --config @examples/tools/code-execution/docker_with_resource_limits \
392
443
  - `massgen/configs/basic/single/single_gpt4o_video_generation.yaml` - Video generation with OpenAI Sora-2
393
444
 
394
445
  **Case Study:**
395
- - [Universal Code Execution via MCP](../../docs/case_studies/universal-code-execution-mcp.md)
446
+ - [Universal Code Execution via MCP](../../docs/source/examples/case_studies/universal-code-execution-mcp.md)
396
447
 
397
448
  **Key Features:**
398
449
  - Universal `execute_command` tool works across Claude, Gemini, OpenAI (Response API), and Chat Completions providers (Grok, ZAI, etc.)
@@ -465,7 +516,7 @@ massgen --config @examples/tools/filesystem/cc_gpt5_gemini_filesystem \
465
516
  - New `FileOperationTracker` class for read-before-delete enforcement
466
517
  - Enhanced PathPermissionManager with operation tracking methods
467
518
 
468
- **Case Study:** [MCP Planning Mode](../../docs/case_studies/mcp-planning-mode.md)
519
+ **Case Study:** [MCP Planning Mode](../../docs/source/examples/case_studies/mcp-planning-mode.md)
469
520
 
470
521
  **Try it:**
471
522
  ```bash
@@ -492,7 +543,7 @@ massgen --config @examples/tools/planning/five_agents_twitter_mcp_planning_mode
492
543
  - New `ExternalAgentBackend` class bridging MassGen with external frameworks
493
544
  - Multiple code executor types: LocalCommandLineCodeExecutor, DockerCommandLineCodeExecutor, JupyterCodeExecutor, YepCodeCodeExecutor
494
545
 
495
- **Case Study:** [AG2 Framework Integration](../../docs/case_studies/ag2-framework-integration.md)
546
+ **Case Study:** [AG2 Framework Integration](../../docs/source/examples/case_studies/ag2-framework-integration.md)
496
547
 
497
548
  **Try it:**
498
549
  ```bash
@@ -561,7 +612,7 @@ massgen --config @examples/tools/filesystem/gemini_gpt5nano_file_context_path \
561
612
  - Automatic `.massgen` directory management for persistent conversation context
562
613
  - Enhanced path permissions with `will_be_writable` flag and smart exclusion patterns
563
614
 
564
- **Case Study:** [Multi-Turn Filesystem Support](../../docs/case_studies/multi-turn-filesystem-support.md)
615
+ **Case Study:** [Multi-Turn Filesystem Support](../../docs/source/examples/case_studies/multi-turn-filesystem-support.md)
565
616
  ```bash
566
617
  # Turn 1 - Initial creation
567
618
  Turn 1: Make a website about Bob Dylan
@@ -599,7 +650,7 @@ massgen --config @examples/basic/multi/two_qwen_vllm \
599
650
  - All configs now organized by provider & use case (basic/, providers/, tools/, teams/)
600
651
  - Use same configs as v0.0.21 for compatibility, but now with improved performance
601
652
 
602
- **Case Study:** [Advanced Filesystem with User Context Path Support](../../docs/case_studies/v0.0.21-v0.0.22-filesystem-permissions.md)
653
+ **Case Study:** [Advanced Filesystem with User Context Path Support](../../docs/source/examples/case_studies/v0.0.21-v0.0.22-filesystem-permissions.md)
603
654
  ```bash
604
655
  # Multi-agent collaboration with granular filesystem permissions
605
656
  massgen --config @examples/tools/filesystem/gpt5mini_cc_fs_context_path "Enhance the website in massgen/configs/resources with: 1) A dark/light theme toggle with smooth transitions, 2) An interactive feature that helps users engage with the blog content (your choice - could be search, filtering by topic, reading time estimates, social sharing, reactions, etc.), and 3) Visual polish with CSS animations or transitions that make the site feel more modern and responsive. Use vanilla JavaScript and be creative with the implementation details."
@@ -645,7 +696,7 @@ massgen --config @examples/tools/mcp/gpt5_nano_mcp_example \
645
696
 
646
697
  ### v0.0.16
647
698
  **New Features:** Unified Filesystem Support with MCP Integration
648
- **Case Study:** [Cross-Backend Collaboration with Gemini MCP Filesystem](../../docs/case_studies/unified-filesystem-mcp-integration.md)
699
+ **Case Study:** [Cross-Backend Collaboration with Gemini MCP Filesystem](../../docs/source/examples/case_studies/unified-filesystem-mcp-integration.md)
649
700
  ```bash
650
701
  # Gemini and Claude Code agents with unified filesystem via MCP
651
702
  massgen --config @examples/tools/mcp/gemini_mcp_filesystem_test_with_claude_code "Create a presentation that teaches a reinforcement learning algorithm and output it in LaTeX Beamer format. No figures should be added."
@@ -658,7 +709,7 @@ massgen --config @examples/tools/mcp/gemini_mcp_filesystem_test_with_claude_code
658
709
 
659
710
  ### v0.0.12 - v0.0.14
660
711
  **New Features:** Enhanced Logging and Workspace Management
661
- **Case Study:** [Claude Code Workspace Management with Comprehensive Logging](../../docs/case_studies/claude-code-workspace-management.md)
712
+ **Case Study:** [Claude Code Workspace Management with Comprehensive Logging](../../docs/source/examples/case_studies/claude-code-workspace-management.md)
662
713
  ```bash
663
714
  # Multi-agent Claude Code collaboration with enhanced workspace isolation
664
715
  massgen --config @examples/tools/filesystem/claude_code_context_sharing "Create a website about a diverse set of fun facts about LLMs, placing the output in one index.html file"