camel-ai 0.2.59__py3-none-any.whl → 0.2.82__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of camel-ai might be problematic. Click here for more details.
- camel/__init__.py +3 -3
- camel/agents/__init__.py +2 -2
- camel/agents/_types.py +9 -4
- camel/agents/_utils.py +40 -2
- camel/agents/base.py +2 -2
- camel/agents/chat_agent.py +5012 -902
- camel/agents/critic_agent.py +2 -2
- camel/agents/deductive_reasoner_agent.py +56 -56
- camel/agents/embodied_agent.py +2 -2
- camel/agents/knowledge_graph_agent.py +20 -20
- camel/agents/mcp_agent.py +39 -36
- camel/agents/multi_hop_generator_agent.py +3 -3
- camel/agents/programmed_agent_instruction.py +2 -2
- camel/agents/repo_agent.py +4 -3
- camel/agents/role_assignment_agent.py +2 -2
- camel/agents/search_agent.py +2 -2
- camel/agents/task_agent.py +2 -2
- camel/agents/tool_agents/__init__.py +2 -2
- camel/agents/tool_agents/base.py +2 -2
- camel/agents/tool_agents/hugging_face_tool_agent.py +3 -3
- camel/benchmarks/__init__.py +2 -2
- camel/benchmarks/apibank.py +5 -5
- camel/benchmarks/apibench.py +2 -2
- camel/benchmarks/base.py +2 -2
- camel/benchmarks/browsecomp.py +44 -33
- camel/benchmarks/gaia.py +17 -13
- camel/benchmarks/mock_website/README.md +94 -0
- camel/benchmarks/mock_website/mock_web.py +299 -0
- camel/benchmarks/mock_website/requirements.txt +3 -0
- camel/benchmarks/mock_website/shopping_mall/app.py +465 -0
- camel/benchmarks/mock_website/task.json +104 -0
- camel/benchmarks/nexus.py +3 -3
- camel/benchmarks/ragbench.py +2 -2
- camel/bots/__init__.py +2 -2
- camel/bots/discord/__init__.py +2 -2
- camel/bots/discord/discord_app.py +2 -2
- camel/bots/discord/discord_installation.py +2 -2
- camel/bots/discord/discord_store.py +3 -3
- camel/bots/slack/__init__.py +2 -2
- camel/bots/slack/models.py +4 -4
- camel/bots/slack/slack_app.py +2 -2
- camel/bots/telegram_bot.py +2 -2
- camel/configs/__init__.py +26 -2
- camel/configs/aihubmix_config.py +90 -0
- camel/configs/aiml_config.py +2 -2
- camel/configs/amd_config.py +70 -0
- camel/configs/anthropic_config.py +8 -7
- camel/configs/base_config.py +2 -2
- camel/configs/bedrock_config.py +5 -3
- camel/configs/cerebras_config.py +98 -0
- camel/configs/cohere_config.py +3 -3
- camel/configs/cometapi_config.py +106 -0
- camel/configs/crynux_config.py +94 -0
- camel/configs/deepseek_config.py +9 -8
- camel/configs/gemini_config.py +6 -4
- camel/configs/groq_config.py +6 -4
- camel/configs/internlm_config.py +6 -4
- camel/configs/litellm_config.py +2 -2
- camel/configs/lmstudio_config.py +6 -4
- camel/configs/minimax_config.py +95 -0
- camel/configs/mistral_config.py +3 -3
- camel/configs/modelscope_config.py +5 -3
- camel/configs/moonshot_config.py +2 -2
- camel/configs/nebius_config.py +105 -0
- camel/configs/netmind_config.py +2 -2
- camel/configs/novita_config.py +2 -2
- camel/configs/nvidia_config.py +2 -2
- camel/configs/ollama_config.py +2 -2
- camel/configs/openai_config.py +8 -3
- camel/configs/openrouter_config.py +6 -4
- camel/configs/ppio_config.py +2 -2
- camel/configs/qianfan_config.py +85 -0
- camel/configs/qwen_config.py +2 -2
- camel/configs/reka_config.py +3 -3
- camel/configs/samba_config.py +8 -6
- camel/configs/sglang_config.py +2 -2
- camel/configs/siliconflow_config.py +2 -2
- camel/configs/togetherai_config.py +2 -2
- camel/configs/vllm_config.py +4 -2
- camel/configs/watsonx_config.py +2 -2
- camel/configs/yi_config.py +6 -4
- camel/configs/zhipuai_config.py +6 -4
- camel/{data_collector → data_collectors}/__init__.py +2 -2
- camel/{data_collector → data_collectors}/alpaca_collector.py +19 -10
- camel/{data_collector → data_collectors}/base.py +2 -2
- camel/{data_collector → data_collectors}/sharegpt_collector.py +3 -3
- camel/datagen/__init__.py +2 -2
- camel/datagen/cot_datagen.py +32 -37
- camel/datagen/evol_instruct/__init__.py +2 -2
- camel/datagen/evol_instruct/evol_instruct.py +2 -2
- camel/datagen/evol_instruct/scorer.py +24 -25
- camel/datagen/evol_instruct/templates.py +48 -48
- camel/datagen/self_improving_cot.py +5 -5
- camel/datagen/self_instruct/__init__.py +2 -2
- camel/datagen/self_instruct/filter/__init__.py +2 -2
- camel/datagen/self_instruct/filter/filter_function.py +2 -2
- camel/datagen/self_instruct/filter/filter_registry.py +2 -2
- camel/datagen/self_instruct/filter/instruction_filter.py +2 -2
- camel/datagen/self_instruct/self_instruct.py +2 -2
- camel/datagen/self_instruct/templates.py +47 -47
- camel/datagen/source2synth/__init__.py +2 -2
- camel/datagen/source2synth/data_processor.py +2 -2
- camel/datagen/source2synth/models.py +2 -2
- camel/datagen/source2synth/user_data_processor_config.py +2 -2
- camel/datahubs/__init__.py +2 -2
- camel/datahubs/base.py +2 -2
- camel/datahubs/huggingface.py +2 -2
- camel/datahubs/models.py +2 -2
- camel/datasets/__init__.py +2 -2
- camel/datasets/base_generator.py +41 -12
- camel/datasets/few_shot_generator.py +18 -18
- camel/datasets/models.py +3 -3
- camel/datasets/self_instruct_generator.py +2 -2
- camel/datasets/static_dataset.py +152 -2
- camel/embeddings/__init__.py +2 -2
- camel/embeddings/azure_embedding.py +2 -2
- camel/embeddings/base.py +2 -2
- camel/embeddings/gemini_embedding.py +2 -2
- camel/embeddings/jina_embedding.py +10 -3
- camel/embeddings/mistral_embedding.py +2 -2
- camel/embeddings/openai_compatible_embedding.py +2 -2
- camel/embeddings/openai_embedding.py +2 -2
- camel/embeddings/sentence_transformers_embeddings.py +4 -4
- camel/embeddings/together_embedding.py +2 -2
- camel/embeddings/vlm_embedding.py +11 -4
- camel/environments/__init__.py +14 -2
- camel/environments/models.py +2 -2
- camel/environments/multi_step.py +2 -2
- camel/environments/rlcards_env.py +860 -0
- camel/environments/single_step.py +30 -5
- camel/environments/tic_tac_toe.py +3 -3
- camel/extractors/__init__.py +2 -2
- camel/extractors/base.py +2 -2
- camel/extractors/python_strategies.py +2 -2
- camel/generators.py +2 -2
- camel/human.py +2 -2
- camel/interpreters/__init__.py +4 -2
- camel/interpreters/base.py +16 -3
- camel/interpreters/docker/Dockerfile +53 -7
- camel/interpreters/docker_interpreter.py +70 -11
- camel/interpreters/e2b_interpreter.py +59 -11
- camel/interpreters/internal_python_interpreter.py +81 -4
- camel/interpreters/interpreter_error.py +2 -2
- camel/interpreters/ipython_interpreter.py +23 -5
- camel/interpreters/microsandbox_interpreter.py +395 -0
- camel/interpreters/subprocess_interpreter.py +36 -4
- camel/loaders/__init__.py +17 -5
- camel/loaders/apify_reader.py +2 -2
- camel/loaders/base_io.py +2 -2
- camel/loaders/base_loader.py +85 -0
- camel/loaders/chunkr_reader.py +128 -93
- camel/loaders/crawl4ai_reader.py +2 -2
- camel/loaders/firecrawl_reader.py +6 -6
- camel/loaders/jina_url_reader.py +2 -2
- camel/loaders/markitdown.py +2 -2
- camel/loaders/mineru_extractor.py +2 -2
- camel/loaders/mistral_reader.py +148 -0
- camel/loaders/scrapegraph_reader.py +2 -2
- camel/loaders/unstructured_io.py +2 -2
- camel/logger.py +5 -5
- camel/memories/__init__.py +2 -2
- camel/memories/agent_memories.py +86 -3
- camel/memories/base.py +36 -2
- camel/memories/blocks/__init__.py +2 -2
- camel/memories/blocks/chat_history_block.py +126 -9
- camel/memories/blocks/vectordb_block.py +10 -3
- camel/memories/context_creators/__init__.py +2 -2
- camel/memories/context_creators/score_based.py +31 -239
- camel/memories/records.py +98 -13
- camel/messages/__init__.py +2 -2
- camel/messages/base.py +193 -46
- camel/messages/conversion/__init__.py +2 -2
- camel/messages/conversion/alpaca.py +2 -2
- camel/messages/conversion/conversation_models.py +2 -2
- camel/messages/conversion/sharegpt/__init__.py +2 -2
- camel/messages/conversion/sharegpt/function_call_formatter.py +2 -2
- camel/messages/conversion/sharegpt/hermes/__init__.py +2 -2
- camel/messages/conversion/sharegpt/hermes/hermes_function_formatter.py +2 -2
- camel/messages/func_message.py +54 -17
- camel/models/__init__.py +18 -2
- camel/models/_utils.py +3 -3
- camel/models/aihubmix_model.py +83 -0
- camel/models/aiml_model.py +11 -18
- camel/models/amd_model.py +101 -0
- camel/models/anthropic_model.py +127 -20
- camel/models/aws_bedrock_model.py +12 -35
- camel/models/azure_openai_model.py +263 -63
- camel/models/base_audio_model.py +5 -3
- camel/models/base_model.py +195 -26
- camel/models/cerebras_model.py +83 -0
- camel/models/cohere_model.py +81 -21
- camel/models/cometapi_model.py +83 -0
- camel/models/crynux_model.py +87 -0
- camel/models/deepseek_model.py +61 -59
- camel/models/fish_audio_model.py +8 -2
- camel/models/gemini_model.py +439 -30
- camel/models/groq_model.py +11 -19
- camel/models/internlm_model.py +11 -18
- camel/models/litellm_model.py +94 -34
- camel/models/lmstudio_model.py +17 -20
- camel/models/minimax_model.py +83 -0
- camel/models/mistral_model.py +84 -19
- camel/models/model_factory.py +49 -6
- camel/models/model_manager.py +33 -11
- camel/models/modelscope_model.py +13 -193
- camel/models/moonshot_model.py +195 -21
- camel/models/nebius_model.py +83 -0
- camel/models/nemotron_model.py +19 -9
- camel/models/netmind_model.py +11 -18
- camel/models/novita_model.py +11 -18
- camel/models/nvidia_model.py +11 -18
- camel/models/ollama_model.py +14 -21
- camel/models/openai_audio_models.py +2 -2
- camel/models/openai_compatible_model.py +234 -27
- camel/models/openai_model.py +255 -39
- camel/models/openrouter_model.py +11 -19
- camel/models/ppio_model.py +11 -18
- camel/models/qianfan_model.py +89 -0
- camel/models/qwen_model.py +13 -193
- camel/models/reka_model.py +90 -21
- camel/models/reward/__init__.py +2 -2
- camel/models/reward/base_reward_model.py +2 -2
- camel/models/reward/evaluator.py +2 -2
- camel/models/reward/nemotron_model.py +2 -2
- camel/models/reward/skywork_model.py +2 -2
- camel/models/samba_model.py +117 -49
- camel/models/sglang_model.py +162 -42
- camel/models/siliconflow_model.py +12 -35
- camel/models/stub_model.py +10 -7
- camel/models/togetherai_model.py +11 -18
- camel/models/vllm_model.py +10 -18
- camel/models/volcano_model.py +16 -20
- camel/models/watsonx_model.py +69 -19
- camel/models/yi_model.py +11 -18
- camel/models/zhipuai_model.py +70 -18
- camel/parsers/__init__.py +18 -0
- camel/parsers/mcp_tool_call_parser.py +176 -0
- camel/personas/__init__.py +2 -2
- camel/personas/persona.py +2 -2
- camel/personas/persona_hub.py +2 -2
- camel/prompts/__init__.py +2 -2
- camel/prompts/ai_society.py +2 -2
- camel/prompts/base.py +2 -2
- camel/prompts/code.py +2 -2
- camel/prompts/evaluation.py +2 -2
- camel/prompts/generate_text_embedding_data.py +2 -2
- camel/prompts/image_craft.py +2 -2
- camel/prompts/misalignment.py +2 -2
- camel/prompts/multi_condition_image_craft.py +2 -2
- camel/prompts/object_recognition.py +2 -2
- camel/prompts/persona_hub.py +3 -3
- camel/prompts/prompt_templates.py +2 -2
- camel/prompts/role_description_prompt_template.py +2 -2
- camel/prompts/solution_extraction.py +8 -8
- camel/prompts/task_prompt_template.py +2 -2
- camel/prompts/translation.py +2 -2
- camel/prompts/video_description_prompt.py +3 -3
- camel/responses/__init__.py +2 -2
- camel/responses/agent_responses.py +2 -2
- camel/retrievers/__init__.py +2 -2
- camel/retrievers/auto_retriever.py +23 -3
- camel/retrievers/base.py +2 -2
- camel/retrievers/bm25_retriever.py +3 -4
- camel/retrievers/cohere_rerank_retriever.py +2 -2
- camel/retrievers/hybrid_retrival.py +4 -4
- camel/retrievers/vector_retriever.py +2 -2
- camel/runtimes/Dockerfile.multi-toolkit +90 -0
- camel/{runtime → runtimes}/__init__.py +2 -2
- camel/runtimes/api.py +153 -0
- camel/{runtime → runtimes}/base.py +2 -2
- camel/{runtime → runtimes}/configs.py +13 -13
- camel/{runtime → runtimes}/daytona_runtime.py +18 -19
- camel/{runtime → runtimes}/docker_runtime.py +13 -13
- camel/{runtime → runtimes}/llm_guard_runtime.py +28 -28
- camel/{runtime → runtimes}/remote_http_runtime.py +12 -12
- camel/{runtime → runtimes}/ubuntu_docker_runtime.py +3 -3
- camel/{runtime → runtimes}/utils/__init__.py +2 -2
- camel/{runtime → runtimes}/utils/function_risk_toolkit.py +2 -2
- camel/{runtime → runtimes}/utils/ignore_risk_toolkit.py +2 -2
- camel/schemas/__init__.py +2 -2
- camel/schemas/base.py +2 -2
- camel/schemas/openai_converter.py +3 -3
- camel/schemas/outlines_converter.py +2 -2
- camel/services/agent_openapi_server.py +380 -0
- camel/societies/__init__.py +4 -2
- camel/societies/babyagi_playing.py +2 -2
- camel/societies/role_playing.py +201 -80
- camel/societies/workforce/__init__.py +10 -3
- camel/societies/workforce/base.py +9 -5
- camel/societies/workforce/events.py +143 -0
- camel/societies/workforce/prompts.py +258 -33
- camel/societies/workforce/role_playing_worker.py +95 -30
- camel/societies/workforce/single_agent_worker.py +659 -30
- camel/societies/workforce/structured_output_handler.py +512 -0
- camel/societies/workforce/task_channel.py +182 -38
- camel/societies/workforce/utils.py +784 -18
- camel/societies/workforce/worker.py +96 -28
- camel/societies/workforce/workflow_memory_manager.py +1746 -0
- camel/societies/workforce/workforce.py +5730 -366
- camel/societies/workforce/workforce_callback.py +103 -0
- camel/societies/workforce/workforce_logger.py +647 -0
- camel/societies/workforce/workforce_metrics.py +33 -0
- camel/storages/__init__.py +10 -2
- camel/storages/graph_storages/__init__.py +2 -2
- camel/storages/graph_storages/base.py +2 -2
- camel/storages/graph_storages/graph_element.py +2 -2
- camel/storages/graph_storages/nebula_graph.py +4 -4
- camel/storages/graph_storages/neo4j_graph.py +7 -7
- camel/storages/key_value_storages/__init__.py +2 -2
- camel/storages/key_value_storages/base.py +2 -2
- camel/storages/key_value_storages/in_memory.py +2 -2
- camel/storages/key_value_storages/json.py +17 -4
- camel/storages/key_value_storages/mem0_cloud.py +50 -49
- camel/storages/key_value_storages/redis.py +2 -2
- camel/storages/object_storages/__init__.py +2 -2
- camel/storages/object_storages/amazon_s3.py +2 -2
- camel/storages/object_storages/azure_blob.py +2 -2
- camel/storages/object_storages/base.py +2 -2
- camel/storages/object_storages/google_cloud.py +3 -3
- camel/storages/vectordb_storages/__init__.py +12 -2
- camel/storages/vectordb_storages/base.py +2 -2
- camel/storages/vectordb_storages/chroma.py +731 -0
- camel/storages/vectordb_storages/faiss.py +712 -0
- camel/storages/vectordb_storages/milvus.py +2 -2
- camel/storages/vectordb_storages/oceanbase.py +16 -17
- camel/storages/vectordb_storages/pgvector.py +349 -0
- camel/storages/vectordb_storages/qdrant.py +6 -6
- camel/storages/vectordb_storages/surreal.py +372 -0
- camel/storages/vectordb_storages/tidb.py +11 -8
- camel/storages/vectordb_storages/weaviate.py +714 -0
- camel/tasks/__init__.py +2 -2
- camel/tasks/task.py +366 -27
- camel/tasks/task_prompt.py +3 -3
- camel/terminators/__init__.py +2 -2
- camel/terminators/base.py +2 -2
- camel/terminators/response_terminator.py +2 -2
- camel/terminators/token_limit_terminator.py +2 -2
- camel/toolkits/__init__.py +58 -10
- camel/toolkits/aci_toolkit.py +66 -21
- camel/toolkits/arxiv_toolkit.py +8 -8
- camel/toolkits/ask_news_toolkit.py +2 -2
- camel/toolkits/async_browser_toolkit.py +174 -575
- camel/toolkits/audio_analysis_toolkit.py +3 -3
- camel/toolkits/base.py +65 -7
- camel/toolkits/bohrium_toolkit.py +318 -0
- camel/toolkits/browser_toolkit.py +306 -566
- camel/toolkits/browser_toolkit_commons.py +568 -0
- camel/toolkits/code_execution.py +67 -11
- camel/toolkits/context_summarizer_toolkit.py +684 -0
- camel/toolkits/craw4ai_toolkit.py +93 -0
- camel/toolkits/dappier_toolkit.py +12 -8
- camel/toolkits/data_commons_toolkit.py +2 -2
- camel/toolkits/dingtalk.py +1135 -0
- camel/toolkits/earth_science_toolkit.py +5367 -0
- camel/toolkits/edgeone_pages_mcp_toolkit.py +49 -0
- camel/toolkits/excel_toolkit.py +910 -70
- camel/toolkits/file_toolkit.py +1402 -0
- camel/toolkits/function_tool.py +128 -20
- camel/toolkits/github_toolkit.py +148 -43
- camel/toolkits/gmail_toolkit.py +1839 -0
- camel/toolkits/google_calendar_toolkit.py +40 -6
- camel/toolkits/google_drive_mcp_toolkit.py +54 -0
- camel/toolkits/google_maps_toolkit.py +2 -2
- camel/toolkits/google_scholar_toolkit.py +2 -2
- camel/toolkits/human_toolkit.py +36 -12
- camel/toolkits/hybrid_browser_toolkit/__init__.py +18 -0
- camel/toolkits/hybrid_browser_toolkit/config_loader.py +185 -0
- camel/toolkits/hybrid_browser_toolkit/hybrid_browser_toolkit.py +246 -0
- camel/toolkits/hybrid_browser_toolkit/hybrid_browser_toolkit_ts.py +1973 -0
- camel/toolkits/hybrid_browser_toolkit/installer.py +203 -0
- camel/toolkits/hybrid_browser_toolkit/ts/package-lock.json +4589 -0
- camel/toolkits/hybrid_browser_toolkit/ts/package.json +33 -0
- camel/toolkits/hybrid_browser_toolkit/ts/src/browser-scripts.js +125 -0
- camel/toolkits/hybrid_browser_toolkit/ts/src/browser-session.ts +1929 -0
- camel/toolkits/hybrid_browser_toolkit/ts/src/config-loader.ts +233 -0
- camel/toolkits/hybrid_browser_toolkit/ts/src/hybrid-browser-toolkit.ts +589 -0
- camel/toolkits/hybrid_browser_toolkit/ts/src/index.ts +7 -0
- camel/toolkits/hybrid_browser_toolkit/ts/src/parent-child-filter.ts +226 -0
- camel/toolkits/hybrid_browser_toolkit/ts/src/snapshot-parser.ts +219 -0
- camel/toolkits/hybrid_browser_toolkit/ts/src/som-screenshot-injected.ts +543 -0
- camel/toolkits/hybrid_browser_toolkit/ts/src/types.ts +129 -0
- camel/toolkits/hybrid_browser_toolkit/ts/tsconfig.json +27 -0
- camel/toolkits/hybrid_browser_toolkit/ts/websocket-server.js +319 -0
- camel/toolkits/hybrid_browser_toolkit/ws_wrapper.py +1037 -0
- camel/toolkits/hybrid_browser_toolkit_py/__init__.py +17 -0
- camel/toolkits/hybrid_browser_toolkit_py/actions.py +575 -0
- camel/toolkits/hybrid_browser_toolkit_py/agent.py +311 -0
- camel/toolkits/hybrid_browser_toolkit_py/browser_session.py +787 -0
- camel/toolkits/hybrid_browser_toolkit_py/config_loader.py +490 -0
- camel/toolkits/hybrid_browser_toolkit_py/hybrid_browser_toolkit.py +2390 -0
- camel/toolkits/hybrid_browser_toolkit_py/snapshot.py +233 -0
- camel/toolkits/hybrid_browser_toolkit_py/stealth_script.js +0 -0
- camel/toolkits/hybrid_browser_toolkit_py/unified_analyzer.js +1043 -0
- camel/toolkits/image_analysis_toolkit.py +3 -3
- camel/toolkits/image_generation_toolkit.py +390 -0
- camel/toolkits/jina_reranker_toolkit.py +195 -79
- camel/toolkits/klavis_toolkit.py +7 -3
- camel/toolkits/linkedin_toolkit.py +2 -2
- camel/toolkits/markitdown_toolkit.py +104 -0
- camel/toolkits/math_toolkit.py +66 -12
- camel/toolkits/mcp_toolkit.py +841 -600
- camel/toolkits/memory_toolkit.py +7 -3
- camel/toolkits/meshy_toolkit.py +2 -2
- camel/toolkits/message_agent_toolkit.py +608 -0
- camel/toolkits/message_integration.py +724 -0
- camel/toolkits/mineru_toolkit.py +2 -2
- camel/toolkits/minimax_mcp_toolkit.py +195 -0
- camel/toolkits/networkx_toolkit.py +2 -2
- camel/toolkits/note_taking_toolkit.py +277 -0
- camel/toolkits/notion_mcp_toolkit.py +224 -0
- camel/toolkits/notion_toolkit.py +2 -2
- camel/toolkits/open_api_specs/biztoc/__init__.py +2 -2
- camel/toolkits/open_api_specs/biztoc/ai-plugin.json +1 -1
- camel/toolkits/open_api_specs/coursera/__init__.py +2 -2
- camel/toolkits/open_api_specs/create_qr_code/__init__.py +2 -2
- camel/toolkits/open_api_specs/klarna/__init__.py +2 -2
- camel/toolkits/open_api_specs/nasa_apod/__init__.py +2 -2
- camel/toolkits/open_api_specs/outschool/__init__.py +2 -2
- camel/toolkits/open_api_specs/outschool/ai-plugin.json +1 -1
- camel/toolkits/open_api_specs/outschool/openapi.yaml +1 -1
- camel/toolkits/open_api_specs/outschool/paths/__init__.py +2 -2
- camel/toolkits/open_api_specs/outschool/paths/get_classes.py +2 -2
- camel/toolkits/open_api_specs/outschool/paths/search_teachers.py +2 -2
- camel/toolkits/open_api_specs/security_config.py +2 -2
- camel/toolkits/open_api_specs/speak/__init__.py +2 -2
- camel/toolkits/open_api_specs/web_scraper/__init__.py +2 -2
- camel/toolkits/open_api_specs/web_scraper/ai-plugin.json +1 -1
- camel/toolkits/open_api_specs/web_scraper/paths/__init__.py +2 -2
- camel/toolkits/open_api_specs/web_scraper/paths/scraper.py +2 -2
- camel/toolkits/open_api_toolkit.py +2 -2
- camel/toolkits/openbb_toolkit.py +7 -3
- camel/toolkits/origene_mcp_toolkit.py +56 -0
- camel/toolkits/page_script.js +86 -74
- camel/toolkits/playwright_mcp_toolkit.py +27 -32
- camel/toolkits/pptx_toolkit.py +790 -0
- camel/toolkits/pubmed_toolkit.py +2 -2
- camel/toolkits/pulse_mcp_search_toolkit.py +2 -2
- camel/toolkits/pyautogui_toolkit.py +2 -2
- camel/toolkits/reddit_toolkit.py +2 -2
- camel/toolkits/resend_toolkit.py +168 -0
- camel/toolkits/retrieval_toolkit.py +2 -2
- camel/toolkits/screenshot_toolkit.py +213 -0
- camel/toolkits/search_toolkit.py +539 -146
- camel/toolkits/searxng_toolkit.py +2 -2
- camel/toolkits/semantic_scholar_toolkit.py +2 -2
- camel/toolkits/slack_toolkit.py +108 -58
- camel/toolkits/sql_toolkit.py +712 -0
- camel/toolkits/stripe_toolkit.py +2 -2
- camel/toolkits/sympy_toolkit.py +3 -3
- camel/toolkits/task_planning_toolkit.py +134 -0
- camel/toolkits/terminal_toolkit/__init__.py +18 -0
- camel/toolkits/terminal_toolkit/terminal_toolkit.py +1070 -0
- camel/toolkits/terminal_toolkit/utils.py +532 -0
- camel/toolkits/thinking_toolkit.py +3 -3
- camel/toolkits/twitter_toolkit.py +8 -3
- camel/toolkits/vertex_ai_veo_toolkit.py +590 -0
- camel/toolkits/video_analysis_toolkit.py +112 -29
- camel/toolkits/video_download_toolkit.py +22 -16
- camel/toolkits/weather_toolkit.py +2 -2
- camel/toolkits/web_deploy_toolkit.py +1219 -0
- camel/toolkits/wechat_official_toolkit.py +483 -0
- camel/toolkits/whatsapp_toolkit.py +2 -2
- camel/toolkits/wolfram_alpha_toolkit.py +53 -25
- camel/toolkits/zapier_toolkit.py +7 -3
- camel/types/__init__.py +4 -4
- camel/types/agents/__init__.py +2 -2
- camel/types/agents/tool_calling_record.py +6 -3
- camel/types/enums.py +454 -35
- camel/types/mcp_registries.py +2 -2
- camel/types/openai_types.py +4 -4
- camel/types/unified_model_type.py +43 -6
- camel/utils/__init__.py +20 -2
- camel/utils/async_func.py +2 -2
- camel/utils/chunker/__init__.py +2 -2
- camel/utils/chunker/base.py +2 -2
- camel/utils/chunker/code_chunker.py +2 -2
- camel/utils/chunker/uio_chunker.py +2 -2
- camel/utils/commons.py +65 -7
- camel/utils/constants.py +5 -2
- camel/utils/context_utils.py +1134 -0
- camel/utils/deduplication.py +2 -2
- camel/utils/filename.py +2 -2
- camel/utils/langfuse.py +258 -0
- camel/utils/mcp.py +140 -6
- camel/utils/mcp_client.py +1056 -0
- camel/utils/message_summarizer.py +148 -0
- camel/utils/response_format.py +2 -2
- camel/utils/token_counting.py +45 -22
- camel/utils/tool_result.py +44 -0
- camel/verifiers/__init__.py +2 -2
- camel/verifiers/base.py +2 -2
- camel/verifiers/math_verifier.py +2 -2
- camel/verifiers/models.py +2 -2
- camel/verifiers/physics_verifier.py +2 -2
- camel/verifiers/python_verifier.py +2 -2
- {camel_ai-0.2.59.dist-info → camel_ai-0.2.82.dist-info}/METADATA +349 -108
- camel_ai-0.2.82.dist-info/RECORD +507 -0
- {camel_ai-0.2.59.dist-info → camel_ai-0.2.82.dist-info}/WHEEL +1 -1
- {camel_ai-0.2.59.dist-info → camel_ai-0.2.82.dist-info}/licenses/LICENSE +1 -1
- camel/loaders/pandas_reader.py +0 -368
- camel/runtime/api.py +0 -97
- camel/toolkits/dalle_toolkit.py +0 -171
- camel/toolkits/file_write_toolkit.py +0 -395
- camel/toolkits/openai_agent_toolkit.py +0 -135
- camel/toolkits/terminal_toolkit.py +0 -1037
- camel_ai-0.2.59.dist-info/RECORD +0 -410
camel/benchmarks/__init__.py
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# ========= Copyright 2023-
|
|
1
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
2
2
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
3
3
|
# you may not use this file except in compliance with the License.
|
|
4
4
|
# You may obtain a copy of the License at
|
|
@@ -10,7 +10,7 @@
|
|
|
10
10
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
11
11
|
# See the License for the specific language governing permissions and
|
|
12
12
|
# limitations under the License.
|
|
13
|
-
# ========= Copyright 2023-
|
|
13
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
14
14
|
|
|
15
15
|
from .apibank import APIBankBenchmark
|
|
16
16
|
from .apibench import APIBenchBenchmark
|
camel/benchmarks/apibank.py
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# ========= Copyright 2023-
|
|
1
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
2
2
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
3
3
|
# you may not use this file except in compliance with the License.
|
|
4
4
|
# You may obtain a copy of the License at
|
|
@@ -10,7 +10,7 @@
|
|
|
10
10
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
11
11
|
# See the License for the specific language governing permissions and
|
|
12
12
|
# limitations under the License.
|
|
13
|
-
# ========= Copyright 2023-
|
|
13
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
14
14
|
|
|
15
15
|
import json
|
|
16
16
|
import logging
|
|
@@ -542,9 +542,9 @@ replace the ApiName with the actual API name, and \
|
|
|
542
542
|
replace the key and value with the actual parameters. \
|
|
543
543
|
Your output should start with a square bracket "[" \
|
|
544
544
|
and end with a square bracket "]". Do not output any \
|
|
545
|
-
other explanation or prompt or the result of the API call in your output.
|
|
545
|
+
other explanation or prompt or the result of the API call in your output.
|
|
546
546
|
This year is 2023.
|
|
547
|
-
Input:
|
|
547
|
+
Input:
|
|
548
548
|
User: [User's utterence]
|
|
549
549
|
AI: [AI's utterence]
|
|
550
550
|
|
|
@@ -559,7 +559,7 @@ Based on the given API description and the existing \
|
|
|
559
559
|
conversation history 1..t, please generate the next \
|
|
560
560
|
dialog that the AI should response after the API call t.
|
|
561
561
|
This year is 2023.
|
|
562
|
-
Input:
|
|
562
|
+
Input:
|
|
563
563
|
User: [User's utterence]
|
|
564
564
|
AI: [AI's utterence]
|
|
565
565
|
[ApiName(key1='value1', key2='value2', …)]
|
camel/benchmarks/apibench.py
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# ========= Copyright 2023-
|
|
1
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
2
2
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
3
3
|
# you may not use this file except in compliance with the License.
|
|
4
4
|
# You may obtain a copy of the License at
|
|
@@ -10,7 +10,7 @@
|
|
|
10
10
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
11
11
|
# See the License for the specific language governing permissions and
|
|
12
12
|
# limitations under the License.
|
|
13
|
-
# ========= Copyright 2023-
|
|
13
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
14
14
|
|
|
15
15
|
import json
|
|
16
16
|
import logging
|
camel/benchmarks/base.py
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# ========= Copyright 2023-
|
|
1
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
2
2
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
3
3
|
# you may not use this file except in compliance with the License.
|
|
4
4
|
# You may obtain a copy of the License at
|
|
@@ -10,7 +10,7 @@
|
|
|
10
10
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
11
11
|
# See the License for the specific language governing permissions and
|
|
12
12
|
# limitations under the License.
|
|
13
|
-
# ========= Copyright 2023-
|
|
13
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
14
14
|
|
|
15
15
|
import logging
|
|
16
16
|
from abc import ABC, abstractmethod
|
camel/benchmarks/browsecomp.py
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# ========= Copyright 2023-
|
|
1
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
2
2
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
3
3
|
# you may not use this file except in compliance with the License.
|
|
4
4
|
# You may obtain a copy of the License at
|
|
@@ -10,7 +10,7 @@
|
|
|
10
10
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
11
11
|
# See the License for the specific language governing permissions and
|
|
12
12
|
# limitations under the License.
|
|
13
|
-
# ========= Copyright 2023-
|
|
13
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
14
14
|
|
|
15
15
|
import base64
|
|
16
16
|
import hashlib
|
|
@@ -55,7 +55,7 @@ class QueryResponse(BaseModel):
|
|
|
55
55
|
)
|
|
56
56
|
exact_answer: str = Field(description="""your succinct, final answer.""")
|
|
57
57
|
confidence: str = Field(
|
|
58
|
-
description="""
|
|
58
|
+
description=r"""
|
|
59
59
|
your confidence score between 0|\%| and 100|\%| for your answer.
|
|
60
60
|
"""
|
|
61
61
|
)
|
|
@@ -72,27 +72,27 @@ class GradingResponse(BaseModel):
|
|
|
72
72
|
extracted_final_answer: str = Field(
|
|
73
73
|
description="""
|
|
74
74
|
The final exact answer extracted from the [response].
|
|
75
|
-
Put the extracted answer as 'None' if there is no exact, final answer to
|
|
75
|
+
Put the extracted answer as 'None' if there is no exact, final answer to
|
|
76
76
|
extract from the response."""
|
|
77
77
|
)
|
|
78
78
|
reasoning: str = Field(
|
|
79
79
|
description="""
|
|
80
|
-
Explain why the extracted_final_answer is correct or incorrect
|
|
81
|
-
based on [correct_answer], focusing only on if there are meaningful
|
|
82
|
-
differences between [correct_answer] and the extracted_final_answer.
|
|
83
|
-
Do not comment on any background to the problem, do not attempt
|
|
84
|
-
to solve the problem, do not argue for any answer different
|
|
80
|
+
Explain why the extracted_final_answer is correct or incorrect
|
|
81
|
+
based on [correct_answer], focusing only on if there are meaningful
|
|
82
|
+
differences between [correct_answer] and the extracted_final_answer.
|
|
83
|
+
Do not comment on any background to the problem, do not attempt
|
|
84
|
+
to solve the problem, do not argue for any answer different
|
|
85
85
|
than [correct_answer], focus only on whether the answers match."""
|
|
86
86
|
)
|
|
87
87
|
correct: str = Field(
|
|
88
|
-
description="""Answer 'yes' if extracted_final_answer matches the
|
|
89
|
-
[correct_answer] given above, or is within a small margin of error for
|
|
90
|
-
numerical problems. Answer 'no' otherwise, i.e. if there if there is any
|
|
91
|
-
inconsistency, ambiguity, non-equivalency, or if the extracted answer is
|
|
88
|
+
description="""Answer 'yes' if extracted_final_answer matches the
|
|
89
|
+
[correct_answer] given above, or is within a small margin of error for
|
|
90
|
+
numerical problems. Answer 'no' otherwise, i.e. if there if there is any
|
|
91
|
+
inconsistency, ambiguity, non-equivalency, or if the extracted answer is
|
|
92
92
|
incorrect."""
|
|
93
93
|
)
|
|
94
94
|
confidence: str = Field(
|
|
95
|
-
description="""The extracted confidence score between 0|\%|
|
|
95
|
+
description=r"""The extracted confidence score between 0|\%|
|
|
96
96
|
and 100|\%| from [response]. Put 100 if there is no confidence score available.
|
|
97
97
|
"""
|
|
98
98
|
)
|
|
@@ -160,8 +160,8 @@ format content into json:
|
|
|
160
160
|
{content}
|
|
161
161
|
"""
|
|
162
162
|
|
|
163
|
-
GRADER_TEMPLATE = """
|
|
164
|
-
Judge whether the following [response] to [question] is correct or not
|
|
163
|
+
GRADER_TEMPLATE = r"""
|
|
164
|
+
Judge whether the following [response] to [question] is correct or not
|
|
165
165
|
based on the precise and unambiguous [correct_answer] below.
|
|
166
166
|
|
|
167
167
|
[question]: {question}
|
|
@@ -171,26 +171,37 @@ based on the precise and unambiguous [correct_answer] below.
|
|
|
171
171
|
Your judgement must be in the format and criteria specified below:
|
|
172
172
|
|
|
173
173
|
extracted_final_answer: The final exact answer extracted from the [response].
|
|
174
|
-
Put the extracted answer as 'None' if there is no exact, final answer to
|
|
174
|
+
Put the extracted answer as 'None' if there is no exact, final answer to
|
|
175
|
+
Put the extracted answer as 'None' if there is no exact, final answer to
|
|
175
176
|
extract from the response.
|
|
176
177
|
|
|
177
178
|
[correct_answer]: {correct_answer}
|
|
178
179
|
|
|
179
|
-
reasoning: Explain why the extracted_final_answer is correct or incorrect
|
|
180
|
-
based on [correct_answer], focusing only on if there are meaningful
|
|
181
|
-
differences between [correct_answer] and the extracted_final_answer.
|
|
182
|
-
Do not comment on any background to the problem, do not attempt
|
|
183
|
-
to solve the problem, do not argue for any answer different
|
|
180
|
+
reasoning: Explain why the extracted_final_answer is correct or incorrect
|
|
181
|
+
based on [correct_answer], focusing only on if there are meaningful
|
|
182
|
+
differences between [correct_answer] and the extracted_final_answer.
|
|
183
|
+
Do not comment on any background to the problem, do not attempt
|
|
184
|
+
to solve the problem, do not argue for any answer different
|
|
185
|
+
reasoning: Explain why the extracted_final_answer is correct or incorrect
|
|
186
|
+
based on [correct_answer], focusing only on if there are meaningful
|
|
187
|
+
differences between [correct_answer] and the extracted_final_answer.
|
|
188
|
+
Do not comment on any background to the problem, do not attempt
|
|
189
|
+
to solve the problem, do not argue for any answer different
|
|
184
190
|
than [correct_answer], focus only on whether the answers match.
|
|
185
191
|
|
|
186
|
-
correct: Answer 'yes' if extracted_final_answer matches the
|
|
187
|
-
[correct_answer] given above, or is within a small margin of error for
|
|
188
|
-
numerical problems. Answer 'no' otherwise, i.e. if there is any
|
|
189
|
-
inconsistency, ambiguity, non-equivalency, or if the extracted answer is
|
|
192
|
+
correct: Answer 'yes' if extracted_final_answer matches the
|
|
193
|
+
[correct_answer] given above, or is within a small margin of error for
|
|
194
|
+
numerical problems. Answer 'no' otherwise, i.e. if there is any
|
|
195
|
+
inconsistency, ambiguity, non-equivalency, or if the extracted answer is
|
|
196
|
+
correct: Answer 'yes' if extracted_final_answer matches the
|
|
197
|
+
[correct_answer] given above, or is within a small margin of error for
|
|
198
|
+
numerical problems. Answer 'no' otherwise, i.e. if there is any
|
|
199
|
+
inconsistency, ambiguity, non-equivalency, or if the extracted answer is
|
|
190
200
|
incorrect.
|
|
191
201
|
|
|
192
202
|
|
|
193
|
-
confidence: The extracted confidence score between 0|\%| and 100|\%|
|
|
203
|
+
confidence: The extracted confidence score between 0|\%| and 100|\%|
|
|
204
|
+
confidence: The extracted confidence score between 0|\%| and 100|\%|
|
|
194
205
|
from [response]. Put 100 if there is no confidence score available.
|
|
195
206
|
""".strip()
|
|
196
207
|
|
|
@@ -619,20 +630,20 @@ class BrowseCompBenchmark(BaseBenchmark):
|
|
|
619
630
|
assistant_response, user_response = pipeline.step(
|
|
620
631
|
input_msg
|
|
621
632
|
)
|
|
622
|
-
if assistant_response.terminated: # type: ignore[attr
|
|
633
|
+
if assistant_response.terminated: # type: ignore[union-attr]
|
|
623
634
|
break
|
|
624
|
-
if user_response.terminated: # type: ignore[attr
|
|
635
|
+
if user_response.terminated: # type: ignore[union-attr]
|
|
625
636
|
break
|
|
626
|
-
if "CAMEL_TASK_DONE" in user_response.msg.content: # type: ignore[attr
|
|
637
|
+
if "CAMEL_TASK_DONE" in user_response.msg.content: # type: ignore[union-attr]
|
|
627
638
|
break
|
|
628
639
|
|
|
629
640
|
chat_history.append(
|
|
630
|
-
f"AI User: {user_response.msg.content}" # type: ignore[attr
|
|
641
|
+
f"AI User: {user_response.msg.content}" # type: ignore[union-attr]
|
|
631
642
|
)
|
|
632
643
|
chat_history.append(
|
|
633
|
-
f"AI Assistant: {assistant_response.msg.content}" # type: ignore[attr
|
|
644
|
+
f"AI Assistant: {assistant_response.msg.content}" # type: ignore[union-attr]
|
|
634
645
|
)
|
|
635
|
-
input_msg = assistant_response.msg # type: ignore[attr
|
|
646
|
+
input_msg = assistant_response.msg # type: ignore[union-attr]
|
|
636
647
|
|
|
637
648
|
chat_history_str = "\n".join(chat_history)
|
|
638
649
|
if roleplaying_summarizer:
|
camel/benchmarks/gaia.py
CHANGED
|
@@ -1,4 +1,4 @@
|
|
|
1
|
-
# ========= Copyright 2023-
|
|
1
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
2
2
|
# Licensed under the Apache License, Version 2.0 (the "License");
|
|
3
3
|
# you may not use this file except in compliance with the License.
|
|
4
4
|
# You may obtain a copy of the License at
|
|
@@ -10,7 +10,7 @@
|
|
|
10
10
|
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
11
11
|
# See the License for the specific language governing permissions and
|
|
12
12
|
# limitations under the License.
|
|
13
|
-
# ========= Copyright 2023-
|
|
13
|
+
# ========= Copyright 2023-2025 @ CAMEL-AI.org. All Rights Reserved. =========
|
|
14
14
|
|
|
15
15
|
import json
|
|
16
16
|
import logging
|
|
@@ -165,6 +165,8 @@ class GAIABenchmark(BaseBenchmark):
|
|
|
165
165
|
force_download (bool, optional): Whether to
|
|
166
166
|
force download the data.
|
|
167
167
|
"""
|
|
168
|
+
import pandas as pd
|
|
169
|
+
|
|
168
170
|
if force_download:
|
|
169
171
|
logger.info("Force downloading data.")
|
|
170
172
|
self.download()
|
|
@@ -181,15 +183,17 @@ class GAIABenchmark(BaseBenchmark):
|
|
|
181
183
|
# Load metadata for both validation and test datasets
|
|
182
184
|
for path, label in zip([valid_dir, test_dir], ["valid", "test"]):
|
|
183
185
|
self._data[label] = []
|
|
184
|
-
|
|
185
|
-
|
|
186
|
-
|
|
187
|
-
|
|
188
|
-
|
|
189
|
-
|
|
190
|
-
|
|
191
|
-
|
|
192
|
-
|
|
186
|
+
metadata_file = path / "metadata.parquet"
|
|
187
|
+
df = pd.read_parquet(metadata_file)
|
|
188
|
+
for _, row in df.iterrows():
|
|
189
|
+
data = row.to_dict()
|
|
190
|
+
if data["task_id"] == "0-0-0-0-0":
|
|
191
|
+
continue
|
|
192
|
+
# convert level to int (parquet stores as string)
|
|
193
|
+
data["Level"] = int(data["Level"])
|
|
194
|
+
if data["file_name"]:
|
|
195
|
+
data["file_name"] = path / data["file_name"]
|
|
196
|
+
self._data[label].append(data)
|
|
193
197
|
return self
|
|
194
198
|
|
|
195
199
|
@property
|
|
@@ -333,7 +337,7 @@ class GAIABenchmark(BaseBenchmark):
|
|
|
333
337
|
}
|
|
334
338
|
self._results.append(result_data)
|
|
335
339
|
file_obj.write(
|
|
336
|
-
json.dumps(result_data, indent=2) + "\n"
|
|
340
|
+
json.dumps(result_data, indent=2, ensure_ascii=False) + "\n"
|
|
337
341
|
)
|
|
338
342
|
file_obj.flush()
|
|
339
343
|
|
|
@@ -354,7 +358,7 @@ class GAIABenchmark(BaseBenchmark):
|
|
|
354
358
|
}
|
|
355
359
|
self._results.append(error_data)
|
|
356
360
|
file_obj.write(
|
|
357
|
-
json.dumps(error_data, indent=2) + "\n"
|
|
361
|
+
json.dumps(error_data, indent=2, ensure_ascii=False) + "\n"
|
|
358
362
|
)
|
|
359
363
|
file_obj.flush()
|
|
360
364
|
|
|
@@ -0,0 +1,94 @@
|
|
|
1
|
+
# Mock Website Benchmarks for Web Agent Testing
|
|
2
|
+
|
|
3
|
+
This project provides a framework for testing web agents against various mock websites. It features a central dispatcher (`mock_web.py`) that manages different test environments, each simulating a specific type of website (e.g., an e-commerce site). The initial example project is an Amazon-like shopping website.
|
|
4
|
+
|
|
5
|
+
## Core Concepts
|
|
6
|
+
|
|
7
|
+
* **Dispatcher (`mock_web.py`):** The single entry point for running any benchmark. It handles:
|
|
8
|
+
* Downloading necessary web assets (HTML templates, CSS, JS) for a specific project from Hugging Face (https://huggingface.co/datasets/camel-ai/mock_websites).
|
|
9
|
+
* Reading the task configuration from `task.json`.
|
|
10
|
+
* Launching the project's dedicated Flask web server as a background process.
|
|
11
|
+
* Monitoring the server for task completion via API polling.
|
|
12
|
+
* Reporting the final results (success status, number of operations) and shutting down the server.
|
|
13
|
+
* **Projects (e.g., `shopping_mall/`):** Each project is a self-contained Flask application representing a unique website.
|
|
14
|
+
* **Task Configuration (`task.json`):** A central JSON file that defines the environment and the goal for the agent.
|
|
15
|
+
|
|
16
|
+
## Example Project: `shopping_mall`
|
|
17
|
+
|
|
18
|
+
The included `shopping_mall` project simulates an e-commerce website with the following features:
|
|
19
|
+
* **Product Display:** View a list of products with images, descriptions, ratings, and prices.
|
|
20
|
+
* **Product Detail Pages:** Click on a product to see its dedicated detail page.
|
|
21
|
+
* **Shopping Cart:** Add products, view the cart, and manage its contents.
|
|
22
|
+
* **API-Driven:** The backend provides API endpoints for all state-changing actions.
|
|
23
|
+
|
|
24
|
+
## Setup and Usage
|
|
25
|
+
|
|
26
|
+
1. **Install Dependencies:**
|
|
27
|
+
```bash
|
|
28
|
+
pip install -r requirements.txt
|
|
29
|
+
```
|
|
30
|
+
This will install `Flask`, `huggingface-hub`, `requests`, and other necessary packages.
|
|
31
|
+
|
|
32
|
+
2. **Configure the Task:**
|
|
33
|
+
Edit the `task.json` file to define the products available on the website and the agent's goal. The structure is as follows:
|
|
34
|
+
```json
|
|
35
|
+
{
|
|
36
|
+
"products": [
|
|
37
|
+
{
|
|
38
|
+
"id": 1,
|
|
39
|
+
"name": "Gaming Laptop",
|
|
40
|
+
"price": 1200,
|
|
41
|
+
"image": "assets/img/products/laptop.jpg",
|
|
42
|
+
"category": "Electronics",
|
|
43
|
+
"rating": 4.5,
|
|
44
|
+
"description": "High-performance gaming laptop with latest specs."
|
|
45
|
+
}
|
|
46
|
+
],
|
|
47
|
+
"ground_truth_cart": [
|
|
48
|
+
{
|
|
49
|
+
"id": 1,
|
|
50
|
+
"quantity": 1
|
|
51
|
+
}
|
|
52
|
+
]
|
|
53
|
+
}
|
|
54
|
+
```
|
|
55
|
+
* `products`: A list of all product objects available in the environment.
|
|
56
|
+
* `ground_truth_cart`: A list of items that defines the target state of the shopping cart for the task to be considered complete.
|
|
57
|
+
|
|
58
|
+
3. **Running the Benchmark:**
|
|
59
|
+
Use the `mock_web.py` dispatcher to run a project.
|
|
60
|
+
```bash
|
|
61
|
+
python mock_web.py --project shopping_mall
|
|
62
|
+
```
|
|
63
|
+
* `--project`: Specifies which mock website project to run (default: `shopping_mall`).
|
|
64
|
+
* `--port`: Specifies the port to run the server on (default: `5001`).
|
|
65
|
+
|
|
66
|
+
The dispatcher will start the server and begin polling for task completion. You or your agent can then interact with the website at `http://127.0.0.1:5001/`. Once the conditions defined in `ground_truth_cart` are met, the dispatcher will automatically detect it, report the results, and shut down. You can also stop it early by pressing `Ctrl+C`.
|
|
67
|
+
|
|
68
|
+
## Logging
|
|
69
|
+
|
|
70
|
+
* `dispatcher.log`: High-level log from the dispatcher, showing setup, status, and final results.
|
|
71
|
+
* `shopping_mall/app.log`: Detailed internal log from the Flask application for the `shopping_mall` project.
|
|
72
|
+
|
|
73
|
+
## Project Structure
|
|
74
|
+
|
|
75
|
+
```
|
|
76
|
+
.
|
|
77
|
+
├── mock_web.py # Main dispatcher for running benchmarks
|
|
78
|
+
├── task.json # Task configuration file
|
|
79
|
+
├── requirements.txt # Python dependencies
|
|
80
|
+
├── shopping_mall/ # Example project: shopping_mall
|
|
81
|
+
│ └── app.py # Flask application for this project
|
|
82
|
+
├── dispatcher.log # (Generated at runtime)
|
|
83
|
+
└── README.md # This file
|
|
84
|
+
```
|
|
85
|
+
The dispatcher will automatically download project-specific `templates` and `static` folders from Hugging Face Hub and place them inside the corresponding project directory at runtime.
|
|
86
|
+
|
|
87
|
+
## TODO: Automated Question Generation Module
|
|
88
|
+
|
|
89
|
+
A planned future module for this project is the development of an automated question generation system. This system would analyze the current state of the web application environment (e.g., visible elements, available products, cart status) and generate relevant questions or tasks for a web agent to solve.
|
|
90
|
+
|
|
91
|
+
This could involve:
|
|
92
|
+
* Identifying interactable elements and their states.
|
|
93
|
+
* Understanding the current context (e.g., on product page, in cart).
|
|
94
|
+
* Formulating natural language questions or goal descriptions based on this context (e.g., "Find a product under $50 in the Electronics category and add it to the cart," or "What is the current subtotal of the cart after adding two units of item X?").
|