hud-python 0.4.60__tar.gz → 0.4.62__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (316) hide show
  1. {hud_python-0.4.60 → hud_python-0.4.62}/PKG-INFO +44 -43
  2. {hud_python-0.4.60 → hud_python-0.4.62}/README.md +40 -40
  3. {hud_python-0.4.60 → hud_python-0.4.62}/environments/blank/README.md +3 -3
  4. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/README.md +2 -2
  5. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/server/pyproject.toml +1 -1
  6. {hud_python-0.4.60 → hud_python-0.4.62}/environments/deepresearch/README.md +3 -3
  7. hud_python-0.4.62/environments/jupyter/README.md +68 -0
  8. hud_python-0.4.62/environments/jupyter/server/pyproject.toml +34 -0
  9. hud_python-0.4.62/environments/online_mind2web/README.md +36 -0
  10. hud_python-0.4.62/environments/online_mind2web/pyproject.toml +22 -0
  11. hud_python-0.4.62/environments/remote_browser/src/hud_controller/providers/README.md +110 -0
  12. {hud_python-0.4.60 → hud_python-0.4.62}/environments/rubrics/README.md +3 -3
  13. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/__init__.py +12 -8
  14. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/build.py +93 -5
  15. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/flows/tasks.py +3 -3
  16. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/remote_runner.py +1 -1
  17. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_convert.py +13 -13
  18. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/base.py +1 -1
  19. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/fastmcp.py +1 -1
  20. {hud_python-0.4.60 → hud_python-0.4.62}/hud/samples/browser.py +1 -1
  21. {hud_python-0.4.60 → hud_python-0.4.62}/hud/settings.py +4 -4
  22. {hud_python-0.4.60 → hud_python-0.4.62}/hud/shared/exceptions.py +1 -1
  23. {hud_python-0.4.60 → hud_python-0.4.62}/hud/shared/tests/test_exceptions.py +1 -1
  24. hud_python-0.4.62/hud/tools/jupyter.py +313 -0
  25. hud_python-0.4.62/hud/tools/tests/test_jupyter_tool.py +176 -0
  26. {hud_python-0.4.60 → hud_python-0.4.62}/hud/types.py +1 -1
  27. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_version.py +1 -1
  28. {hud_python-0.4.60 → hud_python-0.4.62}/hud/version.py +1 -1
  29. {hud_python-0.4.60 → hud_python-0.4.62}/pyproject.toml +4 -3
  30. {hud_python-0.4.60 → hud_python-0.4.62}/.gitignore +0 -0
  31. {hud_python-0.4.60 → hud_python-0.4.62}/LICENSE +0 -0
  32. {hud_python-0.4.60 → hud_python-0.4.62}/environments/README.md +0 -0
  33. {hud_python-0.4.60 → hud_python-0.4.62}/environments/blank/environment/README.md +0 -0
  34. {hud_python-0.4.60 → hud_python-0.4.62}/environments/blank/environment/pyproject.toml +0 -0
  35. {hud_python-0.4.60 → hud_python-0.4.62}/environments/blank/server/README.md +0 -0
  36. {hud_python-0.4.60 → hud_python-0.4.62}/environments/blank/server/pyproject.toml +0 -0
  37. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/browser-base/README.md +0 -0
  38. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/environment/2048/README.md +0 -0
  39. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/environment/2048/backend/pyproject.toml +0 -0
  40. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/environment/README.md +0 -0
  41. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/environment/pyproject.toml +0 -0
  42. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/environment/todo/README.md +0 -0
  43. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/environment/todo/backend/pyproject.toml +0 -0
  44. {hud_python-0.4.60 → hud_python-0.4.62}/environments/browser/pyproject.toml +0 -0
  45. {hud_python-0.4.60 → hud_python-0.4.62}/environments/deepresearch/environment/pyproject.toml +0 -0
  46. {hud_python-0.4.60 → hud_python-0.4.62}/environments/deepresearch/pyproject.toml +0 -0
  47. {hud_python-0.4.60 → hud_python-0.4.62}/environments/deepresearch/server/pyproject.toml +0 -0
  48. {hud_python-0.4.60/environments/remote_browser → hud_python-0.4.62/environments/online_mind2web}/src/hud_controller/providers/README.md +0 -0
  49. {hud_python-0.4.60 → hud_python-0.4.62}/environments/remote_browser/README.md +0 -0
  50. {hud_python-0.4.60 → hud_python-0.4.62}/environments/remote_browser/pyproject.toml +0 -0
  51. {hud_python-0.4.60 → hud_python-0.4.62}/environments/rubrics/environment/pyproject.toml +0 -0
  52. {hud_python-0.4.60 → hud_python-0.4.62}/environments/rubrics/pyproject.toml +0 -0
  53. {hud_python-0.4.60 → hud_python-0.4.62}/environments/rubrics/server/pyproject.toml +0 -0
  54. {hud_python-0.4.60 → hud_python-0.4.62}/environments/text_2048/README.md +0 -0
  55. {hud_python-0.4.60 → hud_python-0.4.62}/environments/text_2048/pyproject.toml +0 -0
  56. {hud_python-0.4.60 → hud_python-0.4.62}/examples/README.md +0 -0
  57. {hud_python-0.4.60 → hud_python-0.4.62}/hud/__init__.py +0 -0
  58. {hud_python-0.4.60 → hud_python-0.4.62}/hud/__main__.py +0 -0
  59. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/__init__.py +0 -0
  60. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/base.py +0 -0
  61. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/claude.py +0 -0
  62. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/gemini.py +0 -0
  63. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/grounded_openai.py +0 -0
  64. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/langchain.py +0 -0
  65. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/lite_llm.py +0 -0
  66. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/misc/__init__.py +0 -0
  67. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/misc/integration_test_agent.py +0 -0
  68. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/misc/response_agent.py +0 -0
  69. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/openai.py +0 -0
  70. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/openai_chat_generic.py +0 -0
  71. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/tests/__init__.py +0 -0
  72. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/tests/test_base.py +0 -0
  73. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/tests/test_base_runtime.py +0 -0
  74. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/tests/test_claude.py +0 -0
  75. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/tests/test_client.py +0 -0
  76. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/tests/test_gemini.py +0 -0
  77. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/tests/test_grounded_openai_agent.py +0 -0
  78. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/tests/test_openai.py +0 -0
  79. {hud_python-0.4.60 → hud_python-0.4.62}/hud/agents/utils.py +0 -0
  80. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/__main__.py +0 -0
  81. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/analyze.py +0 -0
  82. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/clone.py +0 -0
  83. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/debug.py +0 -0
  84. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/dev.py +0 -0
  85. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/eval.py +0 -0
  86. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/flows/__init__.py +0 -0
  87. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/flows/dev.py +0 -0
  88. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/get.py +0 -0
  89. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/init.py +0 -0
  90. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/list_func.py +0 -0
  91. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/pull.py +0 -0
  92. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/push.py +0 -0
  93. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/remove.py +0 -0
  94. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/__init__.py +0 -0
  95. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/celebrate.py +0 -0
  96. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/config.py +0 -0
  97. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/display.py +0 -0
  98. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/gpu.py +0 -0
  99. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/gpu_utils.py +0 -0
  100. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/local_runner.py +0 -0
  101. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/presets.py +0 -0
  102. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/rl_api.py +0 -0
  103. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/viewer.py +0 -0
  104. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/vllm.py +0 -0
  105. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/rl/wait_utils.py +0 -0
  106. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/__init__.py +0 -0
  107. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_analyze.py +0 -0
  108. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_analyze_metadata.py +0 -0
  109. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_analyze_module.py +0 -0
  110. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_build.py +0 -0
  111. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_build_failure.py +0 -0
  112. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_build_module.py +0 -0
  113. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_cli_init.py +0 -0
  114. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_cli_main.py +0 -0
  115. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_cli_more_wrappers.py +0 -0
  116. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_cli_root.py +0 -0
  117. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_clone.py +0 -0
  118. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_cursor.py +0 -0
  119. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_debug.py +0 -0
  120. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_eval.py +0 -0
  121. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_list_func.py +0 -0
  122. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_main_module.py +0 -0
  123. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_mcp_server.py +0 -0
  124. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_pull.py +0 -0
  125. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_push.py +0 -0
  126. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_push_happy.py +0 -0
  127. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_push_wrapper.py +0 -0
  128. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_registry.py +0 -0
  129. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/tests/test_utils.py +0 -0
  130. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/__init__.py +0 -0
  131. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/config.py +0 -0
  132. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/cursor.py +0 -0
  133. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/docker.py +0 -0
  134. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/env_check.py +0 -0
  135. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/environment.py +0 -0
  136. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/interactive.py +0 -0
  137. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/local_runner.py +0 -0
  138. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/logging.py +0 -0
  139. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/metadata.py +0 -0
  140. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/package_runner.py +0 -0
  141. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/registry.py +0 -0
  142. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/remote_runner.py +0 -0
  143. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/runner.py +0 -0
  144. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/server.py +0 -0
  145. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/source_hash.py +0 -0
  146. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tasks.py +0 -0
  147. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/__init__.py +0 -0
  148. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_config.py +0 -0
  149. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_docker.py +0 -0
  150. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_docker_hints.py +0 -0
  151. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_env_check.py +0 -0
  152. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_environment.py +0 -0
  153. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_interactive_module.py +0 -0
  154. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_local_runner.py +0 -0
  155. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_logging_utils.py +0 -0
  156. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_metadata.py +0 -0
  157. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_package_runner.py +0 -0
  158. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_registry_utils.py +0 -0
  159. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_remote_runner.py +0 -0
  160. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_runner_modules.py +0 -0
  161. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_source_hash.py +0 -0
  162. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/tests/test_tasks.py +0 -0
  163. {hud_python-0.4.60 → hud_python-0.4.62}/hud/cli/utils/version_check.py +0 -0
  164. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/README.md +0 -0
  165. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/__init__.py +0 -0
  166. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/mcp_use.py +0 -0
  167. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/tests/__init__.py +0 -0
  168. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/tests/test_client_integration.py +0 -0
  169. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/tests/test_fastmcp.py +0 -0
  170. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/tests/test_mcp_use_retry.py +0 -0
  171. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/tests/test_protocol.py +0 -0
  172. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/utils/__init__.py +0 -0
  173. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/utils/mcp_use_retry.py +0 -0
  174. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/utils/retry.py +0 -0
  175. {hud_python-0.4.60 → hud_python-0.4.62}/hud/clients/utils/retry_transport.py +0 -0
  176. {hud_python-0.4.60 → hud_python-0.4.62}/hud/datasets/__init__.py +0 -0
  177. {hud_python-0.4.60 → hud_python-0.4.62}/hud/datasets/parallel.py +0 -0
  178. {hud_python-0.4.60 → hud_python-0.4.62}/hud/datasets/runner.py +0 -0
  179. {hud_python-0.4.60 → hud_python-0.4.62}/hud/datasets/tests/__init__.py +0 -0
  180. {hud_python-0.4.60 → hud_python-0.4.62}/hud/datasets/tests/test_runner.py +0 -0
  181. {hud_python-0.4.60 → hud_python-0.4.62}/hud/datasets/tests/test_utils.py +0 -0
  182. {hud_python-0.4.60 → hud_python-0.4.62}/hud/datasets/utils.py +0 -0
  183. {hud_python-0.4.60 → hud_python-0.4.62}/hud/misc/__init__.py +0 -0
  184. {hud_python-0.4.60 → hud_python-0.4.62}/hud/misc/claude_plays_pokemon.py +0 -0
  185. {hud_python-0.4.60 → hud_python-0.4.62}/hud/native/__init__.py +0 -0
  186. {hud_python-0.4.60 → hud_python-0.4.62}/hud/native/comparator.py +0 -0
  187. {hud_python-0.4.60 → hud_python-0.4.62}/hud/native/tests/__init__.py +0 -0
  188. {hud_python-0.4.60 → hud_python-0.4.62}/hud/native/tests/test_comparator.py +0 -0
  189. {hud_python-0.4.60 → hud_python-0.4.62}/hud/native/tests/test_native_init.py +0 -0
  190. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/__init__.py +0 -0
  191. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/collector.py +0 -0
  192. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/config.py +0 -0
  193. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/context.py +0 -0
  194. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/exporters.py +0 -0
  195. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/instrumentation.py +0 -0
  196. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/processors.py +0 -0
  197. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/tests/__init__.py +0 -0
  198. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/tests/test_instrumentation.py +0 -0
  199. {hud_python-0.4.60 → hud_python-0.4.62}/hud/otel/tests/test_processors.py +0 -0
  200. {hud_python-0.4.60 → hud_python-0.4.62}/hud/py.typed +0 -0
  201. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/README.md +0 -0
  202. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/__init__.py +0 -0
  203. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/actor.py +0 -0
  204. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/buffer.py +0 -0
  205. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/chat_template.jinja +0 -0
  206. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/config.py +0 -0
  207. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/distributed.py +0 -0
  208. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/learner.py +0 -0
  209. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/tests/__init__.py +0 -0
  210. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/tests/test_learner.py +0 -0
  211. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/train.py +0 -0
  212. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/types.py +0 -0
  213. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/utils/start_vllm_server.sh +0 -0
  214. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/utils.py +0 -0
  215. {hud_python-0.4.60 → hud_python-0.4.62}/hud/rl/vllm_adapter.py +0 -0
  216. {hud_python-0.4.60 → hud_python-0.4.62}/hud/samples/__init__.py +0 -0
  217. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/__init__.py +0 -0
  218. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/context.py +0 -0
  219. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/helper/__init__.py +0 -0
  220. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/low_level.py +0 -0
  221. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/router.py +0 -0
  222. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/server.py +0 -0
  223. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/tests/__init__.py +0 -0
  224. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/tests/test_add_tool.py +0 -0
  225. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/tests/test_context.py +0 -0
  226. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/tests/test_mcp_server_handlers.py +0 -0
  227. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/tests/test_mcp_server_integration.py +0 -0
  228. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/tests/test_mcp_server_more.py +0 -0
  229. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/tests/test_run_wrapper.py +0 -0
  230. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/tests/test_server_extra.py +0 -0
  231. {hud_python-0.4.60 → hud_python-0.4.62}/hud/server/tests/test_sigterm_runner.py +0 -0
  232. {hud_python-0.4.60 → hud_python-0.4.62}/hud/shared/__init__.py +0 -0
  233. {hud_python-0.4.60 → hud_python-0.4.62}/hud/shared/hints.py +0 -0
  234. {hud_python-0.4.60 → hud_python-0.4.62}/hud/shared/requests.py +0 -0
  235. {hud_python-0.4.60 → hud_python-0.4.62}/hud/shared/tests/__init__.py +0 -0
  236. {hud_python-0.4.60 → hud_python-0.4.62}/hud/shared/tests/test_hints.py +0 -0
  237. {hud_python-0.4.60 → hud_python-0.4.62}/hud/shared/tests/test_requests.py +0 -0
  238. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/__init__.py +0 -0
  239. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/async_context.py +0 -0
  240. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/instrument.py +0 -0
  241. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/job.py +0 -0
  242. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/replay.py +0 -0
  243. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/tests/__init__.py +0 -0
  244. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/tests/test_async_context.py +0 -0
  245. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/tests/test_instrument.py +0 -0
  246. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/tests/test_job.py +0 -0
  247. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/tests/test_replay.py +0 -0
  248. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/tests/test_trace.py +0 -0
  249. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/trace.py +0 -0
  250. {hud_python-0.4.60 → hud_python-0.4.62}/hud/telemetry/utils.py +0 -0
  251. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/__init__.py +0 -0
  252. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/base.py +0 -0
  253. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/bash.py +0 -0
  254. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/computer/__init__.py +0 -0
  255. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/computer/anthropic.py +0 -0
  256. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/computer/gemini.py +0 -0
  257. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/computer/hud.py +0 -0
  258. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/computer/openai.py +0 -0
  259. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/computer/qwen.py +0 -0
  260. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/computer/settings.py +0 -0
  261. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/edit.py +0 -0
  262. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/executors/__init__.py +0 -0
  263. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/executors/base.py +0 -0
  264. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/executors/pyautogui.py +0 -0
  265. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/executors/tests/__init__.py +0 -0
  266. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/executors/tests/test_base_executor.py +0 -0
  267. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/executors/tests/test_pyautogui_executor.py +0 -0
  268. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/executors/xdo.py +0 -0
  269. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/grounding/__init__.py +0 -0
  270. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/grounding/config.py +0 -0
  271. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/grounding/grounded_tool.py +0 -0
  272. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/grounding/grounder.py +0 -0
  273. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/grounding/tests/__init__.py +0 -0
  274. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/grounding/tests/test_grounded_tool.py +0 -0
  275. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/playwright.py +0 -0
  276. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/response.py +0 -0
  277. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/submit.py +0 -0
  278. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/__init__.py +0 -0
  279. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_base.py +0 -0
  280. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_bash.py +0 -0
  281. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_bash_extended.py +0 -0
  282. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_computer.py +0 -0
  283. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_computer_actions.py +0 -0
  284. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_edit.py +0 -0
  285. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_init.py +0 -0
  286. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_playwright_tool.py +0 -0
  287. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_response.py +0 -0
  288. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_submit.py +0 -0
  289. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_tools.py +0 -0
  290. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_tools_init.py +0 -0
  291. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_types.py +0 -0
  292. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/tests/test_utils.py +0 -0
  293. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/types.py +0 -0
  294. {hud_python-0.4.60 → hud_python-0.4.62}/hud/tools/utils.py +0 -0
  295. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/__init__.py +0 -0
  296. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/agent_factories.py +0 -0
  297. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/async_utils.py +0 -0
  298. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/group_eval.py +0 -0
  299. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/hud_console.py +0 -0
  300. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/mcp.py +0 -0
  301. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/pretty_errors.py +0 -0
  302. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/progress.py +0 -0
  303. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/task_tracking.py +0 -0
  304. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tasks.py +0 -0
  305. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/telemetry.py +0 -0
  306. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/__init__.py +0 -0
  307. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_agent_factories.py +0 -0
  308. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_async_utils.py +0 -0
  309. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_init.py +0 -0
  310. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_mcp.py +0 -0
  311. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_pretty_errors.py +0 -0
  312. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_progress.py +0 -0
  313. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_tasks.py +0 -0
  314. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_telemetry.py +0 -0
  315. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tests/test_tool_shorthand.py +0 -0
  316. {hud_python-0.4.60 → hud_python-0.4.62}/hud/utils/tool_shorthand.py +0 -0
@@ -1,11 +1,11 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: hud-python
3
- Version: 0.4.60
3
+ Version: 0.4.62
4
4
  Summary: SDK for the HUD platform.
5
5
  Project-URL: Homepage, https://github.com/hud-evals/hud-python
6
6
  Project-URL: Bug Tracker, https://github.com/hud-evals/hud-python/issues
7
- Project-URL: Documentation, https://docs.hud.so
8
- Author-email: HUD SDK <founders@hud.so>
7
+ Project-URL: Documentation, https://docs.hud.ai
8
+ Author-email: HUD <founders@hud.ai>
9
9
  License: MIT License
10
10
 
11
11
  Copyright (c) 2025 Human Union Data, Inc
@@ -59,6 +59,7 @@ Requires-Dist: pydantic<3,>=2.6
59
59
  Requires-Dist: questionary==2.1.0
60
60
  Requires-Dist: rich>=13.0.0
61
61
  Requires-Dist: toml>=0.10.2
62
+ Requires-Dist: tornado>=6.5.2
62
63
  Requires-Dist: typer>=0.9.0
63
64
  Requires-Dist: watchfiles>=0.21.0
64
65
  Requires-Dist: wrapt>=1.14.0
@@ -153,21 +154,21 @@ OSS RL environment + evals toolkit. Wrap software as environments, run benchmark
153
154
  [![Add docs to Cursor](https://img.shields.io/badge/Add%20docs%20to-Cursor-black?style=flat-square)](https://cursor.com/en/install-mcp?name=docs-hud-python&config=eyJ1cmwiOiJodHRwczovL2RvY3MuaHVkLnNvL21jcCJ9)
154
155
  [![Discord](https://img.shields.io/discord/1327447144772407390?label=Discord&logo=discord&style=flat-square)](https://discord.gg/wkjtmHYYjm)
155
156
  [![X Follow](https://img.shields.io/twitter/follow/hud_evals?style=social)](https://x.com/intent/user?screen_name=hud_evals)
156
- [![Shop](https://img.shields.io/badge/_-white.svg?label=shop&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAJCAYAAAAywQxIAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAACxMAAAsTAQCanBgAAAF6SURBVChTlZA9ixNhFIWf8yaTpFHRRMXCKpAZhCAYFvwoLHZhwUKw9A9YCJb+Bq0sxGbBQrTxX1j41dvIRAjGZbdwRUUGIzPMeyw2swS3WZ/ynHvP5VylafoAWAd+5Xm+wX+SpukmcMf29RDCZrD9BViz3f53+CjYngKZpD5A2/Y7SQBMJpOkKIprdV1vdzqdHzHGblmW9Ww2+5pl2TmAxWKxmM/nP8fj8cmqqtZijJ9sb0u6ABBWjh0riuIt8CqE8LGu66e2d5MkeQ8QY3xme7fb7T4ZjUbrZVl+jjFuSXoEXGxCDgIl9WzfAO5LSmzvNB771R6vzG4Bx0MIt/M8vwV8aLyDQNt70+n0G1AspaTxVln+aghQluVsKbvxVysflT9NQK/XO7R/SGiQ9Nt2aftElmWXJd1kv0kbeANQVdWl4XB4XtJouXaqNRgMHkrqS+r0+/3XwD1JXdungRfAVWBi+6WkK8D3EMJz22cl3W21WgNgx3YAzvwFd0Chdq03gKUAAAAASUVORK5CYII=&style=social)](https://shop.hud.so)
157
+ [![Shop](https://img.shields.io/badge/_-white.svg?label=shop&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAJCAYAAAAywQxIAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAACxMAAAsTAQCanBgAAAF6SURBVChTlZA9ixNhFIWf8yaTpFHRRMXCKpAZhCAYFvwoLHZhwUKw9A9YCJb+Bq0sxGbBQrTxX1j41dvIRAjGZbdwRUUGIzPMeyw2swS3WZ/ynHvP5VylafoAWAd+5Xm+wX+SpukmcMf29RDCZrD9BViz3f53+CjYngKZpD5A2/Y7SQBMJpOkKIprdV1vdzqdHzHGblmW9Ww2+5pl2TmAxWKxmM/nP8fj8cmqqtZijJ9sb0u6ABBWjh0riuIt8CqE8LGu66e2d5MkeQ8QY3xme7fb7T4ZjUbrZVl+jjFuSXoEXGxCDgIl9WzfAO5LSmzvNB771R6vzG4Bx0MIt/M8vwV8aLyDQNt70+n0G1AspaTxVln+aghQluVsKbvxVysflT9NQK/XO7R/SGiQ9Nt2aftElmWXJd1kv0kbeANQVdWl4XB4XtJouXaqNRgMHkrqS+r0+/3XwD1JXdungRfAVWBi+6WkK8D3EMJz22cl3W21WgNgx3YAzvwFd0Chdq03gKUAAAAASUVORK5CYII=&style=social)](https://shop.hud.ai)
157
158
 
158
159
 
159
160
  ### Are you a startup building agents?
160
161
 
161
- [📅 Hop on a call](https://cal.com/jay-ram-z6st6w/demo) or [📧 founders@hud.so](mailto:founders@hud.so)
162
+ [📅 Hop on a call](https://cal.com/jay-ram-z6st6w/demo) or [📧 founders@hud.ai](mailto:founders@hud.ai)
162
163
 
163
164
  ## Highlights
164
165
 
165
- - 🚀 **[MCP environment skeleton](https://docs.hud.so/core-concepts/mcp-protocol)** – any agent can call any environment.
166
- - ⚡️ **[Live telemetry](https://hud.so)** – inspect every tool call, observation, and reward in real time.
167
- - 🗂️ **[Public benchmarks](https://hud.so/leaderboards)** – OSWorld-Verified, SheetBench-50, and more.
166
+ - 🚀 **[MCP environment skeleton](https://docs.hud.ai/core-concepts/mcp-protocol)** – any agent can call any environment.
167
+ - ⚡️ **[Live telemetry](https://hud.ai)** – inspect every tool call, observation, and reward in real time.
168
+ - 🗂️ **[Public benchmarks](https://hud.ai/leaderboards)** – OSWorld-Verified, SheetBench-50, and more.
168
169
  - 🌐 **[Cloud browsers](environments/remote_browser/)** – AnchorBrowser, Steel, BrowserBase integrations for browser automation.
169
170
  - 🛠️ **[Hot-reload dev loop](environments/README.md#phase-5-hot-reload-development-with-cursor-agent)** – `hud dev` for iterating on environments without rebuilds.
170
- - 🎓 **[One-click RL](https://hud.so/models)** – Run `hud rl` to get a trained model on any environment.
171
+ - 🎓 **[One-click RL](https://hud.ai/models)** – Run `hud rl` to get a trained model on any environment.
171
172
 
172
173
  > We welcome contributors and feature requests – open an issue or hop on a call to discuss improvements!
173
174
 
@@ -182,10 +183,10 @@ uv tool install hud-python
182
183
  # uv tool update-shell
183
184
  ```
184
185
 
185
- > See [docs.hud.so](https://docs.hud.so), or add docs to any MCP client:
186
- > `claude mcp add --transport http docs-hud https://docs.hud.so/mcp`
186
+ > See [docs.hud.ai](https://docs.hud.ai), or add docs to any MCP client:
187
+ > `claude mcp add --transport http docs-hud https://docs.hud.ai/mcp`
187
188
 
188
- Before starting, get your HUD_API_KEY at [hud.so](https://hud.so).
189
+ Before starting, get your HUD_API_KEY at [hud.ai](https://hud.ai).
189
190
 
190
191
 
191
192
  ## Quickstart: Evals
@@ -203,17 +204,17 @@ import asyncio, hud, os
203
204
  from hud.settings import settings
204
205
  from hud.clients import MCPClient
205
206
  from hud.agents import ClaudeAgent
206
- from hud.datasets import Task # See docs: https://docs.hud.so/reference/tasks
207
+ from hud.datasets import Task # See docs: https://docs.hud.ai/reference/tasks
207
208
 
208
209
  async def main() -> None:
209
- with hud.trace("Quick Start 2048"): # All telemetry works for any MCP-based agent (see https://hud.so)
210
+ with hud.trace("Quick Start 2048"): # All telemetry works for any MCP-based agent (see https://hud.ai)
210
211
  task = {
211
212
  "prompt": "Reach 64 in 2048.",
212
213
  "mcp_config": {
213
214
  "hud": {
214
- "url": "https://mcp.hud.so/v3/mcp", # HUD's cloud MCP server (see https://docs.hud.so/core-concepts/architecture)
215
+ "url": "https://mcp.hud.ai/v3/mcp", # HUD's cloud MCP server (see https://docs.hud.ai/core-concepts/architecture)
215
216
  "headers": {
216
- "Authorization": f"Bearer {settings.api_key}", # Get your key at https://hud.so
217
+ "Authorization": f"Bearer {settings.api_key}", # Get your key at https://hud.ai
217
218
  "Mcp-Image": "hudpython/hud-text-2048:v1.2" # Docker image from https://hub.docker.com/u/hudpython
218
219
  }
219
220
  }
@@ -240,7 +241,7 @@ async def main() -> None:
240
241
  asyncio.run(main())
241
242
  ```
242
243
 
243
- The above example let's the agent play 2048 ([See replay](https://hud.so/trace/6feed7bd-5f67-4d66-b77f-eb1e3164604f))
244
+ The above example let's the agent play 2048 ([See replay](https://hud.ai/trace/6feed7bd-5f67-4d66-b77f-eb1e3164604f))
244
245
 
245
246
  ![Agent playing 2048](https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/src/images/2048_1.gif)
246
247
 
@@ -253,7 +254,7 @@ hud get hud-evals/2048-basic # from HF
253
254
  hud rl 2048-basic.json
254
255
  ```
255
256
 
256
- > See [agent training docs](https://docs.hud.so/train-agents/quickstart)
257
+ > See [agent training docs](https://docs.hud.ai/train-agents/quickstart)
257
258
 
258
259
  Or make your own environment and dataset:
259
260
 
@@ -264,7 +265,7 @@ hud dev --interactive
264
265
  hud rl
265
266
  ```
266
267
 
267
- > See [environment design docs](https://docs.hud.so/build-environments)
268
+ > See [environment design docs](https://docs.hud.ai/build-environments)
268
269
 
269
270
  ## Benchmarking Agents
270
271
 
@@ -272,7 +273,7 @@ This is Claude Computer Use running on our proprietary financial analyst benchma
272
273
 
273
274
  ![Trace screenshot](https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/src/images/trace_sheet.gif)
274
275
 
275
- > [See this trace on _hud.so_](https://hud.so/trace/9e212e9e-3627-4f1f-9eb5-c6d03c59070a)
276
+ > [See this trace on _hud.ai_](https://hud.ai/trace/9e212e9e-3627-4f1f-9eb5-c6d03c59070a)
276
277
 
277
278
  This example runs the full dataset (only takes ~20 minutes) using [run_evaluation.py](examples/run_evaluation.py):
278
279
 
@@ -290,7 +291,7 @@ from hud.agents import ClaudeAgent
290
291
  results = await run_dataset(
291
292
  name="My SheetBench-50 Evaluation",
292
293
  dataset="hud-evals/SheetBench-50", # <-- HuggingFace dataset
293
- agent_class=ClaudeAgent, # <-- Your custom agent can replace this (see https://docs.hud.so/evaluate-agents/create-agents)
294
+ agent_class=ClaudeAgent, # <-- Your custom agent can replace this (see https://docs.hud.ai/evaluate-agents/create-agents)
294
295
  agent_config={"model": "claude-sonnet-4-20250514"},
295
296
  max_concurrent=50,
296
297
  max_steps=30,
@@ -298,13 +299,13 @@ results = await run_dataset(
298
299
  print(f"Average reward: {sum(r.reward for r in results) / len(results):.2f}")
299
300
  ```
300
301
 
301
- > Running a dataset creates a job and streams results to the [hud.so](https://hud.so) platform for analysis and [leaderboard submission](https://docs.hud.so/evaluate-agents/leaderboards).
302
+ > Running a dataset creates a job and streams results to the [hud.ai](https://hud.ai) platform for analysis and [leaderboard submission](https://docs.hud.ai/evaluate-agents/leaderboards).
302
303
 
303
304
  ## Building Environments (MCP)
304
305
 
305
306
  This is how you can make any environment into an interactable one in 5 steps:
306
307
 
307
- 1. Define MCP server layer using [`MCPServer`](https://docs.hud.so/reference/environments)
308
+ 1. Define MCP server layer using [`MCPServer`](https://docs.hud.ai/reference/environments)
308
309
 
309
310
  ```python
310
311
  from hud.server import MCPServer
@@ -312,10 +313,10 @@ from hud.tools import HudComputerTool
312
313
 
313
314
  mcp = MCPServer("My Environment")
314
315
 
315
- # Add hud tools (see all tools: https://docs.hud.so/reference/tools)
316
+ # Add hud tools (see all tools: https://docs.hud.ai/reference/tools)
316
317
  mcp.tool(HudComputerTool())
317
318
 
318
- # Or custom tools (see https://docs.hud.so/build-environments/adapting-software)
319
+ # Or custom tools (see https://docs.hud.ai/build-environments/adapting-software)
319
320
  @mcp.tool("launch_app"):
320
321
  def launch_app(name: str = "Gmail")
321
322
  ...
@@ -389,16 +390,16 @@ Tools
389
390
  hud push # needs docker login, hud api key
390
391
  ```
391
392
 
392
- 5. Now you can use `mcp.hud.so` to launch 100s of instances of this environment in parallel with any agent, and see everything live on [hud.so](https://hud.so):
393
+ 5. Now you can use `mcp.hud.ai` to launch 100s of instances of this environment in parallel with any agent, and see everything live on [hud.ai](https://hud.ai):
393
394
 
394
395
  ```python
395
396
  from hud.agents import ClaudeAgent
396
397
 
397
- result = await ClaudeAgent().run({ # See all agents: https://docs.hud.so/reference/agents
398
+ result = await ClaudeAgent().run({ # See all agents: https://docs.hud.ai/reference/agents
398
399
  "prompt": "Please explore this environment",
399
400
  "mcp_config": {
400
401
  "my-environment": {
401
- "url": "https://mcp.hud.so/v3/mcp",
402
+ "url": "https://mcp.hud.ai/v3/mcp",
402
403
  "headers": {
403
404
  "Authorization": f"Bearer {os.getenv('HUD_API_KEY')}",
404
405
  "Mcp-Image": "my-name/my-environment:latest"
@@ -420,13 +421,13 @@ result = await ClaudeAgent().run({ # See all agents: https://docs.hud.so/refere
420
421
 
421
422
  ## Leaderboards & benchmarks
422
423
 
423
- All leaderboards are publicly available on [hud.so/leaderboards](https://hud.so/leaderboards) (see [docs](https://docs.hud.so/evaluate-agents/leaderboards))
424
+ All leaderboards are publicly available on [hud.ai/leaderboards](https://hud.ai/leaderboards) (see [docs](https://docs.hud.ai/evaluate-agents/leaderboards))
424
425
 
425
426
  ![Leaderboard](https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/src/images/leaderboards_3.png)
426
427
 
427
428
  We highly suggest running 3-5 evaluations per dataset for the most consistent results across multiple jobs.
428
429
 
429
- Using the [`run_dataset`](https://docs.hud.so/reference/tasks#run_dataset) function with a HuggingFace dataset automatically assigns your job to that leaderboard page, and allows you to create a scorecard out of it:
430
+ Using the [`run_dataset`](https://docs.hud.ai/reference/tasks#run_dataset) function with a HuggingFace dataset automatically assigns your job to that leaderboard page, and allows you to create a scorecard out of it:
430
431
 
431
432
  ## Reinforcement Learning with GRPO
432
433
 
@@ -455,11 +456,11 @@ Supports multi‑turn RL for both:
455
456
  - Language‑only models (e.g., `Qwen/Qwen2.5-7B-Instruct`)
456
457
  - Vision‑Language models (e.g., `Qwen/Qwen2.5-VL-3B-Instruct`)
457
458
 
458
- By default, `hud rl` provisions a persistent server and trainer in the cloud, streams telemetry to `hud.so`, and lets you monitor/manage models at `hud.so/models`. Use `--local` to run entirely on your machines (typically 2+ GPUs: one for vLLM, the rest for training).
459
+ By default, `hud rl` provisions a persistent server and trainer in the cloud, streams telemetry to `hud.ai`, and lets you monitor/manage models at `hud.ai/models`. Use `--local` to run entirely on your machines (typically 2+ GPUs: one for vLLM, the rest for training).
459
460
 
460
- Any HUD MCP environment and evaluation works with our RL pipeline (including remote configurations). See the guided docs: `https://docs.hud.so/train-agents/quickstart`.
461
+ Any HUD MCP environment and evaluation works with our RL pipeline (including remote configurations). See the guided docs: `https://docs.hud.ai/train-agents/quickstart`.
461
462
 
462
- Pricing: Hosted vLLM and training GPU rates are listed in the [Training Quickstart → Pricing](https://docs.hud.so/train-agents/quickstart#pricing). Manage billing at the [HUD billing dashboard](https://hud.so/project/billing).
463
+ Pricing: Hosted vLLM and training GPU rates are listed in the [Training Quickstart → Pricing](https://docs.hud.ai/train-agents/quickstart#pricing). Manage billing at the [HUD billing dashboard](https://hud.ai/project/billing).
463
464
 
464
465
  ## Architecture
465
466
 
@@ -467,8 +468,8 @@ Pricing: Hosted vLLM and training GPU rates are listed in the [Training Quicksta
467
468
  %%{init: {"theme": "neutral", "themeVariables": {"fontSize": "14px"}} }%%
468
469
  graph LR
469
470
  subgraph "Platform"
470
- Dashboard["📊 hud.so"]
471
- API["🔌 mcp.hud.so"]
471
+ Dashboard["📊 hud.ai"]
472
+ API["🔌 mcp.hud.ai"]
472
473
  end
473
474
 
474
475
  subgraph "hud"
@@ -506,14 +507,14 @@ graph LR
506
507
 
507
508
  | Command | Purpose | Docs |
508
509
  | ----------------------- | ------------------------------------------ | ---- |
509
- | [`hud init`](https://docs.hud.so/reference/cli/init) | Create new environment with boilerplate. | [📖](https://docs.hud.so/reference/cli/init) |
510
- | [`hud dev`](https://docs.hud.so/reference/cli/dev) | Hot-reload development with Docker. | [📖](https://docs.hud.so/reference/cli/dev) |
511
- | [`hud build`](https://docs.hud.so/reference/cli/build) | Build image and generate lock file. | [📖](https://docs.hud.so/reference/cli/build) |
512
- | [`hud push`](https://docs.hud.so/reference/cli/push) | Share environment to registry. | [📖](https://docs.hud.so/reference/cli/push) |
513
- | [`hud pull <target>`](https://docs.hud.so/reference/cli/pull) | Get environment from registry. | [📖](https://docs.hud.so/reference/cli/pull) |
514
- | [`hud analyze <image>`](https://docs.hud.so/reference/cli/analyze) | Discover tools, resources, and metadata. | [📖](https://docs.hud.so/reference/cli/analyze) |
515
- | [`hud debug <image>`](https://docs.hud.so/reference/cli/debug) | Five-phase health check of an environment. | [📖](https://docs.hud.so/reference/cli/debug) |
516
- | [`hud run <image>`](https://docs.hud.so/reference/cli/run) | Run MCP server locally or remotely. | [📖](https://docs.hud.so/reference/cli/run) |
510
+ | [`hud init`](https://docs.hud.ai/reference/cli/init) | Create new environment with boilerplate. | [📖](https://docs.hud.ai/reference/cli/init) |
511
+ | [`hud dev`](https://docs.hud.ai/reference/cli/dev) | Hot-reload development with Docker. | [📖](https://docs.hud.ai/reference/cli/dev) |
512
+ | [`hud build`](https://docs.hud.ai/reference/cli/build) | Build image and generate lock file. | [📖](https://docs.hud.ai/reference/cli/build) |
513
+ | [`hud push`](https://docs.hud.ai/reference/cli/push) | Share environment to registry. | [📖](https://docs.hud.ai/reference/cli/push) |
514
+ | [`hud pull <target>`](https://docs.hud.ai/reference/cli/pull) | Get environment from registry. | [📖](https://docs.hud.ai/reference/cli/pull) |
515
+ | [`hud analyze <image>`](https://docs.hud.ai/reference/cli/analyze) | Discover tools, resources, and metadata. | [📖](https://docs.hud.ai/reference/cli/analyze) |
516
+ | [`hud debug <image>`](https://docs.hud.ai/reference/cli/debug) | Five-phase health check of an environment. | [📖](https://docs.hud.ai/reference/cli/debug) |
517
+ | [`hud run <image>`](https://docs.hud.ai/reference/cli/run) | Run MCP server locally or remotely. | [📖](https://docs.hud.ai/reference/cli/run) |
517
518
 
518
519
  ## Roadmap
519
520
 
@@ -13,21 +13,21 @@ OSS RL environment + evals toolkit. Wrap software as environments, run benchmark
13
13
  [![Add docs to Cursor](https://img.shields.io/badge/Add%20docs%20to-Cursor-black?style=flat-square)](https://cursor.com/en/install-mcp?name=docs-hud-python&config=eyJ1cmwiOiJodHRwczovL2RvY3MuaHVkLnNvL21jcCJ9)
14
14
  [![Discord](https://img.shields.io/discord/1327447144772407390?label=Discord&logo=discord&style=flat-square)](https://discord.gg/wkjtmHYYjm)
15
15
  [![X Follow](https://img.shields.io/twitter/follow/hud_evals?style=social)](https://x.com/intent/user?screen_name=hud_evals)
16
- [![Shop](https://img.shields.io/badge/_-white.svg?label=shop&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAJCAYAAAAywQxIAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAACxMAAAsTAQCanBgAAAF6SURBVChTlZA9ixNhFIWf8yaTpFHRRMXCKpAZhCAYFvwoLHZhwUKw9A9YCJb+Bq0sxGbBQrTxX1j41dvIRAjGZbdwRUUGIzPMeyw2swS3WZ/ynHvP5VylafoAWAd+5Xm+wX+SpukmcMf29RDCZrD9BViz3f53+CjYngKZpD5A2/Y7SQBMJpOkKIprdV1vdzqdHzHGblmW9Ww2+5pl2TmAxWKxmM/nP8fj8cmqqtZijJ9sb0u6ABBWjh0riuIt8CqE8LGu66e2d5MkeQ8QY3xme7fb7T4ZjUbrZVl+jjFuSXoEXGxCDgIl9WzfAO5LSmzvNB771R6vzG4Bx0MIt/M8vwV8aLyDQNt70+n0G1AspaTxVln+aghQluVsKbvxVysflT9NQK/XO7R/SGiQ9Nt2aftElmWXJd1kv0kbeANQVdWl4XB4XtJouXaqNRgMHkrqS+r0+/3XwD1JXdungRfAVWBi+6WkK8D3EMJz22cl3W21WgNgx3YAzvwFd0Chdq03gKUAAAAASUVORK5CYII=&style=social)](https://shop.hud.so)
16
+ [![Shop](https://img.shields.io/badge/_-white.svg?label=shop&logo=data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAJCAYAAAAywQxIAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAACxMAAAsTAQCanBgAAAF6SURBVChTlZA9ixNhFIWf8yaTpFHRRMXCKpAZhCAYFvwoLHZhwUKw9A9YCJb+Bq0sxGbBQrTxX1j41dvIRAjGZbdwRUUGIzPMeyw2swS3WZ/ynHvP5VylafoAWAd+5Xm+wX+SpukmcMf29RDCZrD9BViz3f53+CjYngKZpD5A2/Y7SQBMJpOkKIprdV1vdzqdHzHGblmW9Ww2+5pl2TmAxWKxmM/nP8fj8cmqqtZijJ9sb0u6ABBWjh0riuIt8CqE8LGu66e2d5MkeQ8QY3xme7fb7T4ZjUbrZVl+jjFuSXoEXGxCDgIl9WzfAO5LSmzvNB771R6vzG4Bx0MIt/M8vwV8aLyDQNt70+n0G1AspaTxVln+aghQluVsKbvxVysflT9NQK/XO7R/SGiQ9Nt2aftElmWXJd1kv0kbeANQVdWl4XB4XtJouXaqNRgMHkrqS+r0+/3XwD1JXdungRfAVWBi+6WkK8D3EMJz22cl3W21WgNgx3YAzvwFd0Chdq03gKUAAAAASUVORK5CYII=&style=social)](https://shop.hud.ai)
17
17
 
18
18
 
19
19
  ### Are you a startup building agents?
20
20
 
21
- [📅 Hop on a call](https://cal.com/jay-ram-z6st6w/demo) or [📧 founders@hud.so](mailto:founders@hud.so)
21
+ [📅 Hop on a call](https://cal.com/jay-ram-z6st6w/demo) or [📧 founders@hud.ai](mailto:founders@hud.ai)
22
22
 
23
23
  ## Highlights
24
24
 
25
- - 🚀 **[MCP environment skeleton](https://docs.hud.so/core-concepts/mcp-protocol)** – any agent can call any environment.
26
- - ⚡️ **[Live telemetry](https://hud.so)** – inspect every tool call, observation, and reward in real time.
27
- - 🗂️ **[Public benchmarks](https://hud.so/leaderboards)** – OSWorld-Verified, SheetBench-50, and more.
25
+ - 🚀 **[MCP environment skeleton](https://docs.hud.ai/core-concepts/mcp-protocol)** – any agent can call any environment.
26
+ - ⚡️ **[Live telemetry](https://hud.ai)** – inspect every tool call, observation, and reward in real time.
27
+ - 🗂️ **[Public benchmarks](https://hud.ai/leaderboards)** – OSWorld-Verified, SheetBench-50, and more.
28
28
  - 🌐 **[Cloud browsers](environments/remote_browser/)** – AnchorBrowser, Steel, BrowserBase integrations for browser automation.
29
29
  - 🛠️ **[Hot-reload dev loop](environments/README.md#phase-5-hot-reload-development-with-cursor-agent)** – `hud dev` for iterating on environments without rebuilds.
30
- - 🎓 **[One-click RL](https://hud.so/models)** – Run `hud rl` to get a trained model on any environment.
30
+ - 🎓 **[One-click RL](https://hud.ai/models)** – Run `hud rl` to get a trained model on any environment.
31
31
 
32
32
  > We welcome contributors and feature requests – open an issue or hop on a call to discuss improvements!
33
33
 
@@ -42,10 +42,10 @@ uv tool install hud-python
42
42
  # uv tool update-shell
43
43
  ```
44
44
 
45
- > See [docs.hud.so](https://docs.hud.so), or add docs to any MCP client:
46
- > `claude mcp add --transport http docs-hud https://docs.hud.so/mcp`
45
+ > See [docs.hud.ai](https://docs.hud.ai), or add docs to any MCP client:
46
+ > `claude mcp add --transport http docs-hud https://docs.hud.ai/mcp`
47
47
 
48
- Before starting, get your HUD_API_KEY at [hud.so](https://hud.so).
48
+ Before starting, get your HUD_API_KEY at [hud.ai](https://hud.ai).
49
49
 
50
50
 
51
51
  ## Quickstart: Evals
@@ -63,17 +63,17 @@ import asyncio, hud, os
63
63
  from hud.settings import settings
64
64
  from hud.clients import MCPClient
65
65
  from hud.agents import ClaudeAgent
66
- from hud.datasets import Task # See docs: https://docs.hud.so/reference/tasks
66
+ from hud.datasets import Task # See docs: https://docs.hud.ai/reference/tasks
67
67
 
68
68
  async def main() -> None:
69
- with hud.trace("Quick Start 2048"): # All telemetry works for any MCP-based agent (see https://hud.so)
69
+ with hud.trace("Quick Start 2048"): # All telemetry works for any MCP-based agent (see https://hud.ai)
70
70
  task = {
71
71
  "prompt": "Reach 64 in 2048.",
72
72
  "mcp_config": {
73
73
  "hud": {
74
- "url": "https://mcp.hud.so/v3/mcp", # HUD's cloud MCP server (see https://docs.hud.so/core-concepts/architecture)
74
+ "url": "https://mcp.hud.ai/v3/mcp", # HUD's cloud MCP server (see https://docs.hud.ai/core-concepts/architecture)
75
75
  "headers": {
76
- "Authorization": f"Bearer {settings.api_key}", # Get your key at https://hud.so
76
+ "Authorization": f"Bearer {settings.api_key}", # Get your key at https://hud.ai
77
77
  "Mcp-Image": "hudpython/hud-text-2048:v1.2" # Docker image from https://hub.docker.com/u/hudpython
78
78
  }
79
79
  }
@@ -100,7 +100,7 @@ async def main() -> None:
100
100
  asyncio.run(main())
101
101
  ```
102
102
 
103
- The above example let's the agent play 2048 ([See replay](https://hud.so/trace/6feed7bd-5f67-4d66-b77f-eb1e3164604f))
103
+ The above example let's the agent play 2048 ([See replay](https://hud.ai/trace/6feed7bd-5f67-4d66-b77f-eb1e3164604f))
104
104
 
105
105
  ![Agent playing 2048](https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/src/images/2048_1.gif)
106
106
 
@@ -113,7 +113,7 @@ hud get hud-evals/2048-basic # from HF
113
113
  hud rl 2048-basic.json
114
114
  ```
115
115
 
116
- > See [agent training docs](https://docs.hud.so/train-agents/quickstart)
116
+ > See [agent training docs](https://docs.hud.ai/train-agents/quickstart)
117
117
 
118
118
  Or make your own environment and dataset:
119
119
 
@@ -124,7 +124,7 @@ hud dev --interactive
124
124
  hud rl
125
125
  ```
126
126
 
127
- > See [environment design docs](https://docs.hud.so/build-environments)
127
+ > See [environment design docs](https://docs.hud.ai/build-environments)
128
128
 
129
129
  ## Benchmarking Agents
130
130
 
@@ -132,7 +132,7 @@ This is Claude Computer Use running on our proprietary financial analyst benchma
132
132
 
133
133
  ![Trace screenshot](https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/src/images/trace_sheet.gif)
134
134
 
135
- > [See this trace on _hud.so_](https://hud.so/trace/9e212e9e-3627-4f1f-9eb5-c6d03c59070a)
135
+ > [See this trace on _hud.ai_](https://hud.ai/trace/9e212e9e-3627-4f1f-9eb5-c6d03c59070a)
136
136
 
137
137
  This example runs the full dataset (only takes ~20 minutes) using [run_evaluation.py](examples/run_evaluation.py):
138
138
 
@@ -150,7 +150,7 @@ from hud.agents import ClaudeAgent
150
150
  results = await run_dataset(
151
151
  name="My SheetBench-50 Evaluation",
152
152
  dataset="hud-evals/SheetBench-50", # <-- HuggingFace dataset
153
- agent_class=ClaudeAgent, # <-- Your custom agent can replace this (see https://docs.hud.so/evaluate-agents/create-agents)
153
+ agent_class=ClaudeAgent, # <-- Your custom agent can replace this (see https://docs.hud.ai/evaluate-agents/create-agents)
154
154
  agent_config={"model": "claude-sonnet-4-20250514"},
155
155
  max_concurrent=50,
156
156
  max_steps=30,
@@ -158,13 +158,13 @@ results = await run_dataset(
158
158
  print(f"Average reward: {sum(r.reward for r in results) / len(results):.2f}")
159
159
  ```
160
160
 
161
- > Running a dataset creates a job and streams results to the [hud.so](https://hud.so) platform for analysis and [leaderboard submission](https://docs.hud.so/evaluate-agents/leaderboards).
161
+ > Running a dataset creates a job and streams results to the [hud.ai](https://hud.ai) platform for analysis and [leaderboard submission](https://docs.hud.ai/evaluate-agents/leaderboards).
162
162
 
163
163
  ## Building Environments (MCP)
164
164
 
165
165
  This is how you can make any environment into an interactable one in 5 steps:
166
166
 
167
- 1. Define MCP server layer using [`MCPServer`](https://docs.hud.so/reference/environments)
167
+ 1. Define MCP server layer using [`MCPServer`](https://docs.hud.ai/reference/environments)
168
168
 
169
169
  ```python
170
170
  from hud.server import MCPServer
@@ -172,10 +172,10 @@ from hud.tools import HudComputerTool
172
172
 
173
173
  mcp = MCPServer("My Environment")
174
174
 
175
- # Add hud tools (see all tools: https://docs.hud.so/reference/tools)
175
+ # Add hud tools (see all tools: https://docs.hud.ai/reference/tools)
176
176
  mcp.tool(HudComputerTool())
177
177
 
178
- # Or custom tools (see https://docs.hud.so/build-environments/adapting-software)
178
+ # Or custom tools (see https://docs.hud.ai/build-environments/adapting-software)
179
179
  @mcp.tool("launch_app"):
180
180
  def launch_app(name: str = "Gmail")
181
181
  ...
@@ -249,16 +249,16 @@ Tools
249
249
  hud push # needs docker login, hud api key
250
250
  ```
251
251
 
252
- 5. Now you can use `mcp.hud.so` to launch 100s of instances of this environment in parallel with any agent, and see everything live on [hud.so](https://hud.so):
252
+ 5. Now you can use `mcp.hud.ai` to launch 100s of instances of this environment in parallel with any agent, and see everything live on [hud.ai](https://hud.ai):
253
253
 
254
254
  ```python
255
255
  from hud.agents import ClaudeAgent
256
256
 
257
- result = await ClaudeAgent().run({ # See all agents: https://docs.hud.so/reference/agents
257
+ result = await ClaudeAgent().run({ # See all agents: https://docs.hud.ai/reference/agents
258
258
  "prompt": "Please explore this environment",
259
259
  "mcp_config": {
260
260
  "my-environment": {
261
- "url": "https://mcp.hud.so/v3/mcp",
261
+ "url": "https://mcp.hud.ai/v3/mcp",
262
262
  "headers": {
263
263
  "Authorization": f"Bearer {os.getenv('HUD_API_KEY')}",
264
264
  "Mcp-Image": "my-name/my-environment:latest"
@@ -280,13 +280,13 @@ result = await ClaudeAgent().run({ # See all agents: https://docs.hud.so/refere
280
280
 
281
281
  ## Leaderboards & benchmarks
282
282
 
283
- All leaderboards are publicly available on [hud.so/leaderboards](https://hud.so/leaderboards) (see [docs](https://docs.hud.so/evaluate-agents/leaderboards))
283
+ All leaderboards are publicly available on [hud.ai/leaderboards](https://hud.ai/leaderboards) (see [docs](https://docs.hud.ai/evaluate-agents/leaderboards))
284
284
 
285
285
  ![Leaderboard](https://raw.githubusercontent.com/hud-evals/hud-python/main/docs/src/images/leaderboards_3.png)
286
286
 
287
287
  We highly suggest running 3-5 evaluations per dataset for the most consistent results across multiple jobs.
288
288
 
289
- Using the [`run_dataset`](https://docs.hud.so/reference/tasks#run_dataset) function with a HuggingFace dataset automatically assigns your job to that leaderboard page, and allows you to create a scorecard out of it:
289
+ Using the [`run_dataset`](https://docs.hud.ai/reference/tasks#run_dataset) function with a HuggingFace dataset automatically assigns your job to that leaderboard page, and allows you to create a scorecard out of it:
290
290
 
291
291
  ## Reinforcement Learning with GRPO
292
292
 
@@ -315,11 +315,11 @@ Supports multi‑turn RL for both:
315
315
  - Language‑only models (e.g., `Qwen/Qwen2.5-7B-Instruct`)
316
316
  - Vision‑Language models (e.g., `Qwen/Qwen2.5-VL-3B-Instruct`)
317
317
 
318
- By default, `hud rl` provisions a persistent server and trainer in the cloud, streams telemetry to `hud.so`, and lets you monitor/manage models at `hud.so/models`. Use `--local` to run entirely on your machines (typically 2+ GPUs: one for vLLM, the rest for training).
318
+ By default, `hud rl` provisions a persistent server and trainer in the cloud, streams telemetry to `hud.ai`, and lets you monitor/manage models at `hud.ai/models`. Use `--local` to run entirely on your machines (typically 2+ GPUs: one for vLLM, the rest for training).
319
319
 
320
- Any HUD MCP environment and evaluation works with our RL pipeline (including remote configurations). See the guided docs: `https://docs.hud.so/train-agents/quickstart`.
320
+ Any HUD MCP environment and evaluation works with our RL pipeline (including remote configurations). See the guided docs: `https://docs.hud.ai/train-agents/quickstart`.
321
321
 
322
- Pricing: Hosted vLLM and training GPU rates are listed in the [Training Quickstart → Pricing](https://docs.hud.so/train-agents/quickstart#pricing). Manage billing at the [HUD billing dashboard](https://hud.so/project/billing).
322
+ Pricing: Hosted vLLM and training GPU rates are listed in the [Training Quickstart → Pricing](https://docs.hud.ai/train-agents/quickstart#pricing). Manage billing at the [HUD billing dashboard](https://hud.ai/project/billing).
323
323
 
324
324
  ## Architecture
325
325
 
@@ -327,8 +327,8 @@ Pricing: Hosted vLLM and training GPU rates are listed in the [Training Quicksta
327
327
  %%{init: {"theme": "neutral", "themeVariables": {"fontSize": "14px"}} }%%
328
328
  graph LR
329
329
  subgraph "Platform"
330
- Dashboard["📊 hud.so"]
331
- API["🔌 mcp.hud.so"]
330
+ Dashboard["📊 hud.ai"]
331
+ API["🔌 mcp.hud.ai"]
332
332
  end
333
333
 
334
334
  subgraph "hud"
@@ -366,14 +366,14 @@ graph LR
366
366
 
367
367
  | Command | Purpose | Docs |
368
368
  | ----------------------- | ------------------------------------------ | ---- |
369
- | [`hud init`](https://docs.hud.so/reference/cli/init) | Create new environment with boilerplate. | [📖](https://docs.hud.so/reference/cli/init) |
370
- | [`hud dev`](https://docs.hud.so/reference/cli/dev) | Hot-reload development with Docker. | [📖](https://docs.hud.so/reference/cli/dev) |
371
- | [`hud build`](https://docs.hud.so/reference/cli/build) | Build image and generate lock file. | [📖](https://docs.hud.so/reference/cli/build) |
372
- | [`hud push`](https://docs.hud.so/reference/cli/push) | Share environment to registry. | [📖](https://docs.hud.so/reference/cli/push) |
373
- | [`hud pull <target>`](https://docs.hud.so/reference/cli/pull) | Get environment from registry. | [📖](https://docs.hud.so/reference/cli/pull) |
374
- | [`hud analyze <image>`](https://docs.hud.so/reference/cli/analyze) | Discover tools, resources, and metadata. | [📖](https://docs.hud.so/reference/cli/analyze) |
375
- | [`hud debug <image>`](https://docs.hud.so/reference/cli/debug) | Five-phase health check of an environment. | [📖](https://docs.hud.so/reference/cli/debug) |
376
- | [`hud run <image>`](https://docs.hud.so/reference/cli/run) | Run MCP server locally or remotely. | [📖](https://docs.hud.so/reference/cli/run) |
369
+ | [`hud init`](https://docs.hud.ai/reference/cli/init) | Create new environment with boilerplate. | [📖](https://docs.hud.ai/reference/cli/init) |
370
+ | [`hud dev`](https://docs.hud.ai/reference/cli/dev) | Hot-reload development with Docker. | [📖](https://docs.hud.ai/reference/cli/dev) |
371
+ | [`hud build`](https://docs.hud.ai/reference/cli/build) | Build image and generate lock file. | [📖](https://docs.hud.ai/reference/cli/build) |
372
+ | [`hud push`](https://docs.hud.ai/reference/cli/push) | Share environment to registry. | [📖](https://docs.hud.ai/reference/cli/push) |
373
+ | [`hud pull <target>`](https://docs.hud.ai/reference/cli/pull) | Get environment from registry. | [📖](https://docs.hud.ai/reference/cli/pull) |
374
+ | [`hud analyze <image>`](https://docs.hud.ai/reference/cli/analyze) | Discover tools, resources, and metadata. | [📖](https://docs.hud.ai/reference/cli/analyze) |
375
+ | [`hud debug <image>`](https://docs.hud.ai/reference/cli/debug) | Five-phase health check of an environment. | [📖](https://docs.hud.ai/reference/cli/debug) |
376
+ | [`hud run <image>`](https://docs.hud.ai/reference/cli/run) | Run MCP server locally or remotely. | [📖](https://docs.hud.ai/reference/cli/run) |
377
377
 
378
378
  ## Roadmap
379
379
 
@@ -1,7 +1,7 @@
1
1
  # Blank Environment
2
2
 
3
3
  Minimal starter template for building HUD environments.
4
- See [docs](https://docs.hud.so/build-environments) for the complete environment design workflow.
4
+ See [docs](https://docs.hud.ai/build-environments) for the complete environment design workflow.
5
5
 
6
6
  ## Architecture
7
7
 
@@ -120,9 +120,9 @@ save_tasks(tasks, repo_id="your-org/your-dataset")
120
120
  hud eval "your-org/your-dataset" claude
121
121
 
122
122
  # View results at:
123
- # hud.so/leaderboards/your-org/your-dataset
123
+ # hud.ai/leaderboards/your-org/your-dataset
124
124
  ```
125
125
 
126
126
  **Note**: Only public HuggingFace datasets appear as leaderboards!
127
127
 
128
- 📚 Learn more: [Creating Benchmarks](https://docs.hud.so/evaluate-agents/create-benchmarks) | [Leaderboards](https://docs.hud.so/evaluate-agents/leaderboards)
128
+ 📚 Learn more: [Creating Benchmarks](https://docs.hud.ai/evaluate-agents/create-benchmarks) | [Leaderboards](https://docs.hud.ai/evaluate-agents/leaderboards)
@@ -74,12 +74,12 @@ save_tasks(tasks, repo_id="your-org/your-dataset")
74
74
  hud eval "your-org/your-dataset" --agent claude
75
75
 
76
76
  # View results at:
77
- # hud.so/leaderboards/your-org/your-dataset
77
+ # hud.ai/leaderboards/your-org/your-dataset
78
78
  ```
79
79
 
80
80
  **Note**: Only public HuggingFace datasets appear as leaderboards!
81
81
 
82
- 📚 Learn more: [Creating Benchmarks](https://docs.hud.so/evaluate-agents/create-benchmarks) | [Leaderboards](https://docs.hud.so/evaluate-agents/leaderboards)
82
+ 📚 Learn more: [Creating Benchmarks](https://docs.hud.ai/evaluate-agents/create-benchmarks) | [Leaderboards](https://docs.hud.ai/evaluate-agents/leaderboards)
83
83
 
84
84
  ## Architecture Overview
85
85
 
@@ -4,7 +4,7 @@ version = "0.1.0"
4
4
  description = "HUD Browser MCP Server"
5
5
  requires-python = ">=3.11,<3.14"
6
6
  dependencies = [
7
- "hud-python>=0.4.60",
7
+ "hud-python>=0.4.62",
8
8
  "httpx",
9
9
  "playwright",
10
10
  "pyautogui",
@@ -1,7 +1,7 @@
1
1
  # Deep Research Environment
2
2
 
3
3
  Web research environment powered by Exa API for searching and fetching content.
4
- See [docs](https://docs.hud.so/build-environments) for the complete environment design workflow.
4
+ See [docs](https://docs.hud.ai/build-environments) for the complete environment design workflow.
5
5
 
6
6
  ## Architecture
7
7
 
@@ -141,12 +141,12 @@ save_tasks(tasks, repo_id="your-org/your-dataset")
141
141
  hud eval "your-org/your-dataset" --agent claude
142
142
 
143
143
  # View results at:
144
- # hud.so/leaderboards/your-org/your-dataset
144
+ # hud.ai/leaderboards/your-org/your-dataset
145
145
  ```
146
146
 
147
147
  **Note**: Only public HuggingFace datasets appear as leaderboards!
148
148
 
149
- 📚 Learn more: [Creating Benchmarks](https://docs.hud.so/evaluate-agents/create-benchmarks) | [Leaderboards](https://docs.hud.so/evaluate-agents/leaderboards)
149
+ 📚 Learn more: [Creating Benchmarks](https://docs.hud.ai/evaluate-agents/create-benchmarks) | [Leaderboards](https://docs.hud.ai/evaluate-agents/leaderboards)
150
150
 
151
151
  ## Example Research Workflow
152
152
 
@@ -0,0 +1,68 @@
1
+ # Jupyter Env (for SpreadSheetBench)
2
+
3
+ ## QuickStart
4
+
5
+ ### MCP Server from Dockerhub (Don't Have to Build Docker Image)
6
+
7
+ Run task by
8
+ ```
9
+ hud eval Genteki/SpreadSheetBench
10
+ ```
11
+
12
+ ### Local MCP Server
13
+
14
+ First we build the docker image with
15
+ ```
16
+ docker build -t <image/name> .
17
+ ```
18
+ Then modify the docker image name in `test_task.json`. Finally, load all `api_key` needed into environment varible and run
19
+
20
+ ```
21
+ hud eval
22
+ ```
23
+
24
+ ## File Structure
25
+
26
+ `environments/jupyter` file sturcture:
27
+ ```
28
+ ├── Dockerfile
29
+ ├── server
30
+ │ ├── config.py
31
+ │ ├── evaluate
32
+ │ │ ├── compare.py
33
+ │ │ ├── dumb.py
34
+ │ │ ├── eval_all.py
35
+ │ │ ├── eval_single.py
36
+ │ │ ├── generalize.py
37
+ │ │ └── __init__.py
38
+ │ ├── __init__.py
39
+ │ ├── main.py
40
+ │ ├── pyproject.toml
41
+ │ ├── setup
42
+ │ │ ├── __init__.py
43
+ │ │ └── load_spreadsheet.py
44
+ │ └── tools
45
+ │ ├── __init__.py
46
+ │ └── jupyter_with_record.py
47
+ └── test_task.json
48
+ ```
49
+ Here we introduce the main parts of the environments
50
+ * `main.py` start point of MCP server
51
+ * `tools/jupyter_with_record.py`: offer `execute_code` method to allow agent interacting with jupyter kernel and record the solution
52
+ * `setup/`: setup methods for eval task
53
+ * `evaluate/` evaluations method for eval task
54
+
55
+
56
+ ## Related Linkd
57
+ ### Hugginface:
58
+ * [Genteki/SpreadSheetBench-Tiny](https://huggingface.co/datasets/Genteki/SpreadSheetBench-Tiny) (Size: 10)
59
+ * [Genteki/SpreadSheetBench-200](https://huggingface.co/datasets/Genteki/SpreadSheetBench-200) (Size: 200)
60
+ * [Genteki/SpreadSheetBench](https://huggingface.co/datasets/Genteki/SpreadSheetBench) (Size: 912)
61
+
62
+ ### Example Traces (May require permission)
63
+ * [Single Test Task](https://www.hud.ai/trace/d31de170-e70a-4abb-8f95-70512515dade)
64
+ * [Genteki/SpreadSheetBench-Tiny Test](https://www.hud.ai/jobs/2c426368-e352-4c79-af4a-aefb136e3f58)
65
+
66
+ ### Github
67
+
68
+ * Feature Branch: [New-Env-Jupyter](https://github.com/Genteki/hud-python/tree/New-Env-Jupyter)