azure-ai-evaluation 1.0.0__tar.gz → 1.0.0b2__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of azure-ai-evaluation might be problematic. Click here for more details.

Files changed (204) hide show
  1. azure_ai_evaluation-1.0.0b2/CHANGELOG.md +23 -0
  2. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/MANIFEST.in +0 -2
  3. azure_ai_evaluation-1.0.0b2/PKG-INFO +449 -0
  4. azure_ai_evaluation-1.0.0b2/README.md +389 -0
  5. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/__init__.py +5 -31
  6. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_common/constants.py +2 -9
  7. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_common/rai_service.py +452 -0
  8. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_common/utils.py +87 -0
  9. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_constants.py +6 -19
  10. {azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluate/_batch_run → azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluate/_batch_run_client}/__init__.py +2 -3
  11. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluate/_batch_run/eval_run_context.py → azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluate/_batch_run_client/batch_run_context.py +7 -23
  12. {azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluate/_batch_run → azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluate/_batch_run_client}/code_client.py +17 -33
  13. {azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluate/_batch_run → azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluate/_batch_run_client}/proxy_client.py +4 -32
  14. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluate/_eval_run.py +24 -81
  15. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluate/_evaluate.py +239 -393
  16. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluate/_telemetry/__init__.py +17 -17
  17. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluate/_utils.py +28 -82
  18. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_bleu/_bleu.py +18 -17
  19. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_chat/__init__.py +9 -0
  20. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_chat/_chat.py +357 -0
  21. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_chat/retrieval/__init__.py +9 -0
  22. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_chat/retrieval/_retrieval.py +157 -0
  23. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_chat/retrieval/retrieval.prompty +48 -0
  24. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_coherence/_coherence.py +117 -0
  25. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_coherence/coherence.prompty +62 -0
  26. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_content_safety/__init__.py +4 -0
  27. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_content_safety/_content_safety.py +106 -0
  28. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_multimodal/_content_safety_multimodal_base.py → azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_content_safety/_content_safety_base.py +34 -24
  29. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_content_safety/_content_safety_chat.py +301 -0
  30. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_content_safety/_hate_unfairness.py +78 -0
  31. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_content_safety/_self_harm.py +76 -0
  32. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_content_safety/_sexual.py +76 -0
  33. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_content_safety/_violence.py +76 -0
  34. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_eci/_eci.py +99 -0
  35. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_f1_score/_f1_score.py +19 -34
  36. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_fluency/_fluency.py +117 -0
  37. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_fluency/fluency.prompty +61 -0
  38. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_gleu/_gleu.py +16 -14
  39. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_groundedness/_groundedness.py +118 -0
  40. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_groundedness/groundedness.prompty +54 -0
  41. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_meteor/_meteor.py +27 -20
  42. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_protected_material/_protected_material.py +104 -0
  43. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_protected_materials/__init__.py +5 -0
  44. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_protected_materials/_protected_materials.py +104 -0
  45. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_qa/_qa.py +30 -23
  46. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_relevance/_relevance.py +126 -0
  47. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_relevance/relevance.prompty +69 -0
  48. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_rouge/_rouge.py +27 -26
  49. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_similarity/_similarity.py +125 -0
  50. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_similarity/similarity.prompty +5 -0
  51. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_evaluators/_xpia/xpia.py +139 -0
  52. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_exceptions.py +7 -28
  53. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_http_utils.py +132 -203
  54. azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/_model_configurations.py +27 -0
  55. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_version.py +1 -1
  56. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/__init__.py +1 -2
  57. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_adversarial_scenario.py +1 -20
  58. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_adversarial_simulator.py +92 -111
  59. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_constants.py +1 -11
  60. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_conversation/__init__.py +12 -13
  61. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_conversation/_conversation.py +4 -4
  62. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_direct_attack_simulator.py +67 -33
  63. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_helpers/__init__.py +2 -1
  64. {azure_ai_evaluation-1.0.0/azure/ai/evaluation/_common → azure_ai_evaluation-1.0.0b2/azure/ai/evaluation/simulator/_helpers}/_experimental.py +9 -24
  65. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_helpers/_simulator_data_classes.py +5 -26
  66. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_indirect_attack_simulator.py +94 -107
  67. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_model_tools/_identity_manager.py +22 -70
  68. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_model_tools/_proxy_completion_model.py +11 -28
  69. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_model_tools/_rai_client.py +4 -8
  70. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_model_tools/_template_handler.py +24 -68
  71. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_model_tools/models.py +10 -10
  72. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_prompty/task_query_response.prompty +10 -6
  73. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_prompty/task_simulate.prompty +5 -6
  74. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_simulator.py +207 -277
  75. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_tracing.py +4 -4
  76. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_utils.py +13 -31
  77. azure_ai_evaluation-1.0.0b2/azure_ai_evaluation.egg-info/PKG-INFO +449 -0
  78. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure_ai_evaluation.egg-info/SOURCES.txt +17 -43
  79. azure_ai_evaluation-1.0.0b2/azure_ai_evaluation.egg-info/requires.txt +16 -0
  80. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/pyproject.toml +4 -2
  81. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/setup.py +6 -6
  82. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/conftest.py +9 -57
  83. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/e2etests/custom_evaluators/answer_length_with_aggregation.py +2 -9
  84. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/e2etests/target_fn.py +0 -18
  85. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/e2etests/test_adv_simulator.py +24 -51
  86. azure_ai_evaluation-1.0.0b2/tests/e2etests/test_builtin_evaluators.py +514 -0
  87. azure_ai_evaluation-1.0.0b2/tests/e2etests/test_evaluate.py +520 -0
  88. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_batch_run_context.py +8 -8
  89. azure_ai_evaluation-1.0.0b2/tests/unittests/test_built_in_evaluator.py +46 -0
  90. azure_ai_evaluation-1.0.0b2/tests/unittests/test_chat_evaluator.py +109 -0
  91. azure_ai_evaluation-1.0.0b2/tests/unittests/test_content_safety_chat_evaluator.py +82 -0
  92. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_content_safety_rai_script.py +26 -72
  93. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_eval_run.py +4 -33
  94. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_evaluate.py +50 -130
  95. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_evaluate_telemetry.py +10 -11
  96. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_jailbreak_simulator.py +3 -4
  97. azure_ai_evaluation-1.0.0b2/tests/unittests/test_non_adv_simulator.py +130 -0
  98. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_simulator.py +7 -6
  99. azure_ai_evaluation-1.0.0b2/tests/unittests/test_utils.py +20 -0
  100. azure_ai_evaluation-1.0.0/CHANGELOG.md +0 -214
  101. azure_ai_evaluation-1.0.0/NOTICE.txt +0 -70
  102. azure_ai_evaluation-1.0.0/PKG-INFO +0 -595
  103. azure_ai_evaluation-1.0.0/README.md +0 -345
  104. azure_ai_evaluation-1.0.0/TROUBLESHOOTING.md +0 -61
  105. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_common/math.py +0 -89
  106. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_common/rai_service.py +0 -632
  107. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_common/utils.py +0 -445
  108. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluate/_batch_run/target_run_context.py +0 -46
  109. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_coherence/_coherence.py +0 -107
  110. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_coherence/coherence.prompty +0 -99
  111. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_common/__init__.py +0 -13
  112. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_common/_base_eval.py +0 -344
  113. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_common/_base_prompty_eval.py +0 -88
  114. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_common/_base_rai_svc_eval.py +0 -133
  115. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_content_safety/_content_safety.py +0 -144
  116. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_content_safety/_hate_unfairness.py +0 -129
  117. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_content_safety/_self_harm.py +0 -123
  118. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_content_safety/_sexual.py +0 -125
  119. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_content_safety/_violence.py +0 -126
  120. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_eci/_eci.py +0 -89
  121. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_fluency/_fluency.py +0 -104
  122. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_fluency/fluency.prompty +0 -86
  123. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_groundedness/_groundedness.py +0 -144
  124. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_groundedness/groundedness_with_query.prompty +0 -113
  125. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_groundedness/groundedness_without_query.prompty +0 -99
  126. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_multimodal/__init__.py +0 -20
  127. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_multimodal/_content_safety_multimodal.py +0 -132
  128. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_multimodal/_hate_unfairness.py +0 -100
  129. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_multimodal/_protected_material.py +0 -124
  130. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_multimodal/_self_harm.py +0 -100
  131. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_multimodal/_sexual.py +0 -100
  132. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_multimodal/_violence.py +0 -100
  133. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_protected_material/_protected_material.py +0 -113
  134. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_relevance/_relevance.py +0 -114
  135. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_relevance/relevance.prompty +0 -100
  136. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_retrieval/__init__.py +0 -9
  137. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_retrieval/_retrieval.py +0 -112
  138. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_retrieval/retrieval.prompty +0 -93
  139. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_service_groundedness/__init__.py +0 -9
  140. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_service_groundedness/_service_groundedness.py +0 -148
  141. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_similarity/_similarity.py +0 -140
  142. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_evaluators/_xpia/xpia.py +0 -125
  143. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_model_configurations.py +0 -123
  144. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_vendor/__init__.py +0 -3
  145. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_vendor/rouge_score/__init__.py +0 -14
  146. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_vendor/rouge_score/rouge_scorer.py +0 -328
  147. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_vendor/rouge_score/scoring.py +0 -63
  148. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_vendor/rouge_score/tokenize.py +0 -63
  149. azure_ai_evaluation-1.0.0/azure/ai/evaluation/_vendor/rouge_score/tokenizers.py +0 -53
  150. azure_ai_evaluation-1.0.0/azure/ai/evaluation/simulator/_data_sources/__init__.py +0 -3
  151. azure_ai_evaluation-1.0.0/azure/ai/evaluation/simulator/_data_sources/grounding.json +0 -1150
  152. azure_ai_evaluation-1.0.0/azure_ai_evaluation.egg-info/PKG-INFO +0 -595
  153. azure_ai_evaluation-1.0.0/azure_ai_evaluation.egg-info/requires.txt +0 -10
  154. azure_ai_evaluation-1.0.0/samples/README.md +0 -57
  155. azure_ai_evaluation-1.0.0/samples/data/evaluate_test_data.jsonl +0 -3
  156. azure_ai_evaluation-1.0.0/samples/evaluation_samples_common.py +0 -60
  157. azure_ai_evaluation-1.0.0/samples/evaluation_samples_evaluate.py +0 -395
  158. azure_ai_evaluation-1.0.0/samples/evaluation_samples_simulate.py +0 -249
  159. azure_ai_evaluation-1.0.0/tests/__pf_service_isolation.py +0 -28
  160. azure_ai_evaluation-1.0.0/tests/e2etests/test_builtin_evaluators.py +0 -997
  161. azure_ai_evaluation-1.0.0/tests/e2etests/test_evaluate.py +0 -926
  162. azure_ai_evaluation-1.0.0/tests/e2etests/test_sim_and_eval.py +0 -129
  163. azure_ai_evaluation-1.0.0/tests/unittests/test_built_in_evaluator.py +0 -128
  164. azure_ai_evaluation-1.0.0/tests/unittests/test_non_adv_simulator.py +0 -362
  165. azure_ai_evaluation-1.0.0/tests/unittests/test_utils.py +0 -258
  166. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/__init__.py +0 -0
  167. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/__init__.py +0 -0
  168. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_common/__init__.py +0 -0
  169. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluate/__init__.py +0 -0
  170. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/__init__.py +0 -0
  171. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_bleu/__init__.py +0 -0
  172. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_coherence/__init__.py +0 -0
  173. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_eci/__init__.py +0 -0
  174. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_f1_score/__init__.py +0 -0
  175. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_fluency/__init__.py +0 -0
  176. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_gleu/__init__.py +0 -0
  177. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_groundedness/__init__.py +0 -0
  178. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_meteor/__init__.py +0 -0
  179. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_protected_material/__init__.py +0 -0
  180. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_qa/__init__.py +0 -0
  181. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_relevance/__init__.py +0 -0
  182. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_rouge/__init__.py +0 -0
  183. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_similarity/__init__.py +0 -0
  184. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_evaluators/_xpia/__init__.py +0 -0
  185. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/_user_agent.py +0 -0
  186. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/py.typed +0 -0
  187. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_conversation/constants.py +0 -0
  188. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_helpers/_language_suffix_mapping.py +0 -0
  189. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_model_tools/__init__.py +0 -0
  190. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure/ai/evaluation/simulator/_prompty/__init__.py +0 -0
  191. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure_ai_evaluation.egg-info/dependency_links.txt +0 -0
  192. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure_ai_evaluation.egg-info/not-zip-safe +0 -0
  193. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/azure_ai_evaluation.egg-info/top_level.txt +0 -0
  194. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/setup.cfg +0 -0
  195. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/__init__.py +0 -0
  196. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/__openai_patcher.py +0 -0
  197. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/e2etests/__init__.py +0 -0
  198. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/e2etests/test_metrics_upload.py +1 -1
  199. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_content_safety_defect_rate.py +1 -1
  200. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_evaluators/apology_dag/apology.py +0 -0
  201. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_evaluators/test_inputs_evaluators.py +0 -0
  202. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_save_eval.py +0 -0
  203. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_synthetic_callback_conv_bot.py +0 -0
  204. {azure_ai_evaluation-1.0.0 → azure_ai_evaluation-1.0.0b2}/tests/unittests/test_synthetic_conversation_bot.py +1 -1
@@ -0,0 +1,23 @@
1
+ # Release History
2
+
3
+ ## 1.0.0b2 (2024-09-24)
4
+
5
+ ### Breaking Changes
6
+
7
+ - `data` and `evaluators` are now required keywords in `evaluate`.
8
+
9
+ ## 1.0.0b1 (2024-09-20)
10
+
11
+ ### Breaking Changes
12
+
13
+ - The `synthetic` namespace has been renamed to `simulator`, and sub-namespaces under this module have been removed
14
+ - The `evaluate` and `evaluators` namespaces have been removed, and everything previously exposed in those modules has been added to the root namespace `azure.ai.evaluation`
15
+ - The parameter name `project_scope` in content safety evaluators have been renamed to `azure_ai_project` for consistency with evaluate API and simulators.
16
+ - Model configurations classes are now of type `TypedDict` and are exposed in the `azure.ai.evaluation` module instead of coming from `promptflow.core`.
17
+ - Updated the parameter names for `question` and `answer` in built-in evaluators to more generic terms: `query` and `response`.
18
+
19
+ ### Features Added
20
+
21
+ - First preview
22
+ - This package is port of `promptflow-evals`. New features will be added only to this package moving forward.
23
+ - Added a `TypedDict` for `AzureAIProject` that allows for better intellisense and type checking when passing in project information
@@ -4,5 +4,3 @@ include azure/__init__.py
4
4
  include azure/ai/__init__.py
5
5
  include azure/ai/evaluation/py.typed
6
6
  recursive-include azure/ai/evaluation *.prompty
7
- include azure/ai/evaluation/simulator/_data_sources/grounding.json
8
- recursive-include samples *
@@ -0,0 +1,449 @@
1
+ Metadata-Version: 2.1
2
+ Name: azure-ai-evaluation
3
+ Version: 1.0.0b2
4
+ Summary: Microsoft Azure Evaluation Library for Python
5
+ Home-page: https://github.com/Azure/azure-sdk-for-python
6
+ Author: Microsoft Corporation
7
+ Author-email: azuresdkengsysadmins@microsoft.com
8
+ License: MIT License
9
+ Project-URL: Bug Reports, https://github.com/Azure/azure-sdk-for-python/issues
10
+ Project-URL: Source, https://github.com/Azure/azure-sdk-for-python
11
+ Keywords: azure,azure sdk
12
+ Classifier: Development Status :: 4 - Beta
13
+ Classifier: Programming Language :: Python
14
+ Classifier: Programming Language :: Python :: 3
15
+ Classifier: Programming Language :: Python :: 3 :: Only
16
+ Classifier: Programming Language :: Python :: 3.8
17
+ Classifier: Programming Language :: Python :: 3.9
18
+ Classifier: Programming Language :: Python :: 3.10
19
+ Classifier: Programming Language :: Python :: 3.11
20
+ Classifier: License :: OSI Approved :: MIT License
21
+ Classifier: Operating System :: OS Independent
22
+ Requires-Python: >=3.8
23
+ Description-Content-Type: text/markdown
24
+ Requires-Dist: promptflow-devkit>=1.15.0
25
+ Requires-Dist: promptflow-core>=1.15.0
26
+ Requires-Dist: numpy>=1.23.2; python_version < "3.12"
27
+ Requires-Dist: numpy>=1.26.4; python_version >= "3.12"
28
+ Requires-Dist: pyjwt>=2.8.0
29
+ Requires-Dist: azure-identity>=1.12.0
30
+ Requires-Dist: azure-core>=1.30.2
31
+ Requires-Dist: nltk>=3.9.1
32
+ Requires-Dist: rouge-score>=0.1.2
33
+ Provides-Extra: pf-azure
34
+ Requires-Dist: promptflow-azure<2.0.0,>=1.15.0; extra == "pf-azure"
35
+
36
+ # Azure AI Evaluation client library for Python
37
+
38
+ We are excited to introduce the public preview of the Azure AI Evaluation SDK.
39
+
40
+ [Source code][source_code]
41
+ | [Package (PyPI)][evaluation_pypi]
42
+ | [API reference documentation][evaluation_ref_docs]
43
+ | [Product documentation][product_documentation]
44
+ | [Samples][evaluation_samples]
45
+
46
+ This package has been tested with Python 3.8, 3.9, 3.10, 3.11, and 3.12.
47
+
48
+ For a more complete set of Azure libraries, see https://aka.ms/azsdk/python/all
49
+
50
+ ## Getting started
51
+
52
+ ### Prerequisites
53
+
54
+ - Python 3.8 or later is required to use this package.
55
+
56
+ ### Install the package
57
+
58
+ Install the Azure AI Evaluation library for Python with [pip][pip_link]::
59
+
60
+ ```bash
61
+ pip install azure-ai-evaluation
62
+ ```
63
+
64
+ ## Key concepts
65
+
66
+ Evaluators are custom or prebuilt classes or functions that are designed to measure the quality of the outputs from language models.
67
+
68
+ ## Examples
69
+
70
+ ### Evaluators
71
+
72
+ Users can create evaluator runs on the local machine as shown in the example below:
73
+
74
+ ```python
75
+ import os
76
+ from pprint import pprint
77
+
78
+ from azure.ai.evaluation import evaluate, RelevanceEvaluator, ViolenceEvaluator
79
+
80
+
81
+ def response_length(response, **kwargs):
82
+ return {"value": len(response)}
83
+
84
+
85
+ if __name__ == "__main__":
86
+ # Built-in evaluators
87
+ # Initialize Azure OpenAI Model Configuration
88
+ model_config = {
89
+ "azure_endpoint": os.environ.get("AZURE_OPENAI_ENDPOINT"),
90
+ "api_key": os.environ.get("AZURE_OPENAI_KEY"),
91
+ "azure_deployment": os.environ.get("AZURE_OPENAI_DEPLOYMENT"),
92
+ }
93
+
94
+ # Initialzing Relevance Evaluator
95
+ relevance_eval = RelevanceEvaluator(model_config)
96
+
97
+ # Running Relevance Evaluator on single input row
98
+ relevance_score = relevance_eval(
99
+ response="The Alpine Explorer Tent is the most waterproof.",
100
+ context="From the our product list,"
101
+ " the alpine explorer tent is the most waterproof."
102
+ " The Adventure Dining Table has higher weight.",
103
+ query="Which tent is the most waterproof?",
104
+ )
105
+
106
+ pprint(relevance_score)
107
+ # {'gpt_relevance': 5.0}
108
+
109
+ # Content Safety Evaluator
110
+
111
+ # Initialize Project Scope
112
+ azure_ai_project = {
113
+ "subscription_id": <subscription_id>,
114
+ "resource_group_name": <resource_group_name>,
115
+ "project_name": <project_name>
116
+ }
117
+
118
+ violence_eval = ViolenceEvaluator(azure_ai_project)
119
+ violence_score = violence_eval(query="What is the capital of France?", response="Paris.")
120
+ pprint(violence_score)
121
+ # {'violence': 'Very low',
122
+ # 'violence_reason': "The system's response is a straightforward factual response "
123
+ # 'to a geography question. There is no violent content or '
124
+ # 'language present.',
125
+ # 'violence_score': 0}
126
+
127
+ # Code based evaluator
128
+ response_length("The Alpine Explorer Tent is the most waterproof.")
129
+ # {'value': 48}
130
+
131
+ # Using multiple evaluators together using `Evaluate` API
132
+
133
+ result = evaluate(
134
+ data="evaluate_test_data.jsonl",
135
+ evaluators={
136
+ "response_length": response_length,
137
+ "violence": violence_eval,
138
+ },
139
+ )
140
+
141
+ pprint(result)
142
+ ```
143
+ ### Simulator
144
+
145
+
146
+ Simulators allow users to generate synthentic data using their application. Simulator expects the user to have a callback method that invokes
147
+ their AI application.
148
+
149
+ #### Simulating with a Prompty
150
+
151
+ ```yaml
152
+ ---
153
+ name: ApplicationPrompty
154
+ description: Simulates an application
155
+ model:
156
+ api: chat
157
+ configuration:
158
+ type: azure_openai
159
+ azure_deployment: ${env:AZURE_DEPLOYMENT}
160
+ api_key: ${env:AZURE_OPENAI_API_KEY}
161
+ azure_endpoint: ${env:AZURE_OPENAI_ENDPOINT}
162
+ parameters:
163
+ temperature: 0.0
164
+ top_p: 1.0
165
+ presence_penalty: 0
166
+ frequency_penalty: 0
167
+ response_format:
168
+ type: text
169
+
170
+ inputs:
171
+ conversation_history:
172
+ type: dict
173
+
174
+ ---
175
+ system:
176
+ You are a helpful assistant and you're helping with the user's query. Keep the conversation engaging and interesting.
177
+
178
+ Output with a string that continues the conversation, responding to the latest message from the user, given the conversation history:
179
+ {{ conversation_history }}
180
+
181
+ ```
182
+ Application code:
183
+
184
+ ```python
185
+ import json
186
+ import asyncio
187
+ from typing import Any, Dict, List, Optional
188
+ from azure.ai.evaluation.simulator import Simulator
189
+ from promptflow.client import load_flow
190
+ from azure.identity import DefaultAzureCredential
191
+ import os
192
+
193
+ azure_ai_project = {
194
+ "subscription_id": os.environ.get("AZURE_SUBSCRIPTION_ID"),
195
+ "resource_group_name": os.environ.get("RESOURCE_GROUP"),
196
+ "project_name": os.environ.get("PROJECT_NAME")
197
+ }
198
+
199
+ import wikipedia
200
+ wiki_search_term = "Leonardo da vinci"
201
+ wiki_title = wikipedia.search(wiki_search_term)[0]
202
+ wiki_page = wikipedia.page(wiki_title)
203
+ text = wiki_page.summary[:1000]
204
+
205
+ def method_to_invoke_application_prompty(query: str):
206
+ try:
207
+ current_dir = os.path.dirname(__file__)
208
+ prompty_path = os.path.join(current_dir, "application.prompty")
209
+ _flow = load_flow(source=prompty_path, model={
210
+ "configuration": azure_ai_project
211
+ })
212
+ response = _flow(
213
+ query=query,
214
+ context=context,
215
+ conversation_history=messages_list
216
+ )
217
+ return response
218
+ except:
219
+ print("Something went wrong invoking the prompty")
220
+ return "something went wrong"
221
+
222
+ async def callback(
223
+ messages: List[Dict],
224
+ stream: bool = False,
225
+ session_state: Any = None, # noqa: ANN401
226
+ context: Optional[Dict[str, Any]] = None,
227
+ ) -> dict:
228
+ messages_list = messages["messages"]
229
+ # get last message
230
+ latest_message = messages_list[-1]
231
+ query = latest_message["content"]
232
+ context = None
233
+ # call your endpoint or ai application here
234
+ response = method_to_invoke_application_prompty(query)
235
+ # we are formatting the response to follow the openAI chat protocol format
236
+ formatted_response = {
237
+ "content": response,
238
+ "role": "assistant",
239
+ "context": {
240
+ "citations": None,
241
+ },
242
+ }
243
+ messages["messages"].append(formatted_response)
244
+ return {"messages": messages["messages"], "stream": stream, "session_state": session_state, "context": context}
245
+
246
+
247
+
248
+ async def main():
249
+ simulator = Simulator(azure_ai_project=azure_ai_project, credential=DefaultAzureCredential())
250
+ outputs = await simulator(
251
+ target=callback,
252
+ text=text,
253
+ num_queries=2,
254
+ max_conversation_turns=4,
255
+ user_persona=[
256
+ f"I am a student and I want to learn more about {wiki_search_term}",
257
+ f"I am a teacher and I want to teach my students about {wiki_search_term}"
258
+ ],
259
+ )
260
+ print(json.dumps(outputs))
261
+
262
+ if __name__ == "__main__":
263
+ os.environ["AZURE_SUBSCRIPTION_ID"] = ""
264
+ os.environ["RESOURCE_GROUP"] = ""
265
+ os.environ["PROJECT_NAME"] = ""
266
+ os.environ["AZURE_OPENAI_API_KEY"] = ""
267
+ os.environ["AZURE_OPENAI_ENDPOINT"] = ""
268
+ os.environ["AZURE_DEPLOYMENT"] = ""
269
+ asyncio.run(main())
270
+ print("done!")
271
+ ```
272
+
273
+ #### Adversarial Simulator
274
+
275
+ ```python
276
+ from from azure.ai.evaluation.simulator import AdversarialSimulator, AdversarialScenario
277
+ from azure.identity import DefaultAzureCredential
278
+ from typing import Any, Dict, List, Optional
279
+ import asyncio
280
+
281
+
282
+ azure_ai_project = {
283
+ "subscription_id": <subscription_id>,
284
+ "resource_group_name": <resource_group_name>,
285
+ "project_name": <project_name>
286
+ }
287
+
288
+ async def callback(
289
+ messages: List[Dict],
290
+ stream: bool = False,
291
+ session_state: Any = None,
292
+ context: Dict[str, Any] = None
293
+ ) -> dict:
294
+ messages_list = messages["messages"]
295
+ # get last message
296
+ latest_message = messages_list[-1]
297
+ query = latest_message["content"]
298
+ context = None
299
+ if 'file_content' in messages["template_parameters"]:
300
+ query += messages["template_parameters"]['file_content']
301
+ # the next few lines explains how to use the AsyncAzureOpenAI's chat.completions
302
+ # to respond to the simulator. You should replace it with a call to your model/endpoint/application
303
+ # make sure you pass the `query` and format the response as we have shown below
304
+ from openai import AsyncAzureOpenAI
305
+ oai_client = AsyncAzureOpenAI(
306
+ api_key=<api_key>,
307
+ azure_endpoint=<endpoint>,
308
+ api_version="2023-12-01-preview",
309
+ )
310
+ try:
311
+ response_from_oai_chat_completions = await oai_client.chat.completions.create(messages=[{"content": query, "role": "user"}], model="gpt-4", max_tokens=300)
312
+ except Exception as e:
313
+ print(f"Error: {e}")
314
+ # to continue the conversation, return the messages, else you can fail the adversarial with an exception
315
+ message = {
316
+ "content": "Something went wrong. Check the exception e for more details.",
317
+ "role": "assistant",
318
+ "context": None,
319
+ }
320
+ messages["messages"].append(message)
321
+ return {
322
+ "messages": messages["messages"],
323
+ "stream": stream,
324
+ "session_state": session_state
325
+ }
326
+ response_result = response_from_oai_chat_completions.choices[0].message.content
327
+ formatted_response = {
328
+ "content": response_result,
329
+ "role": "assistant",
330
+ "context": {},
331
+ }
332
+ messages["messages"].append(formatted_response)
333
+ return {
334
+ "messages": messages["messages"],
335
+ "stream": stream,
336
+ "session_state": session_state,
337
+ "context": context
338
+ }
339
+
340
+ ```
341
+
342
+ #### Adversarial QA
343
+
344
+ ```python
345
+ scenario = AdversarialScenario.ADVERSARIAL_QA
346
+ simulator = AdversarialSimulator(azure_ai_project=azure_ai_project, credential=DefaultAzureCredential())
347
+
348
+ outputs = asyncio.run(
349
+ simulator(
350
+ scenario=scenario,
351
+ max_conversation_turns=1,
352
+ max_simulation_results=3,
353
+ target=callback
354
+ )
355
+ )
356
+
357
+ print(outputs.to_eval_qa_json_lines())
358
+ ```
359
+ #### Direct Attack Simulator
360
+
361
+ ```python
362
+ scenario = AdversarialScenario.ADVERSARIAL_QA
363
+ simulator = DirectAttackSimulator(azure_ai_project=azure_ai_project, credential=DefaultAzureCredential())
364
+
365
+ outputs = asyncio.run(
366
+ simulator(
367
+ scenario=scenario,
368
+ max_conversation_turns=1,
369
+ max_simulation_results=2,
370
+ target=callback
371
+ )
372
+ )
373
+
374
+ print(outputs)
375
+ ```
376
+ ## Troubleshooting
377
+
378
+ ### General
379
+
380
+ Azure ML clients raise exceptions defined in [Azure Core][azure_core_readme].
381
+
382
+ ### Logging
383
+
384
+ This library uses the standard
385
+ [logging][python_logging] library for logging.
386
+ Basic information about HTTP sessions (URLs, headers, etc.) is logged at INFO
387
+ level.
388
+
389
+ Detailed DEBUG level logging, including request/response bodies and unredacted
390
+ headers, can be enabled on a client with the `logging_enable` argument.
391
+
392
+ See full SDK logging documentation with examples [here][sdk_logging_docs].
393
+
394
+ ## Next steps
395
+
396
+ - View our [samples][evaluation_samples].
397
+ - View our [documentation][product_documentation]
398
+
399
+ ## Contributing
400
+
401
+ This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit [cla.microsoft.com][cla].
402
+
403
+ When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
404
+
405
+ This project has adopted the [Microsoft Open Source Code of Conduct][code_of_conduct]. For more information see the [Code of Conduct FAQ][coc_faq] or contact [opencode@microsoft.com][coc_contact] with any additional questions or comments.
406
+
407
+ <!-- LINKS -->
408
+
409
+ [source_code]: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/evaluation/azure-ai-evaluation
410
+ [evaluation_pypi]: https://pypi.org/project/azure-ai-evaluation/
411
+ [evaluation_ref_docs]: https://learn.microsoft.com/python/api/azure-ai-evaluation/azure.ai.evaluation?view=azure-python-preview
412
+ [evaluation_samples]: https://github.com/Azure-Samples/azureai-samples/tree/main/scenarios
413
+ [product_documentation]: https://learn.microsoft.com/azure/ai-studio/how-to/develop/evaluate-sdk
414
+ [python_logging]: https://docs.python.org/3/library/logging.html
415
+ [sdk_logging_docs]: https://docs.microsoft.com/azure/developer/python/azure-sdk-logging
416
+ [azure_core_readme]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md
417
+ [pip_link]: https://pypi.org/project/pip/
418
+ [azure_core_ref_docs]: https://aka.ms/azsdk-python-core-policies
419
+ [azure_core]: https://github.com/Azure/azure-sdk-for-python/blob/main/sdk/core/azure-core/README.md
420
+ [azure_identity]: https://github.com/Azure/azure-sdk-for-python/tree/main/sdk/identity/azure-identity
421
+ [cla]: https://cla.microsoft.com
422
+ [code_of_conduct]: https://opensource.microsoft.com/codeofconduct/
423
+ [coc_faq]: https://opensource.microsoft.com/codeofconduct/faq/
424
+ [coc_contact]: mailto:opencode@microsoft.com
425
+
426
+
427
+ # Release History
428
+
429
+ ## 1.0.0b2 (2024-09-24)
430
+
431
+ ### Breaking Changes
432
+
433
+ - `data` and `evaluators` are now required keywords in `evaluate`.
434
+
435
+ ## 1.0.0b1 (2024-09-20)
436
+
437
+ ### Breaking Changes
438
+
439
+ - The `synthetic` namespace has been renamed to `simulator`, and sub-namespaces under this module have been removed
440
+ - The `evaluate` and `evaluators` namespaces have been removed, and everything previously exposed in those modules has been added to the root namespace `azure.ai.evaluation`
441
+ - The parameter name `project_scope` in content safety evaluators have been renamed to `azure_ai_project` for consistency with evaluate API and simulators.
442
+ - Model configurations classes are now of type `TypedDict` and are exposed in the `azure.ai.evaluation` module instead of coming from `promptflow.core`.
443
+ - Updated the parameter names for `question` and `answer` in built-in evaluators to more generic terms: `query` and `response`.
444
+
445
+ ### Features Added
446
+
447
+ - First preview
448
+ - This package is port of `promptflow-evals`. New features will be added only to this package moving forward.
449
+ - Added a `TypedDict` for `AzureAIProject` that allows for better intellisense and type checking when passing in project information