@microsoft/m365-copilot-eval 1.5.0-preview.1 → 1.7.0-preview.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +19 -1
- package/package.json +4 -3
- package/schema/CHANGELOG.md +7 -0
- package/schema/v1/eval-document.schema.json +144 -333
- package/schema/v1/examples/invalid/error-result-with-score.json +16 -0
- package/schema/v1/examples/invalid/missing-error-on-error.json +13 -0
- package/schema/v1/examples/valid/multi-turn-output.json +2 -0
- package/schema/v1/examples/valid/scenarios-with-mixed-errors.json +239 -0
- package/schema/version.json +1 -1
- package/src/clients/cli/api_clients/A2A/a2a_client.py +57 -10
- package/src/clients/cli/auth/auth_handler.py +21 -1
- package/src/clients/cli/common.py +8 -14
- package/src/clients/cli/error_messages.py +91 -0
- package/src/clients/cli/evaluation_runner.py +108 -97
- package/src/clients/cli/evaluator_resolver.py +8 -33
- package/src/clients/cli/generate_report.py +125 -96
- package/src/clients/cli/main.py +2 -1
- package/src/clients/cli/readme.md +1 -1
- package/src/clients/cli/result_writer.py +129 -110
- package/src/clients/cli/status_derivation.py +91 -0
- package/src/clients/node-js/bin/runevals.js +31 -9
- package/src/clients/node-js/config/default.js +1 -1
- package/src/clients/node-js/lib/env-loader.js +20 -13
- package/src/clients/node-js/lib/python-runtime.js +137 -65
- package/src/clients/node-js/lib/venv-manager.js +3 -2
- package/src/clients/node-js/lib/version-check.js +268 -0
package/README.md
CHANGED
|
@@ -24,6 +24,7 @@ A CLI for evaluating M365 Copilot agents. Send prompts to your agent, get respon
|
|
|
24
24
|
- **M365 Copilot License** for your tenant
|
|
25
25
|
- **M365 Copilot Agent** deployed to your tenant (can be created with [M365 Agents Toolkit](https://learn.microsoft.com/en-us/microsoft-365/developer/overview-m365-agents-toolkit) or any other method)
|
|
26
26
|
- **Node.js 24.12.0+** (check: `node --version`)
|
|
27
|
+
- **Python 3.13.x** is downloaded automatically. If the download fails (e.g., network restrictions), set `PYTHON_PATH` to a local Python 3.13.x installation (see [Troubleshooting](#-troubleshooting))
|
|
27
28
|
- **Environment file** with your credentials and agent ID (see [Environment Setup](#-environment-setup) below)
|
|
28
29
|
- **Your Tenant ID** - get your tenant id using the instructions [here](https://learn.microsoft.com/en-us/azure/azure-portal/get-subscription-tenant-id)
|
|
29
30
|
- Admin approval to run WORKIQ Client App for your tenant [here](https://github.com/microsoft/work-iq/blob/main/ADMIN-INSTRUCTIONS.md)
|
|
@@ -61,6 +62,7 @@ ATK projects already check in `.env.local` with agent configuration. **Do not pu
|
|
|
61
62
|
# .env.local (checked in — no secrets!)
|
|
62
63
|
# Already present from ATK:
|
|
63
64
|
M365_TITLE_ID="T_your-title-id-here" # Auto-generated by ATK
|
|
65
|
+
TEAMS_APP_TENANT_ID="your-tenant-id" # Auto-generated by ATK
|
|
64
66
|
```
|
|
65
67
|
|
|
66
68
|
```bash
|
|
@@ -69,7 +71,6 @@ AZURE_AI_OPENAI_ENDPOINT="<your-azure-openai-endpoint>"
|
|
|
69
71
|
AZURE_AI_API_KEY="<your-api-key-from-azure-portal>"
|
|
70
72
|
AZURE_AI_API_VERSION="2024-12-01-preview" # default
|
|
71
73
|
AZURE_AI_MODEL_NAME="gpt-4o-mini" # recommended
|
|
72
|
-
TENANT_ID="<your-tenant-id>"
|
|
73
74
|
```
|
|
74
75
|
|
|
75
76
|
Add `.env.local.user` to your `.gitignore`:
|
|
@@ -109,6 +110,9 @@ Now that you know what's needed, here's how to get the required values:
|
|
|
109
110
|
|
|
110
111
|
Your Azure Active Directory (AAD) tenant ID.
|
|
111
112
|
|
|
113
|
+
- If you have created your agent using Agents Toolkit, the tool automatically reads `TEAMS_APP_TENANT_ID` from `.env.local` and uses it as the tenant ID. No additional configuration is needed.
|
|
114
|
+
- For non-ATK projects, set `TENANT_ID` in your env file.
|
|
115
|
+
|
|
112
116
|
**How to obtain:**
|
|
113
117
|
|
|
114
118
|
1. Go to [Azure Portal](https://portal.azure.com)
|
|
@@ -460,6 +464,20 @@ runevals cache-dir
|
|
|
460
464
|
chmod -R u+w $(runevals cache-dir)
|
|
461
465
|
```
|
|
462
466
|
|
|
467
|
+
### Custom Python Runtime (PYTHON_PATH)
|
|
468
|
+
|
|
469
|
+
If the automatic Python download fails (e.g., network restrictions, unsupported platform), provide your own Python installation:
|
|
470
|
+
|
|
471
|
+
```bash
|
|
472
|
+
# Windows
|
|
473
|
+
set PYTHON_PATH=C:\Python313\python.exe
|
|
474
|
+
|
|
475
|
+
# macOS/Linux
|
|
476
|
+
export PYTHON_PATH=/usr/local/bin/python3.13
|
|
477
|
+
```
|
|
478
|
+
|
|
479
|
+
Python 3.13.x is the tested version. If a different version is found, you'll be prompted to confirm before proceeding. In CI/CD, a version mismatch fails automatically.
|
|
480
|
+
|
|
463
481
|
## 📚 Advanced Documentation
|
|
464
482
|
|
|
465
483
|
- **[CI/CD Integration](./CICD_CACHE_GUIDE.md)** - GitHub Actions, Azure DevOps caching
|
package/package.json
CHANGED
|
@@ -1,9 +1,9 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@microsoft/m365-copilot-eval",
|
|
3
|
-
"version": "1.
|
|
3
|
+
"version": "1.7.0-preview.1",
|
|
4
4
|
"minCliVersion": "1.0.1-preview.1",
|
|
5
5
|
"description": "Zero-config Node.js wrapper for M365 Copilot Agent Evaluations CLI (Python-based Azure AI Evaluation SDK)",
|
|
6
|
-
"publishDate": "2026-
|
|
6
|
+
"publishDate": "2026-05-14",
|
|
7
7
|
"main": "src/clients/node-js/lib/index.js",
|
|
8
8
|
"type": "module",
|
|
9
9
|
"bin": {
|
|
@@ -80,8 +80,9 @@
|
|
|
80
80
|
"README.md",
|
|
81
81
|
"LICENSE"
|
|
82
82
|
],
|
|
83
|
+
"homepage": "https://github.com/microsoft/m365-copilot-eval",
|
|
83
84
|
"repository": {
|
|
84
85
|
"type": "git",
|
|
85
|
-
"url": "https://github.com/microsoft/
|
|
86
|
+
"url": "https://github.com/microsoft/m365-copilot-eval.git"
|
|
86
87
|
}
|
|
87
88
|
}
|
package/schema/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,13 @@ All notable changes to the eval document schema will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [1.3.0](https://github.com/microsoft/M365-Copilot-Agent-Evals/compare/schema-v1.2.0...schema-v1.3.0) (2026-04-30)
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
### Features
|
|
12
|
+
|
|
13
|
+
* Added similarity evaluator for compatibility with MCS Evals. ([#228](https://github.com/microsoft/M365-Copilot-Agent-Evals/issues/228)) ([0fe8315](https://github.com/microsoft/M365-Copilot-Agent-Evals/commit/0fe8315abc8e0422d1ac9117fe9f29195f29044f))
|
|
14
|
+
|
|
8
15
|
## [1.2.0](https://github.com/microsoft/M365-Copilot-Agent-Evals/compare/schema-v1.1.0...schema-v1.2.0) (2026-04-22)
|
|
9
16
|
|
|
10
17
|
|