wedata3-ml-runtime 0.0.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- wedata3_ml_runtime-0.0.1/PKG-INFO +120 -0
- wedata3_ml_runtime-0.0.1/README.md +96 -0
- wedata3_ml_runtime-0.0.1/pyproject.toml +60 -0
- wedata3_ml_runtime-0.0.1/setup.cfg +4 -0
- wedata3_ml_runtime-0.0.1/src/wedata3_ml_runtime.egg-info/PKG-INFO +120 -0
- wedata3_ml_runtime-0.0.1/src/wedata3_ml_runtime.egg-info/SOURCES.txt +13 -0
- wedata3_ml_runtime-0.0.1/src/wedata3_ml_runtime.egg-info/dependency_links.txt +1 -0
- wedata3_ml_runtime-0.0.1/src/wedata3_ml_runtime.egg-info/requires.txt +6 -0
- wedata3_ml_runtime-0.0.1/src/wedata3_ml_runtime.egg-info/top_level.txt +1 -0
- wedata3_ml_runtime-0.0.1/src/wedata_ml_runtime/__init__.py +121 -0
- wedata3_ml_runtime-0.0.1/src/wedata_ml_runtime/client.py +538 -0
- wedata3_ml_runtime-0.0.1/src/wedata_ml_runtime/common/__init__.py +0 -0
- wedata3_ml_runtime-0.0.1/src/wedata_ml_runtime/common/base_client.py +14 -0
- wedata3_ml_runtime-0.0.1/src/wedata_ml_runtime/experiment.py +365 -0
- wedata3_ml_runtime-0.0.1/tests/test_refresh_sys_pkg.py +218 -0
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: wedata3-ml-runtime
|
|
3
|
+
Version: 0.0.1
|
|
4
|
+
Summary: WeData 3.0 机器学习 Notebook Kernel 运行时初始化库,提供 MLflow 深度桥接、Feast gRPC 代理、腾讯云 OpenAPI 签名调用及平台级多租户隔离。
|
|
5
|
+
Author-email: WeData Team <wedata@tencent.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://wedata.tencent.com
|
|
8
|
+
Project-URL: Documentation, https://wedata.tencent.com/docs
|
|
9
|
+
Classifier: Development Status :: 4 - Beta
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
12
|
+
Classifier: Programming Language :: Python :: 3
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
15
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
16
|
+
Requires-Python: >=3.10
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
Requires-Dist: feast==0.49.0
|
|
19
|
+
Requires-Dist: grpcio>=1.71.0
|
|
20
|
+
Requires-Dist: mlflow<3.11.0,>=2.0.0
|
|
21
|
+
Requires-Dist: pydantic>=2.10.6
|
|
22
|
+
Requires-Dist: wedata-mlflow-header-plugin>=0.1.2
|
|
23
|
+
Requires-Dist: mlflow-tclake-plugin>=3.0.15
|
|
24
|
+
|
|
25
|
+
# wedata3-ml-runtime
|
|
26
|
+
|
|
27
|
+
WeData 3.0 机器学习 Notebook Kernel 运行时初始化库。
|
|
28
|
+
|
|
29
|
+
## 定位
|
|
30
|
+
|
|
31
|
+
本包是 WeData 3.0 Notebook 场景下,运行在 DLC 计算容器中的 **Kernel 启动适配层**。它不改变用户代码,而是通过 monkey patching 把开源的 `mlflow` / `feast` 改造成带有 WeData 平台语义(多租户隔离、CAM 鉴权、审计、网关代理)的版本。
|
|
32
|
+
|
|
33
|
+
> 前身:`wedata-pre-code`(同时支持 2.0 和 3.0)
|
|
34
|
+
>
|
|
35
|
+
> 本包仅承载 WeData 3.0 逻辑。WeData 2.0 仍沿用 `wedata-pre-code`,位于 `../wedata-pre-execute/`(已进入维护模式)。
|
|
36
|
+
|
|
37
|
+
## 核心能力
|
|
38
|
+
|
|
39
|
+
1. **装包标记**:配合 DLC 镜像预装 → 运行时刷新 `/opt/spark/pip_list.json` 系统包快照,消除重复装包告警
|
|
40
|
+
2. **环境变量注入**:`WEDATA_WORKSPACE_ID` / `MLFLOW_TRACKING_URI` / `TENCENTCLOUD_SECRET_ID/KEY/TOKEN` / `KERNEL_WEDATA_*` 等 20+ 条
|
|
41
|
+
3. **MLflow 客户端重写**:
|
|
42
|
+
- `create/get/search/rename/delete experiment` 改走腾讯云 OpenAPI(TC3-HMAC-SHA256 手签)
|
|
43
|
+
- `MlflowClient` 上所有写操作套装饰器做 workspace_id 校验
|
|
44
|
+
4. **Feast gRPC 代理**:给 `RemoteRegistry` 加 `X-Target-Service-IP/PORT` 请求头走反向代理
|
|
45
|
+
5. **自动 tag 注入**:所有 mlflow 对象自动打上 `wedata.workspace / wedata.datascience.type / mlflow.user`
|
|
46
|
+
|
|
47
|
+
## 使用
|
|
48
|
+
|
|
49
|
+
```python
|
|
50
|
+
%pip install wedata3-ml-runtime
|
|
51
|
+
from wedata_ml_runtime.client import Wedata3PreCodeClient
|
|
52
|
+
|
|
53
|
+
client = Wedata3PreCodeClient(
|
|
54
|
+
workspace_id="...",
|
|
55
|
+
base_url="...",
|
|
56
|
+
region="...",
|
|
57
|
+
ap_region_id=1,
|
|
58
|
+
mlflow_gateway_url="...",
|
|
59
|
+
feast_gateway_url="...",
|
|
60
|
+
mlflow_proxy_ip="...",
|
|
61
|
+
mlflow_proxy_port="...",
|
|
62
|
+
feast_proxy_ip="...",
|
|
63
|
+
feast_proxy_port="...",
|
|
64
|
+
kernel_task_name="...",
|
|
65
|
+
kernel_task_id="...",
|
|
66
|
+
cloud_sdk_secret_id="...",
|
|
67
|
+
cloud_sdk_secret_key="...",
|
|
68
|
+
cloud_sdk_secret_token="...",
|
|
69
|
+
qcloud_uin="...",
|
|
70
|
+
qcloud_subuin="...",
|
|
71
|
+
)
|
|
72
|
+
client.init()
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## 必传参数
|
|
76
|
+
|
|
77
|
+
| 参数 | 说明 |
|
|
78
|
+
|------|------|
|
|
79
|
+
| `workspace_id` | 工作空间 ID |
|
|
80
|
+
| `base_url` | WeData 控制台基础 URL |
|
|
81
|
+
| `mlflow_gateway_url` | MLflow Serverless 网关地址 |
|
|
82
|
+
| `feast_gateway_url` | Feast Serverless 网关地址 |
|
|
83
|
+
| `mlflow_proxy_ip` / `mlflow_proxy_port` | MLflow 转发地址 |
|
|
84
|
+
| `feast_proxy_ip` / `feast_proxy_port` | Feast 转发地址 |
|
|
85
|
+
|
|
86
|
+
## 可选参数
|
|
87
|
+
|
|
88
|
+
| 参数 | 说明 |
|
|
89
|
+
|------|------|
|
|
90
|
+
| `region` / `ap_region_id` | 地域标识 |
|
|
91
|
+
| `kernel_task_name` / `kernel_task_id` | 任务身份 |
|
|
92
|
+
| `kernel_submit_form_workflow` | 工作流标识 |
|
|
93
|
+
| `cloud_sdk_secret_id` / `cloud_sdk_secret_key` / `cloud_sdk_secret_token` | CAM 临时凭证三元组(12 小时过期) |
|
|
94
|
+
| `cloud_sdk_env` | SDK 环境(`dev` / `test` / `pre`) |
|
|
95
|
+
| `cloud_sdk_user_id` | 测试账户 ID(`cloud_sdk_env=test` 时使用) |
|
|
96
|
+
| `qcloud_uin` / `qcloud_subuin` | 腾讯云主账号 / 子账号 UIN |
|
|
97
|
+
|
|
98
|
+
## 构建发布
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
bash build.sh
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
脚本内容:`rm -rf dist/ build/` → `uv build` → `twine upload dist/*`
|
|
105
|
+
|
|
106
|
+
## 从 wedata-pre-code 迁移
|
|
107
|
+
|
|
108
|
+
| 维度 | 老包 `wedata-pre-code` | 新包 `wedata3-ml-runtime` |
|
|
109
|
+
|------|----------------------|------------------------|
|
|
110
|
+
| PyPI 名 | `wedata-pre-code` | `wedata3-ml-runtime` |
|
|
111
|
+
| Python 模块 | `wedata_pre_code.wedata3.client` | `wedata_ml_runtime.client` |
|
|
112
|
+
| 客户端类 | `Wedata3PreCodeClient` | `Wedata3PreCodeClient`(保持兼容)|
|
|
113
|
+
| extras | `pip install "wedata-pre-code[wedata-3]"` | 无需 extras,主依赖已覆盖 |
|
|
114
|
+
| 覆盖平台 | 2.0 + 3.0 | **仅 3.0** |
|
|
115
|
+
|
|
116
|
+
**迁移办法**:把 `%pip install wedata-pre-code[wedata-3]==X.Y.Z` 改为 `%pip install wedata3-ml-runtime==X.Y.Z`,并把 `from wedata_pre_code.wedata3.client import Wedata3PreCodeClient` 改为 `from wedata_ml_runtime.client import Wedata3PreCodeClient`。
|
|
117
|
+
|
|
118
|
+
## 相关设计文档
|
|
119
|
+
|
|
120
|
+
- `application/science/doc/plan/wedata-pre-code-image-baseline-plan.md`:DLC 镜像基线 + 运行时 Override 方案
|
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
# wedata3-ml-runtime
|
|
2
|
+
|
|
3
|
+
WeData 3.0 机器学习 Notebook Kernel 运行时初始化库。
|
|
4
|
+
|
|
5
|
+
## 定位
|
|
6
|
+
|
|
7
|
+
本包是 WeData 3.0 Notebook 场景下,运行在 DLC 计算容器中的 **Kernel 启动适配层**。它不改变用户代码,而是通过 monkey patching 把开源的 `mlflow` / `feast` 改造成带有 WeData 平台语义(多租户隔离、CAM 鉴权、审计、网关代理)的版本。
|
|
8
|
+
|
|
9
|
+
> 前身:`wedata-pre-code`(同时支持 2.0 和 3.0)
|
|
10
|
+
>
|
|
11
|
+
> 本包仅承载 WeData 3.0 逻辑。WeData 2.0 仍沿用 `wedata-pre-code`,位于 `../wedata-pre-execute/`(已进入维护模式)。
|
|
12
|
+
|
|
13
|
+
## 核心能力
|
|
14
|
+
|
|
15
|
+
1. **装包标记**:配合 DLC 镜像预装 → 运行时刷新 `/opt/spark/pip_list.json` 系统包快照,消除重复装包告警
|
|
16
|
+
2. **环境变量注入**:`WEDATA_WORKSPACE_ID` / `MLFLOW_TRACKING_URI` / `TENCENTCLOUD_SECRET_ID/KEY/TOKEN` / `KERNEL_WEDATA_*` 等 20+ 条
|
|
17
|
+
3. **MLflow 客户端重写**:
|
|
18
|
+
- `create/get/search/rename/delete experiment` 改走腾讯云 OpenAPI(TC3-HMAC-SHA256 手签)
|
|
19
|
+
- `MlflowClient` 上所有写操作套装饰器做 workspace_id 校验
|
|
20
|
+
4. **Feast gRPC 代理**:给 `RemoteRegistry` 加 `X-Target-Service-IP/PORT` 请求头走反向代理
|
|
21
|
+
5. **自动 tag 注入**:所有 mlflow 对象自动打上 `wedata.workspace / wedata.datascience.type / mlflow.user`
|
|
22
|
+
|
|
23
|
+
## 使用
|
|
24
|
+
|
|
25
|
+
```python
|
|
26
|
+
%pip install wedata3-ml-runtime
|
|
27
|
+
from wedata_ml_runtime.client import Wedata3PreCodeClient
|
|
28
|
+
|
|
29
|
+
client = Wedata3PreCodeClient(
|
|
30
|
+
workspace_id="...",
|
|
31
|
+
base_url="...",
|
|
32
|
+
region="...",
|
|
33
|
+
ap_region_id=1,
|
|
34
|
+
mlflow_gateway_url="...",
|
|
35
|
+
feast_gateway_url="...",
|
|
36
|
+
mlflow_proxy_ip="...",
|
|
37
|
+
mlflow_proxy_port="...",
|
|
38
|
+
feast_proxy_ip="...",
|
|
39
|
+
feast_proxy_port="...",
|
|
40
|
+
kernel_task_name="...",
|
|
41
|
+
kernel_task_id="...",
|
|
42
|
+
cloud_sdk_secret_id="...",
|
|
43
|
+
cloud_sdk_secret_key="...",
|
|
44
|
+
cloud_sdk_secret_token="...",
|
|
45
|
+
qcloud_uin="...",
|
|
46
|
+
qcloud_subuin="...",
|
|
47
|
+
)
|
|
48
|
+
client.init()
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
## 必传参数
|
|
52
|
+
|
|
53
|
+
| 参数 | 说明 |
|
|
54
|
+
|------|------|
|
|
55
|
+
| `workspace_id` | 工作空间 ID |
|
|
56
|
+
| `base_url` | WeData 控制台基础 URL |
|
|
57
|
+
| `mlflow_gateway_url` | MLflow Serverless 网关地址 |
|
|
58
|
+
| `feast_gateway_url` | Feast Serverless 网关地址 |
|
|
59
|
+
| `mlflow_proxy_ip` / `mlflow_proxy_port` | MLflow 转发地址 |
|
|
60
|
+
| `feast_proxy_ip` / `feast_proxy_port` | Feast 转发地址 |
|
|
61
|
+
|
|
62
|
+
## 可选参数
|
|
63
|
+
|
|
64
|
+
| 参数 | 说明 |
|
|
65
|
+
|------|------|
|
|
66
|
+
| `region` / `ap_region_id` | 地域标识 |
|
|
67
|
+
| `kernel_task_name` / `kernel_task_id` | 任务身份 |
|
|
68
|
+
| `kernel_submit_form_workflow` | 工作流标识 |
|
|
69
|
+
| `cloud_sdk_secret_id` / `cloud_sdk_secret_key` / `cloud_sdk_secret_token` | CAM 临时凭证三元组(12 小时过期) |
|
|
70
|
+
| `cloud_sdk_env` | SDK 环境(`dev` / `test` / `pre`) |
|
|
71
|
+
| `cloud_sdk_user_id` | 测试账户 ID(`cloud_sdk_env=test` 时使用) |
|
|
72
|
+
| `qcloud_uin` / `qcloud_subuin` | 腾讯云主账号 / 子账号 UIN |
|
|
73
|
+
|
|
74
|
+
## 构建发布
|
|
75
|
+
|
|
76
|
+
```bash
|
|
77
|
+
bash build.sh
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
脚本内容:`rm -rf dist/ build/` → `uv build` → `twine upload dist/*`
|
|
81
|
+
|
|
82
|
+
## 从 wedata-pre-code 迁移
|
|
83
|
+
|
|
84
|
+
| 维度 | 老包 `wedata-pre-code` | 新包 `wedata3-ml-runtime` |
|
|
85
|
+
|------|----------------------|------------------------|
|
|
86
|
+
| PyPI 名 | `wedata-pre-code` | `wedata3-ml-runtime` |
|
|
87
|
+
| Python 模块 | `wedata_pre_code.wedata3.client` | `wedata_ml_runtime.client` |
|
|
88
|
+
| 客户端类 | `Wedata3PreCodeClient` | `Wedata3PreCodeClient`(保持兼容)|
|
|
89
|
+
| extras | `pip install "wedata-pre-code[wedata-3]"` | 无需 extras,主依赖已覆盖 |
|
|
90
|
+
| 覆盖平台 | 2.0 + 3.0 | **仅 3.0** |
|
|
91
|
+
|
|
92
|
+
**迁移办法**:把 `%pip install wedata-pre-code[wedata-3]==X.Y.Z` 改为 `%pip install wedata3-ml-runtime==X.Y.Z`,并把 `from wedata_pre_code.wedata3.client import Wedata3PreCodeClient` 改为 `from wedata_ml_runtime.client import Wedata3PreCodeClient`。
|
|
93
|
+
|
|
94
|
+
## 相关设计文档
|
|
95
|
+
|
|
96
|
+
- `application/science/doc/plan/wedata-pre-code-image-baseline-plan.md`:DLC 镜像基线 + 运行时 Override 方案
|
|
@@ -0,0 +1,60 @@
|
|
|
1
|
+
[project]
|
|
2
|
+
name = "wedata3-ml-runtime"
|
|
3
|
+
version = "0.0.1"
|
|
4
|
+
description = "WeData 3.0 机器学习 Notebook Kernel 运行时初始化库,提供 MLflow 深度桥接、Feast gRPC 代理、腾讯云 OpenAPI 签名调用及平台级多租户隔离。"
|
|
5
|
+
authors = [
|
|
6
|
+
{name = "WeData Team", email = "wedata@tencent.com"}
|
|
7
|
+
]
|
|
8
|
+
readme = "README.md"
|
|
9
|
+
license = {text = "MIT"}
|
|
10
|
+
classifiers = [
|
|
11
|
+
"Development Status :: 4 - Beta",
|
|
12
|
+
"Intended Audience :: Developers",
|
|
13
|
+
"License :: OSI Approved :: MIT License",
|
|
14
|
+
"Programming Language :: Python :: 3",
|
|
15
|
+
"Programming Language :: Python :: 3.10",
|
|
16
|
+
"Programming Language :: Python :: 3.11",
|
|
17
|
+
"Topic :: Scientific/Engineering :: Artificial Intelligence",
|
|
18
|
+
]
|
|
19
|
+
requires-python = ">=3.10"
|
|
20
|
+
# 依赖扁平化:wedata-pre-code 时期的 [wedata-3] extras 已直接并入主依赖,
|
|
21
|
+
# 因为 wedata3-ml-runtime 仅服务 WeData 3.0,不需要按平台切换依赖集。
|
|
22
|
+
dependencies = [
|
|
23
|
+
"feast==0.49.0",
|
|
24
|
+
"grpcio>=1.71.0",
|
|
25
|
+
"mlflow>=2.0.0,<3.11.0",
|
|
26
|
+
"pydantic>=2.10.6",
|
|
27
|
+
"wedata-mlflow-header-plugin>=0.1.2",
|
|
28
|
+
"mlflow-tclake-plugin>=3.0.15",
|
|
29
|
+
]
|
|
30
|
+
|
|
31
|
+
[project.urls]
|
|
32
|
+
Homepage = "https://wedata.tencent.com"
|
|
33
|
+
Documentation = "https://wedata.tencent.com/docs"
|
|
34
|
+
|
|
35
|
+
[build-system]
|
|
36
|
+
# 使用 setuptools 作为 build backend:
|
|
37
|
+
# PyPI 包名(wedata3-ml-runtime)与 Python 包目录 / import 名(wedata_ml_runtime)
|
|
38
|
+
# 故意不一致,uv_build 默认假设两者同名导致构建失败;setuptools 通过下方
|
|
39
|
+
# `[tool.setuptools.packages.find]` 显式配置即可正确打包。
|
|
40
|
+
requires = ["setuptools>=68", "wheel"]
|
|
41
|
+
build-backend = "setuptools.build_meta"
|
|
42
|
+
|
|
43
|
+
[tool.setuptools]
|
|
44
|
+
# 显式声明 src layout:包目录 wedata_ml_runtime 位于 src/ 下
|
|
45
|
+
package-dir = {"" = "src"}
|
|
46
|
+
|
|
47
|
+
[tool.setuptools.packages.find]
|
|
48
|
+
where = ["src"]
|
|
49
|
+
include = ["wedata_ml_runtime*"]
|
|
50
|
+
|
|
51
|
+
[dependency-groups]
|
|
52
|
+
dev = [
|
|
53
|
+
"pytest",
|
|
54
|
+
"python-dotenv",
|
|
55
|
+
]
|
|
56
|
+
|
|
57
|
+
[tool.pytest.ini_options]
|
|
58
|
+
markers = [
|
|
59
|
+
"integration: 集成测试,需要真实的腾讯云 WeData 3.0 测试环境",
|
|
60
|
+
]
|
|
@@ -0,0 +1,120 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: wedata3-ml-runtime
|
|
3
|
+
Version: 0.0.1
|
|
4
|
+
Summary: WeData 3.0 机器学习 Notebook Kernel 运行时初始化库,提供 MLflow 深度桥接、Feast gRPC 代理、腾讯云 OpenAPI 签名调用及平台级多租户隔离。
|
|
5
|
+
Author-email: WeData Team <wedata@tencent.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://wedata.tencent.com
|
|
8
|
+
Project-URL: Documentation, https://wedata.tencent.com/docs
|
|
9
|
+
Classifier: Development Status :: 4 - Beta
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
12
|
+
Classifier: Programming Language :: Python :: 3
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
15
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
16
|
+
Requires-Python: >=3.10
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
Requires-Dist: feast==0.49.0
|
|
19
|
+
Requires-Dist: grpcio>=1.71.0
|
|
20
|
+
Requires-Dist: mlflow<3.11.0,>=2.0.0
|
|
21
|
+
Requires-Dist: pydantic>=2.10.6
|
|
22
|
+
Requires-Dist: wedata-mlflow-header-plugin>=0.1.2
|
|
23
|
+
Requires-Dist: mlflow-tclake-plugin>=3.0.15
|
|
24
|
+
|
|
25
|
+
# wedata3-ml-runtime
|
|
26
|
+
|
|
27
|
+
WeData 3.0 机器学习 Notebook Kernel 运行时初始化库。
|
|
28
|
+
|
|
29
|
+
## 定位
|
|
30
|
+
|
|
31
|
+
本包是 WeData 3.0 Notebook 场景下,运行在 DLC 计算容器中的 **Kernel 启动适配层**。它不改变用户代码,而是通过 monkey patching 把开源的 `mlflow` / `feast` 改造成带有 WeData 平台语义(多租户隔离、CAM 鉴权、审计、网关代理)的版本。
|
|
32
|
+
|
|
33
|
+
> 前身:`wedata-pre-code`(同时支持 2.0 和 3.0)
|
|
34
|
+
>
|
|
35
|
+
> 本包仅承载 WeData 3.0 逻辑。WeData 2.0 仍沿用 `wedata-pre-code`,位于 `../wedata-pre-execute/`(已进入维护模式)。
|
|
36
|
+
|
|
37
|
+
## 核心能力
|
|
38
|
+
|
|
39
|
+
1. **装包标记**:配合 DLC 镜像预装 → 运行时刷新 `/opt/spark/pip_list.json` 系统包快照,消除重复装包告警
|
|
40
|
+
2. **环境变量注入**:`WEDATA_WORKSPACE_ID` / `MLFLOW_TRACKING_URI` / `TENCENTCLOUD_SECRET_ID/KEY/TOKEN` / `KERNEL_WEDATA_*` 等 20+ 条
|
|
41
|
+
3. **MLflow 客户端重写**:
|
|
42
|
+
- `create/get/search/rename/delete experiment` 改走腾讯云 OpenAPI(TC3-HMAC-SHA256 手签)
|
|
43
|
+
- `MlflowClient` 上所有写操作套装饰器做 workspace_id 校验
|
|
44
|
+
4. **Feast gRPC 代理**:给 `RemoteRegistry` 加 `X-Target-Service-IP/PORT` 请求头走反向代理
|
|
45
|
+
5. **自动 tag 注入**:所有 mlflow 对象自动打上 `wedata.workspace / wedata.datascience.type / mlflow.user`
|
|
46
|
+
|
|
47
|
+
## 使用
|
|
48
|
+
|
|
49
|
+
```python
|
|
50
|
+
%pip install wedata3-ml-runtime
|
|
51
|
+
from wedata_ml_runtime.client import Wedata3PreCodeClient
|
|
52
|
+
|
|
53
|
+
client = Wedata3PreCodeClient(
|
|
54
|
+
workspace_id="...",
|
|
55
|
+
base_url="...",
|
|
56
|
+
region="...",
|
|
57
|
+
ap_region_id=1,
|
|
58
|
+
mlflow_gateway_url="...",
|
|
59
|
+
feast_gateway_url="...",
|
|
60
|
+
mlflow_proxy_ip="...",
|
|
61
|
+
mlflow_proxy_port="...",
|
|
62
|
+
feast_proxy_ip="...",
|
|
63
|
+
feast_proxy_port="...",
|
|
64
|
+
kernel_task_name="...",
|
|
65
|
+
kernel_task_id="...",
|
|
66
|
+
cloud_sdk_secret_id="...",
|
|
67
|
+
cloud_sdk_secret_key="...",
|
|
68
|
+
cloud_sdk_secret_token="...",
|
|
69
|
+
qcloud_uin="...",
|
|
70
|
+
qcloud_subuin="...",
|
|
71
|
+
)
|
|
72
|
+
client.init()
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## 必传参数
|
|
76
|
+
|
|
77
|
+
| 参数 | 说明 |
|
|
78
|
+
|------|------|
|
|
79
|
+
| `workspace_id` | 工作空间 ID |
|
|
80
|
+
| `base_url` | WeData 控制台基础 URL |
|
|
81
|
+
| `mlflow_gateway_url` | MLflow Serverless 网关地址 |
|
|
82
|
+
| `feast_gateway_url` | Feast Serverless 网关地址 |
|
|
83
|
+
| `mlflow_proxy_ip` / `mlflow_proxy_port` | MLflow 转发地址 |
|
|
84
|
+
| `feast_proxy_ip` / `feast_proxy_port` | Feast 转发地址 |
|
|
85
|
+
|
|
86
|
+
## 可选参数
|
|
87
|
+
|
|
88
|
+
| 参数 | 说明 |
|
|
89
|
+
|------|------|
|
|
90
|
+
| `region` / `ap_region_id` | 地域标识 |
|
|
91
|
+
| `kernel_task_name` / `kernel_task_id` | 任务身份 |
|
|
92
|
+
| `kernel_submit_form_workflow` | 工作流标识 |
|
|
93
|
+
| `cloud_sdk_secret_id` / `cloud_sdk_secret_key` / `cloud_sdk_secret_token` | CAM 临时凭证三元组(12 小时过期) |
|
|
94
|
+
| `cloud_sdk_env` | SDK 环境(`dev` / `test` / `pre`) |
|
|
95
|
+
| `cloud_sdk_user_id` | 测试账户 ID(`cloud_sdk_env=test` 时使用) |
|
|
96
|
+
| `qcloud_uin` / `qcloud_subuin` | 腾讯云主账号 / 子账号 UIN |
|
|
97
|
+
|
|
98
|
+
## 构建发布
|
|
99
|
+
|
|
100
|
+
```bash
|
|
101
|
+
bash build.sh
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
脚本内容:`rm -rf dist/ build/` → `uv build` → `twine upload dist/*`
|
|
105
|
+
|
|
106
|
+
## 从 wedata-pre-code 迁移
|
|
107
|
+
|
|
108
|
+
| 维度 | 老包 `wedata-pre-code` | 新包 `wedata3-ml-runtime` |
|
|
109
|
+
|------|----------------------|------------------------|
|
|
110
|
+
| PyPI 名 | `wedata-pre-code` | `wedata3-ml-runtime` |
|
|
111
|
+
| Python 模块 | `wedata_pre_code.wedata3.client` | `wedata_ml_runtime.client` |
|
|
112
|
+
| 客户端类 | `Wedata3PreCodeClient` | `Wedata3PreCodeClient`(保持兼容)|
|
|
113
|
+
| extras | `pip install "wedata-pre-code[wedata-3]"` | 无需 extras,主依赖已覆盖 |
|
|
114
|
+
| 覆盖平台 | 2.0 + 3.0 | **仅 3.0** |
|
|
115
|
+
|
|
116
|
+
**迁移办法**:把 `%pip install wedata-pre-code[wedata-3]==X.Y.Z` 改为 `%pip install wedata3-ml-runtime==X.Y.Z`,并把 `from wedata_pre_code.wedata3.client import Wedata3PreCodeClient` 改为 `from wedata_ml_runtime.client import Wedata3PreCodeClient`。
|
|
117
|
+
|
|
118
|
+
## 相关设计文档
|
|
119
|
+
|
|
120
|
+
- `application/science/doc/plan/wedata-pre-code-image-baseline-plan.md`:DLC 镜像基线 + 运行时 Override 方案
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
README.md
|
|
2
|
+
pyproject.toml
|
|
3
|
+
src/wedata3_ml_runtime.egg-info/PKG-INFO
|
|
4
|
+
src/wedata3_ml_runtime.egg-info/SOURCES.txt
|
|
5
|
+
src/wedata3_ml_runtime.egg-info/dependency_links.txt
|
|
6
|
+
src/wedata3_ml_runtime.egg-info/requires.txt
|
|
7
|
+
src/wedata3_ml_runtime.egg-info/top_level.txt
|
|
8
|
+
src/wedata_ml_runtime/__init__.py
|
|
9
|
+
src/wedata_ml_runtime/client.py
|
|
10
|
+
src/wedata_ml_runtime/experiment.py
|
|
11
|
+
src/wedata_ml_runtime/common/__init__.py
|
|
12
|
+
src/wedata_ml_runtime/common/base_client.py
|
|
13
|
+
tests/test_refresh_sys_pkg.py
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
wedata_ml_runtime
|
|
@@ -0,0 +1,121 @@
|
|
|
1
|
+
"""WeData ML Runtime.
|
|
2
|
+
|
|
3
|
+
WeData 3.0 Notebook Kernel runtime initialization library. Provides deep
|
|
4
|
+
bridging with MLflow / Feast and platform-level multi-tenant isolation for
|
|
5
|
+
Machine Learning experiments.
|
|
6
|
+
|
|
7
|
+
History: this package is the WeData 3.0 portion split out of
|
|
8
|
+
`wedata-pre-code` (which originally served both 2.0 and 3.0). Only 3.0 logic
|
|
9
|
+
is carried here.
|
|
10
|
+
"""
|
|
11
|
+
|
|
12
|
+
# ---------------------------------------------------------------------------
|
|
13
|
+
# MLflow 3.x 兼容垫片
|
|
14
|
+
# ---------------------------------------------------------------------------
|
|
15
|
+
# 背景:部分第三方 MLflow 插件(mlflow-tclake-plugin / wedata-mlflow-header-plugin
|
|
16
|
+
# 等)仍在模块加载期 import 一些老版本 mlflow 的符号;在 mlflow>=3 环境里这些
|
|
17
|
+
# 符号已被改名/移除,导致 Notebook 预执行脚本 ``ImportError`` 无法初始化。
|
|
18
|
+
#
|
|
19
|
+
# 本垫片必须在 ``wedata_ml_runtime`` 被 import 时立即生效(先于 client 里
|
|
20
|
+
# ``import mlflow`` 触发的第三方插件注册链路),以回填别名/注入 stub 的方式
|
|
21
|
+
# 让后续 ``from mlflow.xxx import YYY`` 能命中。
|
|
22
|
+
#
|
|
23
|
+
# 已覆盖符号:
|
|
24
|
+
# 1. ``mlflow.types.llm.ChatResponse``
|
|
25
|
+
# mlflow 2.x 旧名,3.x 改名为 ``ChatCompletionResponse``。
|
|
26
|
+
# 2. ``mlflow.utils.databricks_utils.is_in_databricks_serverless``
|
|
27
|
+
# mlflow 早期私有/内部符号,3.x 已移除。WeData/DLC 运行环境不是 Databricks
|
|
28
|
+
# Serverless,用永远返回 ``False`` 的 stub 兜底即可。
|
|
29
|
+
# 3. ``mlflow.utils.model_utils.RECORD_ENV_VAR_ALLOWLIST``
|
|
30
|
+
# mlflow 新版本引入的 ``set``,用于标记允许记录到 model 元数据的环境
|
|
31
|
+
# 变量白名单。在老版本 mlflow(如生产环境 Python 3.11 下仍在用的旧包)
|
|
32
|
+
# 里缺失,第三方插件 eager import 会直接 ``ImportError``。这里用空
|
|
33
|
+
# ``set()`` 兜底——语义等价于 "不额外记录任何环境变量",对 WeData/DLC
|
|
34
|
+
# 场景安全无副作用。
|
|
35
|
+
# 4. ``mlflow.models.display_utils``(整模块缺失)
|
|
36
|
+
# mlflow 较新版本才新增的子模块,用于在 Databricks Notebook 里渲染
|
|
37
|
+
# Agent Eval / ModelInfo 的 HTML 展示组件。老版本 mlflow 根本没有
|
|
38
|
+
# 该子模块,第三方插件 ``import mlflow.models.display_utils`` 会抛
|
|
39
|
+
# ``ModuleNotFoundError``。这里向 ``sys.modules`` 注入一个伪模块,
|
|
40
|
+
# 并提供 noop 版本的 ``maybe_render_agent_eval_recipe``,保证
|
|
41
|
+
# ``import``/``from ... import ...`` 两种用法都可过,且在非 Databricks
|
|
42
|
+
# 环境下本就不需要真实渲染。
|
|
43
|
+
#
|
|
44
|
+
# 关联:2026-05 修复 Notebook 预执行脚本因第三方插件 import 老符号/老模块报
|
|
45
|
+
# ``ImportError: cannot import name 'ChatResponse' / 'is_in_databricks_serverless'
|
|
46
|
+
# / 'RECORD_ENV_VAR_ALLOWLIST'`` 以及
|
|
47
|
+
# ``ModuleNotFoundError: No module named 'mlflow.models.display_utils'`` 的问题。
|
|
48
|
+
try: # pragma: no cover - 防御性兜底,失败也不应影响主流程
|
|
49
|
+
import mlflow.types.llm as _mlflow_llm # type: ignore
|
|
50
|
+
|
|
51
|
+
if not hasattr(_mlflow_llm, "ChatResponse"):
|
|
52
|
+
_fallback = getattr(_mlflow_llm, "ChatCompletionResponse", None)
|
|
53
|
+
if _fallback is not None:
|
|
54
|
+
_mlflow_llm.ChatResponse = _fallback # type: ignore[attr-defined]
|
|
55
|
+
except Exception: # noqa: BLE001 - 不能让垫片报错阻断 runtime 初始化
|
|
56
|
+
pass
|
|
57
|
+
|
|
58
|
+
try: # pragma: no cover
|
|
59
|
+
import mlflow.utils.databricks_utils as _mlflow_dbx # type: ignore
|
|
60
|
+
|
|
61
|
+
if not hasattr(_mlflow_dbx, "is_in_databricks_serverless"):
|
|
62
|
+
def _is_in_databricks_serverless_stub(*_args, **_kwargs) -> bool: # noqa: D401
|
|
63
|
+
"""WeData/DLC 运行环境恒定不是 Databricks Serverless,返回 False。"""
|
|
64
|
+
return False
|
|
65
|
+
|
|
66
|
+
_mlflow_dbx.is_in_databricks_serverless = _is_in_databricks_serverless_stub # type: ignore[attr-defined]
|
|
67
|
+
except Exception: # noqa: BLE001
|
|
68
|
+
pass
|
|
69
|
+
|
|
70
|
+
try: # pragma: no cover
|
|
71
|
+
import mlflow.utils.model_utils as _mlflow_model_utils # type: ignore
|
|
72
|
+
|
|
73
|
+
if not hasattr(_mlflow_model_utils, "RECORD_ENV_VAR_ALLOWLIST"):
|
|
74
|
+
# 空 set 语义:不额外把环境变量写入 model 元数据,对 WeData/DLC 安全无副作用
|
|
75
|
+
_mlflow_model_utils.RECORD_ENV_VAR_ALLOWLIST = set() # type: ignore[attr-defined]
|
|
76
|
+
except Exception: # noqa: BLE001
|
|
77
|
+
pass
|
|
78
|
+
|
|
79
|
+
try: # pragma: no cover
|
|
80
|
+
# 整模块缺失:老 mlflow 版本没有 ``mlflow.models.display_utils`` 子模块,
|
|
81
|
+
# 必须向 sys.modules 注入伪模块,避免第三方插件
|
|
82
|
+
# ``import mlflow.models.display_utils`` / ``from ... import maybe_render_agent_eval_recipe``
|
|
83
|
+
# 直接抛 ``ModuleNotFoundError``。
|
|
84
|
+
import importlib as _importlib
|
|
85
|
+
import sys as _sys
|
|
86
|
+
import types as _types
|
|
87
|
+
|
|
88
|
+
_DISPLAY_UTILS_FQN = "mlflow.models.display_utils"
|
|
89
|
+
_need_stub_display_utils = False
|
|
90
|
+
try:
|
|
91
|
+
_importlib.import_module(_DISPLAY_UTILS_FQN)
|
|
92
|
+
except Exception: # noqa: BLE001 - 任何导入异常都视为"模块不可用",注入 stub
|
|
93
|
+
_need_stub_display_utils = True
|
|
94
|
+
|
|
95
|
+
if _need_stub_display_utils:
|
|
96
|
+
_stub = _types.ModuleType(_DISPLAY_UTILS_FQN)
|
|
97
|
+
|
|
98
|
+
def _maybe_render_agent_eval_recipe(*_args, **_kwargs): # noqa: D401
|
|
99
|
+
"""Noop:非 Databricks Notebook 环境下无需渲染 agent eval recipe。"""
|
|
100
|
+
return None
|
|
101
|
+
|
|
102
|
+
_stub.maybe_render_agent_eval_recipe = _maybe_render_agent_eval_recipe # type: ignore[attr-defined]
|
|
103
|
+
_stub.__all__ = ["maybe_render_agent_eval_recipe"] # type: ignore[attr-defined]
|
|
104
|
+
_sys.modules[_DISPLAY_UTILS_FQN] = _stub
|
|
105
|
+
|
|
106
|
+
# 同步把伪模块挂到父模块属性上,保持 ``mlflow.models.display_utils`` 可用
|
|
107
|
+
try:
|
|
108
|
+
import mlflow.models as _mlflow_models # type: ignore
|
|
109
|
+
|
|
110
|
+
if not hasattr(_mlflow_models, "display_utils"):
|
|
111
|
+
_mlflow_models.display_utils = _stub # type: ignore[attr-defined]
|
|
112
|
+
except Exception: # noqa: BLE001
|
|
113
|
+
pass
|
|
114
|
+
except Exception: # noqa: BLE001
|
|
115
|
+
pass
|
|
116
|
+
|
|
117
|
+
from wedata_ml_runtime.client import Wedata3PreCodeClient
|
|
118
|
+
|
|
119
|
+
__all__ = ["Wedata3PreCodeClient"]
|
|
120
|
+
|
|
121
|
+
__version__ = "0.1.8"
|