PyPI - zyworkflow - Versions diffs - 0.0.1__py3-none-any.whl - Mend

zyworkflow 0.0.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

zyworkflow/__init__.py +0 -0
zyworkflow/api_server.py +630 -0
zyworkflow/data/__init__.py +0 -0
zyworkflow/data/collection.py +1241 -0
zyworkflow/data/process.py +72 -0
zyworkflow/doc/api.md +461 -0
zyworkflow/example/__init__.py +0 -0
zyworkflow/example/train_client.py +301 -0
zyworkflow/example/train_client_example.py +43 -0
zyworkflow/policy/__init__.py +0 -0
zyworkflow/policy/train_pick_policy.py +834 -0
zyworkflow/utils/__init__.py +0 -0
zyworkflow/utils/logger_config.py +50 -0
zyworkflow/utils/pose.py +131 -0
zyworkflow/utils/utils.py +264 -0
zyworkflow-0.0.1.dist-info/METADATA +11 -0
zyworkflow-0.0.1.dist-info/RECORD +19 -0
zyworkflow-0.0.1.dist-info/WHEEL +5 -0
zyworkflow-0.0.1.dist-info/top_level.txt +1 -0

zyworkflow/data/process.py ADDED Viewed

@@ -0,0 +1,72 @@
+import os
+import traceback
+import pandas as pd
+from zyworkflow.utils.logger_config import setup_data_collection_logger
+logger = setup_data_collection_logger()
+def process_dataset(dataset_root, record_count):
+    try:
+        traj = f"traj_{record_count:03d}"
+        logger.info(f"开始对轨迹{traj}进行后处理")
+        traj_path = os.path.join(dataset_root, traj)
+        old_img_dir = os.path.join(traj_path, traj)
+        new_img_dir = os.path.join(traj_path, "images")
+        old_txt = os.path.join(traj_path, f"{traj}.txt")
+        new_csv = os.path.join(traj_path, "actions.csv")
+        if os.path.isdir(old_img_dir) and not os.path.exists(new_img_dir):
+            os.rename(old_img_dir, new_img_dir)
+        if os.path.isfile(old_txt) and not os.path.exists(new_csv):
+            df = pd.read_csv(old_txt)
+            if 'Image_Filename' in df.columns:
+                filtered_df = df[df['Image_Filename'].notna() & (df['Image_Filename'].astype(str).str.strip() != '')]
+                if not filtered_df.empty:
+                    filtered_df.to_csv(new_csv, index=False)
+                    logger.info(f"CSV文件已创建，包含{len(filtered_df)}行数据")
+                else:
+                    logger.info("没有符合条件的行，不创建CSV文件")
+            else:
+                logger.info("文件中没有Image_Filename列，不创建CSV文件")
+        logger.info("第一阶段完成：重命名 images & txt → csv")
+        rename_map = {
+            "j1(rad)": "j1",
+            "j2(rad)": "j2",
+            "j3(rad)": "j3",
+            "j4(rad)": "j4",
+            "j5(rad)": "j5",
+            "j6(rad)": "j6",
+            "Gripper_Set": "Gripper_Set_Position(‰)"
+        }
+        traj_path = os.path.join(dataset_root, traj)
+        csv_path = os.path.join(traj_path, "actions.csv")
+        if not os.path.isfile(csv_path):
+            logger.error(f"轨迹{traj}的actions.csv文件不存在")
+            return
+        df = pd.read_csv(csv_path)
+        df.rename(columns=rename_map, inplace=True)
+        df["success_flag"] = 0
+        if len(df) > 0:
+            df.loc[df.index[-1], "success_flag"] = 1
+        df.to_csv(csv_path, index=False)
+        logger.info("第二阶段完成：列名修改 & success_flag 添加")
+        logger.info(f"轨迹{traj}后处理完成")
+    except Exception as e:
+        logger.error(f"轨迹{traj}后处理失败: {e}\n{traceback.format_exc()}")
+        return
+if __name__ == "__main__":
+    dataset_root = "/home/user/8T/caizewu/code/lnn/dataset/trajectory_data_pick_update"
+    process_dataset(dataset_root)

zyworkflow/doc/api.md ADDED Viewed

@@ -0,0 +1,461 @@
+# ai-workflow API 接口文档
+> 基于 `api_server.py`（FastAPI 实现），所有接口默认返回 `application/json`。
+---
+## 基本信息
+- 服务地址：`http://<host>:8003`
+- 在线 Swagger：`http://<host>:8003/docs`
+统一错误返回格式（FastAPI `HTTPException`）：
+```json
+{ "detail": "错误原因描述" }
+```
+---
+## 数据目录规范
+本项目的数据、训练产物都默认落在 `/workspace` 下
+### 1) 数据采集目
+- **采集根目录**：
+```text
+/workspace/dataset/<dataset_id>/<ability_id>/
+```
+- **每次采集一条轨迹**会生成一个 `traj_XXX/` 目录（XXX 为 3 位序号，从 001 递增）：
+```text
+/workspace/dataset/<dataset_id>/<ability_id>/
+  traj_001/
+    actions.csv
+    images/
+      0.000000.png
+      0.050000.png
+      ...
+  traj_002/
+    actions.csv
+    images/
+      ...
+```
+- `actions.csv` 中关键字段：
+```text
+Time(s), X(m),Y(m),Z(m), Rx(rad),Ry(rad),Rz(rad), j1..j6, Gripper_Set_Position(‰), Gripper_Real, SKU, Image_Filename, success_flag
+```
+- `Image_Filename` 指向同目录下图片文件（位于 `traj_XXX/` 子目录中）。图片文件名为 `Time(s)` 的浮点字符串（保留 6 位小数）+ `.png`
+### 2) 训练数据目录
+训练请求会在服务端拼出训练数据根目录：
+```text
+root_dir = /workspace/dataset/<dataset_id>/<ability_id>/
+```
+并把该 `root_dir` 传给 `policy/train_pick_policy.py` 的 `--root_dir`。
+训练脚本会在 `root_dir` 下扫描所有 `traj_*` 目录（见 `SingleViewRobotTrajectoryDataset.__init__`），因此训练目录期望结构为：
+```text
+/workspace/dataset/<dataset_id>/<ability_id>/
+  traj_001/
+    actions.csv
+    images/
+      *.png
+  traj_002/
+  ...
+```
+### 3) 训练产物目录（日志与 checkpoint）
+- **训练日志默认路径**（可通过 `TrainRequest.log_path` 覆盖）：
+```text
+/workspace/logs/<task_id>/<ability_id>/<model_id>/training_log.txt
+```
+- **checkpoint 默认目录**（可通过 `TrainRequest.ckpt_dir` 覆盖）：
+```text
+/workspace/checkpoints/<task_id>/<ability_id>/<model_id>/
+```
+- checkpoint 文件名：
+```text
+epoch_<N>.pth
+```
+其中 `<N>` 为 1-based epoch
+### 4) 测试/推理模型路径
+测试任务会在服务端拼出模型文件完整路径：
+```text
+model_path = /workspace/checkpoints/<task_id>/<ability_id>/<model_id>/<model_name>
+```
+因此 `model_name` 一般取值示例：
+```text
+epoch_1.pth
+epoch_10.pth
+...
+```
+---
+## 目录
+- [服务信息 `GET /`](#服务信息-get-)
+- [启动训练 `POST /train`](#启动训练-post-train)
+- [查询训练状态 `GET /train/status/{task_id}/{ability_id}/{model_id}`](#查询训练状态-get-trainstatustask_idability_idmodel_id)
+- [停止训练 `POST /train/stop/{task_id}/{ability_id}/{model_id}`](#停止训练-post-trainstoptask_idability_idmodel_id)
+- [提交数据采集 `POST /data/collection`](#提交数据采集-post-datacollection)
+- [启动测试 `POST /test`](#启动测试-post-test)
+- [查询测试状态 `GET /test/status/{task_id}/{ability_id}/{model_id}/{model_name}`](#查询测试状态-get-teststatustask_idability_idmodel_idmodel_name)
+- [停止测试/急停 `POST /test/stop/{task_id}/{ability_id}/{model_id}/{model_name}`](#停止测试急停-post-teststoptask_idability_idmodel_idmodel_name)
+---
+## 服务信息 `GET /`
+返回服务基本信息。
+### 请求参数
+无
+### 响应示例
+```json
+{
+  "message": "BNN 训练和测试服务",
+  "version": "1.0"
+}
+```
+### CURL 示例
+```bash
+curl -X GET "http://127.0.0.1:8003/"
+```
+---
+## 启动训练 `POST /train`
+后台以独立进程运行训练脚本（`policy/train_pick_policy.py`）。
+### 请求体（`TrainRequest`）
+| 字段 | 类型 | 必填 | 默认值 | 说明 |
+|------|------|------|--------|------|
+| task_id | string | ✔ | - | 任务ID |
+| dataset_id | string | ✔ | - | 数据集ID（与采集侧一致的概念） |
+| model_id | string | ✔ | - | 模型版本ID |
+| ability_id | string | ✔ | - | 原子动作ID |
+| algo_type | string | ✔ | - | 算法类型（目前仅支持 `bnn`） |
+| action_type | string | ✔ | - | 动作类型（`pick`/`place`） |
+| batch_size | int | ✖ | 48 | 批大小 |
+| seq_len | int | ✖ | 4 | 图像序列长度 |
+| action_chunk | int | ✖ | 8 | 预测步长 |
+| lr | float | ✖ | 1e-4 | 学习率 |
+| num_epochs | int | ✖ | 500 | 训练轮数 |
+| start_epoch | int | ✖ | 0 | 起始 epoch（断点续训） |
+| lambda_joints | float | ✖ | 10.0 | joints 损失权重 |
+| lambda_grip | float | ✖ | 5.0 | gripper 损失权重 |
+| lambda_success | float | ✖ | 2.0 | success 损失权重 |
+| log_path | string | ✖ | null | 日志保存路径（不传用默认规则生成） |
+| ckpt_dir | string | ✖ | null | checkpoint 目录（不传用默认规则生成） |
+| success_mode | string | ✖ | within_horizon | 成功率评估模式：`within_horizon`/`terminal_only` |
+| report_url | string | ✖ | null | 训练过程上报回调地址 |
+### 训练进度回调（`report_url`）
+训练脚本每完成一个 epoch 会 POST JSON 到 `report_url`（见 `policy/train_pick_policy.py`），格式：
+```json
+{
+  "task_name": "<task_id>-<ability_id>-<model_id>",
+  "epoch": 12,  // 训练轮次
+  "duration_sec": 37.42,  // 本轮所用时间
+  "avg_loss": 0.0123,  // 本轮平均loss
+  "j_err": 0.045,  // 本轮关节角平均误差
+  "msg": "Ep 12 Saved. Time: ...",  // 本轮完整信息
+  "is_finished": false,  // 是否训练完成
+  "model_path": "/workspace/checkpoints/<task_id>/<ability_id>/<model_id>/epoch_12.pth"  // 模型地址
+}
+```
+### 成功响应（`TrainResponse`）
+```json
+{
+  "status": "started",
+  "message": "已下发训练任务",
+  "task_name": "<task_id>-<ability_id>-<model_id>"
+}
+```
+### CURL 示例
+```bash
+curl -X POST "http://127.0.0.1:8003/train" \
+  -H "Content-Type: application/json" \
+  -d '{
+        "task_id": "t1",
+        "dataset_id": "d1",
+        "model_id": "m1",
+        "ability_id": "a1",
+        "algo_type": "bnn",
+        "action_type": "pick",
+        "num_epochs": 20,
+        "batch_size": 64,
+        "lr": 0.0005,
+        "report_url": "http://127.0.0.1:9000/report"
+      }'
+```
+---
+## 查询训练状态 `GET /train/status/{task_id}/{ability_id}/{model_id}`
+返回指定训练任务的实时状态。
+### `status` 可能取值
+训练状态由服务端内存 `training_status[task_name]` 维护，并在训练进程退出时自动更新
+- **`running`**：训练进程已启动并在运行中。
+- **`stopping`**：已收到停止请求，服务端正在终止训练进程组。
+- **`stopped`**：训练已被用户停止。
+- **`completed`**：训练进程正常退出（return code = 0）。
+- **`failed`**：训练进程异常退出（return code != 0）。
+说明：
+- 训练任务不存在时会返回 **404**。
+### 响应示例
+```json
+{
+  "status": "running",
+  "message": "Process started (PID=12345)"
+}
+```
+### CURL 示例
+```bash
+curl -X GET "http://127.0.0.1:8003/train/status/t1/a1/m1"
+```
+---
+## 停止训练 `POST /train/stop/{task_id}/{ability_id}/{model_id}`
+向进程组发送 SIGTERM
+### 响应示例
+```json
+{
+  "status": "stopped",
+  "message": "训练已被用户停止。"
+}
+```
+### CURL 示例
+```bash
+curl -X POST "http://127.0.0.1:8003/train/stop/t1/a1/m1"
+```
+---
+## 提交数据采集 `POST /data/collection`
+提交采集任务，服务端异步执行，任务状态存储在内存中。
+### 请求体（`TaskRequest`）
+| 字段 | 类型 | 必填 | 默认值 | 说明 |
+|------|------|------|--------|------|
+| sku | string | ✔ | - | 物体编号 |
+| ability_id | string | ✔ | - | 原子动作ID |
+| dataset_id | string | ✔ | - | 数据集ID |
+| algo_type | string | ✔ | - | 算法类型（目前仅支持 `bnn`） |
+| action_type | string | ✔ | - | 动作类型（与 `create("<algo>-<action>")` 对应注册名） |
+| init_pose | list[float] | ✔ | - | 初始关节姿态 |
+| speed | int | ✖ | 40 | 运动速度 |
+| sampling_rate | int | ✖ | 20 | 采样率 Hz |
+| callback_url | string | ✖ | null | 任务完成回调地址 |
+### 成功响应（`TaskResponse`）
+```json
+{
+  "task_name": "<dataset_id>-<ability_id>",
+  "status": "pending",
+  "message": "数据采集任务已提交",
+  "result": null
+}
+```
+### 采集任务回调 `callback_url`
+当采集任务结束（成功/失败）且 `callback_url` 非空时，服务端会 POST JSON：
+```json
+{
+  "code": 0,
+  "status": "completed",
+  "message": "Execution completed",
+  "sku": "SKU123",
+  "task_name": "d1-a1",
+  "dataset_id": "d1",
+  "ability_id": "a1",
+  "traj_path": "/workspace/dataset/d1/a1/traj_001"
+}
+```
+- `code == 0` 表示成功
+- `traj_path` 为该次轨迹目录路径
+### CURL 示例
+```bash
+curl -X POST "http://127.0.0.1:8003/data/collection" \
+  -H "Content-Type: application/json" \
+  -d '{
+        "sku": "SKU123",
+        "ability_id": "a1",
+        "dataset_id": "d1",
+        "algo_type": "bnn",
+        "action_type": "pick",
+        "init_pose": [0,0,0,0,0,0],
+        "sampling_rate": 20,
+        "callback_url": "http://127.0.0.1:9000/collect_cb"
+      }'
+```
+## 启动测试 `POST /test`
+服务端会从相机拉取图片（`utils/utils.py` 中 `get_image(rgb_image_url)`）并进行多步推理，结果写入服务内存状态；如传 `callback_url` 会在结束时回调。
+### 请求体（`TestRequest`）
+| 字段 | 类型 | 必填 | 默认值 | 说明 |
+|------|------|------|--------|------|
+| task_id | string | ✔ | - | 任务ID |
+| ability_id | string | ✔ | - | 原子动作ID |
+| model_id | string | ✔ | - | 模型版本ID |
+| model_name | string | ✔ | - | 模型文件名（例如 `epoch_10.pth`） |
+| algo_type | string | ✔ | - | 算法类型（目前仅支持 `bnn`） |
+| action_type | string | ✔ | - | 动作类型（`pick`/`place`） |
+| seq_len | int | ✖ | 4 | 序列长度 |
+| action_chunk | int | ✖ | 8 | 动作块 |
+| step | int | ✖ | 200 | 推理步数 |
+| callback_url | string | ✖ | null | 测试结束回调 |
+### 成功响应
+```json
+{
+  "code": 0,
+  "status": "started",
+  "message": "测试任务已提交",
+  "task_name": "<task_id>-<ability_id>-<model_id>-<model_name>"
+}
+```
+### CURL 示例
+```bash
+curl -X POST "http://127.0.0.1:8003/test" \
+  -H "Content-Type: application/json" \
+  -d '{
+        "task_id": "t1",
+        "ability_id": "a1",
+        "model_id": "m1",
+        "model_name": "epoch_10.pth",
+        "algo_type": "bnn",
+        "action_type": "pick",
+        "step": 200
+      }'
+```
+---
+## 查询测试状态 `GET /test/status/{task_id}/{ability_id}/{model_id}/{model_name}`
+返回服务端内存中的测试任务状态与过程数据。
+### `status` 可能取值
+测试状态由服务端内存 `test_tasks[task_name]` 维护
+- **`starting`**：测试任务已提交，后台任务尚未进入主循环。
+- **`running`**：测试任务执行中（会持续更新 `current_step/joints/gripper/success`）。
+- **`completed`**：测试完成。
+- **`failed`**：测试失败（通常 `message` 会包含异常原因）。
+- **`stopped`**：测试任务被急停/停止（`POST /test/stop/...` 后后台检测到停止标志）。
+说明：
+- 测试任务不存在时会返回 **404**。
+- `stop_requested` 为布尔值，表示是否已收到停止请求。
+- `current_step` 从 0 开始递增。
+### 响应示例
+```json
+{
+  "status": "running",
+  "message": "测试任务执行中...",
+  "stop_requested": false,
+  "current_step": 12,
+  "joints": [[...], ...],
+  "gripper": [...],
+  "success": [...]
+}
+```
+### CURL 示例
+```bash
+curl -X GET "http://127.0.0.1:8003/test/status/t1/a1/m1/epoch_10.pth"
+```
+---
+## 停止测试/急停 `POST /test/stop/{task_id}/{ability_id}/{model_id}/{model_name}`
+会先调用机械臂急停 `post_arm_stop()`，然后设置 `stop_requested=true`，后台循环检测到后结束。
+### 响应示例
+```json
+{ "code": 0, "message": "急停指令已发送" }
+```
+### CURL 示例
+```bash
+curl -X POST "http://127.0.0.1:8003/test/stop/t1/a1/m1/epoch_10.pth"
+```
+---

zyworkflow/example/__init__.py ADDED Viewed

File without changes