PyPI - torch-rechub - Versions diffs - 0.0.4__tar.gz → 0.0.6__tar.gz - Mend

torch-rechub 0.0.4tar.gz → 0.0.6tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (263) hide show

{torch_rechub-0.0.4 → torch_rechub-0.0.6}/.github/release.yml RENAMED Viewed

@@ -8,6 +8,7 @@ changelog:
       - ignore-for-release
       - duplicate
       - invalid
+      - wontfix
     authors:
       - dependabot
       - dependabot[bot]
@@ -16,30 +17,22 @@ changelog:
     - title: "✨ 新特性 / Features"
       labels:
         - enhancement
-        - feature
-        - feat
     - title: "🐛 Bug 修复 / Bug Fixes"
       labels:
         - bug
-        - fix
-        - bugfix
     - title: "⚡ 性能优化 / Performance"
       labels:
         - performance
-        - perf
     - title: "📝 文档更新 / Documentation"
       labels:
         - documentation
-        - docs
-    - title: "🔧 维护更新 / Maintenance"
+    - title: "🔧 模型更新 / Models"
       labels:
-        - maintenance
-        - chore
-        - refactor
+        - model
     - title: "📦 依赖更新 / Dependencies"
       labels:
@@ -47,5 +40,4 @@ changelog:
     - title: "🔄 其他变更 / Other Changes"
       labels:
-        - "*"
+        - "*"

{torch_rechub-0.0.4 → torch_rechub-0.0.6}/.github/workflows/ci.yml RENAMED Viewed

@@ -1,8 +1,9 @@
 # ===================================================================
 # CI/CD 流程配置 - 代码质量检查、测试、构建、发布
 # ===================================================================
-# 这个workflow在代码文件变更时触发，运行完整的CI/CD流程
-# 排除docs目录和markdown文件的变更
+# 触发条件：
+# - push/pull_request: 运行完整 CI 检查（lint, test, security, build）
+# - release: 仅运行发布流程（跳过已执行的检查）
 name: CI/CD Pipeline
@@ -37,11 +38,13 @@ env:
 jobs:
   # ===================================================================
-  # 代码质量检查
+  # 代码质量检查 (仅在 push/PR 时运行，release 时跳过)
   # ===================================================================
   lint:
     name: Code Quality Checks
     runs-on: ubuntu-latest
+    # 跳过 release 事件，因为代码已在合并时检查过
+    if: github.event_name != 'release'
     steps:
       - name: Checkout code
@@ -88,11 +91,13 @@ jobs:
   # ===================================================================
   # 完整测试 (Python 3.9) - 运行所有测试和覆盖率报告
+  # (仅在 push/PR 时运行，release 时跳过)
   # ===================================================================
   test:
     name: Full Test Suite (Python 3.9)
     runs-on: ${{ matrix.os }}
     needs: lint
+    if: github.event_name != 'release'
     strategy:
       fail-fast: false
@@ -152,11 +157,13 @@ jobs:
   # ===================================================================
   # 依赖兼容性验证 (Python 3.10+) - 仅验证依赖安装成功
+  # (仅在 push/PR 时运行，release 时跳过)
   # ===================================================================
   compatibility:
     name: Dependency Check (Python ${{ matrix.python-version }})
     runs-on: ubuntu-latest
     needs: lint
+    if: github.event_name != 'release'
     strategy:
       fail-fast: false
@@ -186,12 +193,13 @@ jobs:
           python -c "import onnx; import onnxruntime; print('ONNX dependencies OK')"
   # ===================================================================
-  # 安全检查
+  # 安全检查 (仅在 push/PR 时运行，release 时跳过)
   # ===================================================================
   security:
     name: Security Scan
     runs-on: ubuntu-latest
     needs: lint
+    if: github.event_name != 'release'
     steps:
       - name: Checkout code
@@ -220,12 +228,13 @@ jobs:
           path: bandit-report.json
   # ===================================================================
-  # 构建检查
+  # 构建检查 (仅在 push/PR 时运行，release 时跳过)
   # ===================================================================
   build:
     name: Build Package
     runs-on: ubuntu-latest
     needs: [test, compatibility, security]
+    if: github.event_name != 'release'
     steps:
       - name: Checkout code
@@ -256,18 +265,23 @@ jobs:
           path: dist/
   # ===================================================================
-  # 自动发布到PyPI (使用 uv)
-  # 功能：从 GitHub Release 自动同步版本号、更新 CHANGELOG、发布到 PyPI
+  # 自动发布到 PyPI 和 GitHub Release (使用 uv)
+  # 功能：
+  # - 从 GitHub Release 自动同步版本号
+  # - 更新 CHANGELOG.md
+  # - 构建并发布到 PyPI
+  # - 上传构建产物到 GitHub Release 页面
+  # 注意：此 job 仅在 release 事件时运行，不依赖其他 job（代码已在合并时检查过）
   # ===================================================================
   publish:
-    name: Publish to PyPI
+    name: Publish to PyPI & GitHub Release
     runs-on: ubuntu-latest
-    needs: build
+    # 不再依赖 build job，直接运行（代码质量已在 PR 合并时验证）
     if: github.event_name == 'release' && github.event.action == 'published'
     environment: pypi
     permissions:
       id-token: write   # Required for trusted publishing
-      contents: write   # Required for pushing changes back to repo
+      contents: write   # Required for pushing changes and uploading release assets
     steps:
       - name: Checkout code
@@ -334,7 +348,7 @@ jobs:
           fi
       - name: Install uv
-        uses: astral-sh/setup-uv@v4
+        uses: astral-sh/setup-uv@v7
         with:
           version: "latest"
@@ -342,14 +356,28 @@ jobs:
         run: uv python install ${{ env.PYTHON_VERSION }}
       - name: Build package with uv
+        id: build
         run: |
           uv build
           echo "✅ Package built successfully"
           ls -la dist/
+          # 输出构建产物文件名供后续步骤使用
+          echo "WHEEL_FILE=$(ls dist/*.whl)" >> $GITHUB_OUTPUT
+          echo "SDIST_FILE=$(ls dist/*.tar.gz)" >> $GITHUB_OUTPUT
       - name: Publish to PyPI
         env:
           UV_PUBLISH_TOKEN: ${{ secrets.PYPI_API_TOKEN }}
         run: |
           uv publish
-          echo "🚀 Published to PyPI successfully!"
+          echo "🚀 Published to PyPI successfully!"
+      - name: Upload release assets to GitHub Release
+        uses: softprops/action-gh-release@v2
+        with:
+          files: |
+            dist/*.whl
+            dist/*.tar.gz
+          fail_on_unmatched_files: true
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

{torch_rechub-0.0.4 → torch_rechub-0.0.6}/.github/workflows/deploy.yml RENAMED Viewed

@@ -7,6 +7,7 @@ on:
     paths:
       - 'docs/**'
       - 'package.json'
+      - 'CHANGELOG.md'
       - '.github/workflows/deploy.yml'
 jobs:
@@ -27,6 +28,13 @@ jobs:
       - name: Install dependencies
         run: npm ci
+      - name: Sync CHANGELOG to docs
+        run: |
+          # 复制 CHANGELOG.md 到中英文文档目录
+          cp CHANGELOG.md docs/zh/community/changelog.md
+          cp CHANGELOG.md docs/en/community/changelog.md
+          echo "✅ CHANGELOG.md synced to docs directories"
       - name: Build VitePress site
         run: npm run docs:build

{torch_rechub-0.0.4 → torch_rechub-0.0.6}/CHANGELOG.md RENAMED Viewed

@@ -7,6 +7,41 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ---
+## [0.0.6] - 2025-12-11
+<!-- Release notes generated using configuration in .github/release.yml at main -->
+## What's Changed
+### ✨ 新特性 / Features
+* FEATURE: Support Streaming Parquet Dataset by @ywuenthought in https://github.com/datawhalechina/torch-rechub/pull/143
+* Docs & tracking polish: logger docstrings, README refresh, dependency tweak by @1985312383 in https://github.com/datawhalechina/torch-rechub/pull/146
+### 📝 文档更新 / Documentation
+* Refator Chinese documentation structure by @1985312383 in https://github.com/datawhalechina/torch-rechub/pull/145
+## New Contributors
+* @ywuenthought made their first contribution in https://github.com/datawhalechina/torch-rechub/pull/143
+**Full Changelog**: https://github.com/datawhalechina/torch-rechub/compare/v0.0.5...v0.0.6
+---
+## [0.0.5] - 2025-12-05
+<!-- Release notes generated using configuration in .github/release.yml at main -->
+## What's Changed
+### ✨ 新特性 / Features
+* Add torchview to Support Model Visualization && Update CI/CD and release workflows by @1985312383 in https://github.com/datawhalechina/torch-rechub/pull/141
+**Full Changelog**: https://github.com/datawhalechina/torch-rechub/compare/v0.0.4...v0.0.5
+---
 ## [0.0.4] - 2025-12-04
 <!-- Release notes generated using configuration in .github/release.yml at main -->

{torch_rechub-0.0.4 → torch_rechub-0.0.6}/CONTRIBUTING.md RENAMED Viewed

@@ -143,7 +143,7 @@ def test_deepfm_forward():
 - Include code examples
 - Provide clear step-by-step instructions
 - Keep both English and Chinese versions synchronized
-- Follow Google-style docstrings for Python code
+- Follow scikit-learn style docstrings (NumPy/SciPy convention) for Python code
 ### Docstring Example

{torch_rechub-0.0.4 → torch_rechub-0.0.6}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: torch-rechub
-Version: 0.0.4
+Version: 0.0.6
 Summary: A Pytorch Toolbox for Recommendation Models, Easy-to-use and Easy-to-extend.
 Project-URL: Homepage, https://github.com/datawhalechina/torch-rechub
 Project-URL: Documentation, https://www.torch-rechub.com
@@ -28,19 +28,29 @@ Requires-Dist: scikit-learn>=0.24.0
 Requires-Dist: torch>=1.10.0
 Requires-Dist: tqdm>=4.60.0
 Requires-Dist: transformers>=4.46.3
+Provides-Extra: bigdata
+Requires-Dist: pyarrow~=21.0; extra == 'bigdata'
 Provides-Extra: dev
 Requires-Dist: bandit>=1.7.0; extra == 'dev'
 Requires-Dist: flake8>=3.8.0; extra == 'dev'
 Requires-Dist: isort==5.13.2; extra == 'dev'
 Requires-Dist: mypy>=0.800; extra == 'dev'
 Requires-Dist: pre-commit>=2.20.0; extra == 'dev'
+Requires-Dist: pyarrow-stubs>=20.0; extra == 'dev'
 Requires-Dist: pytest-cov>=2.0; extra == 'dev'
 Requires-Dist: pytest>=6.0; extra == 'dev'
 Requires-Dist: toml>=0.10.2; extra == 'dev'
 Requires-Dist: yapf==0.43.0; extra == 'dev'
 Provides-Extra: onnx
-Requires-Dist: onnx>=1.12.0; extra == 'onnx'
-Requires-Dist: onnxruntime>=1.12.0; extra == 'onnx'
+Requires-Dist: onnx>=1.14.0; extra == 'onnx'
+Requires-Dist: onnxruntime>=1.14.0; extra == 'onnx'
+Provides-Extra: tracking
+Requires-Dist: swanlab>=0.1.0; extra == 'tracking'
+Requires-Dist: tensorboardx>=2.5; extra == 'tracking'
+Requires-Dist: wandb>=0.13.0; extra == 'tracking'
+Provides-Extra: visualization
+Requires-Dist: graphviz>=0.20; extra == 'visualization'
+Requires-Dist: torchview>=0.2.6; extra == 'visualization'
 Description-Content-Type: text/markdown
 # 🔥 Torch-RecHub - 轻量、高效、易用的 PyTorch 推荐系统框架
@@ -69,13 +79,13 @@ Description-Content-Type: text/markdown
 ## 🎯 为什么选择 Torch-RecHub？
-| 特性 | Torch-RecHub | 其他框架 |
-|------|-------------|---------|
-| 代码行数 | **10行** 完成训练+评估+部署 | 100+ 行 |
-| 模型覆盖 | **30+** 主流模型 | 有限 |
-| 生成式推荐 | ✅ HSTU/HLLM (Meta 2024) | ❌ |
-| ONNX 一键导出 | ✅ 内置支持 | 需手动适配 |
-| 学习曲线 | 极低 | 陡峭 |
+| 特性          | Torch-RecHub                | 其他框架   |
+| ------------- | --------------------------- | ---------- |
+| 代码行数      | **10行** 完成训练+评估+部署 | 100+ 行    |
+| 模型覆盖      | **30+** 主流模型            | 有限       |
+| 生成式推荐    | ✅ HSTU/HLLM (Meta 2024)     | ❌          |
+| ONNX 一键导出 | ✅ 内置支持                  | 需手动适配 |
+| 学习曲线      | 极低                        | 陡峭       |
 ## ✨ 特性
@@ -86,7 +96,8 @@ Description-Content-Type: text/markdown
 * **易于配置:** 通过配置文件或命令行参数轻松调整实验设置。
 * **可复现性:** 旨在确保实验结果的可复现性。
 * **ONNX 导出:** 支持将训练好的模型导出为 ONNX 格式，便于部署到生产环境。
-* **其他特性:** 例如，支持负采样、多任务学习等。
+* **跨引擎数据处理:** 现已支持基于 PySpark 的数据处理与转换，方便在大数据管道中落地。
+* **实验可视化与跟踪:** 内置 WandB、SwanLab、TensorBoardX 三种可视化/追踪工具的统一集成。
 ## 📖 目录
@@ -205,52 +216,52 @@ torch-rechub/             # 根目录
 ### 排序模型 (Ranking Models) - 13个
-| 模型 | 论文 | 简介 |
-|------|------|------|
-| **DeepFM** | [IJCAI 2017](https://arxiv.org/abs/1703.04247) | FM + Deep 联合训练 |
-| **Wide&Deep** | [DLRS 2016](https://arxiv.org/abs/1606.07792) | 记忆 + 泛化能力结合 |
-| **DCN** | [KDD 2017](https://arxiv.org/abs/1708.05123) | 显式特征交叉网络 |
-| **DCN-v2** | [WWW 2021](https://arxiv.org/abs/2008.13535) | 增强版交叉网络 |
-| **DIN** | [KDD 2018](https://arxiv.org/abs/1706.06978) | 注意力机制捕捉用户兴趣 |
-| **DIEN** | [AAAI 2019](https://arxiv.org/abs/1809.03672) | 兴趣演化建模 |
-| **BST** | [DLP-KDD 2019](https://arxiv.org/abs/1905.06874) | Transformer 序列建模 |
-| **AFM** | [IJCAI 2017](https://arxiv.org/abs/1708.04617) | 注意力因子分解机 |
-| **AutoInt** | [CIKM 2019](https://arxiv.org/abs/1810.11921) | 自动特征交互学习 |
-| **FiBiNET** | [RecSys 2019](https://arxiv.org/abs/1905.09433) | 特征重要性 + 双线性交互 |
-| **DeepFFM** | [RecSys 2019](https://arxiv.org/abs/1611.00144) | 场感知因子分解机 |
-| **EDCN** | [KDD 2021](https://arxiv.org/abs/2106.03032) | 增强型交叉网络 |
+| 模型          | 论文                                             | 简介                    |
+| ------------- | ------------------------------------------------ | ----------------------- |
+| **DeepFM**    | [IJCAI 2017](https://arxiv.org/abs/1703.04247)   | FM + Deep 联合训练      |
+| **Wide&Deep** | [DLRS 2016](https://arxiv.org/abs/1606.07792)    | 记忆 + 泛化能力结合     |
+| **DCN**       | [KDD 2017](https://arxiv.org/abs/1708.05123)     | 显式特征交叉网络        |
+| **DCN-v2**    | [WWW 2021](https://arxiv.org/abs/2008.13535)     | 增强版交叉网络          |
+| **DIN**       | [KDD 2018](https://arxiv.org/abs/1706.06978)     | 注意力机制捕捉用户兴趣  |
+| **DIEN**      | [AAAI 2019](https://arxiv.org/abs/1809.03672)    | 兴趣演化建模            |
+| **BST**       | [DLP-KDD 2019](https://arxiv.org/abs/1905.06874) | Transformer 序列建模    |
+| **AFM**       | [IJCAI 2017](https://arxiv.org/abs/1708.04617)   | 注意力因子分解机        |
+| **AutoInt**   | [CIKM 2019](https://arxiv.org/abs/1810.11921)    | 自动特征交互学习        |
+| **FiBiNET**   | [RecSys 2019](https://arxiv.org/abs/1905.09433)  | 特征重要性 + 双线性交互 |
+| **DeepFFM**   | [RecSys 2019](https://arxiv.org/abs/1611.00144)  | 场感知因子分解机        |
+| **EDCN**      | [KDD 2021](https://arxiv.org/abs/2106.03032)     | 增强型交叉网络          |
 ### 召回模型 (Matching Models) - 12个
-| 模型 | 论文 | 简介 |
-|------|------|------|
-| **DSSM** | [CIKM 2013](https://posenhuang.github.io/papers/cikm2013_DSSM_fullversion.pdf) | 经典双塔召回模型 |
-| **YoutubeDNN** | [RecSys 2016](https://dl.acm.org/doi/10.1145/2959100.2959190) | YouTube 深度召回 |
-| **YoutubeSBC** | [RecSys 2019](https://dl.acm.org/doi/10.1145/3298689.3346997) | 采样偏差校正版本 |
-| **MIND** | [CIKM 2019](https://arxiv.org/abs/1904.08030) | 多兴趣动态路由 |
-| **SINE** | [WSDM 2021](https://arxiv.org/abs/2103.06920) | 稀疏兴趣网络 |
-| **GRU4Rec** | [ICLR 2016](https://arxiv.org/abs/1511.06939) | GRU 序列推荐 |
-| **SASRec** | [ICDM 2018](https://arxiv.org/abs/1808.09781) | 自注意力序列推荐 |
-| **NARM** | [CIKM 2017](https://arxiv.org/abs/1711.04725) | 神经注意力会话推荐 |
-| **STAMP** | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3219895) | 短期注意力记忆优先 |
-| **ComiRec** | [KDD 2020](https://arxiv.org/abs/2005.09347) | 可控多兴趣推荐 |
+| 模型           | 论文                                                                           | 简介               |
+| -------------- | ------------------------------------------------------------------------------ | ------------------ |
+| **DSSM**       | [CIKM 2013](https://posenhuang.github.io/papers/cikm2013_DSSM_fullversion.pdf) | 经典双塔召回模型   |
+| **YoutubeDNN** | [RecSys 2016](https://dl.acm.org/doi/10.1145/2959100.2959190)                  | YouTube 深度召回   |
+| **YoutubeSBC** | [RecSys 2019](https://dl.acm.org/doi/10.1145/3298689.3346997)                  | 采样偏差校正版本   |
+| **MIND**       | [CIKM 2019](https://arxiv.org/abs/1904.08030)                                  | 多兴趣动态路由     |
+| **SINE**       | [WSDM 2021](https://arxiv.org/abs/2103.06920)                                  | 稀疏兴趣网络       |
+| **GRU4Rec**    | [ICLR 2016](https://arxiv.org/abs/1511.06939)                                  | GRU 序列推荐       |
+| **SASRec**     | [ICDM 2018](https://arxiv.org/abs/1808.09781)                                  | 自注意力序列推荐   |
+| **NARM**       | [CIKM 2017](https://arxiv.org/abs/1711.04725)                                  | 神经注意力会话推荐 |
+| **STAMP**      | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3219895)                     | 短期注意力记忆优先 |
+| **ComiRec**    | [KDD 2020](https://arxiv.org/abs/2005.09347)                                   | 可控多兴趣推荐     |
 ### 多任务模型 (Multi-Task Models) - 5个
-| 模型 | 论文 | 简介 |
-|------|------|------|
-| **ESMM** | [SIGIR 2018](https://arxiv.org/abs/1804.07931) | 全空间多任务建模 |
-| **MMoE** | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3220007) | 多门控专家混合 |
-| **PLE** | [RecSys 2020](https://dl.acm.org/doi/10.1145/3383313.3412236) | 渐进式分层提取 |
-| **AITM** | [KDD 2021](https://arxiv.org/abs/2105.08489) | 自适应信息迁移 |
-| **SharedBottom** | - | 经典多任务共享底层 |
+| 模型             | 论文                                                          | 简介               |
+| ---------------- | ------------------------------------------------------------- | ------------------ |
+| **ESMM**         | [SIGIR 2018](https://arxiv.org/abs/1804.07931)                | 全空间多任务建模   |
+| **MMoE**         | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3220007)    | 多门控专家混合     |
+| **PLE**          | [RecSys 2020](https://dl.acm.org/doi/10.1145/3383313.3412236) | 渐进式分层提取     |
+| **AITM**         | [KDD 2021](https://arxiv.org/abs/2105.08489)                  | 自适应信息迁移     |
+| **SharedBottom** | -                                                             | 经典多任务共享底层 |
 ### 生成式推荐 (Generative Recommendation) - 2个
-| 模型 | 论文 | 简介 |
-|------|------|------|
+| 模型     | 论文                                          | 简介                                         |
+| -------- | --------------------------------------------- | -------------------------------------------- |
 | **HSTU** | [Meta 2024](https://arxiv.org/abs/2402.17152) | 层级序列转换单元，支撑 Meta 万亿参数推荐系统 |
-| **HLLM** | [2024](https://arxiv.org/abs/2409.12740) | 层级大语言模型推荐，融合 LLM 语义理解能力 |
+| **HLLM** | [2024](https://arxiv.org/abs/2409.12740)      | 层级大语言模型推荐，融合 LLM 语义理解能力    |
 ## 📊 支持的数据集
@@ -338,11 +349,19 @@ model = DSSM(user_features, item_features, temperature=0.02,
 match_trainer = MatchTrainer(model)
 match_trainer.fit(train_dl)
 match_trainer.export_onnx("dssm.onnx")
-# 双塔模型可分别导出用户塔和物品塔:
+# 双塔模型可分别导出用户塔和物品塔:
 # match_trainer.export_onnx("user_tower.onnx", mode="user")
 # match_trainer.export_onnx("dssm_item.onnx", tower="item")
 ```
+### 模型可视化
+```python
+# 可视化模型架构（需要安装: pip install torch-rechub[visualization]）
+graph = ctr_trainer.visualization(depth=4)  # 生成计算图
+ctr_trainer.visualization(save_path="model.pdf", dpi=300)  # 保存为高清 PDF
+```
 ## 👨‍💻‍ 贡献者
 感谢所有的贡献者！
@@ -388,4 +407,4 @@ match_trainer.export_onnx("dssm.onnx")
 ---
-*最后更新: [2025-12-04]*
+*最后更新: [2025-12-11]*

{torch_rechub-0.0.4 → torch_rechub-0.0.6}/README.md RENAMED Viewed

@@ -24,13 +24,13 @@
 ## 🎯 为什么选择 Torch-RecHub？
-| 特性 | Torch-RecHub | 其他框架 |
-|------|-------------|---------|
-| 代码行数 | **10行** 完成训练+评估+部署 | 100+ 行 |
-| 模型覆盖 | **30+** 主流模型 | 有限 |
-| 生成式推荐 | ✅ HSTU/HLLM (Meta 2024) | ❌ |
-| ONNX 一键导出 | ✅ 内置支持 | 需手动适配 |
-| 学习曲线 | 极低 | 陡峭 |
+| 特性          | Torch-RecHub                | 其他框架   |
+| ------------- | --------------------------- | ---------- |
+| 代码行数      | **10行** 完成训练+评估+部署 | 100+ 行    |
+| 模型覆盖      | **30+** 主流模型            | 有限       |
+| 生成式推荐    | ✅ HSTU/HLLM (Meta 2024)     | ❌          |
+| ONNX 一键导出 | ✅ 内置支持                  | 需手动适配 |
+| 学习曲线      | 极低                        | 陡峭       |
 ## ✨ 特性
@@ -41,7 +41,8 @@
 * **易于配置:** 通过配置文件或命令行参数轻松调整实验设置。
 * **可复现性:** 旨在确保实验结果的可复现性。
 * **ONNX 导出:** 支持将训练好的模型导出为 ONNX 格式，便于部署到生产环境。
-* **其他特性:** 例如，支持负采样、多任务学习等。
+* **跨引擎数据处理:** 现已支持基于 PySpark 的数据处理与转换，方便在大数据管道中落地。
+* **实验可视化与跟踪:** 内置 WandB、SwanLab、TensorBoardX 三种可视化/追踪工具的统一集成。
 ## 📖 目录
@@ -160,52 +161,52 @@ torch-rechub/             # 根目录
 ### 排序模型 (Ranking Models) - 13个
-| 模型 | 论文 | 简介 |
-|------|------|------|
-| **DeepFM** | [IJCAI 2017](https://arxiv.org/abs/1703.04247) | FM + Deep 联合训练 |
-| **Wide&Deep** | [DLRS 2016](https://arxiv.org/abs/1606.07792) | 记忆 + 泛化能力结合 |
-| **DCN** | [KDD 2017](https://arxiv.org/abs/1708.05123) | 显式特征交叉网络 |
-| **DCN-v2** | [WWW 2021](https://arxiv.org/abs/2008.13535) | 增强版交叉网络 |
-| **DIN** | [KDD 2018](https://arxiv.org/abs/1706.06978) | 注意力机制捕捉用户兴趣 |
-| **DIEN** | [AAAI 2019](https://arxiv.org/abs/1809.03672) | 兴趣演化建模 |
-| **BST** | [DLP-KDD 2019](https://arxiv.org/abs/1905.06874) | Transformer 序列建模 |
-| **AFM** | [IJCAI 2017](https://arxiv.org/abs/1708.04617) | 注意力因子分解机 |
-| **AutoInt** | [CIKM 2019](https://arxiv.org/abs/1810.11921) | 自动特征交互学习 |
-| **FiBiNET** | [RecSys 2019](https://arxiv.org/abs/1905.09433) | 特征重要性 + 双线性交互 |
-| **DeepFFM** | [RecSys 2019](https://arxiv.org/abs/1611.00144) | 场感知因子分解机 |
-| **EDCN** | [KDD 2021](https://arxiv.org/abs/2106.03032) | 增强型交叉网络 |
+| 模型          | 论文                                             | 简介                    |
+| ------------- | ------------------------------------------------ | ----------------------- |
+| **DeepFM**    | [IJCAI 2017](https://arxiv.org/abs/1703.04247)   | FM + Deep 联合训练      |
+| **Wide&Deep** | [DLRS 2016](https://arxiv.org/abs/1606.07792)    | 记忆 + 泛化能力结合     |
+| **DCN**       | [KDD 2017](https://arxiv.org/abs/1708.05123)     | 显式特征交叉网络        |
+| **DCN-v2**    | [WWW 2021](https://arxiv.org/abs/2008.13535)     | 增强版交叉网络          |
+| **DIN**       | [KDD 2018](https://arxiv.org/abs/1706.06978)     | 注意力机制捕捉用户兴趣  |
+| **DIEN**      | [AAAI 2019](https://arxiv.org/abs/1809.03672)    | 兴趣演化建模            |
+| **BST**       | [DLP-KDD 2019](https://arxiv.org/abs/1905.06874) | Transformer 序列建模    |
+| **AFM**       | [IJCAI 2017](https://arxiv.org/abs/1708.04617)   | 注意力因子分解机        |
+| **AutoInt**   | [CIKM 2019](https://arxiv.org/abs/1810.11921)    | 自动特征交互学习        |
+| **FiBiNET**   | [RecSys 2019](https://arxiv.org/abs/1905.09433)  | 特征重要性 + 双线性交互 |
+| **DeepFFM**   | [RecSys 2019](https://arxiv.org/abs/1611.00144)  | 场感知因子分解机        |
+| **EDCN**      | [KDD 2021](https://arxiv.org/abs/2106.03032)     | 增强型交叉网络          |
 ### 召回模型 (Matching Models) - 12个
-| 模型 | 论文 | 简介 |
-|------|------|------|
-| **DSSM** | [CIKM 2013](https://posenhuang.github.io/papers/cikm2013_DSSM_fullversion.pdf) | 经典双塔召回模型 |
-| **YoutubeDNN** | [RecSys 2016](https://dl.acm.org/doi/10.1145/2959100.2959190) | YouTube 深度召回 |
-| **YoutubeSBC** | [RecSys 2019](https://dl.acm.org/doi/10.1145/3298689.3346997) | 采样偏差校正版本 |
-| **MIND** | [CIKM 2019](https://arxiv.org/abs/1904.08030) | 多兴趣动态路由 |
-| **SINE** | [WSDM 2021](https://arxiv.org/abs/2103.06920) | 稀疏兴趣网络 |
-| **GRU4Rec** | [ICLR 2016](https://arxiv.org/abs/1511.06939) | GRU 序列推荐 |
-| **SASRec** | [ICDM 2018](https://arxiv.org/abs/1808.09781) | 自注意力序列推荐 |
-| **NARM** | [CIKM 2017](https://arxiv.org/abs/1711.04725) | 神经注意力会话推荐 |
-| **STAMP** | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3219895) | 短期注意力记忆优先 |
-| **ComiRec** | [KDD 2020](https://arxiv.org/abs/2005.09347) | 可控多兴趣推荐 |
+| 模型           | 论文                                                                           | 简介               |
+| -------------- | ------------------------------------------------------------------------------ | ------------------ |
+| **DSSM**       | [CIKM 2013](https://posenhuang.github.io/papers/cikm2013_DSSM_fullversion.pdf) | 经典双塔召回模型   |
+| **YoutubeDNN** | [RecSys 2016](https://dl.acm.org/doi/10.1145/2959100.2959190)                  | YouTube 深度召回   |
+| **YoutubeSBC** | [RecSys 2019](https://dl.acm.org/doi/10.1145/3298689.3346997)                  | 采样偏差校正版本   |
+| **MIND**       | [CIKM 2019](https://arxiv.org/abs/1904.08030)                                  | 多兴趣动态路由     |
+| **SINE**       | [WSDM 2021](https://arxiv.org/abs/2103.06920)                                  | 稀疏兴趣网络       |
+| **GRU4Rec**    | [ICLR 2016](https://arxiv.org/abs/1511.06939)                                  | GRU 序列推荐       |
+| **SASRec**     | [ICDM 2018](https://arxiv.org/abs/1808.09781)                                  | 自注意力序列推荐   |
+| **NARM**       | [CIKM 2017](https://arxiv.org/abs/1711.04725)                                  | 神经注意力会话推荐 |
+| **STAMP**      | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3219895)                     | 短期注意力记忆优先 |
+| **ComiRec**    | [KDD 2020](https://arxiv.org/abs/2005.09347)                                   | 可控多兴趣推荐     |
 ### 多任务模型 (Multi-Task Models) - 5个
-| 模型 | 论文 | 简介 |
-|------|------|------|
-| **ESMM** | [SIGIR 2018](https://arxiv.org/abs/1804.07931) | 全空间多任务建模 |
-| **MMoE** | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3220007) | 多门控专家混合 |
-| **PLE** | [RecSys 2020](https://dl.acm.org/doi/10.1145/3383313.3412236) | 渐进式分层提取 |
-| **AITM** | [KDD 2021](https://arxiv.org/abs/2105.08489) | 自适应信息迁移 |
-| **SharedBottom** | - | 经典多任务共享底层 |
+| 模型             | 论文                                                          | 简介               |
+| ---------------- | ------------------------------------------------------------- | ------------------ |
+| **ESMM**         | [SIGIR 2018](https://arxiv.org/abs/1804.07931)                | 全空间多任务建模   |
+| **MMoE**         | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3220007)    | 多门控专家混合     |
+| **PLE**          | [RecSys 2020](https://dl.acm.org/doi/10.1145/3383313.3412236) | 渐进式分层提取     |
+| **AITM**         | [KDD 2021](https://arxiv.org/abs/2105.08489)                  | 自适应信息迁移     |
+| **SharedBottom** | -                                                             | 经典多任务共享底层 |
 ### 生成式推荐 (Generative Recommendation) - 2个
-| 模型 | 论文 | 简介 |
-|------|------|------|
+| 模型     | 论文                                          | 简介                                         |
+| -------- | --------------------------------------------- | -------------------------------------------- |
 | **HSTU** | [Meta 2024](https://arxiv.org/abs/2402.17152) | 层级序列转换单元，支撑 Meta 万亿参数推荐系统 |
-| **HLLM** | [2024](https://arxiv.org/abs/2409.12740) | 层级大语言模型推荐，融合 LLM 语义理解能力 |
+| **HLLM** | [2024](https://arxiv.org/abs/2409.12740)      | 层级大语言模型推荐，融合 LLM 语义理解能力    |
 ## 📊 支持的数据集
@@ -293,11 +294,19 @@ model = DSSM(user_features, item_features, temperature=0.02,
 match_trainer = MatchTrainer(model)
 match_trainer.fit(train_dl)
 match_trainer.export_onnx("dssm.onnx")
-# 双塔模型可分别导出用户塔和物品塔:
+# 双塔模型可分别导出用户塔和物品塔:
 # match_trainer.export_onnx("user_tower.onnx", mode="user")
 # match_trainer.export_onnx("dssm_item.onnx", tower="item")
 ```
+### 模型可视化
+```python
+# 可视化模型架构（需要安装: pip install torch-rechub[visualization]）
+graph = ctr_trainer.visualization(depth=4)  # 生成计算图
+ctr_trainer.visualization(save_path="model.pdf", dpi=300)  # 保存为高清 PDF
+```
 ## 👨‍💻‍ 贡献者
 感谢所有的贡献者！
@@ -343,4 +352,4 @@ match_trainer.export_onnx("dssm.onnx")
 ---
-*最后更新: [2025-12-04]*
+*最后更新: [2025-12-11]*

{torch_rechub-0.0.4 → torch_rechub-0.0.6}/README_en.md RENAMED Viewed

@@ -41,6 +41,8 @@ English | [简体中文](README.md)
 * **Easy Configuration:** Adjust experiment settings via config files or command-line arguments.
 * **Reproducibility:** Designed to ensure reproducible experimental results.
 * **ONNX Export:** Export trained models to ONNX format for production deployment.
+* **Cross-engine data processing:** PySpark-based data processing and conversion supported for large-scale pipelines.
+* **Experiment visualization & tracking:** Unified integration of WandB, SwanLab, and TensorBoardX.
 * **Additional Features:** Negative sampling, multi-task learning, etc.
 ## 📖 Table of Contents
@@ -342,4 +344,4 @@ If you use this framework in your research or work, please consider citing:
 ---
-*Last updated: [2025-12-04]*
+*Last updated: [2025-12-11]*

torch-rechub 0.0.4__tar.gz → 0.0.6__tar.gz

torch-rechub 0.0.4tar.gz → 0.0.6tar.gz