torch-rechub 0.0.4__py3-none-any.whl → 0.0.6__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: torch-rechub
3
- Version: 0.0.4
3
+ Version: 0.0.6
4
4
  Summary: A Pytorch Toolbox for Recommendation Models, Easy-to-use and Easy-to-extend.
5
5
  Project-URL: Homepage, https://github.com/datawhalechina/torch-rechub
6
6
  Project-URL: Documentation, https://www.torch-rechub.com
@@ -28,19 +28,29 @@ Requires-Dist: scikit-learn>=0.24.0
28
28
  Requires-Dist: torch>=1.10.0
29
29
  Requires-Dist: tqdm>=4.60.0
30
30
  Requires-Dist: transformers>=4.46.3
31
+ Provides-Extra: bigdata
32
+ Requires-Dist: pyarrow~=21.0; extra == 'bigdata'
31
33
  Provides-Extra: dev
32
34
  Requires-Dist: bandit>=1.7.0; extra == 'dev'
33
35
  Requires-Dist: flake8>=3.8.0; extra == 'dev'
34
36
  Requires-Dist: isort==5.13.2; extra == 'dev'
35
37
  Requires-Dist: mypy>=0.800; extra == 'dev'
36
38
  Requires-Dist: pre-commit>=2.20.0; extra == 'dev'
39
+ Requires-Dist: pyarrow-stubs>=20.0; extra == 'dev'
37
40
  Requires-Dist: pytest-cov>=2.0; extra == 'dev'
38
41
  Requires-Dist: pytest>=6.0; extra == 'dev'
39
42
  Requires-Dist: toml>=0.10.2; extra == 'dev'
40
43
  Requires-Dist: yapf==0.43.0; extra == 'dev'
41
44
  Provides-Extra: onnx
42
- Requires-Dist: onnx>=1.12.0; extra == 'onnx'
43
- Requires-Dist: onnxruntime>=1.12.0; extra == 'onnx'
45
+ Requires-Dist: onnx>=1.14.0; extra == 'onnx'
46
+ Requires-Dist: onnxruntime>=1.14.0; extra == 'onnx'
47
+ Provides-Extra: tracking
48
+ Requires-Dist: swanlab>=0.1.0; extra == 'tracking'
49
+ Requires-Dist: tensorboardx>=2.5; extra == 'tracking'
50
+ Requires-Dist: wandb>=0.13.0; extra == 'tracking'
51
+ Provides-Extra: visualization
52
+ Requires-Dist: graphviz>=0.20; extra == 'visualization'
53
+ Requires-Dist: torchview>=0.2.6; extra == 'visualization'
44
54
  Description-Content-Type: text/markdown
45
55
 
46
56
  # 🔥 Torch-RecHub - 轻量、高效、易用的 PyTorch 推荐系统框架
@@ -69,13 +79,13 @@ Description-Content-Type: text/markdown
69
79
 
70
80
  ## 🎯 为什么选择 Torch-RecHub?
71
81
 
72
- | 特性 | Torch-RecHub | 其他框架 |
73
- |------|-------------|---------|
74
- | 代码行数 | **10行** 完成训练+评估+部署 | 100+ 行 |
75
- | 模型覆盖 | **30+** 主流模型 | 有限 |
76
- | 生成式推荐 | ✅ HSTU/HLLM (Meta 2024) | ❌ |
77
- | ONNX 一键导出 | ✅ 内置支持 | 需手动适配 |
78
- | 学习曲线 | 极低 | 陡峭 |
82
+ | 特性 | Torch-RecHub | 其他框架 |
83
+ | ------------- | --------------------------- | ---------- |
84
+ | 代码行数 | **10行** 完成训练+评估+部署 | 100+ 行 |
85
+ | 模型覆盖 | **30+** 主流模型 | 有限 |
86
+ | 生成式推荐 | ✅ HSTU/HLLM (Meta 2024) | ❌ |
87
+ | ONNX 一键导出 | ✅ 内置支持 | 需手动适配 |
88
+ | 学习曲线 | 极低 | 陡峭 |
79
89
 
80
90
  ## ✨ 特性
81
91
 
@@ -86,7 +96,8 @@ Description-Content-Type: text/markdown
86
96
  * **易于配置:** 通过配置文件或命令行参数轻松调整实验设置。
87
97
  * **可复现性:** 旨在确保实验结果的可复现性。
88
98
  * **ONNX 导出:** 支持将训练好的模型导出为 ONNX 格式,便于部署到生产环境。
89
- * **其他特性:** 例如,支持负采样、多任务学习等。
99
+ * **跨引擎数据处理:** 现已支持基于 PySpark 的数据处理与转换,方便在大数据管道中落地。
100
+ * **实验可视化与跟踪:** 内置 WandB、SwanLab、TensorBoardX 三种可视化/追踪工具的统一集成。
90
101
 
91
102
  ## 📖 目录
92
103
 
@@ -205,52 +216,52 @@ torch-rechub/ # 根目录
205
216
 
206
217
  ### 排序模型 (Ranking Models) - 13个
207
218
 
208
- | 模型 | 论文 | 简介 |
209
- |------|------|------|
210
- | **DeepFM** | [IJCAI 2017](https://arxiv.org/abs/1703.04247) | FM + Deep 联合训练 |
211
- | **Wide&Deep** | [DLRS 2016](https://arxiv.org/abs/1606.07792) | 记忆 + 泛化能力结合 |
212
- | **DCN** | [KDD 2017](https://arxiv.org/abs/1708.05123) | 显式特征交叉网络 |
213
- | **DCN-v2** | [WWW 2021](https://arxiv.org/abs/2008.13535) | 增强版交叉网络 |
214
- | **DIN** | [KDD 2018](https://arxiv.org/abs/1706.06978) | 注意力机制捕捉用户兴趣 |
215
- | **DIEN** | [AAAI 2019](https://arxiv.org/abs/1809.03672) | 兴趣演化建模 |
216
- | **BST** | [DLP-KDD 2019](https://arxiv.org/abs/1905.06874) | Transformer 序列建模 |
217
- | **AFM** | [IJCAI 2017](https://arxiv.org/abs/1708.04617) | 注意力因子分解机 |
218
- | **AutoInt** | [CIKM 2019](https://arxiv.org/abs/1810.11921) | 自动特征交互学习 |
219
- | **FiBiNET** | [RecSys 2019](https://arxiv.org/abs/1905.09433) | 特征重要性 + 双线性交互 |
220
- | **DeepFFM** | [RecSys 2019](https://arxiv.org/abs/1611.00144) | 场感知因子分解机 |
221
- | **EDCN** | [KDD 2021](https://arxiv.org/abs/2106.03032) | 增强型交叉网络 |
219
+ | 模型 | 论文 | 简介 |
220
+ | ------------- | ------------------------------------------------ | ----------------------- |
221
+ | **DeepFM** | [IJCAI 2017](https://arxiv.org/abs/1703.04247) | FM + Deep 联合训练 |
222
+ | **Wide&Deep** | [DLRS 2016](https://arxiv.org/abs/1606.07792) | 记忆 + 泛化能力结合 |
223
+ | **DCN** | [KDD 2017](https://arxiv.org/abs/1708.05123) | 显式特征交叉网络 |
224
+ | **DCN-v2** | [WWW 2021](https://arxiv.org/abs/2008.13535) | 增强版交叉网络 |
225
+ | **DIN** | [KDD 2018](https://arxiv.org/abs/1706.06978) | 注意力机制捕捉用户兴趣 |
226
+ | **DIEN** | [AAAI 2019](https://arxiv.org/abs/1809.03672) | 兴趣演化建模 |
227
+ | **BST** | [DLP-KDD 2019](https://arxiv.org/abs/1905.06874) | Transformer 序列建模 |
228
+ | **AFM** | [IJCAI 2017](https://arxiv.org/abs/1708.04617) | 注意力因子分解机 |
229
+ | **AutoInt** | [CIKM 2019](https://arxiv.org/abs/1810.11921) | 自动特征交互学习 |
230
+ | **FiBiNET** | [RecSys 2019](https://arxiv.org/abs/1905.09433) | 特征重要性 + 双线性交互 |
231
+ | **DeepFFM** | [RecSys 2019](https://arxiv.org/abs/1611.00144) | 场感知因子分解机 |
232
+ | **EDCN** | [KDD 2021](https://arxiv.org/abs/2106.03032) | 增强型交叉网络 |
222
233
 
223
234
  ### 召回模型 (Matching Models) - 12个
224
235
 
225
- | 模型 | 论文 | 简介 |
226
- |------|------|------|
227
- | **DSSM** | [CIKM 2013](https://posenhuang.github.io/papers/cikm2013_DSSM_fullversion.pdf) | 经典双塔召回模型 |
228
- | **YoutubeDNN** | [RecSys 2016](https://dl.acm.org/doi/10.1145/2959100.2959190) | YouTube 深度召回 |
229
- | **YoutubeSBC** | [RecSys 2019](https://dl.acm.org/doi/10.1145/3298689.3346997) | 采样偏差校正版本 |
230
- | **MIND** | [CIKM 2019](https://arxiv.org/abs/1904.08030) | 多兴趣动态路由 |
231
- | **SINE** | [WSDM 2021](https://arxiv.org/abs/2103.06920) | 稀疏兴趣网络 |
232
- | **GRU4Rec** | [ICLR 2016](https://arxiv.org/abs/1511.06939) | GRU 序列推荐 |
233
- | **SASRec** | [ICDM 2018](https://arxiv.org/abs/1808.09781) | 自注意力序列推荐 |
234
- | **NARM** | [CIKM 2017](https://arxiv.org/abs/1711.04725) | 神经注意力会话推荐 |
235
- | **STAMP** | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3219895) | 短期注意力记忆优先 |
236
- | **ComiRec** | [KDD 2020](https://arxiv.org/abs/2005.09347) | 可控多兴趣推荐 |
236
+ | 模型 | 论文 | 简介 |
237
+ | -------------- | ------------------------------------------------------------------------------ | ------------------ |
238
+ | **DSSM** | [CIKM 2013](https://posenhuang.github.io/papers/cikm2013_DSSM_fullversion.pdf) | 经典双塔召回模型 |
239
+ | **YoutubeDNN** | [RecSys 2016](https://dl.acm.org/doi/10.1145/2959100.2959190) | YouTube 深度召回 |
240
+ | **YoutubeSBC** | [RecSys 2019](https://dl.acm.org/doi/10.1145/3298689.3346997) | 采样偏差校正版本 |
241
+ | **MIND** | [CIKM 2019](https://arxiv.org/abs/1904.08030) | 多兴趣动态路由 |
242
+ | **SINE** | [WSDM 2021](https://arxiv.org/abs/2103.06920) | 稀疏兴趣网络 |
243
+ | **GRU4Rec** | [ICLR 2016](https://arxiv.org/abs/1511.06939) | GRU 序列推荐 |
244
+ | **SASRec** | [ICDM 2018](https://arxiv.org/abs/1808.09781) | 自注意力序列推荐 |
245
+ | **NARM** | [CIKM 2017](https://arxiv.org/abs/1711.04725) | 神经注意力会话推荐 |
246
+ | **STAMP** | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3219895) | 短期注意力记忆优先 |
247
+ | **ComiRec** | [KDD 2020](https://arxiv.org/abs/2005.09347) | 可控多兴趣推荐 |
237
248
 
238
249
  ### 多任务模型 (Multi-Task Models) - 5个
239
250
 
240
- | 模型 | 论文 | 简介 |
241
- |------|------|------|
242
- | **ESMM** | [SIGIR 2018](https://arxiv.org/abs/1804.07931) | 全空间多任务建模 |
243
- | **MMoE** | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3220007) | 多门控专家混合 |
244
- | **PLE** | [RecSys 2020](https://dl.acm.org/doi/10.1145/3383313.3412236) | 渐进式分层提取 |
245
- | **AITM** | [KDD 2021](https://arxiv.org/abs/2105.08489) | 自适应信息迁移 |
246
- | **SharedBottom** | - | 经典多任务共享底层 |
251
+ | 模型 | 论文 | 简介 |
252
+ | ---------------- | ------------------------------------------------------------- | ------------------ |
253
+ | **ESMM** | [SIGIR 2018](https://arxiv.org/abs/1804.07931) | 全空间多任务建模 |
254
+ | **MMoE** | [KDD 2018](https://dl.acm.org/doi/10.1145/3219819.3220007) | 多门控专家混合 |
255
+ | **PLE** | [RecSys 2020](https://dl.acm.org/doi/10.1145/3383313.3412236) | 渐进式分层提取 |
256
+ | **AITM** | [KDD 2021](https://arxiv.org/abs/2105.08489) | 自适应信息迁移 |
257
+ | **SharedBottom** | - | 经典多任务共享底层 |
247
258
 
248
259
  ### 生成式推荐 (Generative Recommendation) - 2个
249
260
 
250
- | 模型 | 论文 | 简介 |
251
- |------|------|------|
261
+ | 模型 | 论文 | 简介 |
262
+ | -------- | --------------------------------------------- | -------------------------------------------- |
252
263
  | **HSTU** | [Meta 2024](https://arxiv.org/abs/2402.17152) | 层级序列转换单元,支撑 Meta 万亿参数推荐系统 |
253
- | **HLLM** | [2024](https://arxiv.org/abs/2409.12740) | 层级大语言模型推荐,融合 LLM 语义理解能力 |
264
+ | **HLLM** | [2024](https://arxiv.org/abs/2409.12740) | 层级大语言模型推荐,融合 LLM 语义理解能力 |
254
265
 
255
266
  ## 📊 支持的数据集
256
267
 
@@ -338,11 +349,19 @@ model = DSSM(user_features, item_features, temperature=0.02,
338
349
  match_trainer = MatchTrainer(model)
339
350
  match_trainer.fit(train_dl)
340
351
  match_trainer.export_onnx("dssm.onnx")
341
- # 双塔模型可分别导出用户塔和物品塔:
352
+ # 双塔模型可分别导出用户塔和物品塔:
342
353
  # match_trainer.export_onnx("user_tower.onnx", mode="user")
343
354
  # match_trainer.export_onnx("dssm_item.onnx", tower="item")
344
355
  ```
345
356
 
357
+ ### 模型可视化
358
+
359
+ ```python
360
+ # 可视化模型架构(需要安装: pip install torch-rechub[visualization])
361
+ graph = ctr_trainer.visualization(depth=4) # 生成计算图
362
+ ctr_trainer.visualization(save_path="model.pdf", dpi=300) # 保存为高清 PDF
363
+ ```
364
+
346
365
  ## 👨‍💻‍ 贡献者
347
366
 
348
367
  感谢所有的贡献者!
@@ -388,4 +407,4 @@ match_trainer.export_onnx("dssm.onnx")
388
407
 
389
408
  ---
390
409
 
391
- *最后更新: [2025-12-04]*
410
+ *最后更新: [2025-12-11]*
@@ -8,6 +8,10 @@ torch_rechub/basic/layers.py,sha256=URWk78dlffMOAhDVDhOhugcr4nmwEa192AI1diktC-4,
8
8
  torch_rechub/basic/loss_func.py,sha256=6bjljqpiuUP6O8-wUbGd8FSvflY5Dp_DV_57OuQVMz4,7969
9
9
  torch_rechub/basic/metaoptimizer.py,sha256=y-oT4MV3vXnSQ5Zd_ZEHP1KClITEi3kbZa6RKjlkYw8,3093
10
10
  torch_rechub/basic/metric.py,sha256=9JsaJJGvT6VRvsLoM2Y171CZxESsjYTofD3qnMI-bPM,8443
11
+ torch_rechub/basic/tracking.py,sha256=7-aoyKJxyqb8GobpjRjFsgPYWsBDOV44BYOC_vMoCto,6608
12
+ torch_rechub/data/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
13
+ torch_rechub/data/convert.py,sha256=clGFEbDSDpdZBvscWatfjtuXMZUzgy1kiEAg4w_q7VM,2241
14
+ torch_rechub/data/dataset.py,sha256=fDDQ5N3x99KPfy0Ux4LRQbFlWbLg_dvKTO1WUEbEN04,4111
11
15
  torch_rechub/models/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
12
16
  torch_rechub/models/generative/__init__.py,sha256=TsCdVIhOcalQwqKZKjEuNbHKyIjyclapKGNwYfFR7TM,135
13
17
  torch_rechub/models/generative/hllm.py,sha256=6Vrp5Bh0fTFHCn7C-3EqzOyc7UunOyEY9TzAKGHrW-8,9669
@@ -45,18 +49,20 @@ torch_rechub/models/ranking/edcn.py,sha256=6f_S8I6Ir16kCIU54R4EfumWfUFOND5KDKUPH
45
49
  torch_rechub/models/ranking/fibinet.py,sha256=fmEJ9WkO8Mn0RtK_8aRHlnQFh_jMBPO0zODoHZPWmDA,2234
46
50
  torch_rechub/models/ranking/widedeep.py,sha256=eciRvWRBHLlctabLLS5NB7k3MnqrWXCBdpflOU6jMB0,1636
47
51
  torch_rechub/trainers/__init__.py,sha256=NSa2DqgfE1HGDyj40YgrbtUrfBHBxNBpw57XtaAB_jE,148
48
- torch_rechub/trainers/ctr_trainer.py,sha256=RDUXkn7GwLzs3f0kWZwGDNCpqiMeGXo7R6ezFeZdPg8,9075
49
- torch_rechub/trainers/match_trainer.py,sha256=xox5eaPKjSgErJQpbSr29sbyGs1p2sFaKEjxACE6uMI,11276
52
+ torch_rechub/trainers/ctr_trainer.py,sha256=e0xS-W48BOixN0ogksWOcVJNKFiO3g2oNA_hlHytRqk,14138
53
+ torch_rechub/trainers/match_trainer.py,sha256=atkO-gfDuTk6lh-WvaJOh5kgn6HPzbQQN42Rvz8kyXY,16327
50
54
  torch_rechub/trainers/matching.md,sha256=vIBQ3UMmVpUpyk38rrkelFwm_wXVXqMOuqzYZ4M8bzw,30
51
- torch_rechub/trainers/mtl_trainer.py,sha256=tC4c2KIc-H8Wvj4qCzcW6TyfMLRPJyfQvTaN0dDePFg,12598
52
- torch_rechub/trainers/seq_trainer.py,sha256=lXKRx7XbZ3iJuqp_f05vw_jkn8X5j8HmH6Nr-typiIU,12043
55
+ torch_rechub/trainers/mtl_trainer.py,sha256=n3T-ctWACSyl0awBQixOlZUQ8I5cfGyZzgKV09EF8hw,18293
56
+ torch_rechub/trainers/seq_trainer.py,sha256=pyY70kAjTWdKrnAYZynql1PPNtveYDLMB_1hbpCHa48,19217
53
57
  torch_rechub/utils/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
54
58
  torch_rechub/utils/data.py,sha256=vzLAAVt6dujg_vbGhQewiJc0l6JzwzdcM_9EjoOz898,19882
55
59
  torch_rechub/utils/hstu_utils.py,sha256=qLON_pJDC-kDyQn1PoN_HaHi5xTNCwZPgJeV51Z61Lc,6207
56
60
  torch_rechub/utils/match.py,sha256=l9qDwJGHPP9gOQTMYoqGVdWrlhDx1F1-8UnQwDWrEyk,18143
61
+ torch_rechub/utils/model_utils.py,sha256=VLhSbTpupxrFyyY3NzMQ32PPmo5YHm1T96u9KDlwiWE,8450
57
62
  torch_rechub/utils/mtl.py,sha256=AxU05ezizCuLdbPuCg1ZXE0WAStzuxaS5Sc3nwMCBpI,5737
58
- torch_rechub/utils/onnx_export.py,sha256=uRcAD4uZ3eIQbM-DPhdc0bkaPaslNsOYny6BOeLVBfU,13660
59
- torch_rechub-0.0.4.dist-info/METADATA,sha256=SNm71v_YOfculnc13p266bD_8yLo0U_16F_aJQPDvYo,16149
60
- torch_rechub-0.0.4.dist-info/WHEEL,sha256=qtCwoSJWgHk21S1Kb4ihdzI2rlJ1ZKaIurTj_ngOhyQ,87
61
- torch_rechub-0.0.4.dist-info/licenses/LICENSE,sha256=V7ietiX9G_84HtgEbxDgxClniqXGm2t5q8WM4AHGTu0,1066
62
- torch_rechub-0.0.4.dist-info/RECORD,,
63
+ torch_rechub/utils/onnx_export.py,sha256=LRHyZaR9zZJyg6xtuqQHWmusWq-yEvw9EhlmoEwcqsg,8364
64
+ torch_rechub/utils/visualization.py,sha256=Djv8W5SkCk3P2dol5VXf0_eanIhxDwRd7fzNOQY4uiU,9506
65
+ torch_rechub-0.0.6.dist-info/METADATA,sha256=OihjWb0yCI1bmTEoCYAC6pI6cCgl5KS5uSrAGZwv7yY,18470
66
+ torch_rechub-0.0.6.dist-info/WHEEL,sha256=qtCwoSJWgHk21S1Kb4ihdzI2rlJ1ZKaIurTj_ngOhyQ,87
67
+ torch_rechub-0.0.6.dist-info/licenses/LICENSE,sha256=V7ietiX9G_84HtgEbxDgxClniqXGm2t5q8WM4AHGTu0,1066
68
+ torch_rechub-0.0.6.dist-info/RECORD,,