nextrec 0.2.7__py3-none-any.whl → 0.3.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. nextrec/__version__.py +1 -1
  2. nextrec/basic/activation.py +4 -8
  3. nextrec/basic/callback.py +1 -1
  4. nextrec/basic/features.py +33 -25
  5. nextrec/basic/layers.py +164 -601
  6. nextrec/basic/loggers.py +4 -5
  7. nextrec/basic/metrics.py +39 -115
  8. nextrec/basic/model.py +257 -177
  9. nextrec/basic/session.py +1 -5
  10. nextrec/data/__init__.py +12 -0
  11. nextrec/data/data_utils.py +3 -27
  12. nextrec/data/dataloader.py +26 -34
  13. nextrec/data/preprocessor.py +2 -1
  14. nextrec/loss/listwise.py +6 -4
  15. nextrec/loss/loss_utils.py +10 -6
  16. nextrec/loss/pairwise.py +5 -3
  17. nextrec/loss/pointwise.py +7 -13
  18. nextrec/models/generative/__init__.py +5 -0
  19. nextrec/models/generative/hstu.py +399 -0
  20. nextrec/models/match/mind.py +110 -1
  21. nextrec/models/multi_task/esmm.py +46 -27
  22. nextrec/models/multi_task/mmoe.py +48 -30
  23. nextrec/models/multi_task/ple.py +156 -141
  24. nextrec/models/multi_task/poso.py +413 -0
  25. nextrec/models/multi_task/share_bottom.py +43 -26
  26. nextrec/models/ranking/__init__.py +2 -0
  27. nextrec/models/ranking/dcn.py +20 -1
  28. nextrec/models/ranking/dcn_v2.py +84 -0
  29. nextrec/models/ranking/deepfm.py +44 -18
  30. nextrec/models/ranking/dien.py +130 -27
  31. nextrec/models/ranking/masknet.py +13 -67
  32. nextrec/models/ranking/widedeep.py +39 -18
  33. nextrec/models/ranking/xdeepfm.py +34 -1
  34. nextrec/utils/common.py +26 -1
  35. nextrec/utils/optimizer.py +7 -3
  36. nextrec-0.3.2.dist-info/METADATA +312 -0
  37. nextrec-0.3.2.dist-info/RECORD +57 -0
  38. nextrec-0.2.7.dist-info/METADATA +0 -281
  39. nextrec-0.2.7.dist-info/RECORD +0 -54
  40. {nextrec-0.2.7.dist-info → nextrec-0.3.2.dist-info}/WHEEL +0 -0
  41. {nextrec-0.2.7.dist-info → nextrec-0.3.2.dist-info}/licenses/LICENSE +0 -0
@@ -0,0 +1,312 @@
1
+ Metadata-Version: 2.4
2
+ Name: nextrec
3
+ Version: 0.3.2
4
+ Summary: A comprehensive recommendation library with match, ranking, and multi-task learning models
5
+ Project-URL: Homepage, https://github.com/zerolovesea/NextRec
6
+ Project-URL: Repository, https://github.com/zerolovesea/NextRec
7
+ Project-URL: Documentation, https://github.com/zerolovesea/NextRec/blob/main/README.md
8
+ Project-URL: Issues, https://github.com/zerolovesea/NextRec/issues
9
+ Author-email: zerolovesea <zyaztec@gmail.com>
10
+ License-File: LICENSE
11
+ Keywords: ctr,deep-learning,match,pytorch,ranking,recommendation
12
+ Classifier: Development Status :: 3 - Alpha
13
+ Classifier: Intended Audience :: Developers
14
+ Classifier: Intended Audience :: Science/Research
15
+ Classifier: License :: OSI Approved :: Apache Software License
16
+ Classifier: Programming Language :: Python :: 3
17
+ Classifier: Programming Language :: Python :: 3.10
18
+ Classifier: Programming Language :: Python :: 3.11
19
+ Classifier: Programming Language :: Python :: 3.12
20
+ Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
21
+ Requires-Python: >=3.10
22
+ Requires-Dist: numpy<2.0,>=1.21; sys_platform == 'linux' and python_version < '3.12'
23
+ Requires-Dist: numpy<3.0,>=1.26; sys_platform == 'linux' and python_version >= '3.12'
24
+ Requires-Dist: numpy>=1.23.0; sys_platform == 'win32'
25
+ Requires-Dist: numpy>=1.24.0; sys_platform == 'darwin'
26
+ Requires-Dist: pandas<2.0,>=1.5; sys_platform == 'linux' and python_version < '3.12'
27
+ Requires-Dist: pandas<2.3.0,>=2.1.0; sys_platform == 'win32'
28
+ Requires-Dist: pandas>=2.0.0; sys_platform == 'darwin'
29
+ Requires-Dist: pandas>=2.1.0; sys_platform == 'linux' and python_version >= '3.12'
30
+ Requires-Dist: pyarrow<13.0.0,>=10.0.0; sys_platform == 'linux' and python_version < '3.12'
31
+ Requires-Dist: pyarrow<15.0.0,>=12.0.0; sys_platform == 'win32'
32
+ Requires-Dist: pyarrow>=12.0.0; sys_platform == 'darwin'
33
+ Requires-Dist: pyarrow>=16.0.0; sys_platform == 'linux' and python_version >= '3.12'
34
+ Requires-Dist: scikit-learn<2.0,>=1.2; sys_platform == 'linux' and python_version < '3.12'
35
+ Requires-Dist: scikit-learn>=1.3.0; sys_platform == 'darwin'
36
+ Requires-Dist: scikit-learn>=1.3.0; sys_platform == 'linux' and python_version >= '3.12'
37
+ Requires-Dist: scikit-learn>=1.3.0; sys_platform == 'win32'
38
+ Requires-Dist: scipy<1.12,>=1.8; sys_platform == 'linux' and python_version < '3.12'
39
+ Requires-Dist: scipy>=1.10.0; sys_platform == 'darwin'
40
+ Requires-Dist: scipy>=1.10.0; sys_platform == 'win32'
41
+ Requires-Dist: scipy>=1.11.0; sys_platform == 'linux' and python_version >= '3.12'
42
+ Requires-Dist: torch>=2.0.0
43
+ Requires-Dist: torchvision>=0.15.0
44
+ Requires-Dist: tqdm>=4.65.0
45
+ Provides-Extra: dev
46
+ Requires-Dist: jupyter>=1.0.0; extra == 'dev'
47
+ Requires-Dist: matplotlib>=3.7.0; extra == 'dev'
48
+ Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
49
+ Requires-Dist: pytest-html>=3.2.0; extra == 'dev'
50
+ Requires-Dist: pytest-mock>=3.11.0; extra == 'dev'
51
+ Requires-Dist: pytest-timeout>=2.1.0; extra == 'dev'
52
+ Requires-Dist: pytest-xdist>=3.3.0; extra == 'dev'
53
+ Requires-Dist: pytest>=7.4.0; extra == 'dev'
54
+ Requires-Dist: seaborn>=0.12.0; extra == 'dev'
55
+ Description-Content-Type: text/markdown
56
+
57
+ <p align="center">
58
+ <img align="center" src="asserts/logo.png" width="40%">
59
+ <p>
60
+
61
+ <div align="center">
62
+
63
+ ![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)
64
+ ![PyTorch](https://img.shields.io/badge/PyTorch-1.10+-ee4c2c.svg)
65
+ ![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)
66
+ ![Version](https://img.shields.io/badge/Version-0.3.2-orange.svg)
67
+
68
+ English | [中文文档](README_zh.md)
69
+
70
+ **A Unified, Efficient, and Scalable Recommendation System Framework**
71
+
72
+ </div>
73
+
74
+ ## Introduction
75
+
76
+ NextRec is a modern recommendation framework built on PyTorch, delivering a unified experience for modeling, training, and evaluation. It follows a modular design with rich model implementations, data-processing utilities, and engineering-ready training components. NextRec focuses on large-scale industrial recall scenarios on Spark clusters, training on massive offline parquet features.
77
+
78
+ ## Why NextRec
79
+
80
+ - **Unified feature engineering & data pipeline**: Dense/Sparse/Sequence feature definitions, persistent DataProcessor, and batch-optimized RecDataLoader, matching offline feature training/inference in industrial big-data settings.
81
+ - **Multi-scenario coverage**: Ranking (CTR/CVR), retrieval, multi-task learning, and more marketing/rec models, with a continuously expanding model zoo.
82
+ - **Developer-friendly experience**: Stream processing/training/inference for csv/parquet/pathlike data, plus GPU/MPS acceleration and visualization support.
83
+ - **Efficient training & evaluation**: Standardized engine with optimizers, LR schedulers, early stopping, checkpoints, and detailed logging out of the box.
84
+
85
+ ## Architecture
86
+
87
+ NextRec adopts a modular and low-coupling engineering design, enabling full-pipeline reusability and scalability across data processing → model construction → training & evaluation → inference & deployment. Its core components include: a Feature-Spec-driven Embedding architecture, the BaseModel abstraction, a set of independent reusable Layers, a unified DataLoader for both training and inference, and a ready-to-use Model Zoo.
88
+
89
+ ![NextRec Architecture](asserts/nextrec_diagram_en.png)
90
+
91
+ > The project borrows ideas from excellent open-source rec libraries. Early layers referenced [torch-rechub](https://github.com/datawhalechina/torch-rechub) but have been replaced with in-house implementations. torch-rechub remains mature in architecture and models; the author contributed a bit there—feel free to check it out.
92
+
93
+ ---
94
+
95
+ ## Installation
96
+
97
+ You can quickly install the latest NextRec via `pip install nextrec`; Python 3.10+ is required.
98
+
99
+ ## Tutorials
100
+
101
+ See `tutorials/` for examples covering ranking, retrieval, multi-task learning, and data processing:
102
+
103
+ - [movielen_ranking_deepfm.py](/tutorials/movielen_ranking_deepfm.py) — DeepFM training on MovieLens 100k
104
+ - [example_ranking_din.py](/tutorials/example_ranking_din.py) — DIN training on the e-commerce dataset
105
+ - [example_multitask.py](/tutorials/example_multitask.py) — ESMM multi-task training on the e-commerce dataset
106
+ - [movielen_match_dssm.py](/tutorials/example_match_dssm.py) — DSSM retrieval on MovieLens 100k
107
+
108
+ To dive deeper, Jupyter notebooks are available:
109
+
110
+ - [Hands on the NextRec framework](/tutorials/notebooks/en/Hands%20on%20nextrec.ipynb)
111
+ - [Using the data processor for preprocessing](/tutorials/notebooks/en/Hands%20on%20dataprocessor.ipynb)
112
+
113
+ > Current version [0.3.2]: the matching module is not fully polished yet and may have compatibility issues or unexpected errors. Please raise an issue if you run into problems.
114
+
115
+ ## 5-Minute Quick Start
116
+
117
+ We provide a detailed quick start and paired datasets to help you learn the framework. In `datasets/` you’ll find an e-commerce sample dataset like this:
118
+
119
+ | user_id | item_id | dense_0 | dense_1 | dense_2 | dense_3 | dense_4 | dense_5 | dense_6 | dense_7 | sparse_0 | sparse_1 | sparse_2 | sparse_3 | sparse_4 | sparse_5 | sparse_6 | sparse_7 | sparse_8 | sparse_9 | sequence_0 | sequence_1 | label |
120
+ |--------|---------|-------------|-------------|-------------|------------|-------------|-------------|-------------|-------------|----------|----------|----------|----------|----------|----------|----------|----------|----------|----------|-----------------------------------------------------------|-----------------------------------------------------------|-------|
121
+ | 1 | 7817 | 0.14704075 | 0.31020382 | 0.77780896 | 0.944897 | 0.62315375 | 0.57124174 | 0.77009535 | 0.3211029 | 315 | 260 | 379 | 146 | 168 | 161 | 138 | 88 | 5 | 312 | [170,175,97,338,105,353,272,546,175,545,463,128,0,0,0] | [368,414,820,405,548,63,327,0,0,0,0,0,0,0,0] | 0 |
122
+ | 1 | 3579 | 0.77811223 | 0.80359334 | 0.5185201 | 0.91091245 | 0.043562356 | 0.82142705 | 0.8803686 | 0.33748195 | 149 | 229 | 442 | 6 | 167 | 252 | 25 | 402 | 7 | 168 | [179,48,61,551,284,165,344,151,0,0,0,0,0,0,0] | [814,0,0,0,0,0,0,0,0,0,0,0,0,0,0] | 1 |
123
+
124
+ Below is a short example showing how to train a DIN model. DIN (Deep Interest Network) won Best Paper at KDD 2018 for CTR prediction. You can also run `python tutorials/example_ranking_din.py` directly.
125
+
126
+ After training, detailed logs are available under `nextrec_logs/din_tutorial`.
127
+
128
+ ```python
129
+ import pandas as pd
130
+
131
+ from nextrec.models.ranking.din import DIN
132
+ from nextrec.basic.features import DenseFeature, SparseFeature, SequenceFeature
133
+
134
+ df = pd.read_csv('dataset/ranking_task.csv')
135
+
136
+ for col in df.columns and 'sequence' in col: # csv loads lists as text; convert them back to objects
137
+ df[col] = df[col].apply(lambda x: eval(x) if isinstance(x, str) else x)
138
+
139
+ # Define feature columns
140
+ dense_features = [DenseFeature(name=f'dense_{i}', input_dim=1) for i in range(8)]
141
+
142
+ sparse_features = [SparseFeature(name='user_id', embedding_name='user_emb', vocab_size=int(df['user_id'].max() + 1), embedding_dim=32), SparseFeature(name='item_id', embedding_name='item_emb', vocab_size=int(df['item_id'].max() + 1), embedding_dim=32),]
143
+
144
+ sparse_features.extend([SparseFeature(name=f'sparse_{i}', embedding_name=f'sparse_{i}_emb', vocab_size=int(df[f'sparse_{i}'].max() + 1), embedding_dim=32) for i in range(10)])
145
+
146
+ sequence_features = [
147
+ SequenceFeature(name='sequence_0', vocab_size=int(df['sequence_0'].apply(lambda x: max(x)).max() + 1), embedding_dim=32, padding_idx=0, embedding_name='item_emb'),
148
+ SequenceFeature(name='sequence_1', vocab_size=int(df['sequence_1'].apply(lambda x: max(x)).max() + 1), embedding_dim=16, padding_idx=0, embedding_name='sparse_0_emb'),]
149
+
150
+ mlp_params = {
151
+ "dims": [256, 128, 64],
152
+ "activation": "relu",
153
+ "dropout": 0.3,
154
+ }
155
+
156
+ model = DIN(
157
+ dense_features=dense_features,
158
+ sparse_features=sparse_features,
159
+ sequence_features=sequence_features,
160
+ mlp_params=mlp_params,
161
+ attention_hidden_units=[80, 40],
162
+ attention_activation='sigmoid',
163
+ attention_use_softmax=True,
164
+ target=['label'], # target variable
165
+ device='mps',
166
+ embedding_l1_reg=1e-6,
167
+ embedding_l2_reg=1e-5,
168
+ dense_l1_reg=1e-5,
169
+ dense_l2_reg=1e-4,
170
+ session_id="din_tutorial", # experiment id for logs
171
+ )
172
+
173
+ # Compile model with optimizer and loss
174
+ model.compile(
175
+ optimizer = "adam",
176
+ optimizer_params = {"lr": 1e-3, "weight_decay": 1e-5},
177
+ loss = "focal",
178
+ loss_params={"gamma": 2.0, "alpha": 0.25},
179
+ )
180
+
181
+ model.fit(
182
+ train_data=df,
183
+ metrics=['auc', 'gauc', 'logloss'], # metrics to track
184
+ epochs=3,
185
+ batch_size=512,
186
+ shuffle=True,
187
+ user_id_column='user_id' # used for GAUC
188
+ )
189
+
190
+ # Evaluate after training
191
+ metrics = model.evaluate(
192
+ df,
193
+ metrics=['auc', 'gauc', 'logloss'],
194
+ batch_size=512,
195
+ user_id_column='user_id'
196
+ )
197
+ ```
198
+
199
+ ---
200
+
201
+ ## Supported Models
202
+
203
+ ### Ranking Models
204
+
205
+ | Model | Paper | Year | Status |
206
+ |-------|-------|------|--------|
207
+ | [FM](nextrec/models/ranking/fm.py) | Factorization Machines | ICDM 2010 | Supported |
208
+ | [AFM](nextrec/models/ranking/afm.py) | Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks | IJCAI 2017 | Supported |
209
+ | [DeepFM](nextrec/models/ranking/deepfm.py) | DeepFM: A Factorization-Machine based Neural Network for CTR Prediction | IJCAI 2017 | Supported |
210
+ | [Wide&Deep](nextrec/models/ranking/widedeep.py) | Wide & Deep Learning for Recommender Systems | DLRS 2016 | Supported |
211
+ | [xDeepFM](nextrec/models/ranking/xdeepfm.py) | xDeepFM: Combining Explicit and Implicit Feature Interactions | KDD 2018 | Supported |
212
+ | [FiBiNET](nextrec/models/ranking/fibinet.py) | FiBiNET: Combining Feature Importance and Bilinear Feature Interaction for CTR Prediction | RecSys 2019 | Supported |
213
+ | [PNN](nextrec/models/ranking/pnn.py) | Product-based Neural Networks for User Response Prediction | ICDM 2016 | Supported |
214
+ | [AutoInt](nextrec/models/ranking/autoint.py) | AutoInt: Automatic Feature Interaction Learning | CIKM 2019 | Supported |
215
+ | [DCN](nextrec/models/ranking/dcn.py) | Deep & Cross Network for Ad Click Predictions | ADKDD 2017 | Supported |
216
+ | [DCN v2](nextrec/models/ranking/dcn_v2.py) | DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems | KDD 2021 | In Progress |
217
+ | [DIN](nextrec/models/ranking/din.py) | Deep Interest Network for CTR Prediction | KDD 2018 | Supported |
218
+ | [DIEN](nextrec/models/ranking/dien.py) | Deep Interest Evolution Network | AAAI 2019 | Supported |
219
+ | [MaskNet](nextrec/models/ranking/masknet.py) | MaskNet: Feature-wise Gating Blocks for High-dimensional Sparse Recommendation Data | 2020 | Supported |
220
+
221
+ ### Retrieval Models
222
+
223
+ | Model | Paper | Year | Status |
224
+ |-------|-------|------|--------|
225
+ | [DSSM](nextrec/models/match/dssm.py) | Learning Deep Structured Semantic Models | CIKM 2013 | Supported |
226
+ | [DSSM v2](nextrec/models/match/dssm_v2.py) | DSSM with pairwise BPR-style optimization | - | Supported |
227
+ | [YouTube DNN](nextrec/models/match/youtube_dnn.py) | Deep Neural Networks for YouTube Recommendations | RecSys 2016 | Supported |
228
+ | [MIND](nextrec/models/match/mind.py) | Multi-Interest Network with Dynamic Routing | CIKM 2019 | Supported |
229
+ | [SDM](nextrec/models/match/sdm.py) | Sequential Deep Matching Model | - | Supported |
230
+
231
+ ### Multi-task Models
232
+
233
+ | Model | Paper | Year | Status |
234
+ |-------|-------|------|--------|
235
+ | [MMOE](nextrec/models/multi_task/mmoe.py) | Modeling Task Relationships in Multi-task Learning | KDD 2018 | Supported |
236
+ | [PLE](nextrec/models/multi_task/ple.py) | Progressive Layered Extraction | RecSys 2020 | Supported |
237
+ | [ESMM](nextrec/models/multi_task/esmm.py) | Entire Space Multi-task Model | SIGIR 2018 | Supported |
238
+ | [ShareBottom](nextrec/models/multi_task/share_bottom.py) | Multitask Learning | - | Supported |
239
+ | [POSO](nextrec/models/multi_task/poso.py) | POSO: Personalized Cold-start Modules for Large-scale Recommender Systems | 2021 | Supported |
240
+ | [POSO-IFLYTEK](nextrec/models/multi_task/poso_iflytek.py) | POSO with PLE-style gating for sequential marketing tasks | - | Supported |
241
+
242
+ ### Generative Models
243
+
244
+ | Model | Paper | Year | Status |
245
+ |-------|-------|------|--------|
246
+ | [TIGER](nextrec/models/generative/tiger.py) | Recommender Systems with Generative Retrieval | NeurIPS 2023 | In Progress |
247
+ | [HSTU](nextrec/models/generative/hstu.py) | Hierarchical Sequential Transduction Units | - | In Progress |
248
+
249
+ ---
250
+
251
+ ## Contributing
252
+
253
+ We welcome contributions of any form!
254
+
255
+ ### How to Contribute
256
+
257
+ 1. Fork the repository
258
+ 2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
259
+ 3. Commit your changes (`git commit -m 'Add AmazingFeature'`)
260
+ 4. Push your branch (`git push origin feature/AmazingFeature`)
261
+ 5. Open a Pull Request
262
+
263
+ > Before submitting a PR, please run tests using `pytest test/ -v` or `python -m pytest` to ensure everything passes.
264
+
265
+ ### Code Style
266
+
267
+ - Follow PEP8
268
+ - Provide unit tests for new functionality
269
+ - Update documentation accordingly
270
+
271
+ ### Reporting Issues
272
+
273
+ When submitting issues on GitHub, please include:
274
+
275
+ - Description of the problem
276
+ - Reproduction steps
277
+ - Expected behavior
278
+ - Actual behavior
279
+ - Environment info (Python version, PyTorch version, etc.)
280
+
281
+ ---
282
+
283
+ ## License
284
+
285
+ This project is licensed under the [Apache 2.0 License](./LICENSE).
286
+
287
+ ---
288
+
289
+ ## Contact
290
+
291
+ - **GitHub Issues**: [Submit an issue](https://github.com/zerolovesea/NextRec/issues)
292
+ - **Email**: zyaztec@gmail.com
293
+
294
+ ---
295
+
296
+ ## Acknowledgements
297
+
298
+ NextRec is inspired by the following great open-source projects:
299
+
300
+ - [torch-rechub](https://github.com/datawhalechina/torch-rechub) — Flexible, easy-to-extend recommendation framework
301
+ - [FuxiCTR](https://github.com/reczoo/FuxiCTR) — Configurable, tunable, and reproducible CTR library
302
+ - [RecBole](https://github.com/RUCAIBox/RecBole) — Unified, comprehensive, and efficient recommendation library
303
+
304
+ Special thanks to all open-source contributors!
305
+
306
+ ---
307
+
308
+ <div align="center">
309
+
310
+ **[Back to Top](#nextrec)**
311
+
312
+ </div>
@@ -0,0 +1,57 @@
1
+ nextrec/__init__.py,sha256=CvocnY2uBp0cjNkhrT6ogw0q2bN9s1GNp754FLO-7lo,1117
2
+ nextrec/__version__.py,sha256=vNiWJ14r_cw5t_7UDqDQIVZvladKFGyHH2avsLpN7Vg,22
3
+ nextrec/basic/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
4
+ nextrec/basic/activation.py,sha256=1qs9pq4hT3BUxIiYdYs57axMCm4-JyOBFQ6x7xkHTwM,2849
5
+ nextrec/basic/callback.py,sha256=wwh0I2kKYyywCB-sG9eQXShlpXFJIo75qApJmnI5p6c,1036
6
+ nextrec/basic/features.py,sha256=JtB63jqOIL7zZ5zoTgvEM4fEoqexMz0SMTmowTURk1I,4626
7
+ nextrec/basic/layers.py,sha256=zIa8QsPkOOovjrMAUC94SfhSVTS4R_CXySBr5KAk6i4,24686
8
+ nextrec/basic/loggers.py,sha256=VNed0LagpoPSUl2itW8hHT-BSqJHTlQY5pVxIVmm6AE,3733
9
+ nextrec/basic/metrics.py,sha256=YFOaUexHJncc6sPbw2LF2sBnFp-3PLMrjR3aQbBDpGs,20891
10
+ nextrec/basic/model.py,sha256=Doq5KOYrUHavpSa8RkHbT98ZhbFGRpRsA_9K1A5gU9c,73453
11
+ nextrec/basic/session.py,sha256=oaATn-nzbJ9A6SGbMut9xLV_NSh9_1KmVDeNauS06Ps,4767
12
+ nextrec/data/__init__.py,sha256=COaTyiARV7hEQTT3e74uyCBGmHFQ9rhe6g6Shc-Ualw,1064
13
+ nextrec/data/data_utils.py,sha256=H-isIrs2FPyLSTe7IiFUkn6SQKfO0BkGKmj43C9yLGY,7602
14
+ nextrec/data/dataloader.py,sha256=ySNTts03P8I1vq53HwsP0cg9QdkA0SGyazNJnEA5vfs,14668
15
+ nextrec/data/preprocessor.py,sha256=MhQofbOcZLQCwsi335NTwDWsjQ0QbPIuzbzC0-ijAn4,41731
16
+ nextrec/loss/__init__.py,sha256=mO5t417BneZ8Ysa51GyjDaffjWyjzFgPXIQrrggasaQ,827
17
+ nextrec/loss/listwise.py,sha256=gxDbO1td5IeS28jKzdE35o1KAYBRdCYoMzyZzfNLhc0,5689
18
+ nextrec/loss/loss_utils.py,sha256=uZ4m9ChLr-UgIc5Yxm1LjwXDDepApQ-Fas8njweZ9qg,2641
19
+ nextrec/loss/pairwise.py,sha256=MN_3Pk6Nj8KCkmUqGT5cmyx1_nQa3TIx_kxXT_HB58c,3396
20
+ nextrec/loss/pointwise.py,sha256=shgdRJwTV7vAnVxHSffOJU4TPQeKyrwudQ8y-R10nYM,7144
21
+ nextrec/models/generative/__init__.py,sha256=vo8-DloD74cKc1moSH-4GYG99w8Yi8YPGPxh8XDJPoc,50
22
+ nextrec/models/generative/hstu.py,sha256=qTS05XQBjgC5K34A07DSgIITMs1-ADZ8KVb-HEyNh9w,16369
23
+ nextrec/models/generative/tiger.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
24
+ nextrec/models/match/__init__.py,sha256=ASZB5abqKPhDbk8NErNNNa0DHuWpsVxvUtyEn5XMx6Y,215
25
+ nextrec/models/match/dssm.py,sha256=e0hUqNLJVwTRVz4F4EiO8KLOOprKRBDtI4ID6Y1Tc60,8232
26
+ nextrec/models/match/dssm_v2.py,sha256=ywtqTy3YN9ke_7kzcDp7Fhtldw9RJz6yfewxALJb6Z0,7189
27
+ nextrec/models/match/mind.py,sha256=nDzy1owhXtci1_3yWddbnXIc4X5hsg2333uRt1jExZE,14888
28
+ nextrec/models/match/sdm.py,sha256=96yfMQ6arP6JRhAkDTGEjlBiTteznMykrDV_3jqvvVk,10920
29
+ nextrec/models/match/youtube_dnn.py,sha256=pnrz9LYu65Fj4neOriFF45B5k2-yYiiREtQICxxYXZ0,7546
30
+ nextrec/models/multi_task/esmm.py,sha256=27eKtcDV7u-_89_h6aoEmTGhzGwpux3JVeHHbv8aQWE,6443
31
+ nextrec/models/multi_task/mmoe.py,sha256=RDbwr66kO1vlgfREdRhUgsBYkblzJ2a_-p2oayqxRkE,7804
32
+ nextrec/models/multi_task/ple.py,sha256=TCJOlgfetpueJa8LosEttOf43JPXXTsZh8t9PBoP4ek,11950
33
+ nextrec/models/multi_task/poso.py,sha256=FIdbKRfNJJRlUMkSnrIjQkOLvNOT_x03oeUyPWbVh8I,16653
34
+ nextrec/models/multi_task/share_bottom.py,sha256=3oJCQxVL2iIfba4pRiERaxmOp4d4cICtkOxLeoMqfgw,5921
35
+ nextrec/models/ranking/__init__.py,sha256=AY806x-2BtltQdlR4wu23-keL9YUe3An92OJshS4t9Y,472
36
+ nextrec/models/ranking/afm.py,sha256=r9m1nEnc0m5d4pMtOxRMqOaXaBNCEkjJBFB-5wSHeFA,4540
37
+ nextrec/models/ranking/autoint.py,sha256=xKX-w7lkGHkTYgbAB4r-pqOfkOAUia7av4gvT38X6Lk,7772
38
+ nextrec/models/ranking/dcn.py,sha256=30qvToJftZG7UCoS84Lf8GCqipjFmpZWMQgMWSx9cwQ,4897
39
+ nextrec/models/ranking/dcn_v2.py,sha256=ivHwLRxi4VcNzh9DWQQ227Gw5dhyRZ5LezuqkAdD89o,3630
40
+ nextrec/models/ranking/deepfm.py,sha256=oBifQnbwz2OhVG6XWX5k_PyOA-lbFhYdqDEm0XyuEds,4991
41
+ nextrec/models/ranking/dien.py,sha256=mn_po2D1O3zdyvesQo0PXX6s2-TxhlVCxGtYX3jEq8g,12742
42
+ nextrec/models/ranking/din.py,sha256=j5tkT5k91CbsMlMr5vJOySrcY2_rFGxmEgJJ0McW7-Q,7196
43
+ nextrec/models/ranking/fibinet.py,sha256=X6CbQbritvq5jql_Tvs4bn_tRla2zpWPplftZv8k6f0,4853
44
+ nextrec/models/ranking/fm.py,sha256=3Qx_Fgowegr6UPQtEeTmHtOrbWzkvqH94ZTjOqRLu-E,2961
45
+ nextrec/models/ranking/masknet.py,sha256=IE8WZIl7gy282p66qSxaFaWXurPjPaqJh7hCNeOKCoQ,11506
46
+ nextrec/models/ranking/pnn.py,sha256=5RxIKdxD0XcGq-b_QDdwGRwk6b_5BQjyMvCw3Ibv2Kk,4957
47
+ nextrec/models/ranking/widedeep.py,sha256=6lZUDScOGnUJe3j4X0JPh1LSLTBERL54hZrQDysu_oU,5035
48
+ nextrec/models/ranking/xdeepfm.py,sha256=inrJUfmvQAT-EubH9vY_inCirBggB18Kj-Pp8lHB2CA,5682
49
+ nextrec/utils/__init__.py,sha256=A3mH6M-DmDBWQ1stIIaTsNzvUy_AKaUWtRmrzU5R3FE,429
50
+ nextrec/utils/common.py,sha256=YTlJkFCvIH5ExiOvg5pNPdRLUQ-h60BX4xTliaXKDsE,1217
51
+ nextrec/utils/embedding.py,sha256=yxYSdFx0cJITh3Gf-K4SdhwRtKGcI0jOsyBgZ0NLa_c,465
52
+ nextrec/utils/initializer.py,sha256=ffYOs5QuIns_d_-5e40iNtg6s1ftgREJN-ueq_NbDQE,1647
53
+ nextrec/utils/optimizer.py,sha256=EUjAGFPeyou_Cv-_2HRvjzut8y_qpAQudc8L2T0k8zw,2706
54
+ nextrec-0.3.2.dist-info/METADATA,sha256=4RFzGjoOmLQUS1wIyJ6edrJgJNZkH9wcxOrQxSLln4w,16319
55
+ nextrec-0.3.2.dist-info/WHEEL,sha256=WLgqFyCfm_KASv4WHyYy0P3pM_m7J5L9k2skdKLirC8,87
56
+ nextrec-0.3.2.dist-info/licenses/LICENSE,sha256=2fQfVKeafywkni7MYHyClC6RGGC3laLTXCNBx-ubtp0,1064
57
+ nextrec-0.3.2.dist-info/RECORD,,
@@ -1,281 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: nextrec
3
- Version: 0.2.7
4
- Summary: A comprehensive recommendation library with match, ranking, and multi-task learning models
5
- Project-URL: Homepage, https://github.com/zerolovesea/NextRec
6
- Project-URL: Repository, https://github.com/zerolovesea/NextRec
7
- Project-URL: Documentation, https://github.com/zerolovesea/NextRec/blob/main/README.md
8
- Project-URL: Issues, https://github.com/zerolovesea/NextRec/issues
9
- Author-email: zerolovesea <zyaztec@gmail.com>
10
- License-File: LICENSE
11
- Keywords: ctr,deep-learning,match,pytorch,ranking,recommendation
12
- Classifier: Development Status :: 3 - Alpha
13
- Classifier: Intended Audience :: Developers
14
- Classifier: Intended Audience :: Science/Research
15
- Classifier: License :: OSI Approved :: Apache Software License
16
- Classifier: Programming Language :: Python :: 3
17
- Classifier: Programming Language :: Python :: 3.10
18
- Classifier: Programming Language :: Python :: 3.11
19
- Classifier: Programming Language :: Python :: 3.12
20
- Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
21
- Requires-Python: >=3.10
22
- Requires-Dist: numpy<2.0,>=1.21; sys_platform == 'linux' and python_version < '3.12'
23
- Requires-Dist: numpy<3.0,>=1.26; sys_platform == 'linux' and python_version >= '3.12'
24
- Requires-Dist: numpy>=1.23.0; sys_platform == 'win32'
25
- Requires-Dist: numpy>=1.24.0; sys_platform == 'darwin'
26
- Requires-Dist: pandas<2.0,>=1.5; sys_platform == 'linux' and python_version < '3.12'
27
- Requires-Dist: pandas<2.3.0,>=2.1.0; sys_platform == 'win32'
28
- Requires-Dist: pandas>=2.0.0; sys_platform == 'darwin'
29
- Requires-Dist: pandas>=2.1.0; sys_platform == 'linux' and python_version >= '3.12'
30
- Requires-Dist: pyarrow<13.0.0,>=10.0.0; sys_platform == 'linux' and python_version < '3.12'
31
- Requires-Dist: pyarrow<15.0.0,>=12.0.0; sys_platform == 'win32'
32
- Requires-Dist: pyarrow>=12.0.0; sys_platform == 'darwin'
33
- Requires-Dist: pyarrow>=16.0.0; sys_platform == 'linux' and python_version >= '3.12'
34
- Requires-Dist: scikit-learn<2.0,>=1.2; sys_platform == 'linux' and python_version < '3.12'
35
- Requires-Dist: scikit-learn>=1.3.0; sys_platform == 'darwin'
36
- Requires-Dist: scikit-learn>=1.3.0; sys_platform == 'linux' and python_version >= '3.12'
37
- Requires-Dist: scikit-learn>=1.3.0; sys_platform == 'win32'
38
- Requires-Dist: scipy<1.12,>=1.8; sys_platform == 'linux' and python_version < '3.12'
39
- Requires-Dist: scipy>=1.10.0; sys_platform == 'darwin'
40
- Requires-Dist: scipy>=1.10.0; sys_platform == 'win32'
41
- Requires-Dist: scipy>=1.11.0; sys_platform == 'linux' and python_version >= '3.12'
42
- Requires-Dist: torch>=2.0.0
43
- Requires-Dist: torchvision>=0.15.0
44
- Requires-Dist: tqdm>=4.65.0
45
- Provides-Extra: dev
46
- Requires-Dist: jupyter>=1.0.0; extra == 'dev'
47
- Requires-Dist: matplotlib>=3.7.0; extra == 'dev'
48
- Requires-Dist: pytest-cov>=4.1.0; extra == 'dev'
49
- Requires-Dist: pytest-html>=3.2.0; extra == 'dev'
50
- Requires-Dist: pytest-mock>=3.11.0; extra == 'dev'
51
- Requires-Dist: pytest-timeout>=2.1.0; extra == 'dev'
52
- Requires-Dist: pytest-xdist>=3.3.0; extra == 'dev'
53
- Requires-Dist: pytest>=7.4.0; extra == 'dev'
54
- Requires-Dist: seaborn>=0.12.0; extra == 'dev'
55
- Description-Content-Type: text/markdown
56
-
57
- # NextRec
58
-
59
- <div align="center">
60
-
61
- ![Python](https://img.shields.io/badge/Python-3.10+-blue.svg)
62
- ![PyTorch](https://img.shields.io/badge/PyTorch-1.10+-ee4c2c.svg)
63
- ![License](https://img.shields.io/badge/License-Apache%202.0-green.svg)
64
- ![Version](https://img.shields.io/badge/Version-0.2.7-orange.svg)
65
-
66
- English | [中文版](README_zh.md)
67
-
68
- **A Unified, Efficient, and Scalable Recommendation System Framework**
69
-
70
- </div>
71
-
72
- ## Introduction
73
-
74
- NextRec is a modern recommendation system framework built on PyTorch, providing a unified modeling, training, and evaluation experience for researchers and engineering teams. The framework adopts a modular design with rich built-in model implementations, data-processing tools, and production-ready training components, enabling quick coverage of multiple recommendation scenarios.
75
-
76
- > This project draws on several open-source recommendation libraries, with the general layers referencing the mature implementations in [torch-rechub](https://github.com/datawhalechina/torch-rechub). These part of codes is still in its early stage and is being gradually replaced with our own implementations. If you find any bugs, please submit them in the issue section. Contributions are welcome.
77
-
78
- ### Key Features
79
-
80
- - **Multi-scenario Recommendation**: Supports ranking (CTR/CVR), retrieval, multi-task learning, and generative recommendation models such as TIGER and HSTU — with more models continuously added.
81
- - **Unified Feature Engineering & Data Pipeline**: Provides Dense/Sparse/Sequence feature definitions, persistent DataProcessor, and optimized RecDataLoader, forming a complete “Define → Process → Load” workflow.
82
- - **Efficient Training & Evaluation**: A standardized training engine with optimizers, LR schedulers, early stopping, checkpoints, and logging — ready out-of-the-box.
83
- - **Developer-friendly Engineering Experience**: Modular and extensible design, full tutorial support, GPU/MPS acceleration, and visualization tools.
84
-
85
- ---
86
-
87
- ## Installation
88
-
89
- ```bash
90
- # release version
91
- pip install nextrec
92
-
93
- # pre-release version
94
- pip install -i https://test.pypi.org/simple/ nextrec
95
- ```
96
- ---
97
-
98
- ## 5-Minute Quick Start
99
-
100
- The following example demonstrates a full DeepFM training & inference pipeline using the MovieLens dataset:
101
-
102
- ```python
103
- import pandas as pd
104
-
105
- from nextrec.models.ranking.deepfm import DeepFM
106
- from nextrec.basic.features import DenseFeature, SparseFeature, SequenceFeature
107
-
108
- df = pd.read_csv("dataset/movielens_100k.csv")
109
-
110
- target = 'label'
111
- dense_features = [DenseFeature('age')]
112
- sparse_features = [
113
- SparseFeature('user_id', vocab_size=df['user_id'].max()+1, embedding_dim=4),
114
- SparseFeature('item_id', vocab_size=df['item_id'].max()+1, embedding_dim=4),
115
- ]
116
-
117
- sparse_features.append(SparseFeature('gender', vocab_size=df['gender'].max()+1, embedding_dim=4))
118
- sparse_features.append(SparseFeature('occupation', vocab_size=df['occupation'].max()+1, embedding_dim=4))
119
-
120
- model = DeepFM(
121
- dense_features=dense_features,
122
- sparse_features=sparse_features,
123
- mlp_params={"dims": [256, 128], "activation": "relu", "dropout": 0.5},
124
- target=target,
125
- device='cpu',
126
- session_id="deepfm_with_processor",
127
- embedding_l1_reg=1e-6,
128
- dense_l1_reg=1e-5,
129
- embedding_l2_reg=1e-5,
130
- dense_l2_reg=1e-4,
131
- )
132
-
133
- model.compile(optimizer="adam", optimizer_params={"lr": 1e-3, "weight_decay": 1e-5}, loss="bce")
134
- model.fit(train_data=df, metrics=['auc', 'recall', 'precision'], epochs=10, batch_size=512, shuffle=True, verbose=1)
135
- preds = model.predict(df)
136
- print(f'preds: {preds}')
137
- ```
138
-
139
- ### More Tutorials
140
-
141
- The `tutorials/` directory provides examples for ranking, retrieval, multi-task learning, and data processing:
142
-
143
- - `movielen_match_dssm.py` — DSSM retrieval on MovieLens 100k
144
- - `movielen_ranking_deepfm.py` — DeepFM ranking on MovieLens 100k
145
- - `example_ranking_din.py` — DIN (Deep Interest Network) example
146
- - `example_match_dssm.py` — DSSM retrieval example
147
- - `example_multitask.py` — Multi-task learning example
148
-
149
- ---
150
-
151
- ## Data Processing Example
152
-
153
- NextRec offers a unified interface for preprocessing sparse and sequence features:
154
-
155
- ```python
156
- import pandas as pd
157
- from nextrec.data.preprocessor import DataProcessor
158
-
159
- df = pd.read_csv("dataset/movielens_100k.csv")
160
-
161
- processor = DataProcessor()
162
- processor.add_sparse_feature('movie_title', encode_method='hash', hash_size=1000)
163
- processor.fit(df)
164
-
165
- df = processor.transform(df, return_dict=False)
166
-
167
- print("\nSample training data:")
168
- print(df.head())
169
- ```
170
-
171
- ---
172
-
173
- ## Supported Models
174
-
175
- ### Ranking Models
176
-
177
- | Model | Paper | Year | Status |
178
- |-------|-------|------|--------|
179
- | **FM** | Factorization Machines | ICDM 2010 | Supported |
180
- | **AFM** | Attentional Factorization Machines: Learning the Weight of Feature Interactions via Attention Networks | IJCAI 2017 | Supported |
181
- | **DeepFM** | DeepFM: A Factorization-Machine based Neural Network for CTR Prediction | IJCAI 2017 | Supported |
182
- | **Wide&Deep** | Wide & Deep Learning for Recommender Systems | DLRS 2016 | Supported |
183
- | **xDeepFM** | xDeepFM: Combining Explicit and Implicit Feature Interactions | KDD 2018 | Supported |
184
- | **FiBiNET** | FiBiNET: Combining Feature Importance and Bilinear Feature Interaction for CTR Prediction | RecSys 2019 | Supported |
185
- | **PNN** | Product-based Neural Networks for User Response Prediction | ICDM 2016 | Supported |
186
- | **AutoInt** | AutoInt: Automatic Feature Interaction Learning | CIKM 2019 | Supported |
187
- | **DCN** | Deep & Cross Network for Ad Click Predictions | ADKDD 2017 | Supported |
188
- | **DIN** | Deep Interest Network for CTR Prediction | KDD 2018 | Supported |
189
- | **DIEN** | Deep Interest Evolution Network | AAAI 2019 | Supported |
190
- | **MaskNet** | MaskNet: Feature-wise Gating Blocks for High-dimensional Sparse Recommendation Data | 2020 | Supported |
191
-
192
- ### Retrieval Models
193
-
194
- | Model | Paper | Year | Status |
195
- |-------|-------|------|--------|
196
- | **DSSM** | Learning Deep Structured Semantic Models | CIKM 2013 | Supported |
197
- | **DSSM v2** | DSSM with pairwise BPR-style optimization | - | Supported |
198
- | **YouTube DNN** | Deep Neural Networks for YouTube Recommendations | RecSys 2016 | Supported |
199
- | **MIND** | Multi-Interest Network with Dynamic Routing | CIKM 2019 | Supported |
200
- | **SDM** | Sequential Deep Matching Model | - | Supported |
201
-
202
- ### Multi-task Models
203
-
204
- | Model | Paper | Year | Status |
205
- |-------|-------|------|--------|
206
- | **MMOE** | Modeling Task Relationships in Multi-task Learning | KDD 2018 | Supported |
207
- | **PLE** | Progressive Layered Extraction | RecSys 2020 | Supported |
208
- | **ESMM** | Entire Space Multi-task Model | SIGIR 2018 | Supported |
209
- | **ShareBottom** | Multitask Learning | - | Supported |
210
-
211
- ### Generative Models
212
-
213
- | Model | Paper | Year | Status |
214
- |-------|-------|------|--------|
215
- | **TIGER** | Recommender Systems with Generative Retrieval | NeurIPS 2023 | In Progress |
216
- | **HSTU** | Hierarchical Sequential Transduction Units | - | In Progress |
217
-
218
- ---
219
-
220
- ## Contributing
221
-
222
- We welcome contributions of any form!
223
-
224
- ### How to Contribute
225
-
226
- 1. Fork the repository
227
- 2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
228
- 3. Commit your changes (`git commit -m 'Add AmazingFeature'`)
229
- 4. Push your branch (`git push origin feature/AmazingFeature`)
230
- 5. Open a Pull Request
231
-
232
- > Before submitting a PR, please run tests using `pytest test/ -v` or `python -m pytest` to ensure everything passes.
233
-
234
- ### Code Style
235
-
236
- - Follow PEP8
237
- - Provide unit tests for new functionality
238
- - Update documentation accordingly
239
-
240
- ### Reporting Issues
241
-
242
- When submitting issues on GitHub, please include:
243
-
244
- - Description of the problem
245
- - Reproduction steps
246
- - Expected behavior
247
- - Actual behavior
248
- - Environment info (Python version, PyTorch version, etc.)
249
-
250
- ---
251
-
252
- ## License
253
-
254
- This project is licensed under the [Apache 2.0 License](./LICENSE).
255
-
256
- ---
257
-
258
- ## Contact
259
-
260
- - **GitHub Issues**: Submit issues on GitHub
261
- - **Email**: zyaztec@gmail.com
262
-
263
- ---
264
-
265
- ## Acknowledgements
266
-
267
- NextRec is inspired by the following great open-source projects:
268
-
269
- - **torch-rechub** - A Lighting Pytorch Framework for Recommendation Models, Easy-to-use and Easy-to-extend.
270
- - **FuxiCTR** — Configurable and reproducible CTR prediction library
271
- - **RecBole** — Unified and efficient recommendation library
272
-
273
- Special thanks to all open-source contributors!
274
-
275
- ---
276
-
277
- <div align="center">
278
-
279
- **[Back to Top](#nextrec)**
280
-
281
- </div>