pyg-nightly 2.7.0.dev20250919__py3-none-any.whl → 2.7.0.dev20250921__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of pyg-nightly might be problematic. Click here for more details.

@@ -1,15 +1,14 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: pyg-nightly
3
- Version: 2.7.0.dev20250919
3
+ Version: 2.7.0.dev20250921
4
4
  Summary: Graph Neural Network Library for PyTorch
5
5
  Keywords: deep-learning,pytorch,geometric-deep-learning,graph-neural-networks,graph-convolutional-networks
6
6
  Author-email: Matthias Fey <matthias@pyg.org>
7
- Requires-Python: >=3.9
7
+ Requires-Python: >=3.10
8
8
  Description-Content-Type: text/markdown
9
9
  License-Expression: MIT
10
10
  Classifier: Development Status :: 5 - Production/Stable
11
11
  Classifier: Programming Language :: Python
12
- Classifier: Programming Language :: Python :: 3.9
13
12
  Classifier: Programming Language :: Python :: 3.10
14
13
  Classifier: Programming Language :: Python :: 3.11
15
14
  Classifier: Programming Language :: Python :: 3.12
@@ -45,7 +44,6 @@ Requires-Dist: networkx ; extra == "full"
45
44
  Requires-Dist: numba<0.60.0 ; extra == "full"
46
45
  Requires-Dist: opt_einsum ; extra == "full"
47
46
  Requires-Dist: pandas ; extra == "full"
48
- Requires-Dist: pgmpy ; extra == "full"
49
47
  Requires-Dist: pynndescent ; extra == "full"
50
48
  Requires-Dist: pytorch-memlab ; extra == "full"
51
49
  Requires-Dist: rdflib ; extra == "full"
@@ -91,14 +89,21 @@ Provides-Extra: test
91
89
 
92
90
  ______________________________________________________________________
93
91
 
92
+ <div align="center">
93
+
94
94
  [![PyPI Version][pypi-image]][pypi-url]
95
- [![Testing Status][testing-image]][testing-url]
96
- [![Linting Status][linting-image]][linting-url]
97
- [![Docs Status][docs-image]][docs-url]
98
- [![Contributing][contributing-image]][contributing-url]
95
+ [![PyPI Download][pypi-download-image]][pypi-download-url]
99
96
  [![Slack][slack-image]][slack-url]
97
+ [![Contributing][contributing-image]][contributing-url]
98
+
99
+ **[Documentation](https://pytorch-geometric.readthedocs.io)** |
100
+ **[PyG 1.0 Paper](https://arxiv.org/abs/1903.02428)** |
101
+ **[PyG 2.0 Paper](https://arxiv.org/abs/2507.16991)** |
102
+ **[Colab Notebooks](https://pytorch-geometric.readthedocs.io/en/latest/get_started/colabs.html)** |
103
+ **[External Resources](https://pytorch-geometric.readthedocs.io/en/latest/external/resources.html)** |
104
+ **[OGB Examples](https://github.com/snap-stanford/ogb/tree/master/examples)**
100
105
 
101
- **[Documentation](https://pytorch-geometric.readthedocs.io)** | **[PyG 1.0 Paper](https://arxiv.org/abs/1903.02428)** | **[PyG 2.0 Paper](https://arxiv.org/abs/2507.16991)** | **[Colab Notebooks](https://pytorch-geometric.readthedocs.io/en/latest/get_started/colabs.html)** | **[External Resources](https://pytorch-geometric.readthedocs.io/en/latest/external/resources.html)** | **[OGB Examples](https://github.com/snap-stanford/ogb/tree/master/examples)**
106
+ </div>
102
107
 
103
108
  **PyG** *(PyTorch Geometric)* is a library built upon [PyTorch](https://pytorch.org/) to easily write and train Graph Neural Networks (GNNs) for a wide range of applications related to structured data.
104
109
 
@@ -421,7 +426,7 @@ These approaches have been implemented in PyG, and can benefit from the above GN
421
426
 
422
427
  ## Installation
423
428
 
424
- PyG is available for Python 3.9 to Python 3.13.
429
+ PyG is available for Python 3.10 to Python 3.13.
425
430
 
426
431
  From **PyG 2.3** onwards, you can install and use PyG **without any external library** required except for PyTorch.
427
432
  For this, simply run
@@ -546,16 +551,12 @@ If you notice anything unexpected, please open an [issue](https://github.com/pyg
546
551
  If you have any questions or are missing a specific feature, feel free [to discuss them with us](https://github.com/pyg-team/pytorch_geometric/discussions).
547
552
  We are motivated to constantly make PyG even better.
548
553
 
549
- [contributing-image]: https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat
554
+ [contributing-image]: https://img.shields.io/badge/contributions-welcome-brightgreen.svg?style=flat&color=4B26A4
550
555
  [contributing-url]: https://github.com/pyg-team/pytorch_geometric/blob/master/.github/CONTRIBUTING.md
551
- [docs-image]: https://readthedocs.org/projects/pytorch-geometric/badge/?version=latest
552
- [docs-url]: https://pytorch-geometric.readthedocs.io/en/latest
553
- [linting-image]: https://github.com/pyg-team/pytorch_geometric/actions/workflows/linting.yml/badge.svg
554
- [linting-url]: https://github.com/pyg-team/pytorch_geometric/actions/workflows/linting.yml
555
- [pypi-image]: https://badge.fury.io/py/torch-geometric.svg
556
+ [pypi-download-image]: https://img.shields.io/pypi/dm/torch_geometric?color=4B26A4
557
+ [pypi-download-url]: https://pepy.tech/projects/torch_geometric
558
+ [pypi-image]: https://img.shields.io/pypi/pyversions/torch-geometric?color=4B26A4
556
559
  [pypi-url]: https://pypi.python.org/pypi/torch-geometric
557
- [slack-image]: https://img.shields.io/badge/slack-pyg-brightgreen
560
+ [slack-image]: https://img.shields.io/badge/slack-join-white.svg?logo=slack&color=4B26A4
558
561
  [slack-url]: https://data.pyg.org/slack.html
559
- [testing-image]: https://github.com/pyg-team/pytorch_geometric/actions/workflows/testing.yml/badge.svg
560
- [testing-url]: https://github.com/pyg-team/pytorch_geometric/actions/workflows/testing.yml
561
562
 
@@ -1,9 +1,9 @@
1
- torch_geometric/__init__.py,sha256=OLGPhTHC1wmAq6rg69s1sbJUFuptQaVPvG0ggAYNOlM,2292
1
+ torch_geometric/__init__.py,sha256=pOD9vZjRGqoUzFw5_-UNQVsobbiDjoaicP9Jw5mj9jE,2292
2
2
  torch_geometric/_compile.py,sha256=9yqMTBKatZPr40WavJz9FjNi7pQj8YZAZOyZmmRGXgc,1351
3
3
  torch_geometric/_onnx.py,sha256=ODB_8cwFUiwBUjngXn6-K5HHb7IDul7DDXuuGX7vj_0,8178
4
4
  torch_geometric/backend.py,sha256=lVaf7aLoVaB3M-UcByUJ1G4T4FOK6LXAg0CF4W3E8jo,1575
5
5
  torch_geometric/config_mixin.py,sha256=hOTJu5LLVrEAZ6Pjt4ScLDLKv9aHbfAzF_3ufwKgO4I,4301
6
- torch_geometric/config_store.py,sha256=zdMzlgBpUmBkPovpYQh5fMNwTZLDq2OneqX47QEx7zk,16818
6
+ torch_geometric/config_store.py,sha256=Lj5pY_fFamF6pr5XhaKcYXZ3ecUTFNHWEprnBwaNbso,16801
7
7
  torch_geometric/debug.py,sha256=cLyH9OaL2v7POyW-80b19w-ctA7a_5EZsS4aUF1wc2U,1295
8
8
  torch_geometric/deprecation.py,sha256=gN65uX23c3miRPOpQzxcRS_QDUpD3B-qVKD6y6GX8Yw,872
9
9
  torch_geometric/device.py,sha256=tU5-_lBNVbVHl_kUmWPwiG5mQ1pyapwMF4JkmtNN3MM,1224
@@ -19,7 +19,7 @@ torch_geometric/logging.py,sha256=HmHHLiCcM64k-6UYNOSfXPIeSGNAyiGGcn8cD8tlyuQ,85
19
19
  torch_geometric/resolver.py,sha256=fn-_6mCpI2xv7eDZnIFcYrHOn0IrwbkWFLDb9laQrWI,1270
20
20
  torch_geometric/seed.py,sha256=MJLbVwpb9i8mK3oi32sS__Cq-dRq_afTeoOL_HoA9ko,372
21
21
  torch_geometric/template.py,sha256=rqjDWgcSAgTCiV4bkOjWRPaO4PpUdC_RXigzxxBqAu8,1060
22
- torch_geometric/typing.py,sha256=PenHvJZ2ZQ0V6t5hYypcZq8mQ47upgSlVTqHPy6QvLc,15704
22
+ torch_geometric/typing.py,sha256=Kj0R-aw81aY6JE1pbsG_7Jf830tect31uzibDFsjSPc,15684
23
23
  torch_geometric/warnings.py,sha256=SB9dWGovX_KKcxqsOrdTDvSb_j0NoB5vPGnK2vg0jVw,727
24
24
  torch_geometric/contrib/__init__.py,sha256=To_lDofgM7sogKEYOICIXei7Njuk-Vkfm-OFhPIdaLo,366
25
25
  torch_geometric/contrib/datasets/__init__.py,sha256=lrGnWsEiJf5zsBRmshGZZFN_uYR2ezDjbj9n9nCpvtk,23
@@ -36,7 +36,7 @@ torch_geometric/data/collate.py,sha256=tOUvttXoEo-bOvJx_qMivJq2JqOsB9iDdjovtiyys
36
36
  torch_geometric/data/data.py,sha256=-E6el1knNgSJyapV8KUk2aRRHOfvwEvjUFfe_BapLfc,47490
37
37
  torch_geometric/data/database.py,sha256=K3KLefYVfsBN9HRItgFZNkbUIllfDt4ueauBFxk3Rxk,23106
38
38
  torch_geometric/data/datapipes.py,sha256=9_Cq3j_7LIF4plQFzbLaqyy0LcpKdAic6yiKgMqSX9A,3083
39
- torch_geometric/data/dataset.py,sha256=AaJH0N9eZgvxX0ljyTH8cXutKJ0AGFAyE-H4Sw9D51w,16834
39
+ torch_geometric/data/dataset.py,sha256=-B9lXmpJ5-yhLZ-38PzKaaDeFZ1y-hfsTn-bVrnzrYA,16886
40
40
  torch_geometric/data/download.py,sha256=kcesTu6jlgmCeePpOxDQOnVhxB_GuZ9iu9ds72KEORc,1889
41
41
  torch_geometric/data/extract.py,sha256=DMG8_6ps4O5xKfkb7j1gUBX_jlWpFdmz6OLY2jBSEx4,2339
42
42
  torch_geometric/data/feature_store.py,sha256=pl2pJL25wqzEZnNZbW8c8Ee_yH0DnE2AK8TioTWZV-g,20045
@@ -118,7 +118,7 @@ torch_geometric/datasets/medshapenet.py,sha256=eCBCXKpueweCwDSf_Q4_MwVA3IbJd04FS
118
118
  torch_geometric/datasets/mixhop_synthetic_dataset.py,sha256=4NNvTHUvvV6pcqQCyVDS5XhppXUeF2H9GTfFoc49eyU,3951
119
119
  torch_geometric/datasets/mnist_superpixels.py,sha256=o2ArbZ0_OE0u8VCaHmWwvngESlOFr9oM9dSEP_tjAS4,3340
120
120
  torch_geometric/datasets/modelnet.py,sha256=rqR-e75lC8PS_IX7VlNbo2Az9IWfqMNvDp8rmQCp-LE,5357
121
- torch_geometric/datasets/molecule_gpt_dataset.py,sha256=d6V8_Qy1Y2nt9hFhn7Re1omFw5Tf_uhv13QM0Vg76eg,19091
121
+ torch_geometric/datasets/molecule_gpt_dataset.py,sha256=DUzX1h7a0qsIhUVzLyNb5qarxZYFrxBr4bsQJ7vsJrk,19096
122
122
  torch_geometric/datasets/molecule_net.py,sha256=pMzaJzd-LbBncZ0VoC87HfA8d1F4NwCWTb5YKvLM890,7404
123
123
  torch_geometric/datasets/movie_lens.py,sha256=M4Bu0Xus8IkW8GYzjxPxSdPXNbcCCx9cu6cncxBvLx8,4033
124
124
  torch_geometric/datasets/movie_lens_100k.py,sha256=eTpBAteM3jqTEtiwLxmhVj4r8JvftvPx8Hvs-3ZIHlU,6057
@@ -151,7 +151,7 @@ torch_geometric/datasets/shapenet.py,sha256=tn3HiQQAr6lxHrqxfOVaAtl40guwFYTXWCbS
151
151
  torch_geometric/datasets/shrec2016.py,sha256=cTLhctbqE0EUEvKddJFhPzDb1oLKXOth4O_WzsWtyMk,6323
152
152
  torch_geometric/datasets/snap_dataset.py,sha256=deJvB6cpIQ3bu_pcWoqgEo1-Kl_NcFi7ZSUci645X0U,9481
153
153
  torch_geometric/datasets/suite_sparse.py,sha256=eqjH4vAUq872qdk3YdLkZSwlu6r7HHpTgK0vEVGmY1s,3278
154
- torch_geometric/datasets/tag_dataset.py,sha256=qTnwr2N1tbWYeLGbItfv70UxQ3n1rKesjeVU3kcOCP8,14757
154
+ torch_geometric/datasets/tag_dataset.py,sha256=jslijGCh37ip2YkrQLyvbk-1QRJ3yqFpmzuQSxckXrE,19402
155
155
  torch_geometric/datasets/taobao.py,sha256=CUcZpbWsNTasevflO8zqP0YvENy89P7wpKS4MHaDJ6Q,4170
156
156
  torch_geometric/datasets/teeth3ds.py,sha256=hZvhcq9lsQENNFr5hk50w2T3CgxE_tlnQfrCgN6uIDQ,9919
157
157
  torch_geometric/datasets/tosca.py,sha256=nUSF8NQT1GlkwWQLshjWmr8xORsvRHzzIqhUyDCvABc,4632
@@ -181,7 +181,7 @@ torch_geometric/datasets/motif_generator/grid.py,sha256=Pcv3r80nfzqpBvRTAJT_J0Dp
181
181
  torch_geometric/datasets/motif_generator/house.py,sha256=C_E2EgeqXEB9CHKRd9V5Jji4lFPpJb_3c41vYSk9Gcs,814
182
182
  torch_geometric/datasets/utils/__init__.py,sha256=At_dId4MdpzAkDS_7Mc6I7XlkThbL0AbVzHC_92lcjA,182
183
183
  torch_geometric/datasets/utils/cheatsheet.py,sha256=M55Bj64cjMVqDNoIq1shUVeU2ngoxpEjhdtyqw7Sd_k,1835
184
- torch_geometric/distributed/__init__.py,sha256=NNCGXbDTAW5xoJgSr-PK0VYEnT8UCI7SoZXc16fjuxQ,589
184
+ torch_geometric/distributed/__init__.py,sha256=FUsRoS28c7XfnwlA9yF4m2xZWos5TdpQSuY3Bm48vz8,1108
185
185
  torch_geometric/distributed/dist_context.py,sha256=n34e2HU-TxmK6DrOpb5lWZu_xg1To1IFrXH4ueF_Jhg,418
186
186
  torch_geometric/distributed/dist_link_neighbor_loader.py,sha256=wM9heZmStrPSW7eo9qWusKdI_lVkDkLlda8ILBqC2c8,4933
187
187
  torch_geometric/distributed/dist_loader.py,sha256=Gjvl5Ck8YrFN6YmCWEFWVqLEwI1hog-rWj2Sk_zqYC0,6504
@@ -308,10 +308,10 @@ torch_geometric/loader/prefetch.py,sha256=z30TIcu3_6ZubllUOwNLunlq4RyQdFj36vPE5Q
308
308
  torch_geometric/loader/random_node_loader.py,sha256=rCmRXYv70SPxBo-Oh049eFEWEZDV7FmlRPzmjcoirXQ,2196
309
309
  torch_geometric/loader/shadow.py,sha256=_hCspYf9SlJYX0lqEjxFec9e9t1iMScNThOoWR1wQGM,4173
310
310
  torch_geometric/loader/temporal_dataloader.py,sha256=Z7L_rYdl6SYBQXAgtr18FVcmfMH9kP1fBWrc2W63g2c,2250
311
- torch_geometric/loader/utils.py,sha256=3hzKzIgB52QIZu7Jdn4JeXZaegIJinIQfIUP9DrUWUQ,14903
311
+ torch_geometric/loader/utils.py,sha256=DgGHK6kNu7ZZIZuaT0Ya_4rUctBMMKyBBSdHhuU389w,14903
312
312
  torch_geometric/loader/zip_loader.py,sha256=3lt10fD15Rxm1WhWzypswGzCEwUz4h8OLCD1nE15yNg,3843
313
313
  torch_geometric/metrics/__init__.py,sha256=3krvDobW6vV5yHTjq2S2pmOXxNfysNG26muq7z48e94,699
314
- torch_geometric/metrics/link_pred.py,sha256=1_hE3KiRqAdZLI6QuUbjgyFC__mTyFu_RimM3bD8wRw,31678
314
+ torch_geometric/metrics/link_pred.py,sha256=bacmFGn7rm0iF2wOJdAW-iTZ04bOuiS-7ur2K-MZKlA,31684
315
315
  torch_geometric/nn/__init__.py,sha256=tTEKDy4vpjPNKyG1Vg9GIx7dVFJuQtBoh2M19ascGpo,880
316
316
  torch_geometric/nn/data_parallel.py,sha256=YiybTWoSFyfSzlXAamZ_-y1f7B6tvDEFHOuy_AyJz9Q,4761
317
317
  torch_geometric/nn/encoding.py,sha256=82fpwyOx0-STFSAJ5AzG0p2WFC9u1M4KgmKIql8hSLc,3634
@@ -453,7 +453,7 @@ torch_geometric/nn/models/__init__.py,sha256=7eRlAR93pltDfLEcsTJaaK67mVFezAIjg0V
453
453
  torch_geometric/nn/models/attentive_fp.py,sha256=1z3iTV2O5W9tqHFAdno8FeBFeXmuG-TDZk4lwwVh3Ac,6634
454
454
  torch_geometric/nn/models/attract_repel.py,sha256=h9OyogT0NY0xiT0DkpJHMxH6ZUmo8R-CmwZdKEwq8Ek,5277
455
455
  torch_geometric/nn/models/autoencoder.py,sha256=nGje-zty78Y3hxOJ9o0_6QziJjOvBlknk6z0_fDQwQU,10770
456
- torch_geometric/nn/models/basic_gnn.py,sha256=PGa0RUMyvrNy_5yRI2jX_zwPsmZXwOQWfsWvxOiHsSk,31225
456
+ torch_geometric/nn/models/basic_gnn.py,sha256=tp7qbHKn_uO1CBaEiW79zaBDAD-fR88E8ffJpdDYr9w,31261
457
457
  torch_geometric/nn/models/captum.py,sha256=vPN85_HDMTNcw-rKXAtYY-vT2SbHdf4CFtkseqYsnHg,3972
458
458
  torch_geometric/nn/models/correct_and_smooth.py,sha256=wmq-US2r4ocd0a661R8YeDiBeVtILOjdN-4swIth9BQ,6827
459
459
  torch_geometric/nn/models/deep_graph_infomax.py,sha256=yXSZ4mCrq4Dcvl1muzkxEWH4Lo525J4cYuAXpGs55IY,4137
@@ -654,7 +654,7 @@ torch_geometric/utils/undirected.py,sha256=H_nfpI0_WluOG6VfjPyldvcjL4w5USAKWu2x5
654
654
  torch_geometric/visualization/__init__.py,sha256=b-HnVesXjyJ_L1N-DnjiRiRVf7lhwKaBQF_2i5YMVSU,208
655
655
  torch_geometric/visualization/graph.py,sha256=mfZHXYfiU-CWMtfawYc80IxVwVmtK9hbIkSKhM_j7oI,14311
656
656
  torch_geometric/visualization/influence.py,sha256=CWMvuNA_Nf1sfbJmQgn58yS4OFpeKXeZPe7kEuvkUBw,477
657
- pyg_nightly-2.7.0.dev20250919.dist-info/licenses/LICENSE,sha256=ic-27cMJc1kWoMEYncz3Ya3Ur2Bi3bNLWib2DT763-o,1067
658
- pyg_nightly-2.7.0.dev20250919.dist-info/WHEEL,sha256=G2gURzTEtmeR8nrdXUJfNiB3VYVxigPQ-bEQujpNiNs,82
659
- pyg_nightly-2.7.0.dev20250919.dist-info/METADATA,sha256=IfaNYkgI-HE5ar5wS1k5XG9esWAh643uI1uvCOX7ChY,64145
660
- pyg_nightly-2.7.0.dev20250919.dist-info/RECORD,,
657
+ pyg_nightly-2.7.0.dev20250921.dist-info/licenses/LICENSE,sha256=ic-27cMJc1kWoMEYncz3Ya3Ur2Bi3bNLWib2DT763-o,1067
658
+ pyg_nightly-2.7.0.dev20250921.dist-info/WHEEL,sha256=G2gURzTEtmeR8nrdXUJfNiB3VYVxigPQ-bEQujpNiNs,82
659
+ pyg_nightly-2.7.0.dev20250921.dist-info/METADATA,sha256=A3uqoYW6Tuh_CSAyaqwO_WS0CY4l-r6_kQZAu_PYo_U,63680
660
+ pyg_nightly-2.7.0.dev20250921.dist-info/RECORD,,
@@ -31,7 +31,7 @@ from .lazy_loader import LazyLoader
31
31
  contrib = LazyLoader('contrib', globals(), 'torch_geometric.contrib')
32
32
  graphgym = LazyLoader('graphgym', globals(), 'torch_geometric.graphgym')
33
33
 
34
- __version__ = '2.7.0.dev20250919'
34
+ __version__ = '2.7.0.dev20250921'
35
35
 
36
36
  __all__ = [
37
37
  'Index',
@@ -168,7 +168,7 @@ def map_annotation(
168
168
  assert origin is not None
169
169
  args = tuple(map_annotation(a, mapping) for a in args)
170
170
  if type(annotation).__name__ == 'GenericAlias':
171
- # If annotated with `list[...]` or `dict[...]` (>= Python 3.10):
171
+ # If annotated with `list[...]` or `dict[...]`:
172
172
  annotation = origin[args]
173
173
  else:
174
174
  # If annotated with `typing.List[...]` or `typing.Dict[...]`:
@@ -236,8 +236,8 @@ class Dataset(torch.utils.data.Dataset):
236
236
 
237
237
  def _process(self):
238
238
  f = osp.join(self.processed_dir, 'pre_transform.pt')
239
- if osp.exists(f) and torch.load(f, weights_only=False) != _repr(
240
- self.pre_transform):
239
+ if not self.force_reload and osp.exists(f) and torch.load(
240
+ f, weights_only=False) != _repr(self.pre_transform):
241
241
  warnings.warn(
242
242
  "The `pre_transform` argument differs from the one used in "
243
243
  "the pre-processed version of this dataset. If you want to "
@@ -246,8 +246,8 @@ class Dataset(torch.utils.data.Dataset):
246
246
  stacklevel=2)
247
247
 
248
248
  f = osp.join(self.processed_dir, 'pre_filter.pt')
249
- if osp.exists(f) and torch.load(f, weights_only=False) != _repr(
250
- self.pre_filter):
249
+ if not self.force_reload and osp.exists(f) and torch.load(
250
+ f, weights_only=False) != _repr(self.pre_filter):
251
251
  warnings.warn(
252
252
  "The `pre_filter` argument differs from the one used in "
253
253
  "the pre-processed version of this dataset. If you want to "
@@ -174,7 +174,7 @@ def extract_name(
174
174
  class MoleculeGPTDataset(InMemoryDataset):
175
175
  r"""The dataset from the `"MoleculeGPT: Instruction Following Large
176
176
  Language Models for Molecular Property Prediction"
177
- <https://ai4d3.github.io/papers/34.pdf>`_ paper.
177
+ <https://ai4d3.github.io/2023/papers/34.pdf>`_ paper.
178
178
 
179
179
  Args:
180
180
  root (str): Root directory where the dataset should be saved.
@@ -1,3 +1,4 @@
1
+ import csv
1
2
  import os
2
3
  import os.path as osp
3
4
  from collections.abc import Sequence
@@ -10,6 +11,7 @@ from tqdm import tqdm
10
11
 
11
12
  from torch_geometric.data import InMemoryDataset, download_google_url
12
13
  from torch_geometric.data.data import BaseData
14
+ from torch_geometric.io import fs
13
15
 
14
16
  try:
15
17
  from pandas import DataFrame, read_csv
@@ -22,14 +24,16 @@ IndexType = Union[slice, Tensor, np.ndarray, Sequence]
22
24
 
23
25
  class TAGDataset(InMemoryDataset):
24
26
  r"""The Text Attributed Graph datasets from the
25
- `"Learning on Large-scale Text-attributed Graphs via Variational Inference
26
- " <https://arxiv.org/abs/2210.14709>`_ paper.
27
+ `"Learning on Large-scale Text-attributed Graphs via Variational Inference"
28
+ <https://arxiv.org/abs/2210.14709>`_ paper and `"Harnessing Explanations:
29
+ LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation
30
+ Learning" <https://arxiv.org/abs/2305.19523>`_ paper.
27
31
  This dataset is aiming on transform `ogbn products`, `ogbn arxiv`
28
32
  into Text Attributed Graph that each node in graph is associate with a
29
- raw text, that dataset can be adapt to DataLoader (for LM training) and
30
- NeighborLoader(for GNN training). In addition, this class can be use as a
31
- wrapper class by convert a InMemoryDataset with Tokenizer and text into
32
- Text Attributed Graph.
33
+ raw text, LLM prediction and explanation, that dataset can be adapt to
34
+ DataLoader (for LM training) and NeighborLoader(for GNN training).
35
+ In addition, this class can be use as a wrapper class by convert a
36
+ InMemoryDataset with Tokenizer and text into Text Attributed Graph.
33
37
 
34
38
  Args:
35
39
  root (str): Root directory where the dataset should be saved.
@@ -51,22 +55,35 @@ class TAGDataset(InMemoryDataset):
51
55
  or not, default: False
52
56
  force_reload (bool): default: False
53
57
  .. note::
54
- See `example/llm_plus_gnn/glem.py` for example usage
58
+ See `example/llm/glem.py` for example usage
55
59
  """
56
60
  raw_text_id = {
57
61
  'ogbn-arxiv': '1g3OOVhRyiyKv13LY6gbp8GLITocOUr_3',
58
62
  'ogbn-products': '1I-S176-W4Bm1iPDjQv3hYwQBtxE0v8mt'
59
63
  }
60
64
 
61
- def __init__(self, root: str, dataset: InMemoryDataset,
62
- tokenizer_name: str, text: Optional[List[str]] = None,
63
- split_idx: Optional[Dict[str, Tensor]] = None,
64
- tokenize_batch_size: int = 256, token_on_disk: bool = False,
65
- text_on_disk: bool = False,
66
- force_reload: bool = False) -> None:
65
+ llm_prediction_url = 'https://github.com/XiaoxinHe/TAPE/raw/main/gpt_preds'
66
+
67
+ llm_explanation_id = {
68
+ 'ogbn-arxiv': '1o8n2xRen-N_elF9NQpIca0iCHJgEJbRQ',
69
+ }
70
+
71
+ def __init__(
72
+ self,
73
+ root: str,
74
+ dataset: InMemoryDataset,
75
+ tokenizer_name: str,
76
+ text: Optional[List[str]] = None,
77
+ split_idx: Optional[Dict[str, Tensor]] = None,
78
+ tokenize_batch_size: int = 256,
79
+ token_on_disk: bool = False,
80
+ text_on_disk: bool = False,
81
+ force_reload: bool = False,
82
+ ) -> None:
67
83
  # list the vars you want to pass in before run download & process
68
84
  self.name = dataset.name
69
85
  self.text = text
86
+ self.llm_prediction_topk = 5
70
87
  self.tokenizer_name = tokenizer_name
71
88
  from transformers import AutoTokenizer
72
89
  self.tokenizer = AutoTokenizer.from_pretrained(tokenizer_name)
@@ -93,8 +110,9 @@ class TAGDataset(InMemoryDataset):
93
110
  "is_gold mask, please pass splited index "
94
111
  "in format of dictionaty with 'train', 'valid' "
95
112
  "'test' index tensor to 'split_idx'")
96
- if text is not None and text_on_disk:
97
- self.save_node_text(text)
113
+ if text_on_disk:
114
+ if text is not None:
115
+ self.save_node_text(text)
98
116
  self.text_on_disk = text_on_disk
99
117
  # init will call download and process
100
118
  super().__init__(self.root, transform=None, pre_transform=None,
@@ -119,6 +137,10 @@ class TAGDataset(InMemoryDataset):
119
137
  self.token_on_disk = token_on_disk
120
138
  self.tokenize_batch_size = tokenize_batch_size
121
139
  self._token = self.tokenize_graph(self.tokenize_batch_size)
140
+ self._llm_explanation_token = self.tokenize_graph(
141
+ self.tokenize_batch_size, text_type='llm_explanation')
142
+ self._all_token = self.tokenize_graph(self.tokenize_batch_size,
143
+ text_type='all')
122
144
  self.__num_classes__ = dataset.num_classes
123
145
 
124
146
  @property
@@ -146,6 +168,19 @@ class TAGDataset(InMemoryDataset):
146
168
  self._token = self.tokenize_graph()
147
169
  return self._token
148
170
 
171
+ @property
172
+ def llm_explanation_token(self) -> Dict[str, Tensor]:
173
+ if self._llm_explanation_token is None: # lazy load
174
+ self._llm_explanation_token = self.tokenize_graph(
175
+ text_type='llm_explanation')
176
+ return self._llm_explanation_token
177
+
178
+ @property
179
+ def all_token(self) -> Dict[str, Tensor]:
180
+ if self._all_token is None: # lazy load
181
+ self._all_token = self.tokenize_graph(text_type='all')
182
+ return self._all_token
183
+
149
184
  # load is_gold after init
150
185
  @property
151
186
  def is_gold(self) -> Tensor:
@@ -194,10 +229,17 @@ class TAGDataset(InMemoryDataset):
194
229
  folder=f'{self.root}/raw',
195
230
  filename='node-text.csv.gz',
196
231
  log=True)
197
- text_df = read_csv(raw_text_path)
198
- self.text = list(text_df['text'])
232
+ self.text = list(read_csv(raw_text_path)['text'])
233
+ print('downloading llm explanations')
234
+ llm_explanation_path = download_google_url(
235
+ id=self.llm_explanation_id[self.name], folder=f'{self.root}/raw',
236
+ filename='node-gpt-response.csv.gz', log=True)
237
+ self.llm_explanation = list(read_csv(llm_explanation_path)['text'])
238
+ print('downloading llm predictions')
239
+ fs.cp(f'{self.llm_prediction_url}/{self.name}.csv', self.raw_dir)
199
240
 
200
241
  def process(self) -> None:
242
+ # process Title and Abstraction
201
243
  if osp.exists(osp.join(self.root, 'raw', 'node-text.csv.gz')):
202
244
  text_df = read_csv(osp.join(self.root, 'raw', 'node-text.csv.gz'))
203
245
  self.text = list(text_df['text'])
@@ -212,6 +254,42 @@ class TAGDataset(InMemoryDataset):
212
254
  "The raw text of each node is not specified"
213
255
  "Please pass in 'text' when convert your dataset "
214
256
  "to Text Attribute Graph Dataset")
257
+ # process LLM explanation and prediction
258
+ llm_explanation_path = f'{self.raw_dir}/node-gpt-response.csv.gz'
259
+ llm_prediction_path = f'{self.raw_dir}/{self.name}.csv'
260
+ if osp.exists(llm_explanation_path) and osp.exists(
261
+ llm_prediction_path):
262
+ # load LLM explanation
263
+ self.llm_explanation = list(read_csv(llm_explanation_path)['text'])
264
+ # load LLM prediction
265
+ preds = []
266
+ with open(llm_prediction_path) as file:
267
+ reader = csv.reader(file)
268
+ for row in reader:
269
+ inner_list = []
270
+ for value in row:
271
+ inner_list.append(int(value))
272
+ preds.append(inner_list)
273
+
274
+ pl = torch.zeros(len(preds), self.llm_prediction_topk,
275
+ dtype=torch.long)
276
+ for i, pred in enumerate(preds):
277
+ pl[i][:len(pred)] = torch.tensor(
278
+ pred[:self.llm_prediction_topk], dtype=torch.long) + 1
279
+ elif self.name in self.llm_explanation_id:
280
+ self.download()
281
+ else:
282
+ print(
283
+ 'The dataset is not ogbn-arxiv,'
284
+ 'please pass in your llm explanation list to `llm_explanation`'
285
+ 'and llm prediction list to `llm_prediction`')
286
+ if self.llm_explanation is None or pl is None:
287
+ raise ValueError(
288
+ "The TAGDataset only have ogbn-arxiv LLM explanations"
289
+ "and predictions in default. The llm explanation and"
290
+ "prediction of each node is not specified."
291
+ "Please pass in 'llm_explanation' and 'llm_prediction' when"
292
+ "convert your dataset to Text Attribute Graph Dataset")
215
293
 
216
294
  def save_node_text(self, text: List[str]) -> None:
217
295
  node_text_path = osp.join(self.root, 'raw', 'node-text.csv.gz')
@@ -224,22 +302,39 @@ class TAGDataset(InMemoryDataset):
224
302
  text_df.to_csv(osp.join(node_text_path), compression='gzip',
225
303
  index=False)
226
304
 
227
- def tokenize_graph(self, batch_size: int = 256) -> Dict[str, Tensor]:
305
+ def tokenize_graph(self, batch_size: int = 256,
306
+ text_type: str = 'raw_text') -> Dict[str, Tensor]:
228
307
  r"""Tokenizing the text associate with each node, running in cpu.
229
308
 
230
309
  Args:
231
310
  batch_size (Optional[int]): batch size of list of text for
232
311
  generating emebdding
312
+ text_type (Optional[str]): type of text
233
313
  Returns:
234
314
  Dict[str, torch.Tensor]: tokenized graph
235
315
  """
316
+ assert text_type in ['raw_text', 'llm_explanation', 'all']
317
+ if text_type == 'raw_text':
318
+ _text = self.text
319
+ elif text_type == 'llm_explanation':
320
+ _text = self.llm_explanation
321
+ elif text_type == 'all':
322
+ if self.text is None or self.llm_explanation is None:
323
+ raise ValueError("The TAGDataset need text and llm explanation"
324
+ "for tokenizing all text")
325
+ _text = [
326
+ f'{raw_txt} Explanation: {exp_txt}'
327
+ for raw_txt, exp_txt in zip(self.text, self.llm_explanation)
328
+ ]
329
+
236
330
  data_len = 0
237
- if self.text is not None:
238
- data_len = len(self.text)
331
+ if _text is not None:
332
+ data_len = len(_text)
239
333
  else:
240
334
  raise ValueError("The TAGDataset need text for tokenization")
241
335
  token_keys = ['input_ids', 'token_type_ids', 'attention_mask']
242
- path = os.path.join(self.processed_dir, 'token', self.tokenizer_name)
336
+ path = os.path.join(self.processed_dir, 'token', text_type,
337
+ self.tokenizer_name)
243
338
  # Check if the .pt files already exist
244
339
  token_files_exist = any(
245
340
  os.path.exists(os.path.join(path, f'{k}.pt')) for k in token_keys)
@@ -256,12 +351,12 @@ class TAGDataset(InMemoryDataset):
256
351
  all_encoded_token = {k: [] for k in token_keys}
257
352
  pbar = tqdm(total=data_len)
258
353
 
259
- pbar.set_description('Tokenizing Text Attributed Graph')
354
+ pbar.set_description(f'Tokenizing Text Attributed Graph {text_type}')
260
355
  for i in range(0, data_len, batch_size):
261
356
  end_index = min(data_len, i + batch_size)
262
- token = self.tokenizer(self.text[i:min(i + batch_size, data_len)],
263
- padding='max_length', truncation=True,
264
- max_length=512, return_tensors="pt")
357
+ token = self.tokenizer(_text[i:end_index], padding='max_length',
358
+ truncation=True, max_length=512,
359
+ return_tensors="pt")
265
360
  for k in token.keys():
266
361
  all_encoded_token[k].append(token[k])
267
362
  pbar.update(end_index - i)
@@ -289,10 +384,18 @@ class TAGDataset(InMemoryDataset):
289
384
 
290
385
  Args:
291
386
  tag_dataset (TAGDataset): the parent dataset
387
+ text_type (str): type of text
292
388
  """
293
- def __init__(self, tag_dataset: 'TAGDataset') -> None:
389
+ def __init__(self, tag_dataset: 'TAGDataset',
390
+ text_type: str = 'raw_text') -> None:
391
+ assert text_type in ['raw_text', 'llm_explanation', 'all']
294
392
  self.tag_dataset = tag_dataset
295
- self.token = tag_dataset.token
393
+ if text_type == 'raw_text':
394
+ self.token = tag_dataset.token
395
+ elif text_type == 'llm_explanation':
396
+ self.token = tag_dataset.llm_explanation_token
397
+ elif text_type == 'all':
398
+ self.token = tag_dataset.all_token
296
399
  assert tag_dataset._data is not None
297
400
  self._data = tag_dataset._data
298
401
 
@@ -312,7 +415,8 @@ class TAGDataset(InMemoryDataset):
312
415
 
313
416
  # for LM training
314
417
  def __getitem__(
315
- self, node_id: IndexType
418
+ self,
419
+ node_id: IndexType,
316
420
  ) -> Dict[str, Union[Tensor, Dict[str, Tensor]]]:
317
421
  r"""This function will override the function in
318
422
  torch.utils.data.Dataset, and will be called when you
@@ -343,8 +447,8 @@ class TAGDataset(InMemoryDataset):
343
447
  def __repr__(self) -> str:
344
448
  return f'{self.__class__.__name__}()'
345
449
 
346
- def to_text_dataset(self) -> TextDataset:
450
+ def to_text_dataset(self, text_type: str = 'raw_text') -> TextDataset:
347
451
  r"""Factory Build text dataset from Text Attributed Graph Dataset
348
452
  each data point is node's associated text token.
349
453
  """
350
- return TAGDataset.TextDataset(self)
454
+ return TAGDataset.TextDataset(self, text_type)
@@ -1,3 +1,5 @@
1
+ from warnings import warn
2
+
1
3
  from .dist_context import DistContext
2
4
  from .local_feature_store import LocalFeatureStore
3
5
  from .local_graph_store import LocalGraphStore
@@ -7,6 +9,17 @@ from .dist_loader import DistLoader
7
9
  from .dist_neighbor_loader import DistNeighborLoader
8
10
  from .dist_link_neighbor_loader import DistLinkNeighborLoader
9
11
 
12
+ warn(
13
+ "`torch_geometric.distributed` has been deprecated since 2.7.0 and will "
14
+ "no longer be maintained. For distributed training, refer to our "
15
+ "tutorials on distributed training at "
16
+ "https://pytorch-geometric.readthedocs.io/en/latest/tutorial/distributed.html " # noqa: E501
17
+ "or cuGraph examples at "
18
+ "https://github.com/rapidsai/cugraph-gnn/tree/main/python/cugraph-pyg/cugraph_pyg/examples", # noqa: E501
19
+ stacklevel=2,
20
+ category=DeprecationWarning,
21
+ )
22
+
10
23
  __all__ = classes = [
11
24
  'DistContext',
12
25
  'LocalFeatureStore',
@@ -256,14 +256,6 @@ def filter_custom_hetero_store(
256
256
  # Construct a new `HeteroData` object:
257
257
  data = custom_cls() if custom_cls is not None else HeteroData()
258
258
 
259
- # Filter edge storage:
260
- # TODO support edge attributes
261
- for attr in graph_store.get_all_edge_attrs():
262
- key = attr.edge_type
263
- if key in row_dict and key in col_dict:
264
- edge_index = torch.stack([row_dict[key], col_dict[key]], dim=0)
265
- data[attr.edge_type].edge_index = edge_index
266
-
267
259
  # Filter node storage:
268
260
  required_attrs = []
269
261
  for attr in feature_store.get_all_tensor_attrs():
@@ -280,6 +272,14 @@ def filter_custom_hetero_store(
280
272
  for i, attr in enumerate(required_attrs):
281
273
  data[attr.group_name][attr.attr_name] = tensors[i]
282
274
 
275
+ # Filter edge storage:
276
+ # TODO support edge attributes
277
+ for attr in graph_store.get_all_edge_attrs():
278
+ key = attr.edge_type
279
+ if key in row_dict and key in col_dict:
280
+ edge_index = torch.stack([row_dict[key], col_dict[key]], dim=0)
281
+ data[attr.edge_type].edge_index = edge_index
282
+
283
283
  return data
284
284
 
285
285
 
@@ -53,7 +53,7 @@ class LinkPredMetricData:
53
53
 
54
54
  # Flatten both prediction and ground-truth indices, and determine
55
55
  # overlaps afterwards via `torch.searchsorted`.
56
- max_index = max( # type: ignore
56
+ max_index = max(
57
57
  self.pred_index_mat.max()
58
58
  if self.pred_index_mat.numel() > 0 else 0,
59
59
  self.edge_label_index[1].max()
@@ -820,9 +820,10 @@ class LinkPredPersonalization(_LinkPredMetric):
820
820
  right = pred[col.cpu()].to(device)
821
821
 
822
822
  # Use offset to work around applying `isin` along a specific dim:
823
- i = max(left.max(), right.max()) + 1 # type: ignore
824
- i = torch.arange(0, i * row.size(0), i, device=device).view(-1, 1)
825
- isin = torch.isin(left + i, right + i)
823
+ i = max(int(left.max()), int(right.max())) + 1
824
+ idx = torch.arange(0, i * row.size(0), i, device=device)
825
+ idx = idx.view(-1, 1)
826
+ isin = torch.isin(left + idx, right + idx)
826
827
 
827
828
  # Compute personalization via average inverse cosine similarity:
828
829
  cos = isin.sum(dim=-1) / pred.size(1)
@@ -415,7 +415,8 @@ class GCN(BasicGNN):
415
415
  (default: :obj:`None`)
416
416
  jk (str, optional): The Jumping Knowledge mode. If specified, the model
417
417
  will additionally apply a final linear transformation to transform
418
- node embeddings to the expected output feature dimensionality.
418
+ node embeddings to the expected output feature dimensionality,
419
+ while default will not.
419
420
  (:obj:`None`, :obj:`"last"`, :obj:`"cat"`, :obj:`"max"`,
420
421
  :obj:`"lstm"`). (default: :obj:`None`)
421
422
  **kwargs (optional): Additional arguments of
torch_geometric/typing.py CHANGED
@@ -3,17 +3,12 @@ import os
3
3
  import sys
4
4
  import typing
5
5
  import warnings
6
- from typing import Any, Dict, List, Optional, Set, Tuple, Union
6
+ from typing import Any, Dict, List, Optional, Set, Tuple, TypeAlias, Union
7
7
 
8
8
  import numpy as np
9
9
  import torch
10
10
  from torch import Tensor
11
11
 
12
- try:
13
- from typing import TypeAlias # type: ignore
14
- except ImportError:
15
- from typing_extensions import TypeAlias
16
-
17
12
  WITH_PT20 = int(torch.__version__.split('.')[0]) >= 2
18
13
  WITH_PT21 = WITH_PT20 and int(torch.__version__.split('.')[1]) >= 1
19
14
  WITH_PT22 = WITH_PT20 and int(torch.__version__.split('.')[1]) >= 2
@@ -98,7 +93,7 @@ except Exception as e:
98
93
  WITH_CUDA_HASH_MAP = False
99
94
 
100
95
  if WITH_CPU_HASH_MAP:
101
- CPUHashMap: TypeAlias = torch.classes.pyg.CPUHashMap
96
+ CPUHashMap: TypeAlias = torch.classes.pyg.CPUHashMap # type: ignore[name-defined] # noqa: E501
102
97
  else:
103
98
 
104
99
  class CPUHashMap: # type: ignore
@@ -110,7 +105,7 @@ else:
110
105
 
111
106
 
112
107
  if WITH_CUDA_HASH_MAP:
113
- CUDAHashMap: TypeAlias = torch.classes.pyg.CUDAHashMap
108
+ CUDAHashMap: TypeAlias = torch.classes.pyg.CUDAHashMap # type: ignore[name-defined] # noqa: E501
114
109
  else:
115
110
 
116
111
  class CUDAHashMap: # type: ignore