PyPI - floydnet - Versions diffs - 0.1.2__tar.gz → 1.1.0__tar.gz - Mend

floydnet 0.1.2tar.gz → 1.1.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

{floydnet-0.1.2 → floydnet-1.1.0}/.gitignore +2 -1
{floydnet-0.1.2 → floydnet-1.1.0}/CHANGELOG.md +4 -0
floydnet-1.1.0/CITATION.cff +17 -0
{floydnet-0.1.2 → floydnet-1.1.0}/PKG-INFO +16 -13
{floydnet-0.1.2 → floydnet-1.1.0}/README.md +15 -12
{floydnet-0.1.2 → floydnet-1.1.0}/example/README.md +20 -2
{floydnet-0.1.2 → floydnet-1.1.0}/pyproject.toml +1 -1
{floydnet-0.1.2 → floydnet-1.1.0}/src/floydnet/functional.py +15 -1
floydnet-0.1.2/CITATION.cff +0 -10
{floydnet-0.1.2 → floydnet-1.1.0}/LICENSE +0 -0
{floydnet-0.1.2 → floydnet-1.1.0}/src/floydnet/__init__.py +0 -0
{floydnet-0.1.2 → floydnet-1.1.0}/src/floydnet/transformer.py +0 -0

{floydnet-0.1.2 → floydnet-1.1.0}/.gitignore RENAMED Viewed

@@ -215,4 +215,5 @@ example/outputs
 .github/
 count.out
 run_scripts
-example/data/TSP
+example/data/TSP
+example/data/BREC/

{floydnet-0.1.2 → floydnet-1.1.0}/CHANGELOG.md RENAMED Viewed

@@ -2,6 +2,10 @@
 All notable changes to this project will be documented in this file.
+## [1.1.0] - 2026-02-05
+- Added `softmax_cap` parameter to `pivotal_attention3` for improved numerical stability.
+- Added LRGB example script.
 ## [1.0.0] - 2026-01-25
 - Full release with training and evaluation scripts for Graph Count, BREC, and TSP.
 - Added `pivotal_attention3` functional API for 3-Floyd attention.

floydnet-1.1.0/CITATION.cff ADDED Viewed

@@ -0,0 +1,17 @@
+cff-version: 1.2.0
+title: "FloydNet"
+message: "If you use FloydNet in your research, please cite the associated paper."
+type: software
+authors:
+  - family-names: Yu
+    given-names: Jingcheng
+  - family-names: Zeng
+    given-names: Mingliang
+  - family-names: Ye
+    given-names: Qiwei
+version: "1.0.0"
+license: Apache-2.0
+repository-code: "https://github.com/ocx-lab/FloydNet"
+doi: "10.48550/arXiv.2601.19094"
+url: "https://arxiv.org/abs/2601.19094"
+date-released: 2026-01-27

{floydnet-0.1.2 → floydnet-1.1.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: floydnet
-Version: 0.1.2
+Version: 1.1.0
 Summary: Floyd Multi-Head Attention: a drop-in variant of PyTorch MHA with module and function APIs
 Project-URL: Homepage, https://github.com/ocx-lab/FloydNet
 Project-URL: Repository, https://github.com/ocx-lab/FloydNet
@@ -235,7 +235,7 @@ Description-Content-Type: text/markdown
 [![Python](https://img.shields.io/badge/Python-3.9%2B-blue)](https://www.python.org/)
 [![PyTorch](https://img.shields.io/badge/PyTorch-2.1%2B-orange)](https://pytorch.org/)
-Official implementation of an ICLR paper (TODO: add paper title, authors, and links/arXiv).
+Official implementation of [FloydNet](https://arxiv.org/pdf/2601.19094).
 ![Figure Pivotal Attention Mechanism for 2-Floyd/3-Floyd.](misc/pivotalattn2&3.png)
@@ -247,13 +247,13 @@ This repository serves two audiences:
 ## Introduction
-FloydNet is the official PyTorch implementation accompanying an ICLR paper (TODO).
+FloydNet is the official PyTorch implementation.
 The repository provides:
 1. **Reusable components**: a drop-in attention/Transformer-block interface intended for integration into existing projects.
 2. **Reproduction code**: end-to-end training/evaluation pipelines to reproduce the benchmarks reported in the paper.
-For algorithmic details, hyperparameter choices, and analysis, please refer to the paper (TODO: link).
+For algorithmic details, hyperparameter choices, and analysis, please refer to the [paper](https://arxiv.org/pdf/2601.19094).
 ---
@@ -360,9 +360,9 @@ uv pip install -e .
 ## Changelog (latest)
-- Full release with training and evaluation scripts for Graph Count, BREC, and TSP.
-- Added `pivotal_attention3` functional API for 3-Floyd attention.
-- Added additional configuration options in `PivotalAttentionBlock`.
+- Added `softmax_cap` parameter to `pivotal_attention3` for improved numerical stability.
+- Added LRGB example script.
 The full changelog is in [CHANGELOG.md](CHANGELOG.md).
@@ -371,12 +371,15 @@ The full changelog is in [CHANGELOG.md](CHANGELOG.md).
 If you use this code in your research, please cite the paper:
 ```bibtex
-@inproceedings{TODO,
-  title     = {TODO},
-  author    = {TODO},
-  booktitle = {International Conference on Learning Representations (ICLR)},
-  year      = {TODO},
-  url       = {TODO}
+@misc{yu2026floydnetlearningparadigmglobal,
+      title={FloydNet: A Learning Paradigm for Global Relational Reasoning},
+      author={Jingcheng Yu and Mingliang Zeng and Qiwei Ye},
+      year={2026},
+      eprint={2601.19094},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG},
+      url={https://arxiv.org/abs/2601.19094},
 }
 ```

{floydnet-0.1.2 → floydnet-1.1.0}/README.md RENAMED Viewed

@@ -3,7 +3,7 @@
 [![Python](https://img.shields.io/badge/Python-3.9%2B-blue)](https://www.python.org/)
 [![PyTorch](https://img.shields.io/badge/PyTorch-2.1%2B-orange)](https://pytorch.org/)
-Official implementation of an ICLR paper (TODO: add paper title, authors, and links/arXiv).
+Official implementation of [FloydNet](https://arxiv.org/pdf/2601.19094).
 ![Figure Pivotal Attention Mechanism for 2-Floyd/3-Floyd.](misc/pivotalattn2&3.png)
@@ -15,13 +15,13 @@ This repository serves two audiences:
 ## Introduction
-FloydNet is the official PyTorch implementation accompanying an ICLR paper (TODO).
+FloydNet is the official PyTorch implementation.
 The repository provides:
 1. **Reusable components**: a drop-in attention/Transformer-block interface intended for integration into existing projects.
 2. **Reproduction code**: end-to-end training/evaluation pipelines to reproduce the benchmarks reported in the paper.
-For algorithmic details, hyperparameter choices, and analysis, please refer to the paper (TODO: link).
+For algorithmic details, hyperparameter choices, and analysis, please refer to the [paper](https://arxiv.org/pdf/2601.19094).
 ---
@@ -128,9 +128,9 @@ uv pip install -e .
 ## Changelog (latest)
-- Full release with training and evaluation scripts for Graph Count, BREC, and TSP.
-- Added `pivotal_attention3` functional API for 3-Floyd attention.
-- Added additional configuration options in `PivotalAttentionBlock`.
+- Added `softmax_cap` parameter to `pivotal_attention3` for improved numerical stability.
+- Added LRGB example script.
 The full changelog is in [CHANGELOG.md](CHANGELOG.md).
@@ -139,12 +139,15 @@ The full changelog is in [CHANGELOG.md](CHANGELOG.md).
 If you use this code in your research, please cite the paper:
 ```bibtex
-@inproceedings{TODO,
-  title     = {TODO},
-  author    = {TODO},
-  booktitle = {International Conference on Learning Representations (ICLR)},
-  year      = {TODO},
-  url       = {TODO}
+@misc{yu2026floydnetlearningparadigmglobal,
+      title={FloydNet: A Learning Paradigm for Global Relational Reasoning},
+      author={Jingcheng Yu and Mingliang Zeng and Qiwei Ye},
+      year={2026},
+      eprint={2601.19094},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG},
+      url={https://arxiv.org/abs/2601.19094},
 }
 ```

{floydnet-0.1.2 → floydnet-1.1.0}/example/README.md RENAMED Viewed

@@ -1,10 +1,11 @@
 ### Benchmarks
-The paper reports results on **three benchmarks**:
+The paper reports results on **four benchmarks**:
 - Graph Count
 - BREC
 - TSP
+- LRGB
 ## 🚀 Key Results
@@ -134,4 +135,21 @@ torchrun \
   --wandb_name TSP_exp
 ```
----
+---
+### LRGB
+The LRGB benchmark and dataset construction follow:
+https://github.com/vijaydwivedi75/lrgb
+#### PCQM-Contact
+```bash
+source .venv/bin/activate
+cd example
+torchrun \
+  --nproc_per_node=8 \
+  -m LRGB.run \
+  --name pcqm-contact \
+  --wandb_name LRGB_pcqm-contact
+```

{floydnet-0.1.2 → floydnet-1.1.0}/pyproject.toml RENAMED Viewed

@@ -4,7 +4,7 @@ build-backend = "hatchling.build"
 [project]
 name = "floydnet"
-version = "0.1.2"
+version = "1.1.0"
 description = "Floyd Multi-Head Attention: a drop-in variant of PyTorch MHA with module and function APIs"
 readme = "README.md"
 requires-python = ">=3.9"

{floydnet-0.1.2 → floydnet-1.1.0}/src/floydnet/functional.py RENAMED Viewed

@@ -31,6 +31,7 @@ def pivotal_attention(
     dropout: float = 0.0,
     scale: Optional[float] = None,
     inf: float = 1e9,
+    softmax_cap: float = -1,
 ) -> torch.Tensor:
     """Pivotal attention as described in "FLOYDNET: A LEARNING PARADIGM FOR GLOBAL RELATIONAL REASONING".
@@ -47,6 +48,9 @@ def pivotal_attention(
         dropout: Dropout probability applied to attention weights (only effective if > 0).
         scale: Optional custom scaling factor. If None, defaults to 1/sqrt(2*D).
         inf: Value to use for -infinity in masks.
+        softmax_cap: If > 0, applies a tanh-based logit cap before softmax.
+            Note: when using a non-boolean (additive) attn_mask, ensure its magnitude/semantics remain compatible
+            with capping (e.g., very large negative values used to approximate -inf can interact with logit shaping).
     Returns:
         Tensor of shape (B, H, L_i, L_k, D)
@@ -65,6 +69,9 @@ def pivotal_attention(
     attn_scores = torch.einsum("bhikd,bhijd->bhikj", q_ik, k_ij) \
                 + torch.einsum("bhikd,bhjkd->bhikj", q_ik, k_jk)
+    if softmax_cap > 0:
+        attn_scores = softmax_cap * torch.tanh(attn_scores / softmax_cap)
     if attn_mask is not None:
         if attn_mask.dtype == torch.bool:
             attn_scores = attn_scores.masked_fill(attn_mask, -inf)
@@ -93,6 +100,7 @@ def pivotal_attention3(
     dropout: float = 0.0,
     scale: Optional[float] = None,
     inf: float = 1e9,
+    softmax_cap: float = -1,
 ) -> torch.Tensor:
     """3-Pivotal attention as described in "FLOYDNET: A LEARNING PARADIGM FOR GLOBAL RELATIONAL REASONING".
@@ -111,9 +119,12 @@ def pivotal_attention3(
         dropout: Dropout probability applied to attention weights (only effective if > 0).
         scale: Optional custom scaling factor. If None, defaults to 1/sqrt(3*D).
         inf: Value to use for -infinity in masks.
+        softmax_cap: If > 0, applies a tanh-based logit cap before softmax.
+            Note: when using a non-boolean (additive) attn_mask, ensure its magnitude/semantics remain compatible
+            with capping (e.g., very large negative values used to approximate -inf can interact with logit shaping).
     Returns:
-        Tensor of shape (B, H, L_i, l_j, L_k, D)
+        Tensor of shape (B, H, L_i, L_j, L_k, D)
     """
     assert all([t.dim() == 6 for t in [q_ijk, k_pjk, k_ipk, k_ijp, v_pjk, v_ipk, v_ijp]]), "All inputs must be 6D tensors"
     B, H, L_i, L_j, L_k, D = q_ijk.shape
@@ -130,6 +141,9 @@ def pivotal_attention3(
     attn_scores = torch.einsum("bhijkd,bhpjkd->bhijkp", q_ijk, k_pjk) \
                 + torch.einsum("bhijkd,bhipkd->bhijkp", q_ijk, k_ipk) \
                 + torch.einsum("bhijkd,bhijpd->bhijkp", q_ijk, k_ijp)
+    if softmax_cap > 0:
+        attn_scores = softmax_cap * torch.tanh(attn_scores / softmax_cap)
     if attn_mask is not None:
         if attn_mask.dtype == torch.bool:

floydnet-0.1.2/CITATION.cff DELETED Viewed

@@ -1,10 +0,0 @@
-cff-version: 1.2.0
-title: "Floyd Multi-Head Attention"
-authors:
-  - family-names: YourSurname
-    given-names: YourName
-    orcid: "0000-0000-0000-0000"
-version: "0.1.0"
-license: MIT
-repository-code: "https://github.com/yourname/floyd-net"
-message: "If you use this software, please cite it as below."

{floydnet-0.1.2 → floydnet-1.1.0}/LICENSE RENAMED Viewed

File without changes

{floydnet-0.1.2 → floydnet-1.1.0}/src/floydnet/__init__.py RENAMED Viewed

File without changes

{floydnet-0.1.2 → floydnet-1.1.0}/src/floydnet/transformer.py RENAMED Viewed

File without changes

floydnet 0.1.2__tar.gz → 1.1.0__tar.gz

floydnet 0.1.2tar.gz → 1.1.0tar.gz