PyPI - x-transformers - Versions diffs - 2.1.36__tar.gz → 2.2.0__tar.gz - Mend

x-transformers 2.1.36tar.gz → 2.2.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (61) hide show

{x_transformers-2.1.36 → x_transformers-2.2.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.4
 Name: x-transformers
-Version: 2.1.36
+Version: 2.2.0
 Summary: X-Transformers
 Project-URL: Homepage, https://pypi.org/project/x-transformers/
 Project-URL: Repository, https://github.com/lucidrains/x-transformers
@@ -2464,4 +2464,15 @@ ids_out, num_out, is_number_mask = model.generate(start_ids, start_nums, 17)
 }
 ```
+```bibtex
+@article{Pagnoni2024ByteLT,
+    title   = {Byte Latent Transformer: Patches Scale Better Than Tokens},
+    author  = {Artidoro Pagnoni and Ram Pasunuru and Pedro Rodriguez and John Nguyen and Benjamin Muller and Margaret Li and Chunting Zhou and Lili Yu and Jason Weston and Luke S. Zettlemoyer and Gargi Ghosh and Mike Lewis and Ari Holtzman and Srinivasan Iyer},
+    journal = {ArXiv},
+    year    = {2024},
+    volume  = {abs/2412.09871},
+    url     = {https://api.semanticscholar.org/CorpusID:274762821}
+}
+```
 *solve intelligence... then use that to solve everything else.* - Demis Hassabis

{x_transformers-2.1.36 → x_transformers-2.2.0}/README.md RENAMED Viewed

@@ -2416,4 +2416,15 @@ ids_out, num_out, is_number_mask = model.generate(start_ids, start_nums, 17)
 }
 ```
+```bibtex
+@article{Pagnoni2024ByteLT,
+    title   = {Byte Latent Transformer: Patches Scale Better Than Tokens},
+    author  = {Artidoro Pagnoni and Ram Pasunuru and Pedro Rodriguez and John Nguyen and Benjamin Muller and Margaret Li and Chunting Zhou and Lili Yu and Jason Weston and Luke S. Zettlemoyer and Gargi Ghosh and Mike Lewis and Ari Holtzman and Srinivasan Iyer},
+    journal = {ArXiv},
+    year    = {2024},
+    volume  = {abs/2412.09871},
+    url     = {https://api.semanticscholar.org/CorpusID:274762821}
+}
+```
 *solve intelligence... then use that to solve everything else.* - Demis Hassabis

{x_transformers-2.1.36 → x_transformers-2.2.0}/pyproject.toml RENAMED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "x-transformers"
-version = "2.1.36"
+version = "2.2.0"
 description = "X-Transformers"
 authors = [
     { name = "Phil Wang", email = "lucidrains@gmail.com" }

{x_transformers-2.1.36 → x_transformers-2.2.0}/tests/test_x_transformers.py RENAMED Viewed

@@ -768,3 +768,25 @@ def test_dynamic_tanh():
     x = torch.randint(0, 20000, (2, 1024))
     model(x)
+def test_entropy_based_tokenizer():
+    from x_transformers.entropy_based_tokenizer import EntropyBasedTokenizer
+    model = TransformerWrapper(
+        num_tokens = 20000,
+        max_seq_len = 1024,
+        attn_layers = Decoder(
+            dim = 128,
+            depth = 6,
+            heads = 8,
+            attn_dim_head = 64,
+        )
+    )
+    tokenizer = EntropyBasedTokenizer(model, entropy_threshold = 9.738)
+    seq = torch.randint(0, 20000, (2, 1024))
+    segmented_seq = tokenizer(seq, return_segmented_seq = True)
+    assert len(segmented_seq) == seq.shape[0]

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/__init__.py RENAMED Viewed

@@ -37,3 +37,5 @@ from x_transformers.dpo import (
 from x_transformers.neo_mlp import (
     NeoMLP
 )
+from x_transformers.entropy_based_tokenizer import EntropyBasedTokenizer

x_transformers-2.2.0/x_transformers/entropy_based_tokenizer.py ADDED Viewed

@@ -0,0 +1,91 @@
+import torch
+import torch.nn.functional as F
+from torch.nn import Module
+from torch.nn.utils.rnn import pad_sequence
+from x_transformers.x_transformers import Decoder, TransformerWrapper
+from einops import repeat, rearrange
+# helper functions
+def exists(v):
+    return v is not None
+def default(v, d):
+    return v if exists(v) else d
+# entropy based tokenizer applied in byte-latent transformer paper
+# they use a simple entropy threshold for segmenting a string into variable sized tokens
+# https://arxiv.org/abs/2412.09871
+class EntropyBasedTokenizer(Module):
+    def __init__(
+        self,
+        decoder: TransformerWrapper,
+        entropy_threshold = 1.5
+    ):
+        super().__init__()
+        assert isinstance(decoder.attn_layers, Decoder)
+        self.decoder = decoder
+        self.entropy_threshold = entropy_threshold
+    @torch.no_grad()
+    def forward(
+        self,
+        seq,
+        return_segmented_seq = False
+    ):
+        self.decoder.eval()
+        batch, seq_len, device = *seq.shape, seq.device
+        _, intermediates = self.decoder(seq, return_logit_entropies = True)
+        entropies = intermediates.logit_entropies
+        over_thres_mask = entropies >= self.entropy_threshold
+        arange = torch.arange(seq_len, device = device) + 1
+        arange = repeat(arange, 'n -> b n', b = batch)
+        # get a tensor of Int['b num_tokens'] with the token lengths, zero padded
+        boundaries = over_thres_mask.clone()
+        boundaries[..., -1] = True # last token is always a boundary
+        num_tokens = boundaries.sum(dim = -1) # number of tokens
+        boundaries = arange[boundaries].split(num_tokens.tolist())
+        # get the token lengths
+        token_lengths = []
+        for one_boundary in boundaries:
+            padded_boundary = F.pad(one_boundary, (1, 0), value = 0.)
+            one_token_lengths = padded_boundary[1:] - padded_boundary[:-1]
+            token_lengths.append(one_token_lengths)
+        token_lengths = pad_sequence(token_lengths, batch_first = True)
+        # early return
+        if not return_segmented_seq:
+            return token_lengths
+        # segment the sequence based on the token lengths
+        segmented_seq = []
+        for one_seq, one_token_length in zip(seq, token_lengths):
+            one_token_length = one_token_length[one_token_length > 0]
+            splitted_seq = one_seq.split(one_token_length.tolist())
+            segmented_seq.append(splitted_seq)
+        return segmented_seq

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/x_transformers.py RENAMED Viewed

@@ -2909,6 +2909,7 @@ class TransformerWrapper(Module):
         return_embeddings = False,
         return_logits_and_embeddings = False,
         return_intermediates = False,
+        return_embeddings_and_intermediates = False,
         return_logit_entropies = False,
         mask = None,
         return_mems = False,
@@ -2940,8 +2941,8 @@ class TransformerWrapper(Module):
         b, n, device, num_mems, has_memory_tokens, emb_frac_gradient, orig_mask = x.shape[0], x.shape[1], x.device, self.num_memory_tokens, self.num_memory_tokens > 0, self.emb_frac_gradient, mask
-        return_hiddens = return_mems | return_attn | return_intermediates | return_attn_z_loss
-        return_embeddings = return_embeddings | (not exists(self.to_logits))
+        return_hiddens = return_mems | return_attn | return_intermediates | return_attn_z_loss | return_embeddings_and_intermediates
+        return_embeddings = return_embeddings | (not exists(self.to_logits)) | return_embeddings_and_intermediates
         # absolute positional embedding
@@ -3131,6 +3132,8 @@ class TransformerWrapper(Module):
         if return_logits_and_embeddings:
             out = (logits, x)
+        elif return_embeddings_and_intermediates:
+            out = (x, intermediates)
         elif return_embeddings:
             out = x
         else:

{x_transformers-2.1.36 → x_transformers-2.2.0}/.github/FUNDING.yml RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/.github/workflows/python-publish.yml RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/.github/workflows/python-test.yaml RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/.gitignore RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/LICENSE RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/data/README.md RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/data/enwik8.gz RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/all-attention.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/attention-on-attention.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/cosine-sim-attention.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/deepnorm.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/dynamic-pos-bias-linear.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/dynamic-pos-bias-log.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/dynamic-pos-bias-sinusoidal.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/dynamic-pos-bias.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/enhanced-recurrence.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/fcm.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/ffglu.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/flash-attention.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/gate_values.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/gating.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/length-extrapolation-scale.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/macaron-1.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/macaron-2.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/memory-transformer.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/normformer.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/pia.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/qknorm-analysis.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/resi_dual.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/residual_attn.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/rezero.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/rotary.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/sandwich-2.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/sandwich.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/sandwich_norm.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/scalenorm.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/talking-heads.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/topk-attention.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/images/xval.png RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/train_belief_state.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/train_copy.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/train_enwik8.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/train_length_extrapolate.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/train_parity.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/attend.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/autoregressive_wrapper.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/belief_state_wrapper.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/continuous.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/dpo.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/multi_input.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/neo_mlp.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/nonautoregressive_wrapper.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/xl_autoregressive_wrapper.py RENAMED Viewed

File without changes

{x_transformers-2.1.36 → x_transformers-2.2.0}/x_transformers/xval.py RENAMED Viewed

File without changes

x-transformers 2.1.36__tar.gz → 2.2.0__tar.gz

x-transformers 2.1.36tar.gz → 2.2.0tar.gz