PyPI - x-transformers - Versions diffs - 1.28.5__tar.gz → 1.29.0__tar.gz - Mend

x-transformers 1.28.5tar.gz → 1.29.0tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (19) hide show

{x_transformers-1.28.5/x_transformers.egg-info → x_transformers-1.29.0}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: x-transformers
-Version: 1.28.5
+Version: 1.29.0
 Summary: X-Transformers - Pytorch
 Home-page: https://github.com/lucidrains/x-transformers
 Author: Phil Wang

{x_transformers-1.28.5 → x_transformers-1.29.0}/README.md RENAMED Viewed

@@ -674,6 +674,55 @@ model = TransformerWrapper(
 )
 ```
+### Weight-tied Layers
+In the early days of the cambrian explosion of BERT, a paper explored weight tying all the layers, the model named <a href="https://arxiv.org/abs/1909.11942">ALBERT</a>. You can use it by setting `weight_tie_layers = True`
+```python
+import torch
+from x_transformers import TransformerWrapper, Encoder
+model = TransformerWrapper(
+    num_tokens = 20000,
+    max_seq_len = 1024,
+    attn_layers = Encoder(
+        dim = 512,
+        depth = 12,
+        weight_tie_layers = True   # set this to True to weight tie all the layers
+    )
+)
+```
+If you wish to do something more sophisticated, say 3 layers, with each layer recurrent 4 times before onto the next, that is possible as well.
+```python
+import torch
+from x_transformers import TransformerWrapper, Decoder
+model = TransformerWrapper(
+    num_tokens = 20000,
+    max_seq_len = 1024,
+    attn_layers = Decoder(
+        dim = 512,
+        custom_layers = (
+            'a', 'f',        # 3 sets of attention and feedforward
+            'a', 'f',
+            'a', 'f'
+        ),
+        layers_execute_order = (
+            *((0, 1) * 4),   # each done 4 times before sequentially passed forward, but you can probably imagine some more interesting configurations...
+            *((2, 3) * 4),
+            *((4, 5) * 4),
+        )
+    )
+)
+x = torch.randint(0, 256, (1, 1024))
+model(x) # (1, 1024, 20000)
+```
 ### Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View
 <img src="./images/macaron-1.png"></img>

{x_transformers-1.28.5 → x_transformers-1.29.0}/setup.py RENAMED Viewed

@@ -3,7 +3,7 @@ from setuptools import setup, find_packages
 setup(
   name = 'x-transformers',
   packages = find_packages(exclude=['examples']),
-  version = '1.28.5',
+  version = '1.29.0',
   license='MIT',
   description = 'X-Transformers - Pytorch',
   author = 'Phil Wang',

{x_transformers-1.28.5 → x_transformers-1.29.0}/x_transformers/x_transformers.py RENAMED Viewed

@@ -1001,7 +1001,7 @@ class AttentionLayers(Module):
     def __init__(
         self,
         dim,
-        depth,
+        depth = None,
         heads = 8,
         causal = False,
         cross_attend = False,
@@ -1054,6 +1054,8 @@ class AttentionLayers(Module):
         attn_kwargs, kwargs = groupby_prefix_and_trim('attn_', kwargs)
         cross_attn_kwargs, kwargs = groupby_prefix_and_trim('cross_attn_', kwargs)
+        assert len(kwargs) == 0, f'unrecognized kwargs passed in {kwargs.keys()}'
         dim_head = attn_kwargs.get('dim_head', DEFAULT_DIM_HEAD)
         self.dim = dim
@@ -1138,9 +1140,12 @@ class AttentionLayers(Module):
         # setup weight tying, which is a special case of `layer_execute_order`
+        assert not (exists(layers_execute_order) and exists(custom_layers) and exists(depth)), 'depth should not be passed in if using custom layers and custom layer execution order'
         assert not (weight_tie_layers and any([*map(exists, (custom_layers, par_ratio, sandwich_coef))]))
         if weight_tie_layers:
+            assert exists(depth), 'depth must be passed in with `weight_tie_layers` = True'
             assert not exists(layers_execute_order)
             layers_execute_order = tuple(range(len(default_block))) * depth
             depth = 1
@@ -1164,6 +1169,7 @@ class AttentionLayers(Module):
             assert sandwich_coef > 0 and sandwich_coef <= depth, 'sandwich coefficient should be less than the depth'
             layer_types = ('a',) * sandwich_coef + default_block * (depth - sandwich_coef) + ('f',) * sandwich_coef
         else:
+            assert exists(depth), '`depth` must be passed in for `Decoder` or `Encoder`'
             layer_types = default_block * depth
         self.layer_types = layer_types

{x_transformers-1.28.5 → x_transformers-1.29.0/x_transformers.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: x-transformers
-Version: 1.28.5
+Version: 1.29.0
 Summary: X-Transformers - Pytorch
 Home-page: https://github.com/lucidrains/x-transformers
 Author: Phil Wang