PyPI - returnn - Versions diffs - 1.20240620.105009__tar.gz → 1.20240621.130142__tar.gz - Mend

returnn 1.20240620.105009tar.gz → 1.20240621.130142tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of returnn might be problematic. Click here for more details.

Files changed (449) hide show

{returnn-1.20240620.105009 → returnn-1.20240621.130142}/CHANGELOG.md RENAMED Viewed

@@ -6,6 +6,139 @@ or any changes which could potentially break or change the behavior of existing
 This is intentionally kept short. For a full change log, just see the Git log.
+## 2024-06-07: `VariableDataset`
+Custom subdataset per subepoch based on user-provided function.
+Can be useful for advanced training pipeline
+where you create some HDFs on-the-fly (e.g. for alignments)
+and then want to load them in later epochs.
+## 2024-05-28: [`DistributeFilesDataset`](https://github.com/rwth-i6/returnn/blob/master/returnn/datasets/distrib_files.py) ([PR #1521](https://github.com/rwth-i6/returnn/pull/1521), [issue #1519](https://github.com/rwth-i6/returnn/issues/1519))
+`DistributeDataset` together with `FileCache`
+allows to train on very large datasets which do not fit on the local disk.
+`DistributeDataset` operates on a list of files,
+and for each sub-epoch will only select a subset of the files,
+and `FileCache` will cache the files locally.
+It also was specifically designed with distributed training in mind.
+The distributed random_seed_offset method can be used,
+but sharding is also supported ([PR #1538](https://github.com/rwth-i6/returnn/pull/1538)).
+## 2023-11-09: `LearningRateControl` saves more meta info in learning-rate-file
+Like effective learning rate (after `dynamic_learning_rate`),
+training step, GPU, RETURNN version, etc.
+(Was extended a bit over time.)
+## 2024-01-09: [Train proc manager](https://github.com/rwth-i6/returnn/blob/master/returnn/util/train_proc_manager.py)
+Auto restart RETURNN on crashes under certain conditions
+(e.g. it must have trained at least one epoch successfully since the most recent restart).
+## 2023-12-30: PyTorch handle OOM for forwarding, auto-split batch
+This went through several iterations of approaches,
+stumbling through a number of CPython and PyTorch bugs,
+e.g. [CPython #113939](https://github.com/python/cpython/issues/113939),
+[PyTorch #18853](https://github.com/pytorch/pytorch/issues/18853),
+[PyTorch #27600](https://github.com/pytorch/pytorch/issues/27600).
+## 2023-12-23: PyTorch distributed training with param averaging
+In the `torch_distributed` config dict: Set `"reduce_type": "param"` and `"param_sync_step": ...`.
+## 2023-10-24: [`watch_memory`](https://github.com/rwth-i6/returnn/blob/master/returnn/util/watch_memory.py): watches memory of all procs
+## 2023-10-03: RETURNN frontend (RF) native helpers ([PR #1403](https://github.com/rwth-i6/returnn/pull/1403))
+## 2023-06-09: PyTorch distributed training ([PR #1335](https://github.com/rwth-i6/returnn/pull/1335), [issue #1332](https://github.com/rwth-i6/returnn/issues/1332))
+`torch_distributed` config setting.
+Using the official PyTorch `DistributedDataParallel`, i.e. synchronized accumulated gradients.
+Each worker uses a different `random_seed_offset` for the dataset.
+## 2023-05-15: PyTorch automatic mixed precision (AMP) support ([PR #1322](https://github.com/rwth-i6/returnn/pull/1322))
+## 2023-04-03: PyTorch `preload_from_files` support ([PR #1292](https://github.com/rwth-i6/returnn/pull/1292))
+## 2023-03-26: [`MultiProcDataset`](https://github.com/rwth-i6/returnn/blob/master/returnn/datasets/multi_proc.py)
+## 2023-02-24: Make [`Tensor` and `Dim`](https://returnn.readthedocs.io/en/latest/getting_started/data.html) backend independent ([PR #1261](https://github.com/rwth-i6/returnn/pull/1261), [issue #1165](https://github.com/rwth-i6/returnn/issues/1165))
+* Rename `Data` to `Tensor`, `DimensionTag` to `Dim`.
+* Before, in our `Tensor`, the `placeholder` (now `raw_tensor`) was either None (as a template)
+  or a TensorFlow tensor (`tf.Tensor`).
+  Now it can support any raw tensor type.
+* Now `Tensor` and `Dim` are moved to `returnn.tensor`.
+## 2023-02-20: RETURNN frontend (RF) ([issue #1120](https://github.com/rwth-i6/returnn/issues/1120), [issue #1264](https://github.com/rwth-i6/returnn/issues/1264))
+Modern alternative to the network dictionary to define models.
+Using Python code to define the network,
+very similar to how it is done in PyTorch or Keras or Flax.
+This evolved from [`returnn_common.nn`](https://github.com/rwth-i6/returnn_common/tree/main/nn) ([example](https://github.com/rwth-i6/returnn_common/wiki/RETURNN-example-config)),
+which provided already a very similar API.
+But now, we build it such that we support multiple backends.
+Specifically, the current supported (or planned) backends:
+* PyTorch (fully supported)
+* RETURNN network dictionary (TensorFlow) (fully supported)
+  (copied the `returnn_common.nn` code)
+  (this might be deprecated in the future)
+* TensorFlow (directly) (mostly supported)
+* NumPy (partially supported)
+* JAX (planned)
+## 2023-02-03: Use `black`, drop Python 2 support ([PR #1255](https://github.com/rwth-i6/returnn/pull/1255), [issue #487](https://github.com/rwth-i6/returnn/issues/487), [issue #1158](https://github.com/rwth-i6/returnn/issues/1158))
+## 2022-10-24: Remove Theano backend ([PR #1164](https://github.com/rwth-i6/returnn/pull/1164))
+## 2022-09-12: PyTorch backend started ([issue #1120](https://github.com/rwth-i6/returnn/issues/1120))
+This evolved over time.
+It was planned from the beginning
+to support pure PyTorch models defined by the user
+but also RETURNN frontend (RF) models.
+## 2022-04-24: TF eager execution initial support
+## 2022-02-11: TF loss auto-flatten optimization ([PR #906](https://github.com/rwth-i6/returnn/pull/906))
+## 2021-09-12: TF generalized attention, `CumConcatLayer` ([PR #589](https://github.com/rwth-i6/returnn/pull/589), [issue #391](https://github.com/rwth-i6/returnn/issues/391))
+Generalizes `SelfAttentionLayer` to allow for more custom variants.
+The difficulties were to support this when being inside a `RecLayer`
+and then when the optimization would move it outside the loop.
+For this, we introduced `CumConcatLayer`.
+[Example config for decoder self-attention](https://github.com/rwth-i6/returnn/issues/391#issuecomment-917517032).
+This can also be used just in the encoder, i.e. outside a `RecLayer` anyway
+via `ReinterpretDataLayer` to create a new dim tag.
+[Example config for encoder self-attention](https://github.com/rwth-i6/returnn/issues/391#issuecomment-919873563).
+## 2021-08-25: Explicit `Data` dimension tags ([PR #579](https://github.com/rwth-i6/returnn/pull/579))
+`Data` (later called `Tensor`) has `dim_tags` (later called `dims`)
+to describe the full shape, i.e. the dims of each axis.
+These are `DimensionTag` (later `Dim`) objects.
+Before this change, we already had dim tags but only for dynamic dims.
+Now they are consistently used for all dims.
+This makes everything more consistent and more in line
+with other named tensors / named dimensions frameworks.
+## 2021-06-11: [Behavior versions](https://returnn.readthedocs.io/en/latest/configuration_reference/behavior_version.html) ([PR #534](https://github.com/rwth-i6/returnn/pull/534), [issue #508](https://github.com/rwth-i6/returnn/issues/508))
+Setting `behavior_version` in config control the behavior of RETURNN
+and allows to update bad/buggy/broken behavior without changing behavior for existing setups.
+## 2021-06-04: Start of [`returnn_common.nn`](https://github.com/rwth-i6/returnn_common/tree/main/nn)
+Allows to define the RETURNN network dictionary using a more modern Python API,
+very similar to PyTorch or Keras.
+[Example](https://github.com/rwth-i6/returnn_common/wiki/RETURNN-example-config).
+Note that this later got merged into RETURNN frontend (RF).
 ## 2021-03-18: Subnetwork sub layer can be independent ([#473](https://github.com/rwth-i6/returnn/pull/473))
 This has an effect on recurrent subnetworks.
@@ -103,25 +236,25 @@ This will show the same information as before, but much more compact,
 and also in addition the dimension tags (`DimensionTag`),
 which also got improved in many further cases.
-## 2019-08-07: overlay nets (`extra_nets`)
+## 2019-08-07: Overlay nets (`extra_nets`)
 You can have e.g. multiple additional networks which redefine
 existing layers (they would automatically share params),
 which can use different flags (e.g. enable the search flag).
-## 2019-07: multiple stochastic (latent) variables
+## 2019-07: Multiple stochastic (latent) variables
 It was designed to support this from the very beginning,
 but the implementation was never fully finished for this.
 Now examples like hard attention work.
-## 2019-05: better support for RETURNN as a framework
+## 2019-05: Better support for RETURNN as a framework
 `pip install returnn`, and then `import returnn`.
-## 2019-03-29: remove hard Theano dependency
+## 2019-03-29: Remove hard Theano dependency
-## 2019-03-24 and ongoing: automatic linter checks
+## 2019-03-24 and ongoing: Automatic linter checks
 Currently pylint and PyCharm inspection checks automatically run in Travis.
 Both have some false positives, but so far the PyCharm inspections seems much more sane.
@@ -180,16 +313,16 @@ and then later `SplitBatchTimeLayer` to get the time-axis back, it was likely in
 ## 2019-01-30: video: RETURNN overview
-## 2018-08: multi-GPU support via [Horovod](https://github.com/horovod/horovod)
+## 2018-08: Multi-GPU support via [Horovod](https://github.com/horovod/horovod)
-## 2017-05: flexible `RecLayer`, encoder-decoder attention, beam search (Albert Zeyer)
+## 2017-05: Flexible `RecLayer`, encoder-decoder attention, beam search (Albert Zeyer)
-## 2016-12: start on [TensorFlow](https://www.tensorflow.org/) support (Albert Zeyer)
+## 2016-12: Start on [TensorFlow](https://www.tensorflow.org/) support (Albert Zeyer)
 Initial working support already finished within that month.
 TF 0.12.0.
-## 2015-07: fast CUDA LSTM kernel (Paul Voigtlaender)
+## 2015-07: Fast CUDA LSTM kernel (Paul Voigtlaender)
 ## 2015-03: `SprintDataset`, interface to [RASR](https://www-i6.informatik.rwth-aachen.de/rwth-asr/) (Albert Zeyer)
 ## 2015-01: Albert Zeyer joined
 ## ~2013-2014 (?): Patrick Doetsch started the project (Theano)

{returnn-1.20240620.105009/returnn.egg-info → returnn-1.20240621.130142}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: returnn
-Version: 1.20240620.105009
+Version: 1.20240621.130142
 Summary: The RWTH extensible training framework for universal recurrent neural networks
 Home-page: https://github.com/rwth-i6/returnn/
 Author: Albert Zeyer

returnn-1.20240621.130142/_setup_info_generated.py ADDED Viewed

	@@ -0,0 +1,2 @@
1	+ version = '1.20240621.130142'
2	+ long_version = '1.20240621.130142+git.08c1ad5'

{returnn-1.20240620.105009 → returnn-1.20240621.130142}/returnn/datasets/distrib_files.py RENAMED Viewed

@@ -161,7 +161,7 @@ class DistributeFilesDataset(CachedDataset2):
         self._file_sizes: Optional[Dict[str, int]] = None  # key -> size. for equal distribution across sub epochs
         self._data_keys: Optional[List[str]] = None
         self._num_seqs: Optional[int] = None
-        self._shard_index, self._num_shards = _get_rank_and_size() if distrib_shard_files else 0, 1
+        self._shard_index, self._num_shards = _get_rank_and_size() if distrib_shard_files else (0, 1)
         self._file_cache: Optional[_FileCacheProc] = None
         self._workers: Dict[int, _WorkerProcParent] = {}  # epoch -> worker

{returnn-1.20240620.105009 → returnn-1.20240621.130142}/returnn/datasets/util/vocabulary.py RENAMED Viewed

@@ -19,6 +19,7 @@ import sys
 import numpy
 from returnn.log import log
+from returnn.util.basic import NotSpecified
 class Vocabulary(object):
@@ -327,14 +328,48 @@ class SamplingBytePairEncoding(Vocabulary):
     This will encode the text on-the-fly with BPE.
     """
-    def __init__(self, vocab_file, breadth_prob, seq_postfix=None, **kwargs):
+    def __init__(
+        self,
+        vocab_file: str,
+        breadth_prob: float,
+        seq_postfix: Optional[List[int]] = None,
+        label_postfix_merge_symbol: Optional[str] = NotSpecified,
+        word_prefix_symbol: Optional[str] = NotSpecified,
+        **kwargs,
+    ):
         """
-        :param str vocab_file:
-        :param float breadth_prob:
-        :param list[int]|None seq_postfix: labels will be added to the seq in self.get_seq
+        :param vocab_file:
+        :param breadth_prob:
+        :param seq_postfix: labels will be added to the seq in self.get_seq
+        :param label_postfix_merge_symbol: If given, will use this as label postfix merge symbol,
+            i.e. when this occurs at the end of a label, it is supposed to be merged with the next label,
+            i.e. the space between them is removed and is not a word boundary.
+            If None, will not use any postfix merge symbol.
+            If not specified, and also word_prefix_symbol is not specified, will use "@@" by default here,
+            the standard from subword-nmt, and our original behavior.
+        :param word_prefix_symbol: If given, every new word starts with this symbol.
+            This also implies that there are no spaces between words
+            and this symbol is a placeholder for the space.
+            If None, will not use this logic.
+            For SentencePiece, you usually would use "▁" here.
         """
         super(SamplingBytePairEncoding, self).__init__(vocab_file=vocab_file, seq_postfix=seq_postfix, **kwargs)
-        from returnn.util.bpe import SamplingBytePairEncoder
+        from returnn.util.bpe import SamplingBytePairEncoder, BpePostMergeSymbol, BpeOpts
+        if label_postfix_merge_symbol is NotSpecified and word_prefix_symbol is NotSpecified:
+            label_postfix_merge_symbol = BpePostMergeSymbol
+            word_prefix_symbol = None
+        else:
+            if label_postfix_merge_symbol is NotSpecified:
+                label_postfix_merge_symbol = None
+            if word_prefix_symbol is NotSpecified:
+                word_prefix_symbol = None
+        if word_prefix_symbol is not None:
+            # I'm not sure if this makes sense otherwise...
+            assert label_postfix_merge_symbol is None, (
+                f"{self}: word_prefix_symbol {word_prefix_symbol},"
+                f" label_postfix_merge_symbol {label_postfix_merge_symbol}"
+            )
         self.rnd = numpy.random.RandomState(0)
         self.bpe = SamplingBytePairEncoder(
@@ -342,6 +377,7 @@ class SamplingBytePairEncoding(Vocabulary):
             breadth_prob=breadth_prob,
             rnd=self.rnd,
             unknown_label=self.id_to_label(self.unknown_label_id) if self.unknown_label_id is not None else None,
+            opts=BpeOpts(label_postfix_merge_symbol=label_postfix_merge_symbol, word_prefix_symbol=word_prefix_symbol),
         )
     def set_random_seed(self, seed):

{returnn-1.20240620.105009 → returnn-1.20240621.130142}/returnn/util/bpe.py RENAMED Viewed

@@ -4,6 +4,7 @@ Provide basic Byte-Pair-Encoding (BPE) utilities.
 from __future__ import annotations
 from typing import Optional, List, Dict, Callable
+from dataclasses import dataclass
 import re
 import numpy
@@ -227,70 +228,69 @@ class StandardBytePairEncoder:
         return output
+@dataclass
+class BpeOpts:
+    """
+    Options, should allow for both subword-nmt BPE and SentencePiece BPE/Unigram.
+    """
+    label_postfix_merge_symbol: Optional[str] = None  # eg. "@@"
+    word_prefix_symbol: Optional[str] = None  # eg. "▁"
 class PrefixTree:
     """
     Prefix tree / trie.
     This class represents both a single node and the tree.
     """
-    def __init__(self, prefix: str = "", root: Optional[PrefixTree] = None):
+    def __init__(self, *, prefix: str = "", opts: BpeOpts):
         """
-        :param prefix:
-        :param root:
+        :param prefix: if this is not the root, the prefix to get here
         """
         self.prefix = prefix
         self.arcs: Dict[str, PrefixTree] = {}  # single char (or BpePostMergeSymbol) -> sub tree
-        self.finished = False  # word finished here
+        self.finished = False  # label finished here
         self.bpe_finished = False  # partial word finished here with BpePostMergeSymbol at end
-        self.is_root = not root
-        self.root = root
+        self.opts = opts
-    def add(self, postfix: str, root: Optional[PrefixTree] = None) -> PrefixTree:
+    def add(self, postfix: str) -> PrefixTree:
         """
         :param postfix:
-        :param root:
         """
-        if not root:
-            if self.is_root:
-                root = self
-            else:
-                assert self.root
-                root = self.root
         self_ = self
+        opts = self.opts
+        label_postfix_merge_symbol = self.opts.label_postfix_merge_symbol
         while True:
-            if postfix == BpePostMergeSymbol:
+            if self_ is self and opts.word_prefix_symbol and postfix.startswith(opts.word_prefix_symbol):
+                arc = opts.word_prefix_symbol
+            elif postfix == label_postfix_merge_symbol:
                 arc = postfix
-                postfix_ = ""
             else:
                 arc = postfix[:1]
-                postfix_ = postfix[1:]
+            postfix_ = postfix[len(arc) :]
             if arc in self_.arcs:
                 child = self_.arcs[arc]
             else:
-                child = PrefixTree(root=root, prefix=self_.prefix + arc)
+                child = PrefixTree(prefix=self_.prefix + arc, opts=opts)
                 self_.arcs[arc] = child
-            if arc == BpePostMergeSymbol and not postfix_:
+            if arc == label_postfix_merge_symbol and not postfix_:
                 self_.bpe_finished = True
-            if postfix_:
-                self_ = child
-                postfix = postfix_
-            else:
+            if not postfix_:
                 child.finished = True
                 return child
+            self_ = child
+            postfix = postfix_
+@dataclass
 class Hyp:
     """
     Represents a hypothesis in the search.
     """
-    def __init__(self, bpe_sym_history, cur_node):
-        """
-        :param list[str] bpe_sym_history:
-        :param PrefixTree cur_node:
-        """
-        self.bpe_sym_history = bpe_sym_history
-        self.cur_node = cur_node
+    bpe_sym_history: List[str]
+    cur_node: PrefixTree
 class CharSyncSearch:
@@ -305,9 +305,15 @@ class CharSyncSearch:
         :param word_pos:
         """
         self.bpe = bpe
+        self.opts = bpe.opts
         self.word = word
         self.word_pos = word_pos
-        self.hyps: List[Hyp] = [Hyp(bpe_sym_history=[], cur_node=bpe)]
+        self.hyps: List[Hyp] = [
+            Hyp(
+                bpe_sym_history=[],
+                cur_node=bpe if self.opts.word_prefix_symbol is None else bpe.arcs[self.opts.word_prefix_symbol],
+            )
+        ]
         self.final_bpe_seqs: Optional[List[List[str]]] = None
     def _get_finished(self):
@@ -329,10 +335,17 @@ class CharSyncSearch:
                 if next_node:
                     new_hyps.append(
                         Hyp(
-                            bpe_sym_history=hyp.bpe_sym_history + [hyp.cur_node.prefix + BpePostMergeSymbol],
+                            bpe_sym_history=hyp.bpe_sym_history
+                            + [hyp.cur_node.prefix + self.opts.label_postfix_merge_symbol],
                             cur_node=next_node,
                         )
                     )
+            if self.opts.word_prefix_symbol is not None and hyp.cur_node.finished:
+                next_node = self.bpe.arcs.get(char)
+                if next_node:
+                    new_hyps.append(
+                        Hyp(bpe_sym_history=hyp.bpe_sym_history + [hyp.cur_node.prefix], cur_node=next_node)
+                    )
             next_node = hyp.cur_node.arcs.get(char)
             if next_node:
                 new_hyps.append(Hyp(bpe_sym_history=hyp.bpe_sym_history, cur_node=next_node))
@@ -349,20 +362,15 @@ class CharSyncSearch:
         return self.final_bpe_seqs
+@dataclass
 class HypInPos:
     """
     Represents a hypothesis in the search.
     """
-    def __init__(self, bpe_sym_history, cur_node, pos):
-        """
-        :param list[str] bpe_sym_history:
-        :param PrefixTree cur_node:
-        :param int pos:
-        """
-        self.bpe_sym_history = bpe_sym_history
-        self.cur_node = cur_node
-        self.pos = pos
+    bpe_sym_history: List[str]
+    cur_node: PrefixTree
+    pos: int
 class DepthFirstSearch:
@@ -377,11 +385,18 @@ class DepthFirstSearch:
         :param sampler:
         """
         self.bpe = bpe
+        self.opts = bpe.opts
         self.word = word
         self.sampler = sampler
         self.hyps: List[HypInPos] = []
         self.final_bpe_seq: Optional[List[str]] = None
-        self._add_hyp(HypInPos(bpe_sym_history=[], cur_node=bpe, pos=0))
+        self._add_hyp(
+            HypInPos(
+                bpe_sym_history=[],
+                cur_node=bpe if self.opts.word_prefix_symbol is None else bpe.arcs[self.opts.word_prefix_symbol],
+                pos=0,
+            )
+        )
     def _add_hyp(self, hyp: HypInPos):
         if hyp.pos >= len(self.word):
@@ -402,16 +417,26 @@ class DepthFirstSearch:
             if next_node:
                 new_hyps.append(
                     HypInPos(
-                        bpe_sym_history=hyp.bpe_sym_history + [hyp.cur_node.prefix + BpePostMergeSymbol],
+                        bpe_sym_history=hyp.bpe_sym_history
+                        + [hyp.cur_node.prefix + self.opts.label_postfix_merge_symbol],
                         cur_node=next_node,
                         pos=hyp.pos + 1,
                     )
                 )
+        if self.opts.word_prefix_symbol is not None and hyp.cur_node.finished:
+            next_node = self.bpe.arcs.get(char)
+            if next_node:
+                new_hyps.append(
+                    HypInPos(
+                        bpe_sym_history=hyp.bpe_sym_history + [hyp.cur_node.prefix], cur_node=next_node, pos=hyp.pos + 1
+                    )
+                )
         next_node = hyp.cur_node.arcs.get(char)
         if next_node:
             new_hyps.append(HypInPos(bpe_sym_history=hyp.bpe_sym_history, cur_node=next_node, pos=hyp.pos + 1))
         # Note that the order we check them will make this a depth-first or breadth-first search.
+        # We pop(-1) from the hyps list. So the last entry we add here will be the next one we expand.
         if self.sampler and self.sampler():
             new_hyps = list(reversed(new_hyps))
         for hyp in new_hyps:
@@ -431,13 +456,22 @@ class SamplingBytePairEncoder:
     Will randomly sample from any possible BPE split.
     """
-    def __init__(self, labels, breadth_prob, rnd, unknown_label=None):
-        """
-        :param list[str] labels: vocab
-        :param float breadth_prob: 1.0 will lead to breadth-first search, 0.0 to depth-first search.
-          other values are stochastic.
-        :param numpy.random.RandomState rnd:
-        :param str|None unknown_label:
+    def __init__(
+        self,
+        *,
+        labels: List[str],
+        breadth_prob: float,
+        rnd: numpy.random.RandomState,
+        unknown_label: Optional[str] = None,
+        opts: BpeOpts,
+    ):
+        """
+        :param labels: vocab
+        :param breadth_prob: 1.0 will lead to breadth-first search, 0.0 to depth-first search.
+            other values are stochastic.
+        :param rnd:
+        :param unknown_label:
+        :param opts:
         """
         self.labels = labels
         self.unknown_label = unknown_label
@@ -445,9 +479,10 @@ class SamplingBytePairEncoder:
             assert unknown_label in self.labels
         self.breadth_prob = breadth_prob
         self.rnd = rnd
+        self.opts = opts
         # build prefix tree
-        bpe = PrefixTree()
+        bpe = PrefixTree(opts=opts)
         for bpe_sym in labels:
             bpe.add(bpe_sym)
         self._bpe_prefix_tree = bpe
@@ -514,9 +549,10 @@ def _demo():
     vocab = Vocabulary(vocab_file=args.vocab, unknown_label=None)
     rnd = numpy.random.RandomState(args.seed)
+    opts = BpeOpts(label_postfix_merge_symbol=BpePostMergeSymbol)
     if args.input:
-        bpe_prefix_tree = PrefixTree()
+        bpe_prefix_tree = PrefixTree(opts=opts)
         for bpe_sym in vocab.labels:
             bpe_prefix_tree.add(bpe_sym)
@@ -533,7 +569,9 @@ def _demo():
                 print("%s: %s" % (word, " ".join(greedy)))
         return
-    bpe = SamplingBytePairEncoder(labels=vocab.labels, breadth_prob=args.breadth_prob, rnd=rnd, unknown_label=args.unk)
+    bpe = SamplingBytePairEncoder(
+        labels=vocab.labels, breadth_prob=args.breadth_prob, rnd=rnd, unknown_label=args.unk, opts=opts
+    )
     print("Reading from stdin:")
     while True:
         try:

{returnn-1.20240620.105009 → returnn-1.20240621.130142/returnn.egg-info}/PKG-INFO RENAMED Viewed

@@ -1,6 +1,6 @@
 Metadata-Version: 2.1
 Name: returnn
-Version: 1.20240620.105009
+Version: 1.20240621.130142
 Summary: The RWTH extensible training framework for universal recurrent neural networks
 Home-page: https://github.com/rwth-i6/returnn/
 Author: Albert Zeyer

returnn 1.20240620.105009__tar.gz → 1.20240621.130142__tar.gz

Potentially problematic release.

returnn 1.20240620.105009tar.gz → 1.20240621.130142tar.gz