tab2seq 0.1.2__tar.gz → 0.1.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (44) hide show
  1. {tab2seq-0.1.2/src/tab2seq.egg-info → tab2seq-0.1.5}/PKG-INFO +168 -110
  2. tab2seq-0.1.5/README.md +315 -0
  3. {tab2seq-0.1.2 → tab2seq-0.1.5}/pyproject.toml +8 -11
  4. tab2seq-0.1.5/src/tab2seq/__init__.py +18 -0
  5. tab2seq-0.1.5/src/tab2seq/cli.py +71 -0
  6. tab2seq-0.1.5/src/tab2seq/cohort/__init__.py +6 -0
  7. tab2seq-0.1.5/src/tab2seq/cohort/config.py +104 -0
  8. tab2seq-0.1.5/src/tab2seq/cohort/core.py +461 -0
  9. tab2seq-0.1.5/src/tab2seq/config.py +58 -0
  10. tab2seq-0.1.5/src/tab2seq/datasets/__init__.py +16 -0
  11. tab2seq-0.1.5/src/tab2seq/datasets/builder.py +706 -0
  12. tab2seq-0.1.5/src/tab2seq/datasets/config.py +59 -0
  13. {tab2seq-0.1.2 → tab2seq-0.1.5}/src/tab2seq/datasets/synthetic.py +16 -0
  14. tab2seq-0.1.5/src/tab2seq/loader.py +65 -0
  15. tab2seq-0.1.5/src/tab2seq/processor.py +52 -0
  16. {tab2seq-0.1.2 → tab2seq-0.1.5}/src/tab2seq/source/config.py +21 -1
  17. {tab2seq-0.1.2 → tab2seq-0.1.5}/src/tab2seq/source/core.py +2 -4
  18. tab2seq-0.1.5/src/tab2seq/tokenization/__init__.py +7 -0
  19. tab2seq-0.1.5/src/tab2seq/tokenization/config.py +25 -0
  20. tab2seq-0.1.5/src/tab2seq/tokenization/tokenizer.py +139 -0
  21. tab2seq-0.1.5/src/tab2seq/tokenization/vocabulary.py +359 -0
  22. {tab2seq-0.1.2 → tab2seq-0.1.5/src/tab2seq.egg-info}/PKG-INFO +168 -110
  23. tab2seq-0.1.5/src/tab2seq.egg-info/SOURCES.txt +38 -0
  24. {tab2seq-0.1.2 → tab2seq-0.1.5}/src/tab2seq.egg-info/requires.txt +1 -4
  25. tab2seq-0.1.5/tests/test_cli.py +153 -0
  26. tab2seq-0.1.5/tests/test_cohort.py +283 -0
  27. tab2seq-0.1.5/tests/test_config.py +86 -0
  28. tab2seq-0.1.5/tests/test_event_dataset_builder.py +225 -0
  29. tab2seq-0.1.5/tests/test_loader.py +102 -0
  30. tab2seq-0.1.5/tests/test_processor.py +113 -0
  31. tab2seq-0.1.5/tests/test_tokenizer.py +179 -0
  32. tab2seq-0.1.5/tests/test_vocabulary.py +159 -0
  33. tab2seq-0.1.2/README.md +0 -254
  34. tab2seq-0.1.2/src/tab2seq/__init__.py +0 -9
  35. tab2seq-0.1.2/src/tab2seq/datasets/__init__.py +0 -5
  36. tab2seq-0.1.2/src/tab2seq.egg-info/SOURCES.txt +0 -17
  37. {tab2seq-0.1.2 → tab2seq-0.1.5}/LICENSE +0 -0
  38. {tab2seq-0.1.2 → tab2seq-0.1.5}/setup.cfg +0 -0
  39. {tab2seq-0.1.2 → tab2seq-0.1.5}/src/tab2seq/source/__init__.py +0 -0
  40. {tab2seq-0.1.2 → tab2seq-0.1.5}/src/tab2seq/source/collection.py +0 -0
  41. {tab2seq-0.1.2 → tab2seq-0.1.5}/src/tab2seq.egg-info/dependency_links.txt +0 -0
  42. {tab2seq-0.1.2 → tab2seq-0.1.5}/src/tab2seq.egg-info/top_level.txt +0 -0
  43. {tab2seq-0.1.2 → tab2seq-0.1.5}/tests/test_datasets.py +0 -0
  44. {tab2seq-0.1.2 → tab2seq-0.1.5}/tests/test_source.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: tab2seq
3
- Version: 0.1.2
3
+ Version: 0.1.5
4
4
  Summary: Transform tabular event data into sequences ready for Transformer and Sequential models: Life2Vec, BEHRT and more.
5
5
  Author-email: Germans Savcisens <germans@savcisens.com>
6
6
  License: MIT
@@ -9,7 +9,7 @@ Project-URL: Documentation, https://tab2seq.readthedocs.io
9
9
  Project-URL: Repository, https://github.com/carlomarxdk/tab2seq
10
10
  Project-URL: Issues, https://github.com/carlomarxdk/tab2seq/issues
11
11
  Keywords: tokenization,data preprocessing,tabular data,transformer models,sequential models,life2vec
12
- Classifier: Development Status :: 3 - Alpha
12
+ Classifier: Development Status :: 4 - Beta
13
13
  Classifier: Intended Audience :: Science/Research
14
14
  Classifier: License :: OSI Approved :: MIT License
15
15
  Classifier: Programming Language :: Python :: 3
@@ -36,13 +36,10 @@ Requires-Dist: ruff>=0.15.0; extra == "dev"
36
36
  Requires-Dist: mypy>=1.19.0; extra == "dev"
37
37
  Requires-Dist: types-PyYAML>=6.0.0; extra == "dev"
38
38
  Provides-Extra: docs
39
- Requires-Dist: mkdocs>=1.6.1; extra == "docs"
40
- Requires-Dist: mkdocs-material>=9.7.1; extra == "docs"
41
- Requires-Dist: mkdocstrings>=1.0.2; extra == "docs"
39
+ Requires-Dist: zensical; extra == "docs"
42
40
  Requires-Dist: mkdocstrings-python>=2.0.0; extra == "docs"
43
41
  Requires-Dist: mkdocs-gen-files>=0.6.0; extra == "docs"
44
42
  Requires-Dist: mkdocs-literate-nav>=0.6.2; extra == "docs"
45
- Requires-Dist: mkdocs-section-index>=0.3.10; extra == "docs"
46
43
  Requires-Dist: mkdocs-bibtex>=4.4.0; extra == "docs"
47
44
  Provides-Extra: all
48
45
  Requires-Dist: tab2seq[dev,docs]; extra == "all"
@@ -54,86 +51,84 @@ Dynamic: license-file
54
51
  [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tab2seq)](https://pypi.org/project/tab2seq/)
55
52
  [![PyPI - Status](https://img.shields.io/pypi/status/tab2seq)](https://pypi.org/project/tab2seq/)
56
53
  [![GitHub License](https://img.shields.io/github/license/carlomarxdk/tab2seq)](https://github.com/carlomarxdk/tab2seq/blob/main/LICENSE)
54
+ [![DOI](https://zenodo.org/badge/1163020308.svg)](https://doi.org/10.5281/zenodo.18752504)
57
55
 
58
- **tab2seq** adapts the Life2Vec data processing pipeline to make it easy to work with multi-source tabular event data for sequential modeling projects. Transform registry data, EHR records, and other event-based datasets into formats ready for Transformer and sequential deep learning models.
56
+ **tab2seq** adapts the Life2Vec data processing pipeline to make it easy to work with multi-source tabular event data for sequential modeling projects. Transform registry data, EHR records, and other event-based datasets into tokenized sequences ready for Transformer and sequential deep learning models.
57
+ The package reimplements the data-preprocessing steps of the [life2vec](https://github.com/SocialComplexityLab/life2vec) and [life2vec-light](https://github.com/carlomarxdk/life2vec-light) repos.
59
58
 
60
- > [!WARNING]
61
- > This is an alpha package. In the beta version, it will reimplement all the data-preprocessing steps of the [life2vec](https://github.com/SocialComplexityLab/life2vec) and [life2vec-light](https://github.com/carlomarxdk/life2vec-light) repos. See [TODOs](#todos) to see what is implemented at this point.
59
+ > [!INFO]
60
+ > This is a **BETA** version of the package.
62
61
 
63
62
  ## About
64
63
 
65
64
  This package extracts and generalizes the data processing patterns from the [Life2Vec](https://github.com/SocialComplexityLab/life2vec) project, making them reusable for similar research projects that need to:
66
65
 
67
66
  - Work with multiple longitudinal data sources (registries, databases)
68
- - Define and filter cohorts based on complex criteria
67
+ - Define and filter cohorts based on inclusion criteria
68
+ - Create deterministic train/val/test splits with static context
69
+ - Fit a vocabulary on training data only (no leakage)
70
+ - Produce tokenized, model-ready event sequences with time features
69
71
  - Generate realistic synthetic data for development and testing
70
- - Process large-scale tabular event data efficiently
71
72
 
72
73
  Whether you're working with healthcare data, financial records, or any time-stamped event data, tab2seq provides the building blocks for preparing data for Life2Vec-style sequential models.
73
74
 
75
+ ## Pipeline Overview
76
+
77
+ ```
78
+ Sources → Cohort → Vocabulary → EventDataset → Model-ready Parquet
79
+ ```
80
+
81
+ 1. **Sources** – Define one `SourceConfig` per event table (health visits, labour records, income, etc.). Each config declares which columns are categorical, continuous, or timestamps.
82
+ 2. **Cohort** – Unite sources into a single entity universe, apply inclusion criteria, and split into train/val/test with deterministic seeds.
83
+ 3. **Vocabulary** – Fit token mappings and continuous-feature bin edges on the *train split only* to prevent leakage.
84
+ 4. **EventDataset** – Build tokenized event rows per split, derive relative-date features (e.g. age), and persist to Parquet with metadata.
85
+
74
86
  ## Features
75
87
 
76
88
  - **Multi-Source Data Management**: Handle multiple data sources (registries) with unified schema
89
+ - **Cohort Construction**: Entity-level inclusion criteria across sources, deterministic splits, static-attribute propagation
90
+ - **Train-Only Vocabulary**: Token and bin-edge fitting restricted to training entities
91
+ - **Tokenized Event Datasets**: Vectorized token-ID encoding, relative-date features, Parquet persistence
92
+ - **Entity Record Access**: Iterator, random sample, and stateful `next()` retrieval patterns for downstream training loops
77
93
  - **Type-Safe Configuration**: Pydantic-based configuration with YAML support
78
94
  - **Synthetic Data Generation**: Generate realistic dummy registry data for testing and exploration
79
95
  - **Memory-Efficient Loading**: Chunked iteration and lazy loading with Polars
80
- - **Schema Validation**: Automatic validation of entity IDs, timestamps, and column types
81
- - **Cross-Source Operations**: Unified access and operations across multiple data sources
82
96
 
83
97
  ## Installation
84
98
 
85
99
  ```bash
86
- # Basic installation
87
100
  pip install tab2seq
88
101
  ```
89
102
 
90
103
  ## Quick Start
91
104
 
92
- ### Working with a Single Source
105
+ The full pipeline from raw data to model-ready sequences in five steps.
106
+
107
+ ### 1. Generate Synthetic Data
93
108
 
94
109
  ```python
95
- from tab2seq.source import (
96
- Source,
97
- SourceConfig,
98
- SourceCollection,
99
- CategoricalColConfig,
100
- ContinuousColConfig,
101
- TimestampColConfig
102
- )
110
+ from tab2seq.datasets import generate_synthetic_data
111
+ import polars as pl
103
112
 
104
- config = SourceConfig(
105
- name="health",
106
- filepath="synthetic_data/health.parquet",
107
- id_col="entity_id",
108
- categorical_cols=[
109
- CategoricalColConfig(col_name="diagnosis", prefix="DIAG"),
110
- CategoricalColConfig(col_name="procedure", prefix="PROC"),
111
- CategoricalColConfig(col_name="department", prefix="DEPT"),
112
- ],
113
- continuous_cols=[
114
- ContinuousColConfig(col_name="cost", prefix="COST", n_bins=20, strategy="quantile"),
115
- ContinuousColConfig(col_name="length_of_stay", prefix="LOS", n_bins=20, strategy="quantile"),
116
- ],
117
- output_format="parquet",
118
- timestamp_cols=[
119
- TimestampColConfig(col_name="date", is_primary=True, drop_na=True)
120
- ]
113
+ data_paths = generate_synthetic_data(
114
+ output_dir="synthetic_data",
115
+ n_entities=10_000,
116
+ seed=742,
117
+ registries=["health", "labour"],
121
118
  )
122
-
123
- source = Source(config=config)
124
-
125
- # Process and tokenize the columns
126
- print("Number of unique IDs:", len(source.get_entity_ids()))
127
- lf_health = source.process(cache=True)
128
- lf_health.head()
119
+ pl.read_parquet(data_paths["health"]).head()
129
120
  ```
130
121
 
131
- ### Working with Multiple Sources
122
+ ### 2. Define Sources
123
+
124
+ Each `Source` describes one event table: its file path, ID column, timestamp, and feature columns.
132
125
 
133
126
  ```python
134
- from tab2seq.source import SourceCollection, SourceConfig, CategoricalColConfig, ContinuousColConfig, TimestampColConfig
127
+ from tab2seq.source import (
128
+ Source, SourceCollection, SourceConfig,
129
+ CategoricalColConfig, ContinuousColConfig, TimestampColConfig,
130
+ )
135
131
 
136
- # Define your data sources
137
132
  configs = [
138
133
  SourceConfig(
139
134
  name="health",
@@ -145,13 +140,12 @@ configs = [
145
140
  CategoricalColConfig(col_name="department", prefix="DEPT"),
146
141
  ],
147
142
  continuous_cols=[
148
- ContinuousColConfig(col_name="cost", prefix="COST", n_bins=20, strategy="quantile"),
149
- ContinuousColConfig(col_name="length_of_stay", prefix="LOS", n_bins=20, strategy="quantile"),
143
+ ContinuousColConfig(col_name="cost", prefix="COST", n_bins=20),
144
+ ContinuousColConfig(col_name="length_of_stay", prefix="LOS", n_bins=10),
150
145
  ],
151
- output_format="parquet",
152
146
  timestamp_cols=[
153
- TimestampColConfig(col_name="date", is_primary=True, drop_na=True)
154
- ]
147
+ TimestampColConfig(col_name="date", is_primary=True, drop_na=True),
148
+ ],
155
149
  ),
156
150
  SourceConfig(
157
151
  name="labour",
@@ -161,99 +155,162 @@ configs = [
161
155
  CategoricalColConfig(col_name="status", prefix="STATUS"),
162
156
  CategoricalColConfig(col_name="occupation", prefix="OCC"),
163
157
  CategoricalColConfig(col_name="residence_region", prefix="REGION"),
158
+ CategoricalColConfig(col_name="native_language", prefix="LANG", static=True),
164
159
  ],
165
160
  continuous_cols=[
166
- ContinuousColConfig(col_name="weekly_hours", prefix="WEEKLY_HOURS")
161
+ ContinuousColConfig(col_name="weekly_hours", prefix="WEEKLY_HOURS", n_bins=10),
167
162
  ],
168
- output_format="parquet",
169
163
  timestamp_cols=[
170
164
  TimestampColConfig(col_name="date", is_primary=True, drop_na=True),
171
- TimestampColConfig(col_name="birthday", is_primary=False, drop_na=True),
165
+ TimestampColConfig(col_name="birthday", static=True, drop_na=True),
172
166
  ],
173
167
  ),
174
168
  ]
175
169
 
176
- # Create a source collection
177
170
  collection = SourceCollection.from_configs(configs)
178
171
 
179
- # Access individual sources
180
- health = collection["health"]
181
- df = health.read_all()
182
-
183
- # Or iterate over all sources
184
172
  for source in collection:
185
173
  print(f"{source.name}: {len(source.get_entity_ids())} entities")
186
-
187
- # Cross-source operations
188
- all_entity_ids = collection.get_all_entity_ids()
189
174
  ```
190
175
 
191
- ### Generating Synthetic Data
176
+ > Columns marked `static=True` are carried through to the cohort split table as entity-level attributes (e.g. birthday, native language).
177
+
178
+ ### 3. Build a Cohort
179
+
180
+ A `Cohort` resolves one consistent entity universe across all sources, applies inclusion criteria, and generates deterministic train/val/test splits.
192
181
 
193
182
  ```python
194
- from tab2seq.datasets import generate_synthetic_data
195
- import polars as pl
183
+ from tab2seq.cohort import Cohort, CohortConfig, EntityInclusionCriteria
184
+
185
+ criteria = [
186
+ EntityInclusionCriteria(source_name="health", required=False),
187
+ EntityInclusionCriteria(source_name="labour", required=True, min_events=1),
188
+ ]
189
+
190
+ cohort = Cohort(
191
+ name="my_cohort",
192
+ sources=collection,
193
+ inclusion_criteria=criteria,
194
+ cache_dir="data/cohorts",
195
+ )
196
196
 
197
- # Generate synthetic registry data
198
- data_paths = generate_synthetic_data(output_dir="synthetic_data",
199
- n_entities=10000,
200
- seed=742,
201
- registries=["health", "labour", "survey", "income"],
202
- file_format="parquet")
197
+ entities_df = cohort.build_entities_table(force_recompute=True)
198
+ print(f"Cohort size: {len(cohort)} entities")
203
199
 
204
- lf_health = pl.read_parquet(data_paths["health"])
205
- lf_health.head()
200
+ split_cfg = CohortConfig(train_frac=0.7, val_frac=0.15, test_frac=0.15, seed=42)
201
+ split_df = cohort.build_or_load_splits(split_cfg, force_recompute=True)
202
+ split_df.head()
206
203
  ```
207
204
 
208
- ## Architecture
205
+ The split table contains one row per entity with the split label and all static columns.
209
206
 
210
- > [!warning]
211
- > Work in progress!
207
+ ### 4. Fit a Vocabulary (Train Only)
212
208
 
213
- **Available Registries:**
209
+ The vocabulary maps categorical values to token strings and bins continuous features—fitted exclusively on training entities to prevent leakage.
214
210
 
215
- - **health**: Medical events with diagnoses (ICD codes), procedures, departments, costs, and length of stay
216
- - **income**: Yearly income records with income type, sector, and amounts
217
- - **labour**: Quarterly labour status with occupation, employment status, and residence
218
- - **survey**: Periodic survey responses with education level, marital status, and satisfaction scores
211
+ ```python
212
+ from tab2seq.config import TokenizerConfig
213
+ from tab2seq.tokenization import Vocabulary
214
+
215
+ tok_cfg = TokenizerConfig()
216
+ tok_cfg.vocabulary.min_token_count = 1
217
+ tok_cfg.vocabulary.max_vocab_size = 50_000
218
+
219
+ vocab = Vocabulary(tok_cfg.vocabulary)
220
+ vocab_df = vocab.fit_from_cohort_train(
221
+ cohort=cohort,
222
+ split_config=split_cfg,
223
+ force_recompute=True,
224
+ )
225
+ print(f"Vocabulary size: {vocab_df.height}")
226
+ ```
219
227
 
220
- All synthetic data includes realistic temporal patterns, missing data, and correlations between fields to mimic real-world registry data.
228
+ ### 5. Build Tokenized Event Datasets
221
229
 
222
- ## Use Cases
230
+ `EventDataset` produces one row per event with integer token IDs, time features, and optional derived columns.
223
231
 
224
- - **Healthcare Research**: Transform electronic health records (EHR) into sequences for predictive modeling
225
- - **Registry Data Processing**: Work with multiple event-based registries (health, income, labour, surveys)
226
- - **Sequential Modeling**: Prepare multi-source data for Life2Vec, BEHRT, or other transformer-based models
227
- - **Data Pipeline Development**: Use synthetic data to develop and test processing pipelines before working with sensitive real data
228
- - **Multi-Source Analysis**: Combine and analyze data from multiple longitudinal sources with unified tooling
232
+ ```python
233
+ from tab2seq.datasets import EventDataset, EventDatasetConfig, RelativeDateRule
234
+
235
+ dataset_cfg = EventDatasetConfig(
236
+ reference_date="1970-01-01",
237
+ threshold_date="2021-01-01",
238
+ include_after_threshold=True,
239
+ include_token_str=True,
240
+ relative_date_features=[
241
+ RelativeDateRule(
242
+ source_static_column="labour__birthday",
243
+ output_column="age_years",
244
+ unit="years",
245
+ ),
246
+ ],
247
+ )
229
248
 
230
- ## Development
249
+ dataset = EventDataset(
250
+ cohort=cohort,
251
+ vocabulary=vocab,
252
+ split_config=split_cfg,
253
+ dataset_config=dataset_cfg,
254
+ )
231
255
 
232
- ```bash
233
- # Install development dependencies
234
- pip install -e ".[dev]"
256
+ # Inspect one split in memory
257
+ train_events = dataset.build_split("train", force_recompute_splits=True)
258
+ print(train_events.select(
259
+ ["entity_id", "source_name", "primary_timestamp", "token_ids", "age_years"]
260
+ ).head(5))
235
261
 
236
- # Run tests
237
- pytest
262
+ # Persist all splits + static table + metadata to Parquet
263
+ artifacts = dataset.write_parquet(force_recompute_splits=True)
264
+ print(artifacts.split_paths)
265
+ ```
238
266
 
239
- # Run tests with coverage
240
- pytest --cov=tab2seq --cov-report=html
267
+ ### Retrieving Entity Records
241
268
 
242
- # Format code
243
- black src/tab2seq tests
269
+ Three patterns for feeding records into a training loop:
244
270
 
245
- # Lint code
246
- ruff check src/tab2seq tests
271
+ ```python
272
+ # Full iterator sweep
273
+ for record in dataset.iter_entity_records(split="train", shuffle=True, seed=42):
274
+ # record = {"entity_id": ..., "split": ..., "static": {...}, "events": [...]}
275
+ pass
276
+
277
+ # Random sample
278
+ record = dataset.sample_entity_record(split="train", seed=7)
279
+
280
+ # Stateful next() — remembers position across calls
281
+ record = dataset.next_entity_record(split="train", shuffle=True, seed=0, reset=True)
282
+ while record is not None:
283
+ record = dataset.next_entity_record(split="train", shuffle=True, seed=0)
247
284
  ```
248
285
 
286
+ ## Synthetic Registries
287
+
288
+ `generate_synthetic_data` / `generate_synthetic_collections` create four registry-style tables with realistic temporal patterns, missing data, and cross-field correlations:
289
+
290
+ | Registry | Key columns |
291
+ |----------|------------|
292
+ | **health** | diagnosis, procedure, department, cost, length_of_stay |
293
+ | **income** | income_type, sector, income_amount |
294
+ | **labour** | status, occupation, weekly_hours, residence_region, birthday |
295
+ | **survey** | education_level, marital_status, self_rated_health, satisfaction_score |
296
+
297
+ ## Use Cases
298
+
299
+ - **Healthcare Research**: Transform electronic health records (EHR) into sequences for predictive modeling
300
+ - **Registry Data Processing**: Work with multiple event-based registries (health, income, labour, surveys)
301
+ - **Sequential Modeling**: Prepare multi-source data for Life2Vec, BEHRT, or other transformer-based models
302
+ - **Data Pipeline Development**: Use synthetic data to develop and test processing pipelines before working with sensitive real data
303
+
304
+
249
305
  ## TODOs
250
306
 
251
307
  - [x] Synthetic Datasets
252
308
  - [x] `Source` implementation
253
- - [ ] `Cohort` implementation
254
- - [ ] `Cohort` and data splits
255
- - [ ] `Tokenization` implementation
256
- - [ ] `Vocabulary` implementation
309
+ - [x] `Cohort` implementation
310
+ - [x] `Cohort` and data splits
311
+ - [x] `Tokenization` implementation
312
+ - [x] `Vocabulary` implementation
313
+ - [x] `EventDataset` builder
257
314
  - [x] Caching and chunking
258
315
  - [ ] Documentation
259
316
 
@@ -296,9 +353,10 @@ Contributions are welcome! Please open an issue or submit a pull request on [Git
296
353
 
297
354
  ## License
298
355
 
299
- MIT License - see [LICENSE](LICENSE) file for details.
356
+ MIT License: see [LICENSE](LICENSE) file for details.
300
357
 
301
358
  ## Support
302
359
 
303
360
  - 🐛 Issues: [GitHub Issues](https://github.com/carlomarxdk/tab2seq/issues)
304
361
  - 💬 Discussions: [GitHub Discussions](https://github.com/carlomarxdk/tab2seq/discussions)
362
+