lyndon-words 0.2.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- lyndon_words-0.2.0/.github/ISSUE_TEMPLATE/bug_report.md +23 -0
- lyndon_words-0.2.0/.github/ISSUE_TEMPLATE/feature_request.md +14 -0
- lyndon_words-0.2.0/.github/workflows/ci.yml +22 -0
- lyndon_words-0.2.0/.gitignore +18 -0
- lyndon_words-0.2.0/CHANGELOG.md +36 -0
- lyndon_words-0.2.0/CLAUDE.md +50 -0
- lyndon_words-0.2.0/CODE_OF_CONDUCT.md +37 -0
- lyndon_words-0.2.0/CONTRIBUTING.md +33 -0
- lyndon_words-0.2.0/LICENSE +21 -0
- lyndon_words-0.2.0/PKG-INFO +184 -0
- lyndon_words-0.2.0/README.md +134 -0
- lyndon_words-0.2.0/SECURITY.md +18 -0
- lyndon_words-0.2.0/assets/logo.png +0 -0
- lyndon_words-0.2.0/docs/architecture.md +96 -0
- lyndon_words-0.2.0/docs/charter.md +34 -0
- lyndon_words-0.2.0/docs/logo-prompt.md +7 -0
- lyndon_words-0.2.0/examples/de_bruijn_example.py +30 -0
- lyndon_words-0.2.0/examples/enumerate_necklaces.py +35 -0
- lyndon_words-0.2.0/pyproject.toml +56 -0
- lyndon_words-0.2.0/src/lyndon_words/__init__.py +16 -0
- lyndon_words-0.2.0/src/lyndon_words/debruijn.py +81 -0
- lyndon_words-0.2.0/src/lyndon_words/lyndon.py +131 -0
- lyndon_words-0.2.0/src/lyndon_words/lyndon_array.py +77 -0
- lyndon_words-0.2.0/src/lyndon_words/necklaces.py +178 -0
- lyndon_words-0.2.0/src/lyndon_words/py.typed +0 -0
- lyndon_words-0.2.0/tests/__init__.py +0 -0
- lyndon_words-0.2.0/tests/test_debruijn.py +74 -0
- lyndon_words-0.2.0/tests/test_lyndon.py +109 -0
- lyndon_words-0.2.0/tests/test_lyndon_array.py +252 -0
- lyndon_words-0.2.0/tests/test_necklaces.py +147 -0
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: Bug report
|
|
3
|
+
about: Report incorrect output or an unexpected error
|
|
4
|
+
labels: bug
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
**Function called**
|
|
8
|
+
<!-- e.g. factorize("banana") or de_bruijn(alphabet_size=2, length=3) -->
|
|
9
|
+
|
|
10
|
+
**Input**
|
|
11
|
+
<!-- The word, alphabet_size, and length you passed -->
|
|
12
|
+
|
|
13
|
+
**Expected output**
|
|
14
|
+
<!-- What you expected -->
|
|
15
|
+
|
|
16
|
+
**Actual output**
|
|
17
|
+
<!-- What you got -->
|
|
18
|
+
|
|
19
|
+
**Python version**
|
|
20
|
+
<!-- e.g. 3.11.4 -->
|
|
21
|
+
|
|
22
|
+
**lyndon-words version**
|
|
23
|
+
<!-- e.g. 0.1.0 -->
|
|
@@ -0,0 +1,14 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: Feature request
|
|
3
|
+
about: Suggest a new algorithm or combinatorics-on-words primitive
|
|
4
|
+
labels: enhancement
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
**What would you like?**
|
|
8
|
+
<!-- Describe the feature -->
|
|
9
|
+
|
|
10
|
+
**Reference**
|
|
11
|
+
<!-- If it's a published algorithm or definition, please cite the source -->
|
|
12
|
+
|
|
13
|
+
**Example**
|
|
14
|
+
<!-- Show a concrete input/output example -->
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
name: CI
|
|
2
|
+
|
|
3
|
+
on:
|
|
4
|
+
push:
|
|
5
|
+
branches: [main]
|
|
6
|
+
pull_request:
|
|
7
|
+
|
|
8
|
+
jobs:
|
|
9
|
+
test:
|
|
10
|
+
runs-on: ubuntu-latest
|
|
11
|
+
strategy:
|
|
12
|
+
matrix:
|
|
13
|
+
python: ["3.10", "3.11", "3.12", "3.13"]
|
|
14
|
+
steps:
|
|
15
|
+
- uses: actions/checkout@v4
|
|
16
|
+
- uses: actions/setup-python@v5
|
|
17
|
+
with:
|
|
18
|
+
python-version: ${{ matrix.python }}
|
|
19
|
+
- run: pip install -e ".[dev]"
|
|
20
|
+
- run: ruff check .
|
|
21
|
+
- run: mypy src
|
|
22
|
+
- run: pytest -q
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project are documented here. The format follows
|
|
4
|
+
[Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and the project
|
|
5
|
+
adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
6
|
+
|
|
7
|
+
## [Unreleased]
|
|
8
|
+
|
|
9
|
+
### Planned
|
|
10
|
+
- PyPI release (pending new-project quota reset)
|
|
11
|
+
|
|
12
|
+
## [0.2.0] - 2026-06-17
|
|
13
|
+
|
|
14
|
+
### Added
|
|
15
|
+
- `lyndon_array(word)`: for each index `i` of the input, returns the length of
|
|
16
|
+
the longest Lyndon word that is a prefix of the suffix `word[i:]`. The result
|
|
17
|
+
has the same length as the input; every entry satisfies `1 <= result[i] <= n - i`.
|
|
18
|
+
- The first entry `lyndon_array(word)[0]` equals the length of the first factor
|
|
19
|
+
in the Chen-Fox-Lyndon factorization of `word`.
|
|
20
|
+
|
|
21
|
+
### Notes
|
|
22
|
+
- PyPI publish is queued behind the new-project quota limit. Install from source
|
|
23
|
+
or from the GitHub release until the quota resets.
|
|
24
|
+
|
|
25
|
+
## [0.1.0] - 2026-06-17
|
|
26
|
+
|
|
27
|
+
### Added
|
|
28
|
+
- `is_lyndon(word)`: test whether a word is a Lyndon word, via Duval's algorithm.
|
|
29
|
+
- `factorize(word)`: Duval's O(n) Chen-Fox-Lyndon factorization into non-increasing
|
|
30
|
+
Lyndon factors.
|
|
31
|
+
- `enumerate_necklaces(*, alphabet_size, length)`: FKM enumeration of rotation-class
|
|
32
|
+
representatives in lexicographic order.
|
|
33
|
+
- `enumerate_bracelets(*, alphabet_size, length)`: enumeration of dihedral-class
|
|
34
|
+
(rotation and reflection) representatives in lexicographic order.
|
|
35
|
+
- `de_bruijn(*, alphabet_size, length)`: De Bruijn sequence B(k, n) via the FKM
|
|
36
|
+
construction.
|
|
@@ -0,0 +1,50 @@
|
|
|
1
|
+
# lyndon-words
|
|
2
|
+
|
|
3
|
+
Pure-Python combinatorics on words: Lyndon words, Duval factorization, necklace and
|
|
4
|
+
bracelet enumeration, De Bruijn sequences. Zero runtime dependencies.
|
|
5
|
+
|
|
6
|
+
## Commands
|
|
7
|
+
|
|
8
|
+
- Create env and install: `uv venv && uv pip install -e ".[dev]"`
|
|
9
|
+
- Test: `uv run pytest -q`
|
|
10
|
+
- Lint: `uv run ruff check .` (format with `uv run ruff format .`)
|
|
11
|
+
- Types: `uv run mypy src`
|
|
12
|
+
- Build: `uv build` (then `uv run --with twine twine check dist/*` before publishing)
|
|
13
|
+
|
|
14
|
+
## Architecture
|
|
15
|
+
|
|
16
|
+
`src/lyndon_words/`:
|
|
17
|
+
- `lyndon.py`: `is_lyndon` and `factorize` (Duval's Chen-Fox-Lyndon factorization).
|
|
18
|
+
- `necklaces.py`: `enumerate_necklaces`, `enumerate_bracelets`, `_euler_phi`, `_divisors`.
|
|
19
|
+
- `debruijn.py`: `de_bruijn` (FKM construction).
|
|
20
|
+
- `__init__.py`: public surface.
|
|
21
|
+
|
|
22
|
+
See `docs/architecture.md` for precise definitions and references (Duval 1983, FKM).
|
|
23
|
+
|
|
24
|
+
## Conventions
|
|
25
|
+
|
|
26
|
+
- Words over an integer alphabet are `tuple[int, ...]`. `is_lyndon` and `factorize`
|
|
27
|
+
accept any comparable `Sequence` (str, list, tuple).
|
|
28
|
+
- Enumeration and generation functions take keyword-only `alphabet_size` and `length`
|
|
29
|
+
with no default values. The positional `word` argument is fine for word predicates.
|
|
30
|
+
- Pure functions, strict typing, zero runtime dependencies (standard library only).
|
|
31
|
+
- Validate inputs and raise clear ValueError messages.
|
|
32
|
+
|
|
33
|
+
## Testing rules
|
|
34
|
+
|
|
35
|
+
- Golden values for Lyndon tests and small enumerations.
|
|
36
|
+
- Necklace counts cross-checked against the cycle-index formula and a brute-force scan.
|
|
37
|
+
- Bracelet and De Bruijn results cross-checked against brute force.
|
|
38
|
+
- Factorization invariants: each factor is Lyndon, non-increasing, concatenates to input.
|
|
39
|
+
- Bug fixes start with a failing test.
|
|
40
|
+
|
|
41
|
+
## Release
|
|
42
|
+
|
|
43
|
+
- Semantic versioning; update CHANGELOG.md and __version__.
|
|
44
|
+
- Gates: `uv run pytest && uv run ruff check . && uv run mypy src && uv build && uv run --with twine twine check dist/*`.
|
|
45
|
+
- Do NOT publish to PyPI (pending quota reset). Tag vX.Y.Z and GitHub release.
|
|
46
|
+
|
|
47
|
+
## Style
|
|
48
|
+
|
|
49
|
+
- No em dash characters in docs, comments, or commit messages.
|
|
50
|
+
- Comments explain non-obvious reasoning only.
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
# Code of Conduct
|
|
2
|
+
|
|
3
|
+
## Our pledge
|
|
4
|
+
|
|
5
|
+
We as members, contributors, and maintainers pledge to make participation in our
|
|
6
|
+
community a harassment-free experience for everyone, regardless of age, body
|
|
7
|
+
size, visible or invisible disability, ethnicity, sex characteristics, gender
|
|
8
|
+
identity and expression, level of experience, education, socio-economic status,
|
|
9
|
+
nationality, personal appearance, race, religion, or sexual identity and
|
|
10
|
+
orientation.
|
|
11
|
+
|
|
12
|
+
## Our standards
|
|
13
|
+
|
|
14
|
+
Examples of behavior that contributes to a positive environment:
|
|
15
|
+
|
|
16
|
+
- Showing empathy and kindness toward other people.
|
|
17
|
+
- Being respectful of differing opinions, viewpoints, and experiences.
|
|
18
|
+
- Giving and gracefully accepting constructive feedback.
|
|
19
|
+
- Focusing on what is best for the community.
|
|
20
|
+
|
|
21
|
+
Examples of unacceptable behavior:
|
|
22
|
+
|
|
23
|
+
- Harassment, insulting or derogatory comments, and personal or political attacks.
|
|
24
|
+
- Public or private harassment.
|
|
25
|
+
- Publishing others' private information without explicit permission.
|
|
26
|
+
- Other conduct which could reasonably be considered inappropriate.
|
|
27
|
+
|
|
28
|
+
## Enforcement
|
|
29
|
+
|
|
30
|
+
Instances of abusive, harassing, or otherwise unacceptable behavior may be
|
|
31
|
+
reported to the maintainer at amaar2cool@gmail.com. All complaints will be
|
|
32
|
+
reviewed and investigated promptly and fairly.
|
|
33
|
+
|
|
34
|
+
## Attribution
|
|
35
|
+
|
|
36
|
+
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org),
|
|
37
|
+
version 2.1.
|
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# Contributing to lyndon-words
|
|
2
|
+
|
|
3
|
+
Thanks for your interest. This project values correctness, precise definitions, and zero
|
|
4
|
+
runtime dependencies.
|
|
5
|
+
|
|
6
|
+
## Development
|
|
7
|
+
|
|
8
|
+
```sh
|
|
9
|
+
uv venv
|
|
10
|
+
uv pip install -e ".[dev]"
|
|
11
|
+
uv run pytest -q
|
|
12
|
+
uv run ruff check .
|
|
13
|
+
uv run mypy src
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
A standard virtual environment with `pip install -e ".[dev]"` works the same way.
|
|
17
|
+
|
|
18
|
+
## Guidelines
|
|
19
|
+
|
|
20
|
+
- No runtime dependencies. The standard library is enough.
|
|
21
|
+
- Functions are pure. Enumeration and generation functions use keyword-only parameters
|
|
22
|
+
with no default values.
|
|
23
|
+
- Every function needs exact-value tests plus structural invariants, cross-checked against
|
|
24
|
+
a brute-force reference where practical.
|
|
25
|
+
- A bug fix starts with a failing test.
|
|
26
|
+
- Run `uv run ruff format .` before committing.
|
|
27
|
+
- Commit messages follow `type(scope): description`.
|
|
28
|
+
- No em dash characters in code, comments, or commit messages.
|
|
29
|
+
|
|
30
|
+
## Reporting issues
|
|
31
|
+
|
|
32
|
+
Open an issue with the word or parameters, the function called, and what you expected
|
|
33
|
+
versus what you observed.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Amaar Chughtai
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,184 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: lyndon-words
|
|
3
|
+
Version: 0.2.0
|
|
4
|
+
Summary: Lyndon words, Duval factorization, necklace and bracelet enumeration, and De Bruijn sequences in pure Python.
|
|
5
|
+
Project-URL: Homepage, https://github.com/amaar-mc/lyndon-words
|
|
6
|
+
Project-URL: Repository, https://github.com/amaar-mc/lyndon-words
|
|
7
|
+
Project-URL: Issues, https://github.com/amaar-mc/lyndon-words/issues
|
|
8
|
+
Project-URL: Changelog, https://github.com/amaar-mc/lyndon-words/blob/main/CHANGELOG.md
|
|
9
|
+
Author: Amaar Chughtai
|
|
10
|
+
License: MIT License
|
|
11
|
+
|
|
12
|
+
Copyright (c) 2026 Amaar Chughtai
|
|
13
|
+
|
|
14
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
15
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
16
|
+
in the Software without restriction, including without limitation the rights
|
|
17
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
18
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
19
|
+
furnished to do so, subject to the following conditions:
|
|
20
|
+
|
|
21
|
+
The above copyright notice and this permission notice shall be included in all
|
|
22
|
+
copies or substantial portions of the Software.
|
|
23
|
+
|
|
24
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
25
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
26
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
27
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
28
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
29
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
30
|
+
SOFTWARE.
|
|
31
|
+
License-File: LICENSE
|
|
32
|
+
Keywords: combinatorics,combinatorics-on-words,de-bruijn,lyndon-words,necklaces,strings
|
|
33
|
+
Classifier: Development Status :: 3 - Alpha
|
|
34
|
+
Classifier: Intended Audience :: Science/Research
|
|
35
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
36
|
+
Classifier: Programming Language :: Python :: 3
|
|
37
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
38
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
39
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
40
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
41
|
+
Classifier: Topic :: Scientific/Engineering :: Mathematics
|
|
42
|
+
Classifier: Typing :: Typed
|
|
43
|
+
Requires-Python: >=3.10
|
|
44
|
+
Provides-Extra: dev
|
|
45
|
+
Requires-Dist: hypothesis>=6; extra == 'dev'
|
|
46
|
+
Requires-Dist: mypy>=1.10; extra == 'dev'
|
|
47
|
+
Requires-Dist: pytest>=8; extra == 'dev'
|
|
48
|
+
Requires-Dist: ruff>=0.4; extra == 'dev'
|
|
49
|
+
Description-Content-Type: text/markdown
|
|
50
|
+
|
|
51
|
+
# lyndon-words
|
|
52
|
+
|
|
53
|
+
<p align="center">
|
|
54
|
+
<img src="assets/logo.png" alt="lyndon-words logo" width="160">
|
|
55
|
+
</p>
|
|
56
|
+
|
|
57
|
+
[](https://github.com/amaar-mc/lyndon-words/actions/workflows/ci.yml)
|
|
58
|
+
[](./LICENSE)
|
|
59
|
+
|
|
60
|
+
Combinatorics on words in pure Python with zero dependencies: Lyndon-word testing, Duval's
|
|
61
|
+
Chen-Fox-Lyndon factorization, necklace and bracelet enumeration, and De Bruijn sequences.
|
|
62
|
+
|
|
63
|
+
## What is this?
|
|
64
|
+
|
|
65
|
+
A **Lyndon word** is a non-empty word that is strictly smaller than all of its rotations.
|
|
66
|
+
Lyndon words are the building blocks of combinatorics on words: every word factors uniquely
|
|
67
|
+
into a non-increasing concatenation of Lyndon words (the Chen-Fox-Lyndon theorem), and
|
|
68
|
+
Lyndon words index the necklaces (rotation classes) and the De Bruijn sequences over an
|
|
69
|
+
alphabet.
|
|
70
|
+
|
|
71
|
+
This library implements the classic algorithms directly, with no dependencies and strict
|
|
72
|
+
typing.
|
|
73
|
+
|
|
74
|
+
## Install
|
|
75
|
+
|
|
76
|
+
```sh
|
|
77
|
+
pip install lyndon-words
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
> PyPI release pending. Install from source:
|
|
81
|
+
> ```sh
|
|
82
|
+
> git clone https://github.com/amaar-mc/lyndon-words
|
|
83
|
+
> cd lyndon-words
|
|
84
|
+
> pip install -e .
|
|
85
|
+
> ```
|
|
86
|
+
|
|
87
|
+
## Quick start
|
|
88
|
+
|
|
89
|
+
```python
|
|
90
|
+
from lyndon_words import is_lyndon, factorize, enumerate_necklaces, enumerate_bracelets, de_bruijn, lyndon_array
|
|
91
|
+
|
|
92
|
+
# Lyndon-word test (accepts str, list, or tuple)
|
|
93
|
+
is_lyndon("aab") # True
|
|
94
|
+
is_lyndon("aba") # False
|
|
95
|
+
is_lyndon((0, 0, 1)) # True
|
|
96
|
+
|
|
97
|
+
# Duval's Chen-Fox-Lyndon factorization (non-increasing Lyndon factors)
|
|
98
|
+
factorize("banana") # ['b', 'an', 'an', 'a']
|
|
99
|
+
factorize("aababab") # ['aabab', 'ab']
|
|
100
|
+
|
|
101
|
+
# Necklaces: rotation-class representatives over {0, 1} of length 3
|
|
102
|
+
list(enumerate_necklaces(alphabet_size=2, length=3))
|
|
103
|
+
# [(0, 0, 0), (0, 0, 1), (0, 1, 1), (1, 1, 1)]
|
|
104
|
+
|
|
105
|
+
# Bracelets: rotation + reflection classes over {0, 1} of length 4
|
|
106
|
+
list(enumerate_bracelets(alphabet_size=2, length=4))
|
|
107
|
+
# [(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 1, 1), (0, 1, 0, 1), (0, 1, 1, 1), (1, 1, 1, 1)]
|
|
108
|
+
|
|
109
|
+
# De Bruijn sequence B(2, 3): every length-3 binary word appears once cyclically
|
|
110
|
+
de_bruijn(alphabet_size=2, length=3)
|
|
111
|
+
# (0, 0, 0, 1, 0, 1, 1, 1)
|
|
112
|
+
|
|
113
|
+
# Lyndon array: longest Lyndon prefix length at each position
|
|
114
|
+
lyndon_array("banana") # [1, 2, 1, 2, 1, 1]
|
|
115
|
+
lyndon_array("abcd") # [4, 3, 2, 1] (entire increasing suffix is Lyndon)
|
|
116
|
+
lyndon_array("aaaa") # [1, 1, 1, 1] (all equal, no run > 1 is Lyndon)
|
|
117
|
+
lyndon_array("0010011") # [7, 2, 1, 4, 3, 1, 1]
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
## Word representation
|
|
121
|
+
|
|
122
|
+
Words over an integer alphabet are represented as `tuple[int, ...]`, for example
|
|
123
|
+
`(0, 0, 1)` for a 3-symbol word over `{0, 1}`. This is the canonical type, and every
|
|
124
|
+
enumeration and generation function (`enumerate_necklaces`, `enumerate_bracelets`,
|
|
125
|
+
`de_bruijn`) yields or returns tuples of ints.
|
|
126
|
+
|
|
127
|
+
The word predicates `is_lyndon` and `factorize` are generic over any comparable
|
|
128
|
+
`Sequence`, so `str`, `list`, and `tuple` all work. The factors returned by `factorize`
|
|
129
|
+
have the same type as the input: pass a `str` and you get a list of `str` factors, pass a
|
|
130
|
+
`tuple` and you get a list of `tuple` factors. String examples like `"aab"` behave exactly
|
|
131
|
+
like integer tuples because string comparison is lexicographic.
|
|
132
|
+
|
|
133
|
+
## API
|
|
134
|
+
|
|
135
|
+
Enumeration and generation functions take keyword-only `alphabet_size` and `length`.
|
|
136
|
+
|
|
137
|
+
| Function | Description |
|
|
138
|
+
|---|---|
|
|
139
|
+
| `is_lyndon(word)` | True iff `word` is non-empty and strictly smaller than all proper rotations |
|
|
140
|
+
| `factorize(word)` | Duval's O(n) Chen-Fox-Lyndon factorization into non-increasing Lyndon factors |
|
|
141
|
+
| `lyndon_array(word)` | For each index `i`, the length of the longest Lyndon prefix of `word[i:]` |
|
|
142
|
+
| `enumerate_necklaces(*, alphabet_size, length)` | FKM enumeration of rotation classes, lexicographic order |
|
|
143
|
+
| `enumerate_bracelets(*, alphabet_size, length)` | Enumeration of rotation + reflection classes, lexicographic order |
|
|
144
|
+
| `de_bruijn(*, alphabet_size, length)` | De Bruijn sequence B(alphabet_size, length) via FKM |
|
|
145
|
+
|
|
146
|
+
`alphabet_size >= 1` and `length >= 1` are required; otherwise a `ValueError` is raised.
|
|
147
|
+
|
|
148
|
+
## Definitions
|
|
149
|
+
|
|
150
|
+
**Lyndon word.** A non-empty word strictly smaller, in lexicographic order, than every one
|
|
151
|
+
of its proper rotations (equivalently, every proper suffix). Lyndon words are aperiodic.
|
|
152
|
+
|
|
153
|
+
**Chen-Fox-Lyndon factorization.** Every non-empty word factors uniquely into Lyndon words
|
|
154
|
+
`l_1 >= l_2 >= ... >= l_m`. Duval's algorithm computes this in O(n) time and O(1) extra
|
|
155
|
+
space.
|
|
156
|
+
|
|
157
|
+
**Necklace.** A rotation-equivalence class of length-`n` words. The representative is the
|
|
158
|
+
lexicographically least rotation. The count is
|
|
159
|
+
`(1/n) * sum over d dividing n of phi(d) * k^(n/d)`.
|
|
160
|
+
|
|
161
|
+
**Bracelet.** An equivalence class under rotation and reflection (the dihedral group). The
|
|
162
|
+
representative is the lexicographically least element of the orbit.
|
|
163
|
+
|
|
164
|
+
**Lyndon array.** For a word `s` of length `n`, the Lyndon array is the integer array
|
|
165
|
+
`L` where `L[i]` is the length of the longest Lyndon word that is a prefix of `s[i:]`.
|
|
166
|
+
Every entry satisfies `1 <= L[i] <= n - i`. The first entry `L[0]` equals the length of
|
|
167
|
+
the first (leftmost) factor in the Chen-Fox-Lyndon factorization of `s`: that factor is
|
|
168
|
+
the unique longest Lyndon prefix of the entire word.
|
|
169
|
+
|
|
170
|
+
**De Bruijn sequence.** A cyclic sequence `B(k, n)` of length `k^n` in which every
|
|
171
|
+
length-`n` word appears exactly once as a contiguous cyclic subword. The FKM construction
|
|
172
|
+
concatenates, in lexicographic order, every Lyndon word whose length divides `n`.
|
|
173
|
+
|
|
174
|
+
## References
|
|
175
|
+
|
|
176
|
+
- Duval, J.-P. (1983). Factorizing words over an ordered alphabet. Journal of Algorithms.
|
|
177
|
+
- Chen, K.-T., Fox, R. H., Lyndon, R. C. (1958). Free differential calculus IV. Annals of Mathematics.
|
|
178
|
+
- Fredricksen, H., Maiorana, J. (1978). Necklaces of beads in k colors and k-ary de Bruijn sequences. Discrete Mathematics.
|
|
179
|
+
- Fredricksen, H., Kessler, I. J. (1986). An algorithm for generating necklaces of beads in two colors. Discrete Mathematics.
|
|
180
|
+
- Ruskey, F. (2003). Combinatorial Generation. University of Victoria.
|
|
181
|
+
|
|
182
|
+
## License
|
|
183
|
+
|
|
184
|
+
MIT. Copyright (c) 2026 Amaar Chughtai.
|
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
# lyndon-words
|
|
2
|
+
|
|
3
|
+
<p align="center">
|
|
4
|
+
<img src="assets/logo.png" alt="lyndon-words logo" width="160">
|
|
5
|
+
</p>
|
|
6
|
+
|
|
7
|
+
[](https://github.com/amaar-mc/lyndon-words/actions/workflows/ci.yml)
|
|
8
|
+
[](./LICENSE)
|
|
9
|
+
|
|
10
|
+
Combinatorics on words in pure Python with zero dependencies: Lyndon-word testing, Duval's
|
|
11
|
+
Chen-Fox-Lyndon factorization, necklace and bracelet enumeration, and De Bruijn sequences.
|
|
12
|
+
|
|
13
|
+
## What is this?
|
|
14
|
+
|
|
15
|
+
A **Lyndon word** is a non-empty word that is strictly smaller than all of its rotations.
|
|
16
|
+
Lyndon words are the building blocks of combinatorics on words: every word factors uniquely
|
|
17
|
+
into a non-increasing concatenation of Lyndon words (the Chen-Fox-Lyndon theorem), and
|
|
18
|
+
Lyndon words index the necklaces (rotation classes) and the De Bruijn sequences over an
|
|
19
|
+
alphabet.
|
|
20
|
+
|
|
21
|
+
This library implements the classic algorithms directly, with no dependencies and strict
|
|
22
|
+
typing.
|
|
23
|
+
|
|
24
|
+
## Install
|
|
25
|
+
|
|
26
|
+
```sh
|
|
27
|
+
pip install lyndon-words
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
> PyPI release pending. Install from source:
|
|
31
|
+
> ```sh
|
|
32
|
+
> git clone https://github.com/amaar-mc/lyndon-words
|
|
33
|
+
> cd lyndon-words
|
|
34
|
+
> pip install -e .
|
|
35
|
+
> ```
|
|
36
|
+
|
|
37
|
+
## Quick start
|
|
38
|
+
|
|
39
|
+
```python
|
|
40
|
+
from lyndon_words import is_lyndon, factorize, enumerate_necklaces, enumerate_bracelets, de_bruijn, lyndon_array
|
|
41
|
+
|
|
42
|
+
# Lyndon-word test (accepts str, list, or tuple)
|
|
43
|
+
is_lyndon("aab") # True
|
|
44
|
+
is_lyndon("aba") # False
|
|
45
|
+
is_lyndon((0, 0, 1)) # True
|
|
46
|
+
|
|
47
|
+
# Duval's Chen-Fox-Lyndon factorization (non-increasing Lyndon factors)
|
|
48
|
+
factorize("banana") # ['b', 'an', 'an', 'a']
|
|
49
|
+
factorize("aababab") # ['aabab', 'ab']
|
|
50
|
+
|
|
51
|
+
# Necklaces: rotation-class representatives over {0, 1} of length 3
|
|
52
|
+
list(enumerate_necklaces(alphabet_size=2, length=3))
|
|
53
|
+
# [(0, 0, 0), (0, 0, 1), (0, 1, 1), (1, 1, 1)]
|
|
54
|
+
|
|
55
|
+
# Bracelets: rotation + reflection classes over {0, 1} of length 4
|
|
56
|
+
list(enumerate_bracelets(alphabet_size=2, length=4))
|
|
57
|
+
# [(0, 0, 0, 0), (0, 0, 0, 1), (0, 0, 1, 1), (0, 1, 0, 1), (0, 1, 1, 1), (1, 1, 1, 1)]
|
|
58
|
+
|
|
59
|
+
# De Bruijn sequence B(2, 3): every length-3 binary word appears once cyclically
|
|
60
|
+
de_bruijn(alphabet_size=2, length=3)
|
|
61
|
+
# (0, 0, 0, 1, 0, 1, 1, 1)
|
|
62
|
+
|
|
63
|
+
# Lyndon array: longest Lyndon prefix length at each position
|
|
64
|
+
lyndon_array("banana") # [1, 2, 1, 2, 1, 1]
|
|
65
|
+
lyndon_array("abcd") # [4, 3, 2, 1] (entire increasing suffix is Lyndon)
|
|
66
|
+
lyndon_array("aaaa") # [1, 1, 1, 1] (all equal, no run > 1 is Lyndon)
|
|
67
|
+
lyndon_array("0010011") # [7, 2, 1, 4, 3, 1, 1]
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
## Word representation
|
|
71
|
+
|
|
72
|
+
Words over an integer alphabet are represented as `tuple[int, ...]`, for example
|
|
73
|
+
`(0, 0, 1)` for a 3-symbol word over `{0, 1}`. This is the canonical type, and every
|
|
74
|
+
enumeration and generation function (`enumerate_necklaces`, `enumerate_bracelets`,
|
|
75
|
+
`de_bruijn`) yields or returns tuples of ints.
|
|
76
|
+
|
|
77
|
+
The word predicates `is_lyndon` and `factorize` are generic over any comparable
|
|
78
|
+
`Sequence`, so `str`, `list`, and `tuple` all work. The factors returned by `factorize`
|
|
79
|
+
have the same type as the input: pass a `str` and you get a list of `str` factors, pass a
|
|
80
|
+
`tuple` and you get a list of `tuple` factors. String examples like `"aab"` behave exactly
|
|
81
|
+
like integer tuples because string comparison is lexicographic.
|
|
82
|
+
|
|
83
|
+
## API
|
|
84
|
+
|
|
85
|
+
Enumeration and generation functions take keyword-only `alphabet_size` and `length`.
|
|
86
|
+
|
|
87
|
+
| Function | Description |
|
|
88
|
+
|---|---|
|
|
89
|
+
| `is_lyndon(word)` | True iff `word` is non-empty and strictly smaller than all proper rotations |
|
|
90
|
+
| `factorize(word)` | Duval's O(n) Chen-Fox-Lyndon factorization into non-increasing Lyndon factors |
|
|
91
|
+
| `lyndon_array(word)` | For each index `i`, the length of the longest Lyndon prefix of `word[i:]` |
|
|
92
|
+
| `enumerate_necklaces(*, alphabet_size, length)` | FKM enumeration of rotation classes, lexicographic order |
|
|
93
|
+
| `enumerate_bracelets(*, alphabet_size, length)` | Enumeration of rotation + reflection classes, lexicographic order |
|
|
94
|
+
| `de_bruijn(*, alphabet_size, length)` | De Bruijn sequence B(alphabet_size, length) via FKM |
|
|
95
|
+
|
|
96
|
+
`alphabet_size >= 1` and `length >= 1` are required; otherwise a `ValueError` is raised.
|
|
97
|
+
|
|
98
|
+
## Definitions
|
|
99
|
+
|
|
100
|
+
**Lyndon word.** A non-empty word strictly smaller, in lexicographic order, than every one
|
|
101
|
+
of its proper rotations (equivalently, every proper suffix). Lyndon words are aperiodic.
|
|
102
|
+
|
|
103
|
+
**Chen-Fox-Lyndon factorization.** Every non-empty word factors uniquely into Lyndon words
|
|
104
|
+
`l_1 >= l_2 >= ... >= l_m`. Duval's algorithm computes this in O(n) time and O(1) extra
|
|
105
|
+
space.
|
|
106
|
+
|
|
107
|
+
**Necklace.** A rotation-equivalence class of length-`n` words. The representative is the
|
|
108
|
+
lexicographically least rotation. The count is
|
|
109
|
+
`(1/n) * sum over d dividing n of phi(d) * k^(n/d)`.
|
|
110
|
+
|
|
111
|
+
**Bracelet.** An equivalence class under rotation and reflection (the dihedral group). The
|
|
112
|
+
representative is the lexicographically least element of the orbit.
|
|
113
|
+
|
|
114
|
+
**Lyndon array.** For a word `s` of length `n`, the Lyndon array is the integer array
|
|
115
|
+
`L` where `L[i]` is the length of the longest Lyndon word that is a prefix of `s[i:]`.
|
|
116
|
+
Every entry satisfies `1 <= L[i] <= n - i`. The first entry `L[0]` equals the length of
|
|
117
|
+
the first (leftmost) factor in the Chen-Fox-Lyndon factorization of `s`: that factor is
|
|
118
|
+
the unique longest Lyndon prefix of the entire word.
|
|
119
|
+
|
|
120
|
+
**De Bruijn sequence.** A cyclic sequence `B(k, n)` of length `k^n` in which every
|
|
121
|
+
length-`n` word appears exactly once as a contiguous cyclic subword. The FKM construction
|
|
122
|
+
concatenates, in lexicographic order, every Lyndon word whose length divides `n`.
|
|
123
|
+
|
|
124
|
+
## References
|
|
125
|
+
|
|
126
|
+
- Duval, J.-P. (1983). Factorizing words over an ordered alphabet. Journal of Algorithms.
|
|
127
|
+
- Chen, K.-T., Fox, R. H., Lyndon, R. C. (1958). Free differential calculus IV. Annals of Mathematics.
|
|
128
|
+
- Fredricksen, H., Maiorana, J. (1978). Necklaces of beads in k colors and k-ary de Bruijn sequences. Discrete Mathematics.
|
|
129
|
+
- Fredricksen, H., Kessler, I. J. (1986). An algorithm for generating necklaces of beads in two colors. Discrete Mathematics.
|
|
130
|
+
- Ruskey, F. (2003). Combinatorial Generation. University of Victoria.
|
|
131
|
+
|
|
132
|
+
## License
|
|
133
|
+
|
|
134
|
+
MIT. Copyright (c) 2026 Amaar Chughtai.
|
|
@@ -0,0 +1,18 @@
|
|
|
1
|
+
# Security Policy
|
|
2
|
+
|
|
3
|
+
## Scope
|
|
4
|
+
|
|
5
|
+
`lyndon-words` is a pure computation library with no runtime dependencies, no network
|
|
6
|
+
access, and no file system access. The attack surface is limited to incorrect results from
|
|
7
|
+
malformed input, which the library guards against with explicit validation.
|
|
8
|
+
|
|
9
|
+
## Reporting a vulnerability
|
|
10
|
+
|
|
11
|
+
If you find a security issue, please email amaar2cool@gmail.com with details and steps to
|
|
12
|
+
reproduce. Please do not open a public issue for security reports. You can expect an
|
|
13
|
+
initial response within a few days.
|
|
14
|
+
|
|
15
|
+
## Supported versions
|
|
16
|
+
|
|
17
|
+
The latest published minor version receives fixes. Pre-1.0 releases may introduce breaking
|
|
18
|
+
changes in minor versions, as allowed by semantic versioning.
|
|
Binary file
|