logic-loom 0.3.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- logic_loom-0.3.0/LICENSE +21 -0
- logic_loom-0.3.0/PKG-INFO +504 -0
- logic_loom-0.3.0/README.md +480 -0
- logic_loom-0.3.0/logic_loom/__init__.py +67 -0
- logic_loom-0.3.0/logic_loom/__main__.py +4 -0
- logic_loom-0.3.0/logic_loom/analysis.py +94 -0
- logic_loom-0.3.0/logic_loom/cli.py +88 -0
- logic_loom-0.3.0/logic_loom/codegen.py +189 -0
- logic_loom-0.3.0/logic_loom/compiler.py +135 -0
- logic_loom-0.3.0/logic_loom/cost.py +166 -0
- logic_loom-0.3.0/logic_loom/effects.py +88 -0
- logic_loom-0.3.0/logic_loom/egraph.py +155 -0
- logic_loom-0.3.0/logic_loom/expr.py +138 -0
- logic_loom-0.3.0/logic_loom/parser.py +167 -0
- logic_loom-0.3.0/logic_loom/rules.py +181 -0
- logic_loom-0.3.0/logic_loom/saturate.py +210 -0
- logic_loom-0.3.0/logic_loom/viz.py +71 -0
- logic_loom-0.3.0/logic_loom.egg-info/PKG-INFO +504 -0
- logic_loom-0.3.0/logic_loom.egg-info/SOURCES.txt +31 -0
- logic_loom-0.3.0/logic_loom.egg-info/dependency_links.txt +1 -0
- logic_loom-0.3.0/logic_loom.egg-info/entry_points.txt +2 -0
- logic_loom-0.3.0/logic_loom.egg-info/requires.txt +3 -0
- logic_loom-0.3.0/logic_loom.egg-info/top_level.txt +1 -0
- logic_loom-0.3.0/pyproject.toml +43 -0
- logic_loom-0.3.0/setup.cfg +4 -0
- logic_loom-0.3.0/tests/test_analysis.py +37 -0
- logic_loom-0.3.0/tests/test_codegen.py +48 -0
- logic_loom-0.3.0/tests/test_cost_model.py +40 -0
- logic_loom-0.3.0/tests/test_effects.py +34 -0
- logic_loom-0.3.0/tests/test_equivalence.py +42 -0
- logic_loom-0.3.0/tests/test_extended.py +53 -0
- logic_loom-0.3.0/tests/test_optimize.py +53 -0
- logic_loom-0.3.0/tests/test_parser.py +32 -0
logic_loom-0.3.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Elian Alfonso Lopez Preciado
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
@@ -0,0 +1,504 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: logic-loom
|
|
3
|
+
Version: 0.3.0
|
|
4
|
+
Summary: A compiler that understands mathematics: it optimizes expressions by reasoning about algebra, using equality saturation over an e-graph.
|
|
5
|
+
Author: Elian Alfonso Lopez Preciado
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/elianalfonsolopezpreciado/Logic-Loom
|
|
8
|
+
Project-URL: Repository, https://github.com/elianalfonsolopezpreciado/Logic-Loom
|
|
9
|
+
Project-URL: Documentation, https://github.com/elianalfonsolopezpreciado/Logic-Loom#readme
|
|
10
|
+
Project-URL: Issues, https://github.com/elianalfonsolopezpreciado/Logic-Loom/issues
|
|
11
|
+
Keywords: compiler,optimization,algebra,e-graph,equality-saturation,symbolic,rewriting
|
|
12
|
+
Classifier: Development Status :: 4 - Beta
|
|
13
|
+
Classifier: Intended Audience :: Science/Research
|
|
14
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
15
|
+
Classifier: Programming Language :: Python :: 3
|
|
16
|
+
Classifier: Topic :: Scientific/Engineering :: Mathematics
|
|
17
|
+
Classifier: Topic :: Software Development :: Compilers
|
|
18
|
+
Requires-Python: >=3.9
|
|
19
|
+
Description-Content-Type: text/markdown
|
|
20
|
+
License-File: LICENSE
|
|
21
|
+
Provides-Extra: dev
|
|
22
|
+
Requires-Dist: pytest>=7; extra == "dev"
|
|
23
|
+
Dynamic: license-file
|
|
24
|
+
|
|
25
|
+
<div align="center">
|
|
26
|
+
|
|
27
|
+
# Logic-Loom
|
|
28
|
+
|
|
29
|
+
### A compiler that understands *mathematics*, not just instructions.
|
|
30
|
+
|
|
31
|
+
Most optimizers shuffle instructions. Logic-Loom reasons about algebra. It
|
|
32
|
+
discovers that `a*b + a*c` **is** `a*(b + c)`, finds Horner's scheme on its
|
|
33
|
+
own, cancels `a*(b+c) - a*b` down to `a*c`, optimizes for a chosen hardware
|
|
34
|
+
target, tracks the domain assumptions it relies on, refuses to disturb side
|
|
35
|
+
effects, and emits the result as C, Rust, JavaScript or LLVM IR.
|
|
36
|
+
|
|
37
|
+
[Quick start](#quick-start) ·
|
|
38
|
+
[Showcase](#showcase) ·
|
|
39
|
+
[How it works](#how-it-works) ·
|
|
40
|
+
[Cost models](#hardware-aware-cost-models) ·
|
|
41
|
+
[Domain safety](#domain-safety-and-assumptions) ·
|
|
42
|
+
[Side effects](#side-effect-awareness) ·
|
|
43
|
+
[Code generation](#code-generation) ·
|
|
44
|
+
[Correctness](#correctness) ·
|
|
45
|
+
[Limitations](#limitations) ·
|
|
46
|
+
[Roadmap](#roadmap)
|
|
47
|
+
|
|
48
|
+
</div>
|
|
49
|
+
|
|
50
|
+
---
|
|
51
|
+
|
|
52
|
+
## The idea
|
|
53
|
+
|
|
54
|
+
A traditional compiler optimizes by pattern-matching *instructions*: replace a
|
|
55
|
+
multiply-by-two with a shift, fold two constants, peephole away a redundant
|
|
56
|
+
move. It reads code like a clerk with a checklist.
|
|
57
|
+
|
|
58
|
+
Logic-Loom reads code like a **mathematician**. Given
|
|
59
|
+
|
|
60
|
+
```
|
|
61
|
+
a*b + a*c
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
it does not ask *"which instruction is cheaper?"* It recognizes the
|
|
65
|
+
**distributive law** and rewrites the algorithm:
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
a*(b + c) # one multiply instead of two
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
It does this not with a hand-written branch for every case, but by knowing a
|
|
72
|
+
handful of algebraic *laws* and exploring all of their consequences at once,
|
|
73
|
+
then picking the cheapest equivalent form under a configurable cost model. The
|
|
74
|
+
same engine that factors a sum also discovers Horner's scheme for polynomials
|
|
75
|
+
and cancels terms that destroy each other; none of it was special-cased.
|
|
76
|
+
|
|
77
|
+
> **The technique:** *equality saturation* over an *e-graph*, the same idea
|
|
78
|
+
> behind [`egg`](https://egraphs-good.github.io/) and the
|
|
79
|
+
> [Herbie](https://herbie.uwplse.org/) floating-point optimizer. See
|
|
80
|
+
> [How it works](#how-it-works).
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## Quick start
|
|
85
|
+
|
|
86
|
+
No dependencies. Pure Python (3.9+).
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
pip install logic-loom
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
Then use it from the command line:
|
|
93
|
+
|
|
94
|
+
```bash
|
|
95
|
+
# optimize an expression
|
|
96
|
+
logic-loom "a*b + a*c"
|
|
97
|
+
# a * b + a * c => a * (b + c)
|
|
98
|
+
|
|
99
|
+
# the module form works too
|
|
100
|
+
python -m logic_loom "a*b + a*c"
|
|
101
|
+
|
|
102
|
+
# optimize for a hardware target and emit LLVM IR
|
|
103
|
+
python -m logic_loom --profile gpu --target llvm "a/b + c/b"
|
|
104
|
+
|
|
105
|
+
# track the domain assumptions a simplification relies on
|
|
106
|
+
python -m logic_loom --extended --explain "sqrt(x)*sqrt(x)"
|
|
107
|
+
# sqrt(x) * sqrt(x) => x
|
|
108
|
+
# assumes (for soundness): ?x >= 0
|
|
109
|
+
|
|
110
|
+
# respect side effects: do not collapse two random draws into zero
|
|
111
|
+
python -m logic_loom --impure rand "rand(s) - rand(s)"
|
|
112
|
+
# rand(s) - rand(s) == rand(s) - rand(s)
|
|
113
|
+
```
|
|
114
|
+
|
|
115
|
+
From Python:
|
|
116
|
+
|
|
117
|
+
```python
|
|
118
|
+
from logic_loom import optimize, to_code
|
|
119
|
+
|
|
120
|
+
r = optimize("a*x*x + b*x + c")
|
|
121
|
+
print(r.optimized) # x * (a * x + b) + c <- Horner's scheme
|
|
122
|
+
print(r.speedup) # 1.32
|
|
123
|
+
print(to_code(r.optimized, "c"))
|
|
124
|
+
```
|
|
125
|
+
|
|
126
|
+
Or run from a source checkout (and try the full showcase):
|
|
127
|
+
|
|
128
|
+
```bash
|
|
129
|
+
git clone https://github.com/elianalfonsolopezpreciado/Logic-Loom.git
|
|
130
|
+
cd Logic-Loom
|
|
131
|
+
pip install -e .
|
|
132
|
+
python examples/demo.py
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
---
|
|
136
|
+
|
|
137
|
+
## Showcase
|
|
138
|
+
|
|
139
|
+
Every row below is produced by the **same** engine and the **same** rule set;
|
|
140
|
+
nothing is special-cased. `cost` is the weighted operation count under the
|
|
141
|
+
default model.
|
|
142
|
+
|
|
143
|
+
| Input | Logic-Loom output | What it figured out | Cost |
|
|
144
|
+
|---|---|---|---|
|
|
145
|
+
| `a*b + a*c` | `a * (b + c)` | distributive law / factoring | 5.4 -> 3.3 |
|
|
146
|
+
| `p*q + p*r + p*s` | `p * (q + (r + s))` | factor a term shared by three products | 8.6 -> 4.4 |
|
|
147
|
+
| `a*x*x + b*x + c` | `x * (a*x + b) + c` | **Horner's scheme, discovered** | 8.6 -> 6.5 |
|
|
148
|
+
| `a*(b + c) - a*b` | `a * c` | expand, then cancel `a*b` | 6.5 -> 2.2 |
|
|
149
|
+
| `2*3 + 4*x*0 + a*1` | `6 + a` | constant folding + identities | 10.7 -> 1.2 |
|
|
150
|
+
| `2*x + 3*x` | `x * 5` | combine like terms | 5.4 -> 2.2 |
|
|
151
|
+
| `x + 0 - x + 5` | `5` | self-inverse vanishes | 3.4 -> 0.1 |
|
|
152
|
+
| `x/x + y - y` | `1` | division and subtraction cancel | 6.4 -> 0.1 |
|
|
153
|
+
| `(a+b)/c + (a-b)/c` | `(a + a) / c` | combine over a denominator | 11.6 -> 5.3 |
|
|
154
|
+
|
|
155
|
+
Run `python examples/demo.py` to reproduce all of these with live statistics.
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## How it works
|
|
160
|
+
|
|
161
|
+
The key insight is that Logic-Loom never commits to a single rewrite. A greedy
|
|
162
|
+
compiler that applies `factor` too early can miss a better form that needed
|
|
163
|
+
`distribute` first. Logic-Loom sidesteps this **phase-ordering problem**
|
|
164
|
+
entirely by keeping *every* equivalent form alive simultaneously.
|
|
165
|
+
|
|
166
|
+
```mermaid
|
|
167
|
+
flowchart LR
|
|
168
|
+
A["a*b + a*c<br/>(text)"] --> B[Parse]
|
|
169
|
+
B --> P[Static analysis:<br/>prune rules, size limits]
|
|
170
|
+
P --> C["e-graph<br/>(one term)"]
|
|
171
|
+
C --> D{Equality<br/>saturation}
|
|
172
|
+
D -->|apply every law,<br/>both directions| C
|
|
173
|
+
D -->|nothing new learned| E[Extract cheapest<br/>under cost model]
|
|
174
|
+
E --> F["a*(b + c)<br/>(optimal form)"]
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
**1. The e-graph.** An *e-graph* stores a large set of equivalent expressions
|
|
178
|
+
compactly. Terms known to be equal are grouped into an *e-class*; an *e-node*
|
|
179
|
+
is an operator applied to e-*classes* rather than to concrete terms. So a
|
|
180
|
+
single `+` node over the classes `{a*b}` and `{a*c}` already represents *every*
|
|
181
|
+
term those classes contain.
|
|
182
|
+
|
|
183
|
+
**2. Equality saturation.** Algebraic laws are applied as *rewrites* that
|
|
184
|
+
**add** equalities instead of replacing terms:
|
|
185
|
+
|
|
186
|
+
```
|
|
187
|
+
distribute : ?a * (?b + ?c) == ?a*?b + ?a*?c
|
|
188
|
+
factor : ?a*?b + ?a*?c == ?a * (?b + ?c)
|
|
189
|
+
comm-add : ?a + ?b == ?b + ?a
|
|
190
|
+
assoc-mul : (?a*?b)*?c == ?a*(?b*?c)
|
|
191
|
+
self-mul : ?a * ?a == ?a ^ 2
|
|
192
|
+
... (see logic_loom/rules.py)
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
Because rewrites only *add* information, contradictory-looking rules
|
|
196
|
+
(`distribute` **and** `factor`) coexist without looping, and the result is
|
|
197
|
+
independent of the order rules fire in. The engine runs until the laws teach it
|
|
198
|
+
nothing new (the graph is **saturated**) or a resource limit is reached.
|
|
199
|
+
|
|
200
|
+
**3. Extraction.** The saturated graph now contains all discovered forms. A
|
|
201
|
+
[cost model](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/cost.py) assigns each operator a weight, and a
|
|
202
|
+
fixed-point picks the cheapest representative of each e-class. Extraction is
|
|
203
|
+
deterministic: the same input yields the same output regardless of hash seed.
|
|
204
|
+
|
|
205
|
+
**4. Smarter limits.** Two mechanisms keep the inherently explosive search
|
|
206
|
+
under control:
|
|
207
|
+
|
|
208
|
+
- **Static pruning and auto-sizing** (`logic_loom/analysis.py`) runs *before*
|
|
209
|
+
saturation. A fixed-point computes which rules can ever fire given the
|
|
210
|
+
operators in the input, and drops the rest; it also sizes the resource limits
|
|
211
|
+
to the input's complexity. On a polynomial, every transcendental rule is
|
|
212
|
+
pruned automatically. This alone cut the test-suite runtime by roughly 6x.
|
|
213
|
+
- **A backoff scheduler** (after egg's `BackoffScheduler`) throttles rules that
|
|
214
|
+
match explosively at runtime: when a rule blows past its budget it is briefly
|
|
215
|
+
banned, and its budget then doubles, so a productive rule is delayed but never
|
|
216
|
+
silenced. A hard node cap remains as the final guarantee of termination.
|
|
217
|
+
|
|
218
|
+
---
|
|
219
|
+
|
|
220
|
+
## Hardware-aware cost models
|
|
221
|
+
|
|
222
|
+
"Optimal" is not absolute; it depends on what you are optimizing for. On a GPU
|
|
223
|
+
a division can cost as much as eight multiplies and a transcendental call far
|
|
224
|
+
more; on a modern x86 core a fused multiply-add makes `*` and `+` nearly free.
|
|
225
|
+
Logic-Loom makes the cost model a first-class input:
|
|
226
|
+
|
|
227
|
+
```bash
|
|
228
|
+
python -m logic_loom --profile gpu "a/b + c/b" # strongly prefers one divide
|
|
229
|
+
python -m logic_loom --profile x86 "..."
|
|
230
|
+
python -m logic_loom --profile arm "..."
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
Built-in profiles (`default`, `x86`, `arm`, `gpu`) encode *relative* latencies
|
|
234
|
+
in the spirit of published instruction tables. The model genuinely changes the
|
|
235
|
+
chosen form, not just the reported number:
|
|
236
|
+
|
|
237
|
+
```python
|
|
238
|
+
from logic_loom import optimize, CostModel
|
|
239
|
+
|
|
240
|
+
cheap_pow = CostModel("cheap-pow", {"+": 1, "-": 1, "*": 2, "/": 4, "^": 1})
|
|
241
|
+
optimize("x*x*x").optimized # x * (x * x) (default: powers are dear)
|
|
242
|
+
optimize("x*x*x", model=cheap_pow).optimized # x^3 (powers are cheap here)
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
You can load a profile measured with your own micro-benchmarks:
|
|
246
|
+
|
|
247
|
+
```python
|
|
248
|
+
CostModel.from_json("examples/profile.skylake.json")
|
|
249
|
+
```
|
|
250
|
+
|
|
251
|
+
---
|
|
252
|
+
|
|
253
|
+
## Domain safety and assumptions
|
|
254
|
+
|
|
255
|
+
Some algebraic laws are only valid on part of the real line: `x/x = 1` assumes
|
|
256
|
+
`x != 0`, `sqrt(x)*sqrt(x) = x` assumes `x >= 0`, `log(a*b) = log a + log b`
|
|
257
|
+
assumes both arguments are positive. Logic-Loom records the preconditions of
|
|
258
|
+
every rule, and reports exactly which assumptions a given result relied on:
|
|
259
|
+
|
|
260
|
+
```bash
|
|
261
|
+
python -m logic_loom --extended --explain "sqrt(x)*sqrt(x)"
|
|
262
|
+
# sqrt(x) * sqrt(x) => x
|
|
263
|
+
# assumes (for soundness): ?x >= 0
|
|
264
|
+
```
|
|
265
|
+
|
|
266
|
+
```python
|
|
267
|
+
r = optimize("x / x")
|
|
268
|
+
r.optimized # 1
|
|
269
|
+
r.assumptions # ['?a != 0']
|
|
270
|
+
```
|
|
271
|
+
|
|
272
|
+
The set is a conservative over-approximation (it lists the assumptions of every
|
|
273
|
+
rule that fired on the way to the result), which is the safe direction: it never
|
|
274
|
+
hides a precondition. Surfacing these is the foundation for emitting guarded
|
|
275
|
+
code or refusing an unsafe rewrite, rather than silently assuming validity.
|
|
276
|
+
|
|
277
|
+
---
|
|
278
|
+
|
|
279
|
+
## Side-effect awareness
|
|
280
|
+
|
|
281
|
+
By default Logic-Loom assumes expressions are *pure*, so terms may be freely
|
|
282
|
+
duplicated, dropped or reordered. That is wrong when a subexpression has side
|
|
283
|
+
effects (reading input, drawing a random number, mutating state). Name the
|
|
284
|
+
impure functions and the engine taints every term that can contain such a call,
|
|
285
|
+
then refuses any rewrite that would change how often it runs or in what order:
|
|
286
|
+
|
|
287
|
+
```python
|
|
288
|
+
optimize("rand(s) - rand(s)") # 0 (pure assumption)
|
|
289
|
+
optimize("rand(s) - rand(s)", impure={"rand"}) # rand(s) - rand(s) (preserved)
|
|
290
|
+
optimize("rand(s) + rand(s)", impure={"rand"}) # rand(s) + rand(s) (not 2*rand)
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
A rewrite is rejected if it would duplicate, eliminate, or reorder a tainted
|
|
294
|
+
term. The check is deliberately conservative -- when in doubt it keeps the
|
|
295
|
+
effect -- which is exactly what soundness in the presence of side effects
|
|
296
|
+
requires. Pure factors *around* impure calls (where each call still runs
|
|
297
|
+
exactly once) are still optimized.
|
|
298
|
+
|
|
299
|
+
---
|
|
300
|
+
|
|
301
|
+
## Code generation
|
|
302
|
+
|
|
303
|
+
Optimizing an expression is only useful if you can run it. Logic-Loom emits the
|
|
304
|
+
optimized form as real source code in C, Rust, JavaScript, or **LLVM IR** so it
|
|
305
|
+
can plug into an existing toolchain:
|
|
306
|
+
|
|
307
|
+
```bash
|
|
308
|
+
python -m logic_loom --target c "a*x*x + b*x + c" # x * (a * x + b) + c
|
|
309
|
+
python -m logic_loom --target rust "x ^ 3" # (x).powf(3)
|
|
310
|
+
python -m logic_loom --target llvm "a*x + b"
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
```llvm
|
|
314
|
+
define double @f(double %a, double %b, double %x) {
|
|
315
|
+
entry:
|
|
316
|
+
%t1 = fmul double %a, %x
|
|
317
|
+
%t2 = fadd double %t1, %b
|
|
318
|
+
ret double %t2
|
|
319
|
+
}
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
The LLVM backend emits SSA IR with the right intrinsics
|
|
323
|
+
(`@llvm.pow.f64`, `@llvm.sqrt.f64`, ...) and their declarations, ready for
|
|
324
|
+
`opt`/`clang` to inline and vectorize. Power and function calls in the textual
|
|
325
|
+
targets are rendered idiomatically per language (`pow` / `Math.pow` / `.powf`).
|
|
326
|
+
|
|
327
|
+
---
|
|
328
|
+
|
|
329
|
+
## Extended domain: transcendental functions
|
|
330
|
+
|
|
331
|
+
By default Logic-Loom reasons over polynomial/rational arithmetic. The
|
|
332
|
+
`--extended` flag (or `rules=ALL_RULES`) adds identities for exponentials,
|
|
333
|
+
logarithms, square roots and trigonometry, each validated numerically in the
|
|
334
|
+
test-suite:
|
|
335
|
+
|
|
336
|
+
| Input | Output | Identity used |
|
|
337
|
+
|---|---|---|
|
|
338
|
+
| `exp(a) * exp(b)` | `exp(a + b)` | product of exponentials |
|
|
339
|
+
| `log(exp(x))` | `x` | log and exp are inverse |
|
|
340
|
+
| `sin(x)^2 + cos(x)^2` | `1` | Pythagorean identity |
|
|
341
|
+
| `sqrt(x) * sqrt(x)` | `x` | square root squared |
|
|
342
|
+
|
|
343
|
+
These assume the usual real domains, which is why they are opt-in and why their
|
|
344
|
+
[assumptions](#domain-safety-and-assumptions) are tracked.
|
|
345
|
+
|
|
346
|
+
---
|
|
347
|
+
|
|
348
|
+
## Visualize the e-graph
|
|
349
|
+
|
|
350
|
+
To see what saturation actually explores, export the e-graph to Graphviz:
|
|
351
|
+
|
|
352
|
+
```bash
|
|
353
|
+
python -m logic_loom --dot "a*b + a*c" > egraph.dot
|
|
354
|
+
dot -Tsvg egraph.dot -o egraph.svg
|
|
355
|
+
```
|
|
356
|
+
|
|
357
|
+
Each dashed box is an e-class (a set of forms proven equal); nodes inside it are
|
|
358
|
+
the different ways to build a value in that class; edges point from an operator
|
|
359
|
+
to the classes of its operands.
|
|
360
|
+
|
|
361
|
+
---
|
|
362
|
+
|
|
363
|
+
## Teach it new mathematics
|
|
364
|
+
|
|
365
|
+
Rules are one-liners, optionally annotated with domain assumptions:
|
|
366
|
+
|
|
367
|
+
```python
|
|
368
|
+
from logic_loom import optimize, rule, DEFAULT_RULES
|
|
369
|
+
|
|
370
|
+
power_of_two = rule("pow2", "?x ^ 2", "?x * ?x")
|
|
371
|
+
|
|
372
|
+
r = optimize("(a + b) ^ 2", rules=DEFAULT_RULES + [power_of_two])
|
|
373
|
+
print(r.optimized)
|
|
374
|
+
```
|
|
375
|
+
|
|
376
|
+
A rule is `rule(name, left_pattern, right_pattern, assumes=(...))`, where
|
|
377
|
+
`?name` marks a pattern variable. You describe a *theorem*, not a procedure;
|
|
378
|
+
Logic-Loom decides when and where it pays off.
|
|
379
|
+
|
|
380
|
+
---
|
|
381
|
+
|
|
382
|
+
## Correctness
|
|
383
|
+
|
|
384
|
+
A clever optimizer is worthless if it is ever *wrong*. Logic-Loom is backed by
|
|
385
|
+
**differential testing**: for each example the original and optimized
|
|
386
|
+
expressions are evaluated on hundreds of random inputs and asserted to agree to
|
|
387
|
+
floating-point tolerance.
|
|
388
|
+
|
|
389
|
+
```bash
|
|
390
|
+
pip install pytest
|
|
391
|
+
pytest -q # 53 passed
|
|
392
|
+
```
|
|
393
|
+
|
|
394
|
+
The suite covers parsing, code generation (including LLVM IR), cost models,
|
|
395
|
+
static pruning, side-effect safety, every class of optimization, and -- most
|
|
396
|
+
importantly -- that **no rewrite ever changes what an expression computes**.
|
|
397
|
+
|
|
398
|
+
---
|
|
399
|
+
|
|
400
|
+
## Architecture
|
|
401
|
+
|
|
402
|
+
A compact, readable codebase.
|
|
403
|
+
|
|
404
|
+
| File | Responsibility |
|
|
405
|
+
|---|---|
|
|
406
|
+
| [`logic_loom/expr.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/expr.py) | expression AST, pretty-printer, numeric evaluator |
|
|
407
|
+
| [`logic_loom/parser.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/parser.py) | Pratt parser (precedence, unary minus, calls, `?patvars`) |
|
|
408
|
+
| [`logic_loom/egraph.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/egraph.py) | the e-graph: union-find, hashcons, congruence `rebuild` |
|
|
409
|
+
| [`logic_loom/rules.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/rules.py) | rewrite rules (default + extended), assumptions, e-matching |
|
|
410
|
+
| [`logic_loom/analysis.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/analysis.py) | static rule pruning and automatic limit sizing |
|
|
411
|
+
| [`logic_loom/saturate.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/saturate.py) | saturation loop, constant folding, backoff scheduler |
|
|
412
|
+
| [`logic_loom/effects.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/effects.py) | side-effect taint analysis and rewrite safety |
|
|
413
|
+
| [`logic_loom/cost.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/cost.py) | cost models, hardware profiles, cheapest-term extraction |
|
|
414
|
+
| [`logic_loom/codegen.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/codegen.py) | emit C / Rust / JavaScript / LLVM IR |
|
|
415
|
+
| [`logic_loom/viz.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/viz.py) | Graphviz DOT export of the e-graph |
|
|
416
|
+
| [`logic_loom/compiler.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/compiler.py) | the high-level `optimize()` API |
|
|
417
|
+
| [`logic_loom/cli.py`](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/logic_loom/cli.py) | the `python -m logic_loom` command line |
|
|
418
|
+
|
|
419
|
+
---
|
|
420
|
+
|
|
421
|
+
## Limitations
|
|
422
|
+
|
|
423
|
+
This is a focused, working demonstration of a powerful idea, not a
|
|
424
|
+
production-grade computer algebra system. The current boundaries are explicit:
|
|
425
|
+
|
|
426
|
+
- **Cost models are configurable but still illustrative.** Profiles for x86,
|
|
427
|
+
ARM and GPU exist and can be loaded from JSON, and the model demonstrably
|
|
428
|
+
changes the extracted form. The shipped numbers are *relative* estimates,
|
|
429
|
+
however, not values measured on a specific chip; calibrate with
|
|
430
|
+
micro-benchmarks before drawing performance conclusions.
|
|
431
|
+
|
|
432
|
+
- **Domain safety is tracked, not enforced.** Assumptions like `x != 0` and
|
|
433
|
+
`x >= 0` are recorded and reported per result, which is the right foundation.
|
|
434
|
+
The engine does not yet *prove* those conditions hold, emit runtime guards
|
|
435
|
+
automatically, or model floating-point rounding and NaN propagation; a sound
|
|
436
|
+
numerical pipeline would add those checks.
|
|
437
|
+
|
|
438
|
+
- **Combinatorial explosion is bounded, not eliminated.** Static pruning, a
|
|
439
|
+
backoff scheduler and a node cap keep the search tractable and guarantee
|
|
440
|
+
termination, but over many associative/commutative operators the graph can
|
|
441
|
+
still hit a resource limit. In that case the *globally* optimal form is not
|
|
442
|
+
guaranteed; Logic-Loom returns the best form found within the budget.
|
|
443
|
+
|
|
444
|
+
- **Toolchain integration is via code emission, not a compiler pass.** It emits
|
|
445
|
+
LLVM IR text that an existing toolchain can consume, but it is not yet an
|
|
446
|
+
in-tree LLVM optimization pass or a registered backend for a language front
|
|
447
|
+
end. Wiring it directly into a compiler's pass pipeline remains future work.
|
|
448
|
+
|
|
449
|
+
- **Side-effect handling is sound but coarse.** Impure calls are never
|
|
450
|
+
duplicated, dropped or reordered, which is safe, but the analysis is
|
|
451
|
+
whole-term tainting rather than a precise effect system; it does not model
|
|
452
|
+
distinct effect kinds, aliasing, or ordering constraints between *different*
|
|
453
|
+
impure calls beyond forbidding their reordering.
|
|
454
|
+
|
|
455
|
+
- **Not a full CAS.** This is deliberately a *demonstrator* of the idea, not a
|
|
456
|
+
complete symbolic-mathematics system; it does not solve equations, integrate,
|
|
457
|
+
or simplify across the full breadth a CAS would.
|
|
458
|
+
|
|
459
|
+
---
|
|
460
|
+
|
|
461
|
+
## Roadmap
|
|
462
|
+
|
|
463
|
+
Directions for turning the demonstrator into something broader. Items marked
|
|
464
|
+
**(done)** ship today; **(partial)** are started with a clear next step.
|
|
465
|
+
|
|
466
|
+
- **Realistic cost model (partial).** Hardware profiles and JSON loading exist;
|
|
467
|
+
next is calibrating them from real micro-benchmarks per architecture and
|
|
468
|
+
modeling fused operations (FMA) and vector throughput.
|
|
469
|
+
- **Domain guards and numerical safety (partial).** Assumptions are tracked and
|
|
470
|
+
reported; next is proving or emitting runtime guards for them and modeling
|
|
471
|
+
floating-point behavior (rounding, NaN/Inf) so transcendental rewrites are
|
|
472
|
+
sound under IEEE-754.
|
|
473
|
+
- **Scalability heuristics (partial).** Static pruning and a backoff scheduler
|
|
474
|
+
are in place; next is cost-aware scheduling and detecting AC subgraphs that
|
|
475
|
+
cannot improve, to prune the search further before saturating.
|
|
476
|
+
- **Code generation (done, expanding).** C, Rust, JavaScript and LLVM IR are
|
|
477
|
+
supported; more targets and full function/statement emission are natural
|
|
478
|
+
extensions.
|
|
479
|
+
- **e-graph visualization (done, expanding).** DOT export exists; an
|
|
480
|
+
interactive viewer that animates saturation round by round would help.
|
|
481
|
+
- **Integration with toolchains.** Build a real LLVM optimization pass (or an
|
|
482
|
+
IR transpiler wired into the pass pipeline) so C++/Rust builds can call
|
|
483
|
+
Logic-Loom directly, rather than pasting emitted IR.
|
|
484
|
+
- **Richer side-effect model.** Move from whole-term tainting to a precise
|
|
485
|
+
effect system that tracks distinct effects and ordering constraints, enabling
|
|
486
|
+
optimization that reorders independent effects safely.
|
|
487
|
+
|
|
488
|
+
---
|
|
489
|
+
|
|
490
|
+
## Further reading
|
|
491
|
+
|
|
492
|
+
- M. Willsey et al., *"egg: Fast and Extensible Equality Saturation,"* POPL 2021 - the modern reference for e-graphs and equality saturation.
|
|
493
|
+
- R. Tate et al., *"Equality Saturation: A New Approach to Optimization,"* POPL 2009.
|
|
494
|
+
- **Herbie** - equality saturation applied to floating-point accuracy: <https://herbie.uwplse.org/>
|
|
495
|
+
- **egg / egglog** - <https://egraphs-good.github.io/>
|
|
496
|
+
|
|
497
|
+
---
|
|
498
|
+
|
|
499
|
+
<div align="center">
|
|
500
|
+
|
|
501
|
+
Built as an exploration of what a compiler looks like when it thinks like a
|
|
502
|
+
mathematician. MIT licensed; see [LICENSE](https://github.com/elianalfonsolopezpreciado/Logic-Loom/blob/main/LICENSE).
|
|
503
|
+
|
|
504
|
+
</div>
|