crossing 1.0.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- crossing-1.0.0/LICENSE +21 -0
- crossing-1.0.0/PKG-INFO +305 -0
- crossing-1.0.0/README.md +272 -0
- crossing-1.0.0/crossing.egg-info/PKG-INFO +305 -0
- crossing-1.0.0/crossing.egg-info/SOURCES.txt +13 -0
- crossing-1.0.0/crossing.egg-info/dependency_links.txt +1 -0
- crossing-1.0.0/crossing.egg-info/entry_points.txt +3 -0
- crossing-1.0.0/crossing.egg-info/requires.txt +10 -0
- crossing-1.0.0/crossing.egg-info/top_level.txt +4 -0
- crossing-1.0.0/crossing.py +870 -0
- crossing-1.0.0/pyproject.toml +45 -0
- crossing-1.0.0/report.py +577 -0
- crossing-1.0.0/scan.py +402 -0
- crossing-1.0.0/semantic_scan.py +1324 -0
- crossing-1.0.0/setup.cfg +4 -0
crossing-1.0.0/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 Friday
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
crossing-1.0.0/PKG-INFO
ADDED
|
@@ -0,0 +1,305 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: crossing
|
|
3
|
+
Version: 1.0.0
|
|
4
|
+
Summary: Detect silent information loss at system boundaries — semantic exception analysis and round-trip data loss fuzzing for Python
|
|
5
|
+
Author-email: Friday <friday@fridayops.xyz>
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://fridayops.xyz
|
|
8
|
+
Project-URL: Repository, https://github.com/worksbyfriday/crossing
|
|
9
|
+
Project-URL: Issues, https://github.com/worksbyfriday/crossing/issues
|
|
10
|
+
Project-URL: Documentation, https://fridayops.xyz/crossing/
|
|
11
|
+
Keywords: testing,exceptions,semantic-analysis,data-loss,boundaries,fuzzing,information-theory,ast
|
|
12
|
+
Classifier: Development Status :: 4 - Beta
|
|
13
|
+
Classifier: Intended Audience :: Developers
|
|
14
|
+
Classifier: Programming Language :: Python :: 3
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
19
|
+
Classifier: Topic :: Software Development :: Testing
|
|
20
|
+
Classifier: Topic :: Software Development :: Quality Assurance
|
|
21
|
+
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
22
|
+
Requires-Python: >=3.10
|
|
23
|
+
Description-Content-Type: text/markdown
|
|
24
|
+
License-File: LICENSE
|
|
25
|
+
Provides-Extra: yaml
|
|
26
|
+
Requires-Dist: pyyaml>=6.0; extra == "yaml"
|
|
27
|
+
Provides-Extra: toml
|
|
28
|
+
Requires-Dist: tomli_w>=1.0; extra == "toml"
|
|
29
|
+
Provides-Extra: all
|
|
30
|
+
Requires-Dist: pyyaml>=6.0; extra == "all"
|
|
31
|
+
Requires-Dist: tomli_w>=1.0; extra == "all"
|
|
32
|
+
Dynamic: license-file
|
|
33
|
+
|
|
34
|
+
# Crossing
|
|
35
|
+
|
|
36
|
+
Detect silent information loss at system boundaries in Python codebases.
|
|
37
|
+
|
|
38
|
+
## Two Tools
|
|
39
|
+
|
|
40
|
+
### 1. Semantic Scanner — Exception Pattern Analysis
|
|
41
|
+
|
|
42
|
+
Find where the same exception type carries different meanings depending on the code path, but handlers can't distinguish them.
|
|
43
|
+
|
|
44
|
+
```bash
|
|
45
|
+
# Basic scan
|
|
46
|
+
crossing-semantic /path/to/project
|
|
47
|
+
|
|
48
|
+
# With implicit raises (dict access, getattr, etc.)
|
|
49
|
+
crossing-semantic --implicit /path/to/project
|
|
50
|
+
|
|
51
|
+
# JSON output for tooling
|
|
52
|
+
crossing-semantic --format json /path/to/project
|
|
53
|
+
|
|
54
|
+
# CI mode: fail if elevated/high risk crossings found
|
|
55
|
+
crossing-semantic --ci --min-risk elevated /path/to/project
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
Example: a `KeyError` that means "config key missing" and a `KeyError` that means "factor-filtered to empty" arrive at the same `except KeyError` handler. The handler assumes one meaning. The bug is silent.
|
|
59
|
+
|
|
60
|
+
### 2. Data Loss Fuzzer — Round-Trip Testing
|
|
61
|
+
|
|
62
|
+
Test whether information survives boundary crossings: serialization, API calls, database writes, format conversions.
|
|
63
|
+
|
|
64
|
+
```python
|
|
65
|
+
from crossing import Crossing, cross
|
|
66
|
+
|
|
67
|
+
c = Crossing(
|
|
68
|
+
encode=lambda d: json.dumps(d),
|
|
69
|
+
decode=lambda s: json.loads(s),
|
|
70
|
+
)
|
|
71
|
+
|
|
72
|
+
report = cross(c, samples=1000)
|
|
73
|
+
report.print() # shows what was lost, where, and how
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
This isn't fuzzing for crashes. It's fuzzing for **silent data loss** — the operation succeeds but the output is missing something the input had.
|
|
77
|
+
|
|
78
|
+
---
|
|
79
|
+
|
|
80
|
+
## Semantic Scanner
|
|
81
|
+
|
|
82
|
+
### What It Finds
|
|
83
|
+
|
|
84
|
+
- **Polymorphic exceptions**: Multiple `raise` sites for the same exception type, caught by handlers that don't distinguish between them
|
|
85
|
+
- **Cross-function crossings**: Exceptions raised in called functions, caught by handlers in the caller
|
|
86
|
+
- **Cross-file crossings**: Same pattern across module boundaries via import resolution
|
|
87
|
+
- **Implicit raises**: `dict[key]` -> `KeyError`, `getattr(obj, name)` -> `AttributeError`, `int(x)` -> `ValueError`
|
|
88
|
+
- **Inheritance crossings**: `except ValueError` catching subclass raises like `ValidationError`
|
|
89
|
+
- **Scope analysis**: Whether handlers catch exceptions from direct raises or from called functions
|
|
90
|
+
- **Message differentiation**: Risk downgraded when all raise sites pass distinct string messages
|
|
91
|
+
|
|
92
|
+
### Risk Levels
|
|
93
|
+
|
|
94
|
+
| Level | Meaning |
|
|
95
|
+
|-------|---------|
|
|
96
|
+
| **low** | Single raise site, or polymorphic with matching handler strategies |
|
|
97
|
+
| **medium** | Multiple raise sites with uniform handler treatment |
|
|
98
|
+
| **elevated** | Scope mismatches or cross-function reachability |
|
|
99
|
+
| **high** | Many raise sites, few handlers, mixed implicit/explicit |
|
|
100
|
+
|
|
101
|
+
### CLI Options
|
|
102
|
+
|
|
103
|
+
```
|
|
104
|
+
crossing-semantic [OPTIONS] PATH
|
|
105
|
+
|
|
106
|
+
Options:
|
|
107
|
+
--implicit Detect implicit raises (dict access, getattr, etc.)
|
|
108
|
+
--format FORMAT Output format: text (default), json, markdown
|
|
109
|
+
--min-risk LEVEL Minimum risk to report: low, medium, elevated, high
|
|
110
|
+
--exclude PATTERN Exclude directories (repeatable)
|
|
111
|
+
--ci Exit code 1 if elevated/high risk crossings found
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
### Example Output
|
|
115
|
+
|
|
116
|
+
```
|
|
117
|
+
============================================================
|
|
118
|
+
Semantic Crossing Scan: /path/to/tox
|
|
119
|
+
============================================================
|
|
120
|
+
Files scanned: 42
|
|
121
|
+
Exception raises: 87 (58 explicit, 29 implicit)
|
|
122
|
+
Exception handlers: 34
|
|
123
|
+
Semantic crossings: 12
|
|
124
|
+
Polymorphic (multi-raise): 8
|
|
125
|
+
Elevated risk: 3
|
|
126
|
+
|
|
127
|
+
--- KeyError: 3 raise sites, 14 handlers --- high risk ---
|
|
128
|
+
3 raise sites across different loaders (API, TOML, INI),
|
|
129
|
+
14 handlers catching without distinguishing source
|
|
130
|
+
============================================================
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
### Information-Theoretic Scoring
|
|
134
|
+
|
|
135
|
+
Each crossing reports quantitative metrics based on Shannon entropy:
|
|
136
|
+
|
|
137
|
+
| Metric | What it measures |
|
|
138
|
+
|--------|-----------------|
|
|
139
|
+
| **Semantic entropy** | Bits of information carried by the exception type at raise sites (log2 of distinct origins) |
|
|
140
|
+
| **Handler discrimination** | Bits preserved by handlers (re-raise = full, return/pass = zero) |
|
|
141
|
+
| **Information loss** | Bits destroyed: entropy minus discrimination |
|
|
142
|
+
| **Collapse ratio** | Normalized loss: 0% (no collapse) to 100% (total meaning erasure) |
|
|
143
|
+
|
|
144
|
+
```
|
|
145
|
+
--- AttributeError: 4 raise sites, 3 handlers — high risk ---
|
|
146
|
+
Information: 2.0 bits entropy, 0.3 bits lost, 83% collapse
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
In JSON output, each crossing includes an `information_theory` object, and the summary includes `total_information_loss_bits` and `mean_collapse_ratio` across all crossings.
|
|
150
|
+
|
|
151
|
+
### Real Bugs Found
|
|
152
|
+
|
|
153
|
+
The semantic scanner has identified real bugs in production codebases:
|
|
154
|
+
|
|
155
|
+
- **tox #3809**: `KeyError` meaning "factor-filtered to empty" caught by handler expecting "key doesn't exist"
|
|
156
|
+
- **Rich #3960**: Exception `__notes__` leaking across chained exceptions
|
|
157
|
+
- **pytest #14214**: Verbosity config not propagated across internal call boundary
|
|
158
|
+
|
|
159
|
+
---
|
|
160
|
+
|
|
161
|
+
## Data Loss Fuzzer
|
|
162
|
+
|
|
163
|
+
### Built-in Crossings
|
|
164
|
+
|
|
165
|
+
| Crossing | What it tests | Typical loss rate |
|
|
166
|
+
|----------|---------------|-------------------|
|
|
167
|
+
| `json_crossing()` | JSON with `default=str` | ~24% lossy, 34% crashes |
|
|
168
|
+
| `json_crossing_strict()` | JSON without fallback | ~6% lossy, 52% crashes |
|
|
169
|
+
| `pickle_crossing()` | Python pickle | 0% (lossless baseline) |
|
|
170
|
+
| `yaml_crossing()` | YAML safe_load | ~0% lossy, 49% crashes |
|
|
171
|
+
| `toml_crossing()` | TOML via tomllib/tomli_w | varies |
|
|
172
|
+
| `csv_crossing()` | CSV (everything becomes strings) | ~82% lossy |
|
|
173
|
+
| `env_file_crossing()` | .env files (KEY=VALUE) | ~83% lossy |
|
|
174
|
+
| `url_query_crossing()` | URL query string encoding | ~80% lossy |
|
|
175
|
+
|
|
176
|
+
### Custom Crossings
|
|
177
|
+
|
|
178
|
+
```python
|
|
179
|
+
from crossing import Crossing, cross
|
|
180
|
+
|
|
181
|
+
# Test your API serialization
|
|
182
|
+
c = Crossing(
|
|
183
|
+
encode=lambda d: my_api_serialize(d),
|
|
184
|
+
decode=lambda s: my_api_deserialize(s),
|
|
185
|
+
name="My API boundary",
|
|
186
|
+
)
|
|
187
|
+
report = cross(c, samples=1000)
|
|
188
|
+
report.print()
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
### Compose Pipelines
|
|
192
|
+
|
|
193
|
+
```python
|
|
194
|
+
from crossing import compose, json_crossing, string_truncation_crossing, cross
|
|
195
|
+
|
|
196
|
+
# Simulate: serialize -> store in VARCHAR(100) -> deserialize
|
|
197
|
+
pipeline = compose(
|
|
198
|
+
json_crossing(),
|
|
199
|
+
string_truncation_crossing(100),
|
|
200
|
+
)
|
|
201
|
+
report = cross(pipeline, samples=500)
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
### Codebase Scanning
|
|
205
|
+
|
|
206
|
+
```bash
|
|
207
|
+
python3 scan.py /path/to/project
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
Finds encode/decode pairs for: JSON, YAML, pickle, TOML, base64, URL encoding, CSV, struct, zlib, gzip.
|
|
211
|
+
|
|
212
|
+
---
|
|
213
|
+
|
|
214
|
+
## GitHub Action
|
|
215
|
+
|
|
216
|
+
Add Crossing to your CI pipeline:
|
|
217
|
+
|
|
218
|
+
```yaml
|
|
219
|
+
# .github/workflows/crossing.yml
|
|
220
|
+
name: Exception Analysis
|
|
221
|
+
on: [pull_request]
|
|
222
|
+
|
|
223
|
+
jobs:
|
|
224
|
+
crossing:
|
|
225
|
+
runs-on: ubuntu-latest
|
|
226
|
+
steps:
|
|
227
|
+
- uses: actions/checkout@v4
|
|
228
|
+
- uses: worksbyfriday/crossing@main
|
|
229
|
+
with:
|
|
230
|
+
path: 'src/'
|
|
231
|
+
fail-on-risk: 'elevated'
|
|
232
|
+
```
|
|
233
|
+
|
|
234
|
+
Inputs: `path`, `min-risk`, `format`, `implicit`, `exclude`, `fail-on-risk`.
|
|
235
|
+
|
|
236
|
+
---
|
|
237
|
+
|
|
238
|
+
## Benchmarks
|
|
239
|
+
|
|
240
|
+
Scanned 11 popular Python projects (Feb 2026):
|
|
241
|
+
|
|
242
|
+
| Project | Files | Crossings | High Risk | Info Loss |
|
|
243
|
+
|---|---|---|---|---|
|
|
244
|
+
| **pydantic** | **402** | **119** | **12** | **22.9 bits** |
|
|
245
|
+
| **sqlalchemy** | **661** | **103** | **16** | **79.8 bits** |
|
|
246
|
+
| django | 902 | 80 | 6 | — |
|
|
247
|
+
| aiohttp | 166 | 53 | 11 | 25.5 bits |
|
|
248
|
+
| click | 62 | 14 | 5 | 7.4 bits |
|
|
249
|
+
| celery | 161 | 12 | 3 | — |
|
|
250
|
+
| flask | 24 | 6 | 2 | — |
|
|
251
|
+
| requests | 18 | 5 | 2 | — |
|
|
252
|
+
| rich | 100 | 5 | 1 | — |
|
|
253
|
+
| astroid | 96 | 5 | 0 | — |
|
|
254
|
+
| **fastapi** | **47** | **0** | **0** | **0 bits** |
|
|
255
|
+
|
|
256
|
+
FastAPI scoring clean validates the tool. Sample audit reports: [SQLAlchemy](examples/audit-sqlalchemy.md), [Django](examples/audit-django.md), [Celery](examples/audit-celery.md), [Flask](examples/audit-flask.md), [Requests](examples/audit-requests.md).
|
|
257
|
+
|
|
258
|
+
---
|
|
259
|
+
|
|
260
|
+
## API
|
|
261
|
+
|
|
262
|
+
Scan any installed Python package via HTTP:
|
|
263
|
+
|
|
264
|
+
```bash
|
|
265
|
+
curl https://api.fridayops.xyz/crossing/package/flask
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
Returns JSON with full crossing analysis, information theory metrics, and risk levels.
|
|
269
|
+
|
|
270
|
+
**Audit report** — full markdown report with findings, recommendations, and benchmarks:
|
|
271
|
+
|
|
272
|
+
```bash
|
|
273
|
+
curl https://api.fridayops.xyz/crossing/report/flask
|
|
274
|
+
```
|
|
275
|
+
|
|
276
|
+
**Badge** — embed in your README:
|
|
277
|
+
|
|
278
|
+
```markdown
|
|
279
|
+

|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+

|
|
283
|
+
|
|
284
|
+
All endpoints:
|
|
285
|
+
- `POST /crossing` — scan raw Python source
|
|
286
|
+
- `GET /crossing/package/{name}` — JSON scan results
|
|
287
|
+
- `GET /crossing/report/{name}` — full markdown audit report
|
|
288
|
+
- `GET /crossing/badge/{name}` — SVG badge
|
|
289
|
+
- `GET /crossing/benchmark` — comparison data from 17 projects
|
|
290
|
+
- `GET /crossing/packages` — list of example packages
|
|
291
|
+
- `GET /crossing/example` — demo snippet
|
|
292
|
+
|
|
293
|
+
---
|
|
294
|
+
|
|
295
|
+
## Install
|
|
296
|
+
|
|
297
|
+
```
|
|
298
|
+
pip install crossing
|
|
299
|
+
```
|
|
300
|
+
|
|
301
|
+
Or copy the files directly — no external dependencies. Python 3.10+.
|
|
302
|
+
|
|
303
|
+
## License
|
|
304
|
+
|
|
305
|
+
MIT
|
crossing-1.0.0/README.md
ADDED
|
@@ -0,0 +1,272 @@
|
|
|
1
|
+
# Crossing
|
|
2
|
+
|
|
3
|
+
Detect silent information loss at system boundaries in Python codebases.
|
|
4
|
+
|
|
5
|
+
## Two Tools
|
|
6
|
+
|
|
7
|
+
### 1. Semantic Scanner — Exception Pattern Analysis
|
|
8
|
+
|
|
9
|
+
Find where the same exception type carries different meanings depending on the code path, but handlers can't distinguish them.
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
# Basic scan
|
|
13
|
+
crossing-semantic /path/to/project
|
|
14
|
+
|
|
15
|
+
# With implicit raises (dict access, getattr, etc.)
|
|
16
|
+
crossing-semantic --implicit /path/to/project
|
|
17
|
+
|
|
18
|
+
# JSON output for tooling
|
|
19
|
+
crossing-semantic --format json /path/to/project
|
|
20
|
+
|
|
21
|
+
# CI mode: fail if elevated/high risk crossings found
|
|
22
|
+
crossing-semantic --ci --min-risk elevated /path/to/project
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Example: a `KeyError` that means "config key missing" and a `KeyError` that means "factor-filtered to empty" arrive at the same `except KeyError` handler. The handler assumes one meaning. The bug is silent.
|
|
26
|
+
|
|
27
|
+
### 2. Data Loss Fuzzer — Round-Trip Testing
|
|
28
|
+
|
|
29
|
+
Test whether information survives boundary crossings: serialization, API calls, database writes, format conversions.
|
|
30
|
+
|
|
31
|
+
```python
|
|
32
|
+
from crossing import Crossing, cross
|
|
33
|
+
|
|
34
|
+
c = Crossing(
|
|
35
|
+
encode=lambda d: json.dumps(d),
|
|
36
|
+
decode=lambda s: json.loads(s),
|
|
37
|
+
)
|
|
38
|
+
|
|
39
|
+
report = cross(c, samples=1000)
|
|
40
|
+
report.print() # shows what was lost, where, and how
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
This isn't fuzzing for crashes. It's fuzzing for **silent data loss** — the operation succeeds but the output is missing something the input had.
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
## Semantic Scanner
|
|
48
|
+
|
|
49
|
+
### What It Finds
|
|
50
|
+
|
|
51
|
+
- **Polymorphic exceptions**: Multiple `raise` sites for the same exception type, caught by handlers that don't distinguish between them
|
|
52
|
+
- **Cross-function crossings**: Exceptions raised in called functions, caught by handlers in the caller
|
|
53
|
+
- **Cross-file crossings**: Same pattern across module boundaries via import resolution
|
|
54
|
+
- **Implicit raises**: `dict[key]` -> `KeyError`, `getattr(obj, name)` -> `AttributeError`, `int(x)` -> `ValueError`
|
|
55
|
+
- **Inheritance crossings**: `except ValueError` catching subclass raises like `ValidationError`
|
|
56
|
+
- **Scope analysis**: Whether handlers catch exceptions from direct raises or from called functions
|
|
57
|
+
- **Message differentiation**: Risk downgraded when all raise sites pass distinct string messages
|
|
58
|
+
|
|
59
|
+
### Risk Levels
|
|
60
|
+
|
|
61
|
+
| Level | Meaning |
|
|
62
|
+
|-------|---------|
|
|
63
|
+
| **low** | Single raise site, or polymorphic with matching handler strategies |
|
|
64
|
+
| **medium** | Multiple raise sites with uniform handler treatment |
|
|
65
|
+
| **elevated** | Scope mismatches or cross-function reachability |
|
|
66
|
+
| **high** | Many raise sites, few handlers, mixed implicit/explicit |
|
|
67
|
+
|
|
68
|
+
### CLI Options
|
|
69
|
+
|
|
70
|
+
```
|
|
71
|
+
crossing-semantic [OPTIONS] PATH
|
|
72
|
+
|
|
73
|
+
Options:
|
|
74
|
+
--implicit Detect implicit raises (dict access, getattr, etc.)
|
|
75
|
+
--format FORMAT Output format: text (default), json, markdown
|
|
76
|
+
--min-risk LEVEL Minimum risk to report: low, medium, elevated, high
|
|
77
|
+
--exclude PATTERN Exclude directories (repeatable)
|
|
78
|
+
--ci Exit code 1 if elevated/high risk crossings found
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
### Example Output
|
|
82
|
+
|
|
83
|
+
```
|
|
84
|
+
============================================================
|
|
85
|
+
Semantic Crossing Scan: /path/to/tox
|
|
86
|
+
============================================================
|
|
87
|
+
Files scanned: 42
|
|
88
|
+
Exception raises: 87 (58 explicit, 29 implicit)
|
|
89
|
+
Exception handlers: 34
|
|
90
|
+
Semantic crossings: 12
|
|
91
|
+
Polymorphic (multi-raise): 8
|
|
92
|
+
Elevated risk: 3
|
|
93
|
+
|
|
94
|
+
--- KeyError: 3 raise sites, 14 handlers --- high risk ---
|
|
95
|
+
3 raise sites across different loaders (API, TOML, INI),
|
|
96
|
+
14 handlers catching without distinguishing source
|
|
97
|
+
============================================================
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
### Information-Theoretic Scoring
|
|
101
|
+
|
|
102
|
+
Each crossing reports quantitative metrics based on Shannon entropy:
|
|
103
|
+
|
|
104
|
+
| Metric | What it measures |
|
|
105
|
+
|--------|-----------------|
|
|
106
|
+
| **Semantic entropy** | Bits of information carried by the exception type at raise sites (log2 of distinct origins) |
|
|
107
|
+
| **Handler discrimination** | Bits preserved by handlers (re-raise = full, return/pass = zero) |
|
|
108
|
+
| **Information loss** | Bits destroyed: entropy minus discrimination |
|
|
109
|
+
| **Collapse ratio** | Normalized loss: 0% (no collapse) to 100% (total meaning erasure) |
|
|
110
|
+
|
|
111
|
+
```
|
|
112
|
+
--- AttributeError: 4 raise sites, 3 handlers — high risk ---
|
|
113
|
+
Information: 2.0 bits entropy, 0.3 bits lost, 83% collapse
|
|
114
|
+
```
|
|
115
|
+
|
|
116
|
+
In JSON output, each crossing includes an `information_theory` object, and the summary includes `total_information_loss_bits` and `mean_collapse_ratio` across all crossings.
|
|
117
|
+
|
|
118
|
+
### Real Bugs Found
|
|
119
|
+
|
|
120
|
+
The semantic scanner has identified real bugs in production codebases:
|
|
121
|
+
|
|
122
|
+
- **tox #3809**: `KeyError` meaning "factor-filtered to empty" caught by handler expecting "key doesn't exist"
|
|
123
|
+
- **Rich #3960**: Exception `__notes__` leaking across chained exceptions
|
|
124
|
+
- **pytest #14214**: Verbosity config not propagated across internal call boundary
|
|
125
|
+
|
|
126
|
+
---
|
|
127
|
+
|
|
128
|
+
## Data Loss Fuzzer
|
|
129
|
+
|
|
130
|
+
### Built-in Crossings
|
|
131
|
+
|
|
132
|
+
| Crossing | What it tests | Typical loss rate |
|
|
133
|
+
|----------|---------------|-------------------|
|
|
134
|
+
| `json_crossing()` | JSON with `default=str` | ~24% lossy, 34% crashes |
|
|
135
|
+
| `json_crossing_strict()` | JSON without fallback | ~6% lossy, 52% crashes |
|
|
136
|
+
| `pickle_crossing()` | Python pickle | 0% (lossless baseline) |
|
|
137
|
+
| `yaml_crossing()` | YAML safe_load | ~0% lossy, 49% crashes |
|
|
138
|
+
| `toml_crossing()` | TOML via tomllib/tomli_w | varies |
|
|
139
|
+
| `csv_crossing()` | CSV (everything becomes strings) | ~82% lossy |
|
|
140
|
+
| `env_file_crossing()` | .env files (KEY=VALUE) | ~83% lossy |
|
|
141
|
+
| `url_query_crossing()` | URL query string encoding | ~80% lossy |
|
|
142
|
+
|
|
143
|
+
### Custom Crossings
|
|
144
|
+
|
|
145
|
+
```python
|
|
146
|
+
from crossing import Crossing, cross
|
|
147
|
+
|
|
148
|
+
# Test your API serialization
|
|
149
|
+
c = Crossing(
|
|
150
|
+
encode=lambda d: my_api_serialize(d),
|
|
151
|
+
decode=lambda s: my_api_deserialize(s),
|
|
152
|
+
name="My API boundary",
|
|
153
|
+
)
|
|
154
|
+
report = cross(c, samples=1000)
|
|
155
|
+
report.print()
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
### Compose Pipelines
|
|
159
|
+
|
|
160
|
+
```python
|
|
161
|
+
from crossing import compose, json_crossing, string_truncation_crossing, cross
|
|
162
|
+
|
|
163
|
+
# Simulate: serialize -> store in VARCHAR(100) -> deserialize
|
|
164
|
+
pipeline = compose(
|
|
165
|
+
json_crossing(),
|
|
166
|
+
string_truncation_crossing(100),
|
|
167
|
+
)
|
|
168
|
+
report = cross(pipeline, samples=500)
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
### Codebase Scanning
|
|
172
|
+
|
|
173
|
+
```bash
|
|
174
|
+
python3 scan.py /path/to/project
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
Finds encode/decode pairs for: JSON, YAML, pickle, TOML, base64, URL encoding, CSV, struct, zlib, gzip.
|
|
178
|
+
|
|
179
|
+
---
|
|
180
|
+
|
|
181
|
+
## GitHub Action
|
|
182
|
+
|
|
183
|
+
Add Crossing to your CI pipeline:
|
|
184
|
+
|
|
185
|
+
```yaml
|
|
186
|
+
# .github/workflows/crossing.yml
|
|
187
|
+
name: Exception Analysis
|
|
188
|
+
on: [pull_request]
|
|
189
|
+
|
|
190
|
+
jobs:
|
|
191
|
+
crossing:
|
|
192
|
+
runs-on: ubuntu-latest
|
|
193
|
+
steps:
|
|
194
|
+
- uses: actions/checkout@v4
|
|
195
|
+
- uses: worksbyfriday/crossing@main
|
|
196
|
+
with:
|
|
197
|
+
path: 'src/'
|
|
198
|
+
fail-on-risk: 'elevated'
|
|
199
|
+
```
|
|
200
|
+
|
|
201
|
+
Inputs: `path`, `min-risk`, `format`, `implicit`, `exclude`, `fail-on-risk`.
|
|
202
|
+
|
|
203
|
+
---
|
|
204
|
+
|
|
205
|
+
## Benchmarks
|
|
206
|
+
|
|
207
|
+
Scanned 11 popular Python projects (Feb 2026):
|
|
208
|
+
|
|
209
|
+
| Project | Files | Crossings | High Risk | Info Loss |
|
|
210
|
+
|---|---|---|---|---|
|
|
211
|
+
| **pydantic** | **402** | **119** | **12** | **22.9 bits** |
|
|
212
|
+
| **sqlalchemy** | **661** | **103** | **16** | **79.8 bits** |
|
|
213
|
+
| django | 902 | 80 | 6 | — |
|
|
214
|
+
| aiohttp | 166 | 53 | 11 | 25.5 bits |
|
|
215
|
+
| click | 62 | 14 | 5 | 7.4 bits |
|
|
216
|
+
| celery | 161 | 12 | 3 | — |
|
|
217
|
+
| flask | 24 | 6 | 2 | — |
|
|
218
|
+
| requests | 18 | 5 | 2 | — |
|
|
219
|
+
| rich | 100 | 5 | 1 | — |
|
|
220
|
+
| astroid | 96 | 5 | 0 | — |
|
|
221
|
+
| **fastapi** | **47** | **0** | **0** | **0 bits** |
|
|
222
|
+
|
|
223
|
+
FastAPI scoring clean validates the tool. Sample audit reports: [SQLAlchemy](examples/audit-sqlalchemy.md), [Django](examples/audit-django.md), [Celery](examples/audit-celery.md), [Flask](examples/audit-flask.md), [Requests](examples/audit-requests.md).
|
|
224
|
+
|
|
225
|
+
---
|
|
226
|
+
|
|
227
|
+
## API
|
|
228
|
+
|
|
229
|
+
Scan any installed Python package via HTTP:
|
|
230
|
+
|
|
231
|
+
```bash
|
|
232
|
+
curl https://api.fridayops.xyz/crossing/package/flask
|
|
233
|
+
```
|
|
234
|
+
|
|
235
|
+
Returns JSON with full crossing analysis, information theory metrics, and risk levels.
|
|
236
|
+
|
|
237
|
+
**Audit report** — full markdown report with findings, recommendations, and benchmarks:
|
|
238
|
+
|
|
239
|
+
```bash
|
|
240
|
+
curl https://api.fridayops.xyz/crossing/report/flask
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
**Badge** — embed in your README:
|
|
244
|
+
|
|
245
|
+
```markdown
|
|
246
|
+

|
|
247
|
+
```
|
|
248
|
+
|
|
249
|
+

|
|
250
|
+
|
|
251
|
+
All endpoints:
|
|
252
|
+
- `POST /crossing` — scan raw Python source
|
|
253
|
+
- `GET /crossing/package/{name}` — JSON scan results
|
|
254
|
+
- `GET /crossing/report/{name}` — full markdown audit report
|
|
255
|
+
- `GET /crossing/badge/{name}` — SVG badge
|
|
256
|
+
- `GET /crossing/benchmark` — comparison data from 17 projects
|
|
257
|
+
- `GET /crossing/packages` — list of example packages
|
|
258
|
+
- `GET /crossing/example` — demo snippet
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
## Install
|
|
263
|
+
|
|
264
|
+
```
|
|
265
|
+
pip install crossing
|
|
266
|
+
```
|
|
267
|
+
|
|
268
|
+
Or copy the files directly — no external dependencies. Python 3.10+.
|
|
269
|
+
|
|
270
|
+
## License
|
|
271
|
+
|
|
272
|
+
MIT
|