depyo 1.0.3 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,116 +1,193 @@
1
1
  # depyo — Python bytecode decompiler in Node.js
2
2
 
3
- Depyo converts Python `.pyc` files (or archives of them) back to readable Python source. It aims for broad coverage (Python 1.0 through 3.14) and fast throughput, with fixtures for modern features (exception groups, pattern matching, walrus, f-strings, async, context managers).
3
+ Depyo converts Python `.pyc` files (or archives of them) back to readable Python source — right from Node.js, without a Python runtime. Coverage spans **Python 1.0 through 3.15** plus PyPy, with first-class support for modern features: match/case, walrus, f-strings, exception groups, async/await, type parameters, PEP 696 TypeVar defaults, and t-strings (PEP 750).
4
4
 
5
- ## Why depyo?
6
- - **Wide version coverage:** Opcode tables and expected outputs for Python 1.0–3.14, plus decompilation support for PyPy bytecode sets.
7
- - **Modern features:** WITH_EXCEPT_START/PREP_RERAISE_STAR, async/await, walrus, match/case, f-strings, type params, dict/set merges.
8
- - **Workflow friendly:** CLI options for asm dumps, raw spacing hints, raw `.pyc` preservation, and flattened output paths.
9
- - **Verification harness:** `run-fixtures.js` and `run-matrix.js` compare decompiled output against expected fixtures across versions.
5
+ ```bash
6
+ npx depyo my_script.pyc
7
+ # writes my_script.py next to the input
8
+ ```
9
+
10
+ ## What it's good for
11
+
12
+ - **Reverse engineering stripped Python.** You have a `.pyc` (maybe extracted from a PyInstaller binary, an Android APK's Kivy bundle, or an old archive) and no source. Depyo reconstructs the source — even for Python versions the original `uncompyle6`/`decompyle3` no longer follow.
13
+ - **Malware / threat analysis.** Quickly triage suspicious Python payloads without setting up a matching Python interpreter. Add `--asm` for a bytecode listing alongside the source.
14
+ - **Forensics on old codebases.** Resurrect Python 2.x (even 1.x) modules when the source is long gone.
15
+ - **CI-side audits.** Depyo is a pure Node.js CLI — drop it in any Node pipeline to spot-check compiled `.pyc` against expected sources, or to extract and diff shipped bytecode.
16
+ - **Learning tool.** Inspect how CPython lowers a given Python feature (comprehensions, pattern matching, exception groups) across versions. `--asm` is handy here.
17
+ - **Batch processing.** Feed a `.zip` of `.pyc` files and get back a mirrored tree of `.py` sources.
18
+
19
+ ## Why depyo (vs alternatives)
20
+
21
+ | Tool | Versions | Modern features¹ | Runtime | Throughput | Notes |
22
+ | --------------------- | --------------------- | ---------------- | ------- | ---------- | -------------------------------------------- |
23
+ | **depyo** | 1.0–3.15 + PyPy | Yes | Node.js | ~0.1 ms/file² | Modern opcodes land fast; no Python needed |
24
+ | uncompyle6/decompyle3 | 2.x–3.12 (stalled) | Partial | Python | slower | Development largely halted on 3.13+ |
25
+ | pycdc (C++) | 2.x–3.x (limited new) | Partial | native | fast | Rich history, but slow to adopt new opcodes |
26
+
27
+ ¹ match/case, walrus, f-strings, exception groups, async/await, type params.
28
+ ² Informal: `py314_exception_groups.pyc` × 50 in-process, Node 25, single thread (`--stats` on your machine for real numbers).
10
29
 
11
30
  ## Install
12
- - Global: `npm i -g depyo`
13
- - One-off: `npx depyo <file.pyc>`
14
31
 
15
- Node.js 20+ recommended (matches CI).
32
+ ```bash
33
+ npm i -g depyo # global CLI
34
+ npx depyo <file.pyc> # one-off, no install
35
+ ```
36
+
37
+ Node.js 20+ recommended (CI gate).
16
38
 
17
39
  ## Quick start
18
40
 
19
41
  ```bash
20
- # Decompile a single .pyc
42
+ # Single .pyc writes <name>.py next to it
21
43
  node depyo.js /path/to/file.pyc
22
44
 
23
- # Decompile a zip of .pyc files, emit asm and keep raw bytes
24
- node depyo.js --asm --raw my_archive.zip
45
+ # ZIP of .pyc files mirrors structure
46
+ node depyo.js my_archive.zip
25
47
 
26
- # Write sources next to inputs (skip directory mirroring)
27
- node depyo.js --skip-path /path/to/file.pyc
48
+ # Also emit disassembly and preserve the raw .pyc
49
+ node depyo.js --asm --raw my_archive.zip
28
50
 
29
- # Dump to stdout instead of files
51
+ # Stream to stdout (no files written)
30
52
  node depyo.js --out /path/to/file.pyc
31
53
 
32
- # Marshal-only blob (no .pyc header)
33
- node depyo.js --marshal --py-version 3.11 /path/to/blob.bin
34
- node depyo.js --marshal /path/to/blob.bin
54
+ # Flatten outputs (drop mirrored directories)
55
+ node depyo.js --skip-path /path/to/file.pyc
35
56
 
36
- # Fast marshal scan (no decompile)
37
- node depyo.js --marshal-scan /path/to/blob.bin
57
+ # Headerless marshal blob (no .pyc magic)
58
+ node depyo.js --marshal --py-version 3.11 /path/to/blob.bin
59
+ node depyo.js --marshal /path/to/blob.bin # auto-scan
60
+ node depyo.js --marshal-scan /path/to/blob.bin # fast scan, no decompile
38
61
  ```
62
+
39
63
  Without `--py-version`, depyo scans supported versions (oldest → newest) and accepts the first clean output when all clean candidates agree. If outputs diverge (ambiguous), it stops and asks for `--py-version`. Use `--debug` to see scan results.
40
64
 
41
- ### CLI options
42
- - `--asm` emit `.pyasm` disassembly alongside source
43
- - `--raw` emit raw `.pyc` next to output
44
- - `--raw-spacing` preserve blank lines/comment gaps
45
- - `--dump` dump marshalled object tree
46
- - `--stats` print throughput stats
47
- - `--skip-source-gen` skip writing `.py` (use with `--asm/--dump`)
48
- - `--skip-path` flatten output paths (write next to input)
49
- - `--out` print source to stdout instead of files
50
- - `--marshal` treat input as raw marshalled data (no .pyc header, auto-scan versions)
51
- - `--marshal-scan` fast scan marshal blobs and print version candidates
52
- - `--py-version <x.y>` bytecode version hint (use with `--marshal`)
53
- - `--basedir <dir>` override output root (default: alongside input)
54
- - `--file-ext <ext>` change emitted extension (default `py`)
55
-
56
- ## Examples
57
- - Disassemble only (no source): `node depyo.js --skip-source-gen --asm file.pyc`
58
- - Keep raw + disassembly next to source: `node depyo.js --raw --asm path/to/file.pyc`
59
- - Flatten outputs (helpful for bulk zips): `node depyo.js --skip-path archive.zip`
65
+ ## Example
60
66
 
61
- ## Testing
62
- - Smoke per version:
63
- ```bash
64
- node scripts/run-fixtures.js --root test/bytecode_3.14 --pattern py314_with_except_star --fail-fast
65
- node scripts/run-fixtures.js --root test/bytecode_3.6 --pattern py36_fstrings --fail-fast
66
- ```
67
- - Matrix (all versions, optional PyPy):
68
- ```bash
69
- node scripts/run-matrix.js # full sweep
70
- node scripts/run-matrix.js --pattern py311_exception_groups --fail-fast
71
- ```
72
- - Marshal fixtures (headerless marshal blobs):
73
- ```bash
74
- node scripts/run-marshal-fixtures.js
75
- ```
76
- - Regenerate marshal fixtures:
77
- ```bash
78
- node scripts/generate-marshal-fixtures.js --clean
79
- ```
80
- - Modern fixtures are generated via `test/generate_modern_tests.py` (Python 3.8+ on PATH).
67
+ Input `greet.py`:
68
+
69
+ ```python
70
+ async def greet(names: list[str], *, greeting: str = "Hello") -> None:
71
+ seen = set()
72
+ for name in names:
73
+ if name in seen:
74
+ continue
75
+ seen.add(name)
76
+ print(f"{greeting}, {name}!")
77
+ ```
78
+
79
+ Compile (`python3.13 -c 'import py_compile; py_compile.compile("greet.py", "greet.pyc")'`) then:
80
+
81
+ ```bash
82
+ $ npx depyo --out greet.pyc
83
+ async def greet(names: list[str], *, greeting: str = "Hello") -> None:
84
+ seen = set()
85
+ for name in names:
86
+ if name in seen:
87
+ continue
88
+ seen.add(name)
89
+ print(f"{greeting}, {name}!")
90
+ ```
91
+
92
+ Pattern matching round-trips too:
93
+
94
+ ```python
95
+ match command.split():
96
+ case [action]:
97
+ run(action)
98
+ case [action, obj] if action in VERBS:
99
+ run(action, obj)
100
+ case _:
101
+ print("usage: ...")
102
+ ```
103
+
104
+ ## CLI options
105
+
106
+ | Option | Effect |
107
+ | ------------------------ | --------------------------------------------------------------- |
108
+ | `--asm` | Emit `.pyasm` disassembly alongside source |
109
+ | `--raw` | Copy raw `.pyc` next to output |
110
+ | `--raw-spacing` | Preserve blank-line / comment gaps |
111
+ | `--dump` | Dump the marshalled object tree |
112
+ | `--stats` | Print throughput stats |
113
+ | `--skip-source-gen` | Skip writing `.py` (useful with `--asm`/`--dump`) |
114
+ | `--skip-path` | Flatten output paths (write next to input) |
115
+ | `--out` | Print source to stdout instead of files |
116
+ | `--marshal` | Treat input as raw marshalled data (no `.pyc` header) |
117
+ | `--marshal-scan` | Fast scan marshal blobs; print candidate versions |
118
+ | `--py-version <x.y>` | Bytecode version hint (required for some headerless marshals) |
119
+ | `--basedir <dir>` | Override output root (default: alongside input) |
120
+ | `--file-ext <ext>` | Change emitted extension (default `py`) |
121
+
122
+ ## Programmatic API
123
+
124
+ ```js
125
+ const {PycReader} = require('depyo/lib/PycReader');
126
+ const {PycDecompiler} = require('depyo/lib/PycDecompiler');
127
+
128
+ const fs = require('fs');
129
+ const buffer = fs.readFileSync('greet.pyc');
130
+ const reader = new PycReader(buffer);
131
+ const obj = reader.ReadObject();
132
+
133
+ const decompiler = new PycDecompiler(obj);
134
+ const ast = decompiler.decompile();
135
+ console.log(ast.codeFragment().toString());
136
+ ```
81
137
 
82
138
  ## Support matrix
83
- - Python 1.0–3.14 opcode tables with expected fixtures.
84
- - Modern features: match/case, walrus, f-strings, exception groups, type params.
85
- - PyPy bytecode sets decompile; expected files are not yet part of CI.
86
- - Legacy CI smokes (1.x/2.7/3.0–3.6) are informational (`continue-on-error`); modern feature checks are blocking.
139
+
140
+ - **Python 1.0–3.15** opcode tables and expected fixtures.
141
+ - **Modern features:** match/case (guards, OR-patterns, bindings, wildcards), walrus, f-strings (nested, equals-sign debug), exception groups (`except*`), async comprehensions, type parameters, PEP 696 TypeVar defaults, PEP 750 t-strings.
142
+ - **PyPy** bytecode decompiles; expected fixtures not yet part of CI.
143
+ - **CI gates:** Modern feature checks are blocking; legacy 1.x / 2.7 / 3.0–3.6 smokes gate as well.
87
144
 
88
145
  ## Known limitations
89
- - **Inline comprehensions (Python 3.12+):** PEP 709 inlines list/dict/set comprehensions into parent code objects. Depyo currently reconstructs these as for-loops rather than comprehension expressions. Functions, classes, match/case, exception handling, and other constructs work correctly.
90
146
 
91
- ## Contributing / DX tips
147
+ - **Inline comprehensions (3.12+):** PEP 709 inlines list/dict/set comprehensions into the parent code object. Depyo currently reconstructs these as for-loops rather than comprehension expressions. Functions, classes, match/case, exception handling, and other constructs work correctly.
148
+ - **Comments / blank lines:** Lost in compilation and not recoverable. `--raw-spacing` can hint at original gaps using line-number attributes.
149
+ - **Source-level AST drift:** Some constructs are normalized by CPython before bytecode (e.g. `if not x: raise AssertionError` ↔ `assert x`). Depyo renders what the compiler produced.
150
+
151
+ ## Testing
152
+
153
+ ```bash
154
+ # Smoke per version
155
+ node scripts/run-fixtures.js --root test/bytecode_3.14 --pattern py314_with_except_star --fail-fast
156
+ node scripts/run-fixtures.js --root test/bytecode_3.6 --pattern py36_fstrings --fail-fast
157
+
158
+ # Full matrix
159
+ node scripts/run-matrix.js
160
+ node scripts/run-matrix.js --pattern py311_exception_groups --fail-fast
161
+
162
+ # Marshal-blob fixtures (headerless)
163
+ node scripts/run-marshal-fixtures.js
164
+
165
+ # Regenerate snapshot fixtures (destructive)
166
+ node scripts/generate-marshal-fixtures.js --clean
167
+
168
+ # Tier-1 oracle: parseability of every decompiled fixture
169
+ node scripts/check-parseable.js
170
+
171
+ # Tier-2 oracle: AST equivalence between source .py and decompiled .py
172
+ node scripts/check-ast-equivalence.js
173
+
174
+ # Sentinel leak gate (CI-critical)
175
+ node scripts/check-no-sentinels.js
176
+ ```
177
+
178
+ Modern fixtures are generated via `test/generate_modern_tests.py` (Python 3.8+ on PATH).
179
+
180
+ ## Contributing
181
+
92
182
  - Use `node scripts/run-fixtures.js --pattern <piece>` for fast repros.
93
183
  - For full coverage, `node scripts/run-matrix.js --fail-fast` (optionally add `--pattern`).
94
- - Enable `--raw-spacing` to inspect potential comment/blank-line gaps.
184
+ - `--raw-spacing` helps inspect potential comment/blank-line gaps.
95
185
  - `--stats` helps when profiling throughput.
96
186
 
97
- Comments and docs are in English; output mirrors the target Python version syntax.
187
+ Issues, repro `.pyc` files, and PRs welcome at https://github.com/skuznetsov/depyo.js/issues.
98
188
 
99
- ## Comparison snapshot (at a glance)
100
-
101
- | Project | Supported versions | Modern features (match, walrus, f-strings, exc groups) | Delivery | Expected fixtures | Notes |
102
- | ------------------ | --------------------------- | ------------------------------------------------------ | ------------ | ----------------- | ----------------------------------------- |
103
- | depyo | 1.0–3.14 (PyPy decompiles) | Yes | npm/npx, CLI | Yes (1.0–3.14) | Node.js CLI, asm/raw-spacing options |
104
- | uncompyle6/decompyle3 | 2.x–3.12+ (lag on 3.13/3.14) | Partial (depends on branch) | pip | Partial | Python-based, slower adoption of new ops |
105
- | pycdc (C++) | Mostly 2.x–3.x (limited new) | Partial | source build | No | Fast, but modern coverage limited |
189
+ Comments and docs are in English; output mirrors the target Python version syntax.
106
190
 
107
- ## Quick benchmark (informal)
108
- - Machine: local Node 25, single-thread.
109
- - Case: `py314_exception_groups.pyc` decompiled 50× in-process: ~5.3 ms total (≈0.1 ms per decompile).
110
- Use `node depyo.js --stats <file.pyc>` for your environment.
191
+ ## License
111
192
 
112
- ## Promotion ideas (OSS)
113
- - Announce on HN/Reddit (Show HN / r/Python) with npm/npx one-liners.
114
- - Add to awesome lists (`awesome-python`, `awesome-reverse-engineering`).
115
- - Provide asciinema/GIF of `npx depyo file.pyc` + `--asm`.
116
- - Encourage contributions via Issues/Discussions and `help wanted` labels.
193
+ MIT see [LICENSE](LICENSE).