codeclone 1.2.0__py3-none-any.whl → 1.2.1__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: codeclone
3
- Version: 1.2.0
3
+ Version: 1.2.1
4
4
  Summary: AST and CFG-based code clone detector for Python focused on architectural duplication
5
5
  Author-email: Den Rozhnovskiy <pytelemonbot@mail.ru>
6
6
  Maintainer-email: Den Rozhnovskiy <pytelemonbot@mail.ru>
@@ -10,11 +10,10 @@ Project-URL: Repository, https://github.com/orenlab/codeclone
10
10
  Project-URL: Issues, https://github.com/orenlab/codeclone/issues
11
11
  Project-URL: Changelog, https://github.com/orenlab/codeclone/releases
12
12
  Project-URL: Documentation, https://github.com/orenlab/codeclone/tree/main/docs
13
- Keywords: python,ast,code-clone,duplication,static-analysis,ci,architecture
13
+ Keywords: python,ast,cfg,code-clone,duplication,static-analysis,architecture,control-flow,ci
14
14
  Classifier: Development Status :: 5 - Production/Stable
15
15
  Classifier: Intended Audience :: Developers
16
16
  Classifier: Topic :: Software Development :: Quality Assurance
17
- Classifier: Topic :: Software Development :: Code Generators
18
17
  Classifier: Topic :: Software Development :: Testing
19
18
  Classifier: Typing :: Typed
20
19
  Classifier: License :: OSI Approved :: MIT License
@@ -32,19 +31,22 @@ Requires-Dist: pygments>=2.19.2
32
31
  Requires-Dist: rich>=14.3.2
33
32
  Provides-Extra: dev
34
33
  Requires-Dist: pytest>=9.0.0; extra == "dev"
34
+ Requires-Dist: pytest-cov>=6.1.0; extra == "dev"
35
35
  Requires-Dist: build>=1.2.0; extra == "dev"
36
36
  Requires-Dist: twine>=5.0.0; extra == "dev"
37
37
  Requires-Dist: mypy>=1.19.1; extra == "dev"
38
+ Requires-Dist: ruff>=0.12.0; extra == "dev"
38
39
  Dynamic: license-file
39
40
 
40
41
  # CodeClone
41
42
 
42
43
  [![PyPI](https://img.shields.io/pypi/v/codeclone.svg)](https://pypi.org/project/codeclone/)
43
44
  [![Downloads](https://img.shields.io/pypi/dm/codeclone.svg)](https://pypi.org/project/codeclone/)
45
+ [![tests](https://github.com/orenlab/codeclone/actions/workflows/tests.yml/badge.svg?branch=main)](https://github.com/orenlab/codeclone/actions/workflows/tests.yml)
44
46
  [![Python](https://img.shields.io/pypi/pyversions/codeclone.svg)](https://pypi.org/project/codeclone/)
45
47
  [![License](https://img.shields.io/pypi/l/codeclone.svg)](LICENSE)
46
48
 
47
- **CodeClone** is a Python code clone detector based on **normalized AST and control-flow graphs (CFG)**.
49
+ **CodeClone** is a Python code clone detector based on **normalized Python AST and Control Flow Graphs (CFG)**.
48
50
  It helps teams discover architectural duplication and prevent new copy-paste from entering the codebase via CI.
49
51
 
50
52
  CodeClone is designed to help teams:
@@ -53,15 +55,16 @@ CodeClone is designed to help teams:
53
55
  - identify architectural hotspots,
54
56
  - prevent *new* duplication via CI and pre-commit hooks.
55
57
 
56
- Unlike token- or text-based tools, CodeClone operates on **normalized Python AST and CFG**, making it robust against renaming,
57
- formatting, and minor refactoring.
58
+ Unlike token- or text-based tools, CodeClone operates on **normalized Python AST and CFG**, making it robust against
59
+ renaming, formatting, and minor refactoring.
58
60
 
59
61
  ---
60
62
 
61
63
  ## Why CodeClone?
62
64
 
63
65
  Most existing tools detect *textual* duplication.
64
- CodeClone detects **structural and block-level duplication**, which usually signals missing abstractions or architectural drift.
66
+ CodeClone detects **structural and block-level duplication**, which usually signals missing abstractions or
67
+ architectural drift.
65
68
 
66
69
  Typical use cases:
67
70
 
@@ -79,11 +82,11 @@ Typical use cases:
79
82
  - Detects functions and methods with identical **control-flow structure**.
80
83
  - Based on **Control Flow Graph (CFG)** fingerprinting.
81
84
  - Robust to:
82
- - variable renaming,
83
- - constant changes,
84
- - attribute renaming,
85
- - formatting differences,
86
- - docstrings and type annotations.
85
+ - variable renaming,
86
+ - constant changes,
87
+ - attribute renaming,
88
+ - formatting differences,
89
+ - docstrings and type annotations.
87
90
  - Ideal for spotting architectural duplication across layers.
88
91
 
89
92
  ### Block-level clone detection (Type-3-lite)
@@ -91,29 +94,29 @@ Typical use cases:
91
94
  - Detects repeated **statement blocks** inside larger functions.
92
95
  - Uses sliding windows over CFG-normalized statement sequences.
93
96
  - Targets:
94
- - validation blocks,
95
- - guard clauses,
96
- - repeated orchestration logic.
97
+ - validation blocks,
98
+ - guard clauses,
99
+ - repeated orchestration logic.
97
100
  - Carefully filtered to reduce noise:
98
- - no overlapping windows,
99
- - no clones inside the same function,
100
- - no `__init__` noise,
101
- - size and statement-count thresholds.
101
+ - no overlapping windows,
102
+ - no clones inside the same function,
103
+ - no `__init__` noise,
104
+ - size and statement-count thresholds.
102
105
 
103
106
  ### Control-Flow Awareness (CFG v1)
104
107
 
105
108
  - Each function is converted into a **Control Flow Graph**.
106
109
  - CFG nodes contain normalized AST statements.
107
110
  - CFG edges represent structural control flow:
108
- - `if` / `else`
109
- - `for` / `async for` / `while`
110
- - `try` / `except` / `finally`
111
- - `with` / `async with`
112
- - `match` / `case` (Python 3.10+)
111
+ - `if` / `else`
112
+ - `for` / `async for` / `while`
113
+ - `try` / `except` / `finally`
114
+ - `with` / `async with`
115
+ - `match` / `case` (Python 3.10+)
113
116
  - Current CFG semantics (v1):
114
- - `break` and `continue` are treated as statements (no jump targets),
115
- - after-blocks are explicit and always present,
116
- - focus is on **structural similarity**, not precise runtime semantics.
117
+ - `break` and `continue` are treated as statements (no jump targets),
118
+ - after-blocks are explicit and always present,
119
+ - focus is on **structural similarity**, not precise runtime semantics.
117
120
 
118
121
  This design keeps clone detection **stable, deterministic, and low-noise**.
119
122
 
@@ -122,6 +125,7 @@ This design keeps clone detection **stable, deterministic, and low-noise**.
122
125
  - AST + CFG normalization instead of token matching.
123
126
  - Conservative defaults tuned for real-world Python projects.
124
127
  - Explicit thresholds for size and statement count.
128
+ - No probabilistic scoring or heuristic similarity thresholds.
125
129
  - Focus on *architectural duplication*, not micro-similarities.
126
130
 
127
131
  ### CI-friendly baseline mode
@@ -188,14 +192,26 @@ Commit the generated baseline file to the repository.
188
192
  ### 2. Use in CI
189
193
 
190
194
  ```bash
191
- codeclone . --fail-on-new
195
+ codeclone . --fail-on-new --no-progress
192
196
  ```
193
197
 
194
198
  Behavior:
195
199
 
196
- - existing clones are allowed,
197
- - build fails if *new* clones appear,
198
- - refactoring that removes duplication is always allowed.
200
+ - existing clones are allowed,
201
+ - the build fails if *new* clones appear,
202
+ - refactoring that removes duplication is always allowed.
203
+
204
+ `--fail-on-new` exits with a non-zero code when new clones are detected.
205
+
206
+ ### Python Version Consistency for Baseline Checks
207
+
208
+ Due to inherent differences in Python’s AST between interpreter versions, baseline
209
+ generation and verification must be performed using the same Python version.
210
+
211
+ This ensures deterministic and reproducible clone detection results.
212
+
213
+ CI checks therefore pin baseline verification to a single Python version, while the
214
+ test matrix continues to validate compatibility across Python 3.10–3.14.
199
215
 
200
216
  ---
201
217
 
@@ -203,14 +219,14 @@ Behavior:
203
219
 
204
220
  ```yaml
205
221
  repos:
206
- - repo: local
222
+ - repo: local
207
223
  hooks:
208
- - id: codeclone
224
+ - id: codeclone
209
225
  name: CodeClone
210
226
  entry: codeclone
211
227
  language: python
212
- args: [".", "--fail-on-new"]
213
- types: [python]
228
+ args: [ ".", "--fail-on-new" ]
229
+ types: [ python ]
214
230
  ```
215
231
 
216
232
  ---
@@ -243,6 +259,7 @@ repos:
243
259
  6. Apply conservative filters to suppress noise.
244
260
 
245
261
  See the architectural overview:
262
+
246
263
  - [docs/architecture.md](docs/architecture.md)
247
264
 
248
265
  ---
@@ -255,6 +272,7 @@ to improve structural clone detection robustness.
255
272
  The CFG is a **structural abstraction**, not a runtime execution model.
256
273
 
257
274
  See full design and semantics:
275
+
258
276
  - [docs/cfg.md](docs/cfg.md)
259
277
 
260
278
  ---
@@ -0,0 +1,23 @@
1
+ codeclone/__init__.py,sha256=M4X23w41G4ph6fx29JG3qrkkyWaTSwvCe5O6jeJvFd0,364
2
+ codeclone/baseline.py,sha256=Y367RgcTlBwQxkQmm4gGwWft7yhp7u3BMeeWQsZgV6I,2533
3
+ codeclone/blockhash.py,sha256=0vavcw3nkg7IIIqqYRObVmzHSJSC3tbzmiVqqr21miA,594
4
+ codeclone/blocks.py,sha256=BAWb9iPvSWYggQr9SlWIpb-7lyvVvrW0dxjTEnKdm0o,1796
5
+ codeclone/cache.py,sha256=3Ky4ubZZsDvHTwqSRPQ9nPl4t4eCsUtiXCWeNBOZfIo,5493
6
+ codeclone/cfg.py,sha256=XJWb2fcKxa3f6ffRMrQroX6QzkPcByZPNwjhsebbx2g,8236
7
+ codeclone/cfg_model.py,sha256=D2KK4OZrVV-OCwFECw-Ej_u32uekNgj6pdTqCvZa5H0,1170
8
+ codeclone/cli.py,sha256=SjoTSBDNyrjlNhTMze_-Y1GvuZ1nB4wVa027y9XiOgc,20550
9
+ codeclone/errors.py,sha256=3QRu2IR8I-VgJLkT3B9DZ0fW4Sxi-EnBYVUeX3rfWhw,556
10
+ codeclone/extractor.py,sha256=LixKDPJpP2V1UIwW0vsFJm6BKr1NijD8BCGJB3I9lds,6946
11
+ codeclone/fingerprint.py,sha256=H0YY209sfGf02VeLlxHNDp7n6es0vLiMmq3TBWDm3SE,545
12
+ codeclone/html_report.py,sha256=OwVRt4o3DZ17RMDgzI5578bqbyHLqC0mnedmDlYWwag,16369
13
+ codeclone/normalize.py,sha256=BOTgJS5NmJxgBwHFxkqWqRNL1uk9rJtjdeQtSgFscS8,4461
14
+ codeclone/py.typed,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
15
+ codeclone/report.py,sha256=g8EiUksjrQXjJch8-KfJT-IsMkONNnuaj1rNOIfPDbE,2153
16
+ codeclone/scanner.py,sha256=44YbL9fiDauNgwrMbnLijoH3CMWInro4vBsIlBLQu-M,2669
17
+ codeclone/templates.py,sha256=UDEzE_X7n5eWU1CyXChxNhVPM6j3OvVBYDlWcyQaY7I,29375
18
+ codeclone-1.2.1.dist-info/licenses/LICENSE,sha256=ndXAbunvN-jCQjgCaoBOF5AN4FcRlAa0R7hK1lWDuBU,1073
19
+ codeclone-1.2.1.dist-info/METADATA,sha256=Xxe9hqv01xYFZpjmS0gmzAGzobMV5Cyb_MvvkzjL1Vs,8110
20
+ codeclone-1.2.1.dist-info/WHEEL,sha256=wUyA8OaulRlbfwMtmQsvNngGrxQHAvkKcvRmdizlJi0,92
21
+ codeclone-1.2.1.dist-info/entry_points.txt,sha256=_MI9DVTLOmv3OlxpyogdOmMAduiLVIdHeOlZ_KBsrIY,49
22
+ codeclone-1.2.1.dist-info/top_level.txt,sha256=4tQa_d-4Zle-qV9KmNDkWq0WHYgZsW9vdaeF30rNntg,10
23
+ codeclone-1.2.1.dist-info/RECORD,,
@@ -1,19 +0,0 @@
1
- codeclone/__init__.py,sha256=_MZuqgioYGn49qw6_OluedgcRvuu8IhAwVgHkM4Kj7s,364
2
- codeclone/baseline.py,sha256=tV-vgCpyQi-g1AlWJwuUFDeQwUZuhj4tX-BxFDv-LWo,1719
3
- codeclone/blockhash.py,sha256=jOswW9jqe9ww3y4r-gLTUZjMmn0CHXpU5qTtFndKQ10,594
4
- codeclone/blocks.py,sha256=Y75BSpzf-zyyeD9-vKnQfJ3QwIjYxMwiN9DbEqnbONg,1735
5
- codeclone/cache.py,sha256=DexslhZfxj79fnoPna8r_oBCqmsstN2ICRA_o6ZVpGo,1559
6
- codeclone/cfg.py,sha256=op503zRnew2ZIz5coK_HUaKt1VG1ucJBacXAZ1rBJYQ,11587
7
- codeclone/cli.py,sha256=c_61l3wSF2nbTZgd6LmvoUvejXatydPAPmJyRknHx78,13229
8
- codeclone/extractor.py,sha256=crCxgkK1n2fl5FUz6HtbaEZUH5CO8S0zqM0Xj0RSE6E,4558
9
- codeclone/fingerprint.py,sha256=H0YY209sfGf02VeLlxHNDp7n6es0vLiMmq3TBWDm3SE,545
10
- codeclone/html_report.py,sha256=e7gYxEHk5ezJtGUYIYsQlcxu_0fP3hmd9jj-zMWMfJY,25574
11
- codeclone/normalize.py,sha256=bvMoY3VDiZsnFiD50h5XUgwnMUKOyUEx6lJXCBiembg,4290
12
- codeclone/report.py,sha256=IwblTgZe4liTq3gHagnw6O4ZUkRqJ448XqzhLhJZoWM,1972
13
- codeclone/scanner.py,sha256=BdJFyaLv1xvimVkJyvvTN0FcG2RQZzRlTlHWi2fRQnU,1050
14
- codeclone-1.2.0.dist-info/licenses/LICENSE,sha256=ndXAbunvN-jCQjgCaoBOF5AN4FcRlAa0R7hK1lWDuBU,1073
15
- codeclone-1.2.0.dist-info/METADATA,sha256=RekVw29wHrLX0fHNLBYS9Ky6Ee6oFnntQxeCOcRbMTM,7255
16
- codeclone-1.2.0.dist-info/WHEEL,sha256=wUyA8OaulRlbfwMtmQsvNngGrxQHAvkKcvRmdizlJi0,92
17
- codeclone-1.2.0.dist-info/entry_points.txt,sha256=_MI9DVTLOmv3OlxpyogdOmMAduiLVIdHeOlZ_KBsrIY,49
18
- codeclone-1.2.0.dist-info/top_level.txt,sha256=4tQa_d-4Zle-qV9KmNDkWq0WHYgZsW9vdaeF30rNntg,10
19
- codeclone-1.2.0.dist-info/RECORD,,