code-maat-python 0.1.0__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- code_maat_python/__init__.py +12 -0
- code_maat_python/__main__.py +5 -0
- code_maat_python/analyses/__init__.py +39 -0
- code_maat_python/analyses/age.py +101 -0
- code_maat_python/analyses/authors.py +60 -0
- code_maat_python/analyses/churn.py +353 -0
- code_maat_python/analyses/communication.py +151 -0
- code_maat_python/analyses/coupling.py +136 -0
- code_maat_python/analyses/effort.py +210 -0
- code_maat_python/analyses/entities.py +51 -0
- code_maat_python/analyses/revisions.py +56 -0
- code_maat_python/analyses/soc.py +90 -0
- code_maat_python/analyses/summary.py +61 -0
- code_maat_python/cli.py +822 -0
- code_maat_python/output/__init__.py +0 -0
- code_maat_python/parser.py +232 -0
- code_maat_python/pipeline.py +112 -0
- code_maat_python/transformers/__init__.py +0 -0
- code_maat_python/transformers/grouper.py +204 -0
- code_maat_python/transformers/team_mapper.py +132 -0
- code_maat_python/transformers/time_grouper.py +146 -0
- code_maat_python/utils/__init__.py +0 -0
- code_maat_python/utils/math.py +105 -0
- code_maat_python-0.1.0.dist-info/METADATA +545 -0
- code_maat_python-0.1.0.dist-info/RECORD +28 -0
- code_maat_python-0.1.0.dist-info/WHEEL +4 -0
- code_maat_python-0.1.0.dist-info/entry_points.txt +3 -0
- code_maat_python-0.1.0.dist-info/licenses/LICENSE +674 -0
|
@@ -0,0 +1,545 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: code-maat-python
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Modern Python tool for mining and analyzing version control system data
|
|
5
|
+
License: GPL-3.0
|
|
6
|
+
License-File: LICENSE
|
|
7
|
+
Keywords: vcs,git,analysis,mining,coupling,churn
|
|
8
|
+
Author: Cameron Yick
|
|
9
|
+
Author-email: cameron.yick@gmail.com
|
|
10
|
+
Requires-Python: >=3.10,<4.0
|
|
11
|
+
Classifier: Development Status :: 3 - Alpha
|
|
12
|
+
Classifier: Intended Audience :: Developers
|
|
13
|
+
Classifier: License :: OSI Approved :: GNU General Public License v3 (GPLv3)
|
|
14
|
+
Classifier: Programming Language :: Python :: 3
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
16
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
17
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.13
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.14
|
|
20
|
+
Classifier: Topic :: Software Development :: Version Control
|
|
21
|
+
Requires-Dist: click (>=8.1.0,<9.0.0)
|
|
22
|
+
Requires-Dist: pandas (>=2.0.0,<3.0.0)
|
|
23
|
+
Requires-Dist: python-dateutil (>=2.8.0,<3.0.0)
|
|
24
|
+
Project-URL: Homepage, https://github.com/hydrosquall/code-maat-python
|
|
25
|
+
Project-URL: Repository, https://github.com/hydrosquall/code-maat-python
|
|
26
|
+
Description-Content-Type: text/markdown
|
|
27
|
+
|
|
28
|
+
# code-maat-python
|
|
29
|
+
|
|
30
|
+
> **Discover hidden patterns in your codebase that predict bugs, reveal team dynamics, and guide better decisions.**
|
|
31
|
+
|
|
32
|
+
[](https://opensource.org/licenses/GPL-3.0)
|
|
33
|
+
[](https://www.python.org/downloads/)
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## What is This?
|
|
38
|
+
|
|
39
|
+
**code-maat-python** is a tool that analyzes your Git history to answer questions like:
|
|
40
|
+
|
|
41
|
+
- **Which files have the most bugs?** (Hint: it's usually the files that change together most often)
|
|
42
|
+
- **Which developers need to talk to each other?** (They're editing the same code without knowing it)
|
|
43
|
+
- **Where is technical debt hiding?** (Old, untouched files that everyone's afraid to change)
|
|
44
|
+
- **Who really owns this code?** (Spoiler: it's not who you think)
|
|
45
|
+
- **Which files are the riskiest to change?** (High churn = high risk)
|
|
46
|
+
|
|
47
|
+
**You don't need to be a data scientist.** If you can run a Git command, you can use this tool.
|
|
48
|
+
|
|
49
|
+
---
|
|
50
|
+
|
|
51
|
+
## Why Should You Care?
|
|
52
|
+
|
|
53
|
+
### **Real-World Problem: The Hidden Coupling Bug**
|
|
54
|
+
|
|
55
|
+
Imagine this scenario:
|
|
56
|
+
- Sarah fixes a bug in `payment_processor.py`
|
|
57
|
+
- Tests pass. Code review approved. She deploys.
|
|
58
|
+
- **Production breaks.**
|
|
59
|
+
|
|
60
|
+
Why? Because `invoice_generator.py` also needed to change, but nobody knew these files were related. They're in different directories, never import each other, but they *always change together* because they share hidden business logic.
|
|
61
|
+
|
|
62
|
+
**code-maat-python finds these patterns automatically** by analyzing your Git history.
|
|
63
|
+
|
|
64
|
+
### **What You Get**
|
|
65
|
+
|
|
66
|
+
- **Predict defects**: Files that change together frequently have 2-10x more bugs ([Microsoft Research](https://dl.acm.org/doi/10.1145/1453101.1453106))
|
|
67
|
+
- **Improve code reviews**: Know which files need extra scrutiny
|
|
68
|
+
- **Optimize team structure**: Identify communication bottlenecks before they cause problems
|
|
69
|
+
- **Find technical debt**: Discover abandoned code, knowledge silos, and refactoring opportunities
|
|
70
|
+
- **Make data-driven decisions**: Replace gut feelings with actual evidence from your repository
|
|
71
|
+
|
|
72
|
+
---
|
|
73
|
+
|
|
74
|
+
## Quick Start: See It In Action
|
|
75
|
+
|
|
76
|
+
### Installation
|
|
77
|
+
|
|
78
|
+
```bash
|
|
79
|
+
# Using pip
|
|
80
|
+
pip install code-maat-python
|
|
81
|
+
|
|
82
|
+
# Using poetry (recommended for development)
|
|
83
|
+
poetry add code-maat-python
|
|
84
|
+
```
|
|
85
|
+
|
|
86
|
+
### Your First Analysis (2 minutes)
|
|
87
|
+
|
|
88
|
+
**Step 1: Generate a Git log**
|
|
89
|
+
|
|
90
|
+
```bash
|
|
91
|
+
cd your-project
|
|
92
|
+
git log --all -M -C --numstat --date=short --pretty=format:'--%h--%cd--%cn' > git.log
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
**Step 2: Find files that change together (coupling analysis)**
|
|
96
|
+
|
|
97
|
+
```bash
|
|
98
|
+
code-maat-python coupling git.log --min-coupling 50 --rows 10
|
|
99
|
+
```
|
|
100
|
+
|
|
101
|
+
**Output:**
|
|
102
|
+
```csv
|
|
103
|
+
entity,coupled,degree,average-revs
|
|
104
|
+
src/models/user.py,src/views/profile.py,87,45
|
|
105
|
+
src/api/auth.py,src/middleware/session.py,76,32
|
|
106
|
+
src/utils/validators.py,src/forms/registration.py,65,28
|
|
107
|
+
...
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
**What This Tells You:**
|
|
111
|
+
- `user.py` and `profile.py` change together 87% of the time
|
|
112
|
+
- This coupling is based on 45 average revisions
|
|
113
|
+
- **Action**: These files should probably be reviewed together, tested together, and maybe even refactored into a single module
|
|
114
|
+
|
|
115
|
+
---
|
|
116
|
+
|
|
117
|
+
### One-Liner with UV (No Log File Needed)
|
|
118
|
+
|
|
119
|
+
Run analysis directly without creating an intermediate file using **process substitution**:
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
# Coupling analysis (last month)
|
|
123
|
+
uvx code-maat-python coupling <(git log --all -M -C --numstat --date=short --since="1 month ago" --pretty=format:'--%h--%cd--%cn') --min-coupling 50
|
|
124
|
+
|
|
125
|
+
# Hotspots (revisions)
|
|
126
|
+
uvx code-maat-python revisions <(git log --all -M -C --numstat --date=short --since="1 month ago" --pretty=format:'--%h--%cd--%cn') --rows 20
|
|
127
|
+
|
|
128
|
+
# Communication analysis
|
|
129
|
+
uvx code-maat-python communication <(git log --all -M -C --numstat --date=short --since="1 month ago" --pretty=format:'--%h--%cd--%cn') --min-shared 10
|
|
130
|
+
|
|
131
|
+
# For more history, adjust the time period:
|
|
132
|
+
# --since="3 months ago"
|
|
133
|
+
# --since="6 months ago"
|
|
134
|
+
# --since="1 year ago"
|
|
135
|
+
# Or omit --since to analyze entire history
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
**Note:**
|
|
139
|
+
- Process substitution `<(...)` works on macOS and Linux with bash/zsh. For Windows or portable scripts, create a temporary file instead.
|
|
140
|
+
- Default to `--since="1 month ago"` to keep output manageable. Adjust time period as needed.
|
|
141
|
+
|
|
142
|
+
---
|
|
143
|
+
|
|
144
|
+
### Shell Alias for Quick Analysis
|
|
145
|
+
|
|
146
|
+
Add this to your `~/.bashrc` or `~/.zshrc` for even faster analysis:
|
|
147
|
+
|
|
148
|
+
```bash
|
|
149
|
+
# Default: analyze last month
|
|
150
|
+
maat() {
|
|
151
|
+
local analysis="${1:?Usage: maat <analysis> [options]}"
|
|
152
|
+
shift
|
|
153
|
+
uvx code-maat-python "$analysis" <(git log --all -M -C --numstat --date=short --since="1 month ago" --pretty=format:'--%h--%cd--%cn') "$@"
|
|
154
|
+
}
|
|
155
|
+
|
|
156
|
+
# Optional: analyze custom time period
|
|
157
|
+
maat-since() {
|
|
158
|
+
local since="${1:?Usage: maat-since <time-period> <analysis> [options]}"
|
|
159
|
+
local analysis="${2:?}"
|
|
160
|
+
shift 2
|
|
161
|
+
uvx code-maat-python "$analysis" <(git log --all -M -C --numstat --date=short --since="$since" --pretty=format:'--%h--%cd--%cn') "$@"
|
|
162
|
+
}
|
|
163
|
+
```
|
|
164
|
+
|
|
165
|
+
Then use it like:
|
|
166
|
+
```bash
|
|
167
|
+
# Default (last month)
|
|
168
|
+
maat coupling --min-coupling 50
|
|
169
|
+
maat revisions --rows 20
|
|
170
|
+
maat communication --min-shared 10
|
|
171
|
+
|
|
172
|
+
# Custom time period
|
|
173
|
+
maat-since "3 months ago" coupling --min-coupling 50
|
|
174
|
+
maat-since "1 year ago" revisions --rows 20
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
---
|
|
178
|
+
|
|
179
|
+
## Installation
|
|
180
|
+
|
|
181
|
+
### Requirements
|
|
182
|
+
- Python 3.10 or higher
|
|
183
|
+
- Git (to generate logs)
|
|
184
|
+
|
|
185
|
+
### Install via pip
|
|
186
|
+
|
|
187
|
+
```bash
|
|
188
|
+
pip install code-maat-python
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
### Install from source
|
|
192
|
+
|
|
193
|
+
```bash
|
|
194
|
+
git clone https://github.com/hydrosquall/code-maat-python.git
|
|
195
|
+
cd code-maat-python
|
|
196
|
+
poetry install
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
### Verify installation
|
|
200
|
+
|
|
201
|
+
```bash
|
|
202
|
+
code-maat-python --help
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
---
|
|
206
|
+
|
|
207
|
+
## Available Analysis Commands
|
|
208
|
+
|
|
209
|
+
code-maat-python provides 17 different analysis commands organized into these categories:
|
|
210
|
+
|
|
211
|
+
### Code Quality & Risk
|
|
212
|
+
- **coupling** - Find files that change together (hidden dependencies)
|
|
213
|
+
- **soc** - Sum of coupling (quick hotspot detection)
|
|
214
|
+
- **entity-churn** - Identify high-change files (defect predictors)
|
|
215
|
+
- **age** - Find stale or stable code
|
|
216
|
+
|
|
217
|
+
### Team & Collaboration
|
|
218
|
+
- **communication** - Identify developers who need to coordinate
|
|
219
|
+
- **authors** - Knowledge distribution across files
|
|
220
|
+
- **entity-ownership** - Contribution breakdown per file
|
|
221
|
+
- **main-dev** - Primary contributor by lines added
|
|
222
|
+
- **main-dev-by-revs** - Primary contributor by commits
|
|
223
|
+
- **refactoring-main-dev** - Who cleans up code
|
|
224
|
+
|
|
225
|
+
### Activity & Effort
|
|
226
|
+
- **revisions** - Most frequently changed files
|
|
227
|
+
- **abs-churn** - Activity over time
|
|
228
|
+
- **author-churn** - Individual contribution patterns
|
|
229
|
+
- **entity-effort** - Effort distribution by commits
|
|
230
|
+
- **fragmentation** - Contributor spread analysis
|
|
231
|
+
|
|
232
|
+
### Overview
|
|
233
|
+
- **summary** - Repository health check
|
|
234
|
+
- **entities** - List all tracked files
|
|
235
|
+
|
|
236
|
+
For detailed command documentation, examples, and advanced options, see [REFERENCE.md](REFERENCE.md).
|
|
237
|
+
|
|
238
|
+
---
|
|
239
|
+
|
|
240
|
+
## Real-World Use Cases
|
|
241
|
+
|
|
242
|
+
### Use Case 1: Pre-Release Risk Assessment
|
|
243
|
+
|
|
244
|
+
**Problem**: You're about to ship a major release. Which files are most risky?
|
|
245
|
+
|
|
246
|
+
**Solution**:
|
|
247
|
+
```bash
|
|
248
|
+
# 1. Find high-churn files (frequently changed = higher risk)
|
|
249
|
+
code-maat-python entity-churn git.log --rows 20 > high-churn.csv
|
|
250
|
+
|
|
251
|
+
# 2. Find tightly coupled files (changes cascade)
|
|
252
|
+
code-maat-python coupling git.log --min-coupling 60 > coupling.csv
|
|
253
|
+
|
|
254
|
+
# 3. Find fragmented files (many authors = inconsistent)
|
|
255
|
+
code-maat-python fragmentation git.log --rows 20 > fragmented.csv
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
**Action**: Focus testing and code review on files appearing in all three reports.
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
### Use Case 2: Team Communication Gaps
|
|
263
|
+
|
|
264
|
+
**Problem**: Your team keeps stepping on each other's toes, causing merge conflicts and duplicate work.
|
|
265
|
+
|
|
266
|
+
**Solution**:
|
|
267
|
+
```bash
|
|
268
|
+
# Show which developers work on the same code
|
|
269
|
+
code-maat-python communication git.log --min-shared 15
|
|
270
|
+
```
|
|
271
|
+
|
|
272
|
+
**Output**:
|
|
273
|
+
```csv
|
|
274
|
+
author,peer,shared,strength
|
|
275
|
+
Alice,Bob,23,34
|
|
276
|
+
Bob,Charlie,18,35
|
|
277
|
+
Alice,Charlie,12,27
|
|
278
|
+
```
|
|
279
|
+
|
|
280
|
+
**Action**: Alice and Bob should sync up regularly - they're working on overlapping code 34% of the time.
|
|
281
|
+
|
|
282
|
+
---
|
|
283
|
+
|
|
284
|
+
### Use Case 3: Architectural Analysis
|
|
285
|
+
|
|
286
|
+
**Problem**: You want to understand coupling at the **architecture level**, not individual files.
|
|
287
|
+
|
|
288
|
+
**Solution**: Use architectural grouping!
|
|
289
|
+
|
|
290
|
+
**Step 1**: Create `layers.txt`:
|
|
291
|
+
```
|
|
292
|
+
src/controllers => Controllers
|
|
293
|
+
src/models => Models
|
|
294
|
+
src/views => Views
|
|
295
|
+
src/utils => Utilities
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
**Step 2**: Analyze coupling by layer:
|
|
299
|
+
```bash
|
|
300
|
+
code-maat-python coupling git.log --group layers.txt --min-coupling 40
|
|
301
|
+
```
|
|
302
|
+
|
|
303
|
+
**Output**:
|
|
304
|
+
```csv
|
|
305
|
+
entity,coupled,degree,average-revs
|
|
306
|
+
Controllers,Models,78,234
|
|
307
|
+
Views,Controllers,65,198
|
|
308
|
+
Models,Utilities,42,156
|
|
309
|
+
```
|
|
310
|
+
|
|
311
|
+
**Insight**: Controllers and Models are tightly coupled (78%) - might indicate business logic leaking into controllers.
|
|
312
|
+
|
|
313
|
+
---
|
|
314
|
+
|
|
315
|
+
### Use Case 4: Knowledge Transfer Planning
|
|
316
|
+
|
|
317
|
+
**Problem**: Sarah is leaving the team. Where will knowledge gaps be?
|
|
318
|
+
|
|
319
|
+
**Solution**:
|
|
320
|
+
```bash
|
|
321
|
+
# Find files where Sarah is the main developer
|
|
322
|
+
code-maat-python main-dev git.log | grep "Sarah"
|
|
323
|
+
|
|
324
|
+
# See who else has worked on those files
|
|
325
|
+
code-maat-python entity-ownership git.log > ownership.csv
|
|
326
|
+
```
|
|
327
|
+
|
|
328
|
+
**Action**: Focus knowledge transfer sessions on files where Sarah has >70% ownership and no backup developer.
|
|
329
|
+
|
|
330
|
+
---
|
|
331
|
+
|
|
332
|
+
## Advanced Features
|
|
333
|
+
|
|
334
|
+
### Architectural Grouping
|
|
335
|
+
|
|
336
|
+
Group files by architectural layers for high-level analysis:
|
|
337
|
+
|
|
338
|
+
```bash
|
|
339
|
+
code-maat-python coupling git.log --group layers.txt
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
### Team Mapping
|
|
343
|
+
|
|
344
|
+
Aggregate analysis by teams instead of individuals:
|
|
345
|
+
|
|
346
|
+
```bash
|
|
347
|
+
code-maat-python communication git.log --team-map-file teams.csv
|
|
348
|
+
```
|
|
349
|
+
|
|
350
|
+
### Output Control
|
|
351
|
+
|
|
352
|
+
```bash
|
|
353
|
+
# Limit to top N results
|
|
354
|
+
code-maat-python revisions git.log --rows 10
|
|
355
|
+
|
|
356
|
+
# Save to CSV
|
|
357
|
+
code-maat-python coupling git.log --output results.csv
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
For complete examples and advanced workflows, see [REFERENCE.md](REFERENCE.md)
|
|
361
|
+
|
|
362
|
+
---
|
|
363
|
+
|
|
364
|
+
## Understanding the Git Log Format
|
|
365
|
+
|
|
366
|
+
code-maat-python expects Git logs in this format:
|
|
367
|
+
|
|
368
|
+
```bash
|
|
369
|
+
git log --all -M -C --numstat --date=short --pretty=format:'--%h--%cd--%cn'
|
|
370
|
+
```
|
|
371
|
+
|
|
372
|
+
**What each flag does:**
|
|
373
|
+
- `--all`: Include all branches
|
|
374
|
+
- `-M -C`: Detect renames and copies
|
|
375
|
+
- `--numstat`: Show line changes per file
|
|
376
|
+
- `--date=short`: Use YYYY-MM-DD format
|
|
377
|
+
- `--pretty=format:'--%h--%cd--%cn'`: Commit format (hash--date--author)
|
|
378
|
+
|
|
379
|
+
**Sample output:**
|
|
380
|
+
```
|
|
381
|
+
--a1b2c3d--2023-06-15--Alice Smith
|
|
382
|
+
|
|
383
|
+
45 12 src/main.py
|
|
384
|
+
8 3 src/utils.py
|
|
385
|
+
--e4f5g6h--2023-06-16--Bob Jones
|
|
386
|
+
|
|
387
|
+
23 5 src/main.py
|
|
388
|
+
```
|
|
389
|
+
|
|
390
|
+
---
|
|
391
|
+
|
|
392
|
+
## Tips & Best Practices
|
|
393
|
+
|
|
394
|
+
### DO
|
|
395
|
+
|
|
396
|
+
1. **Start with recent history**: Analyze last 3-6 months for current patterns
|
|
397
|
+
```bash
|
|
398
|
+
git log --since="3 months ago" ...
|
|
399
|
+
```
|
|
400
|
+
|
|
401
|
+
2. **Combine multiple analyses**: Cross-reference coupling + churn + fragmentation for best insights
|
|
402
|
+
|
|
403
|
+
3. **Use architectural grouping**: Get higher-level insights by grouping files into layers
|
|
404
|
+
|
|
405
|
+
4. **Filter noise**: Use `--min-coupling`, `--min-revs` to focus on significant patterns
|
|
406
|
+
|
|
407
|
+
5. **Share results**: Export to CSV and share with your team
|
|
408
|
+
|
|
409
|
+
### DON'T
|
|
410
|
+
|
|
411
|
+
1. **Don't analyze entire history**: Recent patterns matter most (last 6-12 months)
|
|
412
|
+
|
|
413
|
+
2. **Don't ignore context**: High coupling isn't always bad (e.g., tests + implementation)
|
|
414
|
+
|
|
415
|
+
3. **Don't make knee-jerk decisions**: Use insights to guide investigation, not as absolute truth
|
|
416
|
+
|
|
417
|
+
4. **Don't forget team context**: Talk to developers about what the data shows
|
|
418
|
+
|
|
419
|
+
---
|
|
420
|
+
|
|
421
|
+
## Common Questions
|
|
422
|
+
|
|
423
|
+
### **Q: How is this different from code complexity tools?**
|
|
424
|
+
|
|
425
|
+
**A**: Complexity tools (like SonarQube) analyze *what* the code is. code-maat-python analyzes *how* the code *changes over time*. They're complementary:
|
|
426
|
+
|
|
427
|
+
- **Complexity tools**: "This function is too long"
|
|
428
|
+
- **code-maat-python**: "These two files always change together, suggesting hidden coupling"
|
|
429
|
+
|
|
430
|
+
### **Q: Why analyze Git history instead of static code?**
|
|
431
|
+
|
|
432
|
+
**A**: Git history reveals:
|
|
433
|
+
- Hidden dependencies (files that change together but don't import each other)
|
|
434
|
+
- Team dynamics (who works on what, communication needs)
|
|
435
|
+
- Risk patterns (high churn correlates with defects)
|
|
436
|
+
- Knowledge distribution (who knows what)
|
|
437
|
+
|
|
438
|
+
Static analysis can't tell you any of this.
|
|
439
|
+
|
|
440
|
+
### **Q: How accurate are the predictions?**
|
|
441
|
+
|
|
442
|
+
**A**: Research shows:
|
|
443
|
+
- **Logical coupling** predicts 60-80% of defects ([Nagappan & Ball, 2007](https://dl.acm.org/doi/10.1145/1453101.1453106))
|
|
444
|
+
- **Code churn** is the #1 predictor of post-release defects
|
|
445
|
+
- **Communication gaps** correlate with coordination problems and bugs
|
|
446
|
+
|
|
447
|
+
This is evidence-based software engineering, not magic.
|
|
448
|
+
|
|
449
|
+
### **Q: Can I use this with other version control systems?**
|
|
450
|
+
|
|
451
|
+
**A**: Currently, only Git is supported. The log format is Git-specific.
|
|
452
|
+
|
|
453
|
+
### **Q: How big can my repository be?**
|
|
454
|
+
|
|
455
|
+
**A**: code-maat-python uses pandas for efficient processing. It handles:
|
|
456
|
+
- 10,000 commits: < 1 second
|
|
457
|
+
- 100,000 commits: < 10 seconds
|
|
458
|
+
- 1,000,000 commits: < 2 minutes
|
|
459
|
+
|
|
460
|
+
If you have a huge monorepo, filter by date or subdirectory.
|
|
461
|
+
|
|
462
|
+
---
|
|
463
|
+
|
|
464
|
+
## Further Reading & Research
|
|
465
|
+
|
|
466
|
+
### **Academic Foundation**
|
|
467
|
+
|
|
468
|
+
This tool is based on research in Mining Software Repositories (MSR):
|
|
469
|
+
|
|
470
|
+
- **Logical Coupling**: D'Ambros, M., Lanza, M., & Robbes, R. (2010). "An extensive comparison of bug prediction approaches."
|
|
471
|
+
- **Code Churn & Defects**: Nagappan, N., & Ball, T. (2007). "Predicting failures with developer networks and social network analysis."
|
|
472
|
+
- **Communication Needs**: Cataldo, M., et al. (2006). "Identification of coordination requirements."
|
|
473
|
+
|
|
474
|
+
### **Books**
|
|
475
|
+
|
|
476
|
+
- **"Your Code as a Crime Scene"** by Adam Tornhill - Forensic analysis of code using Git history
|
|
477
|
+
- **"Software Design X-Rays"** by Adam Tornhill - Advanced techniques for analyzing codebases
|
|
478
|
+
|
|
479
|
+
### **Original Tool**
|
|
480
|
+
|
|
481
|
+
code-maat-python is a modern Python reimplementation of [Code Maat](https://github.com/adamtornhill/code-maat) by Adam Tornhill (Clojure).
|
|
482
|
+
|
|
483
|
+
**Why rewrite?**
|
|
484
|
+
- **Python ecosystem**: Easier to integrate with data science tools (pandas, matplotlib, Jupyter)
|
|
485
|
+
- **Modern CLI**: Better user experience with Click
|
|
486
|
+
- **Extensibility**: Easy to add custom analyses
|
|
487
|
+
- **Performance**: pandas is fast for data processing
|
|
488
|
+
|
|
489
|
+
---
|
|
490
|
+
|
|
491
|
+
## Contributing
|
|
492
|
+
|
|
493
|
+
We welcome contributions! See our [contributing guidelines](CONTRIBUTING.md) for details.
|
|
494
|
+
|
|
495
|
+
**Ideas for contributions:**
|
|
496
|
+
- New analysis types
|
|
497
|
+
- Visualization tools (matplotlib/plotly integration)
|
|
498
|
+
- Performance optimizations
|
|
499
|
+
- Documentation improvements
|
|
500
|
+
- Integration with CI/CD tools
|
|
501
|
+
|
|
502
|
+
---
|
|
503
|
+
|
|
504
|
+
## License
|
|
505
|
+
|
|
506
|
+
GPL-3.0 License - see [LICENSE](LICENSE) for details.
|
|
507
|
+
|
|
508
|
+
This project is inspired by and compatible with [Code Maat](https://github.com/adamtornhill/code-maat) by Adam Tornhill.
|
|
509
|
+
|
|
510
|
+
---
|
|
511
|
+
|
|
512
|
+
## Support & Community
|
|
513
|
+
|
|
514
|
+
- **Issues**: [GitHub Issues](https://github.com/hydrosquall/code-maat-python/issues)
|
|
515
|
+
- **Discussions**: [GitHub Discussions](https://github.com/hydrosquall/code-maat-python/discussions)
|
|
516
|
+
- **Repository**: [github.com/hydrosquall/code-maat-python](https://github.com/hydrosquall/code-maat-python)
|
|
517
|
+
|
|
518
|
+
---
|
|
519
|
+
|
|
520
|
+
## Quick Reference: All Commands
|
|
521
|
+
|
|
522
|
+
| Command | Purpose | Use Case |
|
|
523
|
+
|---------|---------|----------|
|
|
524
|
+
| `coupling` | Files that change together | Find hidden dependencies |
|
|
525
|
+
| `soc` | Sum of coupling | Quick hotspot detection |
|
|
526
|
+
| `entity-churn` | Code change frequency | Identify risky files |
|
|
527
|
+
| `age` | Time since last change | Find stale/stable code |
|
|
528
|
+
| `communication` | Developer collaboration needs | Improve team coordination |
|
|
529
|
+
| `authors` | Knowledge distribution | Find silos |
|
|
530
|
+
| `entity-ownership` | Contribution breakdown | Assign code owners |
|
|
531
|
+
| `main-dev` | Primary contributor (lines) | Find experts |
|
|
532
|
+
| `main-dev-by-revs` | Primary contributor (commits) | Find maintainers |
|
|
533
|
+
| `refactoring-main-dev` | Refactoring effort | Identify quality champions |
|
|
534
|
+
| `revisions` | Most changed files | Hotspot analysis |
|
|
535
|
+
| `abs-churn` | Activity over time | Track development phases |
|
|
536
|
+
| `author-churn` | Individual contributions | Review effort distribution |
|
|
537
|
+
| `entity-effort` | Effort by commits | Understand work distribution |
|
|
538
|
+
| `fragmentation` | Contributor spread | Find coordination issues |
|
|
539
|
+
| `summary` | Repository overview | Health check |
|
|
540
|
+
| `entities` | All files list | Scope understanding |
|
|
541
|
+
|
|
542
|
+
---
|
|
543
|
+
|
|
544
|
+
**Made by developers, for developers. Now go discover what your Git history has been trying to tell you!**
|
|
545
|
+
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
code_maat_python/__init__.py,sha256=J0ZQHXSrSBKFYGHCUt0pW38Cnrdo-H-PR2J67G0vMEs,266
|
|
2
|
+
code_maat_python/__main__.py,sha256=sAO9bTFAGk_T0o5fbwlg4ECsAGHDpd_7dYFcpjRNcn4,158
|
|
3
|
+
code_maat_python/analyses/__init__.py,sha256=R6raQtauygioo_1kA2M1wBu4b7fLJaom6ule0XBONpw,1032
|
|
4
|
+
code_maat_python/analyses/age.py,sha256=xYnOMmadEPew5_R_LS5q7jBFd-RUP931GOFcahcMufs,3711
|
|
5
|
+
code_maat_python/analyses/authors.py,sha256=jn-kMQqqDVU5acIxxUVa7jKfanJf1qTf8me35gDGuP8,1920
|
|
6
|
+
code_maat_python/analyses/churn.py,sha256=pwhAnbCQGtrVxgi0nR7h6UAhA7I8BcjV0jIvFTOmfcA,12062
|
|
7
|
+
code_maat_python/analyses/communication.py,sha256=tf5EsKTF2s3HhDCHFNdue9lp7bJ1uxdR8TF3NstVMIg,6195
|
|
8
|
+
code_maat_python/analyses/coupling.py,sha256=MBnxEDuXYWRzNnvl0Obh5YRVjyR4Y0FDtSwGnHs62W8,5478
|
|
9
|
+
code_maat_python/analyses/effort.py,sha256=VjUzn4guZnCs4HTz_ZAsikDO2H3BbsKVRP_IOJ-y23o,7573
|
|
10
|
+
code_maat_python/analyses/entities.py,sha256=LL_W2C0BPxUDLwpawn9tzDRHKttRA3_Uq-k1BW2kc_k,1587
|
|
11
|
+
code_maat_python/analyses/revisions.py,sha256=qvkZERseKzxkGKkfp3F6KQF9HBaBNnYZpJrgI0BID38,1909
|
|
12
|
+
code_maat_python/analyses/soc.py,sha256=0JVqNbmVCVVyLMXTh-UjKySorCavkKd8a-P0JJ_G0wk,3444
|
|
13
|
+
code_maat_python/analyses/summary.py,sha256=Fb8aV-PjG-WAgurnVR-tmdqa3xuMhOBcA1_Se3N8Jpg,1976
|
|
14
|
+
code_maat_python/cli.py,sha256=8qW7oQHZ7pyiTQNPuH3x26g_6LXwLN8y-pMWgdjJoYA,23753
|
|
15
|
+
code_maat_python/output/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
16
|
+
code_maat_python/parser.py,sha256=GWOByYiQHnCDKcknug2Ce30j89vI-Q2JkTRSzUTUCmA,6910
|
|
17
|
+
code_maat_python/pipeline.py,sha256=et5v11gVOwU_rGCNbNj0RlT0Wdwm12Xv44nhr_Z3KjA,2927
|
|
18
|
+
code_maat_python/transformers/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
19
|
+
code_maat_python/transformers/grouper.py,sha256=SwLPm-CaN8DfGfZbQhV_nRgS7YSlilk7tQzWIS1wkss,6549
|
|
20
|
+
code_maat_python/transformers/team_mapper.py,sha256=BkdVH1iH6vlZxoTXidFaU0Qm515U-z5pGSiiEyOodYs,4178
|
|
21
|
+
code_maat_python/transformers/time_grouper.py,sha256=E-p_kFuNQQiuU-NxuRLMSdJgdt7FyvH_10oqI1CT_Y0,6010
|
|
22
|
+
code_maat_python/utils/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
|
|
23
|
+
code_maat_python/utils/math.py,sha256=tWA6XcTePViqjxbyJnQqY28ieUcW4xRJ19iDtUdu-I0,2488
|
|
24
|
+
code_maat_python-0.1.0.dist-info/METADATA,sha256=srWtx7KzH-1P0cxe1-3QqT2Exm6UD_nT8yvVOMYzOKo,17020
|
|
25
|
+
code_maat_python-0.1.0.dist-info/WHEEL,sha256=kJCRJT_g0adfAJzTx2GUMmS80rTJIVHRCfG0DQgLq3o,88
|
|
26
|
+
code_maat_python-0.1.0.dist-info/entry_points.txt,sha256=S0RxeRIXvp2c3_IlgP1a0_Q30d-YA2ZW64fN_iRI8q0,62
|
|
27
|
+
code_maat_python-0.1.0.dist-info/licenses/LICENSE,sha256=OXLcl0T2SZ8Pmy2_dmlvKuetivmyPd5m1q-Gyd-zaYY,35149
|
|
28
|
+
code_maat_python-0.1.0.dist-info/RECORD,,
|