better-git-of-theseus 0.4.0__tar.gz → 0.4.2__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- better_git_of_theseus-0.4.2/PKG-INFO +96 -0
- better_git_of_theseus-0.4.2/README.md +69 -0
- better_git_of_theseus-0.4.2/better_git_of_theseus.egg-info/PKG-INFO +96 -0
- better_git_of_theseus-0.4.2/better_git_of_theseus.egg-info/entry_points.txt +2 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/git_of_theseus/app.py +3 -3
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/setup.py +2 -6
- better_git_of_theseus-0.4.0/PKG-INFO +0 -122
- better_git_of_theseus-0.4.0/README.md +0 -95
- better_git_of_theseus-0.4.0/better_git_of_theseus.egg-info/PKG-INFO +0 -122
- better_git_of_theseus-0.4.0/better_git_of_theseus.egg-info/entry_points.txt +0 -6
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/LICENSE +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/better_git_of_theseus.egg-info/SOURCES.txt +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/better_git_of_theseus.egg-info/dependency_links.txt +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/better_git_of_theseus.egg-info/requires.txt +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/better_git_of_theseus.egg-info/top_level.txt +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/git_of_theseus/__init__.py +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/git_of_theseus/analyze.py +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/git_of_theseus/cmd.py +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/git_of_theseus/line_plot.py +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/git_of_theseus/plotly_plots.py +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/git_of_theseus/stack_plot.py +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/git_of_theseus/survival_plot.py +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/git_of_theseus/utils.py +0 -0
- {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.2}/setup.cfg +0 -0
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: better-git-of-theseus
|
|
3
|
+
Version: 0.4.2
|
|
4
|
+
Summary: Plot stats on Git repositories with interactive Plotly charts
|
|
5
|
+
Home-page: https://github.com/onewesong/better-git-of-theseus
|
|
6
|
+
Author: Erik Bernhardsson
|
|
7
|
+
Author-email: mail@erikbern.com
|
|
8
|
+
Description-Content-Type: text/markdown
|
|
9
|
+
License-File: LICENSE
|
|
10
|
+
Requires-Dist: gitpython
|
|
11
|
+
Requires-Dist: numpy
|
|
12
|
+
Requires-Dist: tqdm
|
|
13
|
+
Requires-Dist: wcmatch
|
|
14
|
+
Requires-Dist: pygments
|
|
15
|
+
Requires-Dist: plotly
|
|
16
|
+
Requires-Dist: streamlit
|
|
17
|
+
Requires-Dist: python-dateutil
|
|
18
|
+
Requires-Dist: scipy
|
|
19
|
+
Dynamic: author
|
|
20
|
+
Dynamic: author-email
|
|
21
|
+
Dynamic: description
|
|
22
|
+
Dynamic: description-content-type
|
|
23
|
+
Dynamic: home-page
|
|
24
|
+
Dynamic: license-file
|
|
25
|
+
Dynamic: requires-dist
|
|
26
|
+
Dynamic: summary
|
|
27
|
+
|
|
28
|
+
<div align="center">
|
|
29
|
+
|
|
30
|
+
# Better Git of Theseus
|
|
31
|
+
|
|
32
|
+
[](https://pypi.python.org/pypi/better-git-of-theseus)
|
|
33
|
+
[](https://pypi.org/project/better-git-of-theseus/)
|
|
34
|
+
[](https://pypi.org/project/better-git-of-theseus/)
|
|
35
|
+
[](https://github.com/onewesong/better-git-of-theseus/blob/master/LICENSE)
|
|
36
|
+
[](https://deepwiki.com/onewesong/better-git-of-theseus)
|
|
37
|
+
|
|
38
|
+
[中文版](README_zh.md)
|
|
39
|
+
|
|
40
|
+
</div>
|
|
41
|
+
|
|
42
|
+
**Better Git of Theseus** is a modern refactor of the original [git-of-theseus](https://github.com/erikbern/git-of-theseus). It provides a fully interactive Web Dashboard powered by **Streamlit** and **Plotly**, making it easier than ever to visualize how your code evolves over time.
|
|
43
|
+
|
|
44
|
+
 *(Note: Charts are now fully interactive!)*
|
|
45
|
+
|
|
46
|
+
## Key Enhancements
|
|
47
|
+
|
|
48
|
+
- 🚀 **One-Click Visualization**: New `better-git-of-theseus` command automatically scans your project and launches a Web UI.
|
|
49
|
+
- 📊 **Interactive Charts**: Replaced static Matplotlib plots with Plotly. Support for zooming, panning, and detailed data hovers.
|
|
50
|
+
- 🧠 **In-Memory Processing**: Data flows directly in memory. No more mandatory intermediate `.json` files cluttering your repo.
|
|
51
|
+
- ⚡ **Smart Caching**: Leverages Streamlit's caching to make repeat analysis of large repos nearly instantaneous.
|
|
52
|
+
- 🎨 **Modern UI**: Adjust parameters (Cohort format, ignore rules, normalization, etc.) in real-time via the sidebar.
|
|
53
|
+
|
|
54
|
+
## Installation
|
|
55
|
+
|
|
56
|
+
Install via pip:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
pip install better-git-of-theseus
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
## Quick Start
|
|
63
|
+
|
|
64
|
+
Run the following in any Git repository:
|
|
65
|
+
|
|
66
|
+
```bash
|
|
67
|
+
better-git-of-theseus
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
It will automatically open your browser to the interactive dashboard.
|
|
71
|
+
|
|
72
|
+
## Feature Highlights
|
|
73
|
+
|
|
74
|
+
### Cohort Formatting
|
|
75
|
+
|
|
76
|
+
Customize how commits are grouped by year, month, or week (based on Python strftime):
|
|
77
|
+
- `%Y`: Group by **Year** (Default)
|
|
78
|
+
- `%Y-%m`: Group by **Month**
|
|
79
|
+
- `%Y-W%W`: Group by **Week**
|
|
80
|
+
|
|
81
|
+
### Real-time Parameters
|
|
82
|
+
|
|
83
|
+
Adjust parameters like "Max Series", "Normalization", and "Exponential Fit" directly in the Web UI without re-running any commands.
|
|
84
|
+
|
|
85
|
+
## FAQ
|
|
86
|
+
|
|
87
|
+
- **Duplicate Authors?** Configure a [.mailmap](https://git-scm.com/docs/gitmailmap) file in your repo root to merge identities.
|
|
88
|
+
- **Performance?** First-time analysis of very large repos (like the Linux Kernel) may take time, but subsequent views are extremely fast due to caching.
|
|
89
|
+
|
|
90
|
+
## Credits
|
|
91
|
+
|
|
92
|
+
Special thanks to [Erik Bernhardsson](https://github.com/erikbern) for creating the original `git-of-theseus`.
|
|
93
|
+
|
|
94
|
+
## License
|
|
95
|
+
|
|
96
|
+
MIT
|
|
@@ -0,0 +1,69 @@
|
|
|
1
|
+
<div align="center">
|
|
2
|
+
|
|
3
|
+
# Better Git of Theseus
|
|
4
|
+
|
|
5
|
+
[](https://pypi.python.org/pypi/better-git-of-theseus)
|
|
6
|
+
[](https://pypi.org/project/better-git-of-theseus/)
|
|
7
|
+
[](https://pypi.org/project/better-git-of-theseus/)
|
|
8
|
+
[](https://github.com/onewesong/better-git-of-theseus/blob/master/LICENSE)
|
|
9
|
+
[](https://deepwiki.com/onewesong/better-git-of-theseus)
|
|
10
|
+
|
|
11
|
+
[中文版](README_zh.md)
|
|
12
|
+
|
|
13
|
+
</div>
|
|
14
|
+
|
|
15
|
+
**Better Git of Theseus** is a modern refactor of the original [git-of-theseus](https://github.com/erikbern/git-of-theseus). It provides a fully interactive Web Dashboard powered by **Streamlit** and **Plotly**, making it easier than ever to visualize how your code evolves over time.
|
|
16
|
+
|
|
17
|
+
 *(Note: Charts are now fully interactive!)*
|
|
18
|
+
|
|
19
|
+
## Key Enhancements
|
|
20
|
+
|
|
21
|
+
- 🚀 **One-Click Visualization**: New `better-git-of-theseus` command automatically scans your project and launches a Web UI.
|
|
22
|
+
- 📊 **Interactive Charts**: Replaced static Matplotlib plots with Plotly. Support for zooming, panning, and detailed data hovers.
|
|
23
|
+
- 🧠 **In-Memory Processing**: Data flows directly in memory. No more mandatory intermediate `.json` files cluttering your repo.
|
|
24
|
+
- ⚡ **Smart Caching**: Leverages Streamlit's caching to make repeat analysis of large repos nearly instantaneous.
|
|
25
|
+
- 🎨 **Modern UI**: Adjust parameters (Cohort format, ignore rules, normalization, etc.) in real-time via the sidebar.
|
|
26
|
+
|
|
27
|
+
## Installation
|
|
28
|
+
|
|
29
|
+
Install via pip:
|
|
30
|
+
|
|
31
|
+
```bash
|
|
32
|
+
pip install better-git-of-theseus
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Quick Start
|
|
36
|
+
|
|
37
|
+
Run the following in any Git repository:
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
better-git-of-theseus
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
It will automatically open your browser to the interactive dashboard.
|
|
44
|
+
|
|
45
|
+
## Feature Highlights
|
|
46
|
+
|
|
47
|
+
### Cohort Formatting
|
|
48
|
+
|
|
49
|
+
Customize how commits are grouped by year, month, or week (based on Python strftime):
|
|
50
|
+
- `%Y`: Group by **Year** (Default)
|
|
51
|
+
- `%Y-%m`: Group by **Month**
|
|
52
|
+
- `%Y-W%W`: Group by **Week**
|
|
53
|
+
|
|
54
|
+
### Real-time Parameters
|
|
55
|
+
|
|
56
|
+
Adjust parameters like "Max Series", "Normalization", and "Exponential Fit" directly in the Web UI without re-running any commands.
|
|
57
|
+
|
|
58
|
+
## FAQ
|
|
59
|
+
|
|
60
|
+
- **Duplicate Authors?** Configure a [.mailmap](https://git-scm.com/docs/gitmailmap) file in your repo root to merge identities.
|
|
61
|
+
- **Performance?** First-time analysis of very large repos (like the Linux Kernel) may take time, but subsequent views are extremely fast due to caching.
|
|
62
|
+
|
|
63
|
+
## Credits
|
|
64
|
+
|
|
65
|
+
Special thanks to [Erik Bernhardsson](https://github.com/erikbern) for creating the original `git-of-theseus`.
|
|
66
|
+
|
|
67
|
+
## License
|
|
68
|
+
|
|
69
|
+
MIT
|
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: better-git-of-theseus
|
|
3
|
+
Version: 0.4.2
|
|
4
|
+
Summary: Plot stats on Git repositories with interactive Plotly charts
|
|
5
|
+
Home-page: https://github.com/onewesong/better-git-of-theseus
|
|
6
|
+
Author: Erik Bernhardsson
|
|
7
|
+
Author-email: mail@erikbern.com
|
|
8
|
+
Description-Content-Type: text/markdown
|
|
9
|
+
License-File: LICENSE
|
|
10
|
+
Requires-Dist: gitpython
|
|
11
|
+
Requires-Dist: numpy
|
|
12
|
+
Requires-Dist: tqdm
|
|
13
|
+
Requires-Dist: wcmatch
|
|
14
|
+
Requires-Dist: pygments
|
|
15
|
+
Requires-Dist: plotly
|
|
16
|
+
Requires-Dist: streamlit
|
|
17
|
+
Requires-Dist: python-dateutil
|
|
18
|
+
Requires-Dist: scipy
|
|
19
|
+
Dynamic: author
|
|
20
|
+
Dynamic: author-email
|
|
21
|
+
Dynamic: description
|
|
22
|
+
Dynamic: description-content-type
|
|
23
|
+
Dynamic: home-page
|
|
24
|
+
Dynamic: license-file
|
|
25
|
+
Dynamic: requires-dist
|
|
26
|
+
Dynamic: summary
|
|
27
|
+
|
|
28
|
+
<div align="center">
|
|
29
|
+
|
|
30
|
+
# Better Git of Theseus
|
|
31
|
+
|
|
32
|
+
[](https://pypi.python.org/pypi/better-git-of-theseus)
|
|
33
|
+
[](https://pypi.org/project/better-git-of-theseus/)
|
|
34
|
+
[](https://pypi.org/project/better-git-of-theseus/)
|
|
35
|
+
[](https://github.com/onewesong/better-git-of-theseus/blob/master/LICENSE)
|
|
36
|
+
[](https://deepwiki.com/onewesong/better-git-of-theseus)
|
|
37
|
+
|
|
38
|
+
[中文版](README_zh.md)
|
|
39
|
+
|
|
40
|
+
</div>
|
|
41
|
+
|
|
42
|
+
**Better Git of Theseus** is a modern refactor of the original [git-of-theseus](https://github.com/erikbern/git-of-theseus). It provides a fully interactive Web Dashboard powered by **Streamlit** and **Plotly**, making it easier than ever to visualize how your code evolves over time.
|
|
43
|
+
|
|
44
|
+
 *(Note: Charts are now fully interactive!)*
|
|
45
|
+
|
|
46
|
+
## Key Enhancements
|
|
47
|
+
|
|
48
|
+
- 🚀 **One-Click Visualization**: New `better-git-of-theseus` command automatically scans your project and launches a Web UI.
|
|
49
|
+
- 📊 **Interactive Charts**: Replaced static Matplotlib plots with Plotly. Support for zooming, panning, and detailed data hovers.
|
|
50
|
+
- 🧠 **In-Memory Processing**: Data flows directly in memory. No more mandatory intermediate `.json` files cluttering your repo.
|
|
51
|
+
- ⚡ **Smart Caching**: Leverages Streamlit's caching to make repeat analysis of large repos nearly instantaneous.
|
|
52
|
+
- 🎨 **Modern UI**: Adjust parameters (Cohort format, ignore rules, normalization, etc.) in real-time via the sidebar.
|
|
53
|
+
|
|
54
|
+
## Installation
|
|
55
|
+
|
|
56
|
+
Install via pip:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
pip install better-git-of-theseus
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
## Quick Start
|
|
63
|
+
|
|
64
|
+
Run the following in any Git repository:
|
|
65
|
+
|
|
66
|
+
```bash
|
|
67
|
+
better-git-of-theseus
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
It will automatically open your browser to the interactive dashboard.
|
|
71
|
+
|
|
72
|
+
## Feature Highlights
|
|
73
|
+
|
|
74
|
+
### Cohort Formatting
|
|
75
|
+
|
|
76
|
+
Customize how commits are grouped by year, month, or week (based on Python strftime):
|
|
77
|
+
- `%Y`: Group by **Year** (Default)
|
|
78
|
+
- `%Y-%m`: Group by **Month**
|
|
79
|
+
- `%Y-W%W`: Group by **Week**
|
|
80
|
+
|
|
81
|
+
### Real-time Parameters
|
|
82
|
+
|
|
83
|
+
Adjust parameters like "Max Series", "Normalization", and "Exponential Fit" directly in the Web UI without re-running any commands.
|
|
84
|
+
|
|
85
|
+
## FAQ
|
|
86
|
+
|
|
87
|
+
- **Duplicate Authors?** Configure a [.mailmap](https://git-scm.com/docs/gitmailmap) file in your repo root to merge identities.
|
|
88
|
+
- **Performance?** First-time analysis of very large repos (like the Linux Kernel) may take time, but subsequent views are extremely fast due to caching.
|
|
89
|
+
|
|
90
|
+
## Credits
|
|
91
|
+
|
|
92
|
+
Special thanks to [Erik Bernhardsson](https://github.com/erikbern) for creating the original `git-of-theseus`.
|
|
93
|
+
|
|
94
|
+
## License
|
|
95
|
+
|
|
96
|
+
MIT
|
|
@@ -93,7 +93,7 @@ if st.session_state.analysis_results:
|
|
|
93
93
|
data = results.get(data_key)
|
|
94
94
|
if data:
|
|
95
95
|
fig = plotly_stack_plot(data, normalize=normalize, max_n=max_n, title=project_name)
|
|
96
|
-
st.plotly_chart(fig,
|
|
96
|
+
st.plotly_chart(fig, width="stretch")
|
|
97
97
|
else:
|
|
98
98
|
st.warning(f"Data for {data_source_label} not found.")
|
|
99
99
|
|
|
@@ -110,7 +110,7 @@ if st.session_state.analysis_results:
|
|
|
110
110
|
data_line = results.get(data_key_line)
|
|
111
111
|
if data_line:
|
|
112
112
|
fig = plotly_line_plot(data_line, normalize=normalize_line, max_n=max_n_line, title=project_name)
|
|
113
|
-
st.plotly_chart(fig,
|
|
113
|
+
st.plotly_chart(fig, width="stretch")
|
|
114
114
|
else:
|
|
115
115
|
st.warning(f"Data for {data_source_label_line} not found.")
|
|
116
116
|
|
|
@@ -125,7 +125,7 @@ if st.session_state.analysis_results:
|
|
|
125
125
|
survival_data = results.get("survival")
|
|
126
126
|
if survival_data:
|
|
127
127
|
fig = plotly_survival_plot(survival_data, exp_fit=exp_fit, years=years, title=project_name)
|
|
128
|
-
st.plotly_chart(fig,
|
|
128
|
+
st.plotly_chart(fig, width="stretch")
|
|
129
129
|
else:
|
|
130
130
|
st.warning("Survival data not found.")
|
|
131
131
|
|
|
@@ -5,7 +5,7 @@ with open("README.md", "r", encoding="utf-8") as fh:
|
|
|
5
5
|
|
|
6
6
|
setup(
|
|
7
7
|
name="better-git-of-theseus",
|
|
8
|
-
version="0.4.
|
|
8
|
+
version="0.4.2",
|
|
9
9
|
description="Plot stats on Git repositories with interactive Plotly charts",
|
|
10
10
|
long_description=long_description,
|
|
11
11
|
long_description_content_type="text/markdown",
|
|
@@ -27,11 +27,7 @@ setup(
|
|
|
27
27
|
],
|
|
28
28
|
entry_points={
|
|
29
29
|
"console_scripts": [
|
|
30
|
-
"git-of-theseus
|
|
31
|
-
"git-of-theseus-survival-plot=git_of_theseus:survival_plot_cmdline",
|
|
32
|
-
"git-of-theseus-stack-plot=git_of_theseus:stack_plot_cmdline",
|
|
33
|
-
"git-of-theseus-line-plot=git_of_theseus:line_plot_cmdline",
|
|
34
|
-
"git-of-theseus-visualize=git_of_theseus.cmd:main",
|
|
30
|
+
"better-git-of-theseus=git_of_theseus.cmd:main",
|
|
35
31
|
]
|
|
36
32
|
},
|
|
37
33
|
)
|
|
@@ -1,122 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: better-git-of-theseus
|
|
3
|
-
Version: 0.4.0
|
|
4
|
-
Summary: Plot stats on Git repositories with interactive Plotly charts
|
|
5
|
-
Home-page: https://github.com/onewesong/better-git-of-theseus
|
|
6
|
-
Author: Erik Bernhardsson
|
|
7
|
-
Author-email: mail@erikbern.com
|
|
8
|
-
Description-Content-Type: text/markdown
|
|
9
|
-
License-File: LICENSE
|
|
10
|
-
Requires-Dist: gitpython
|
|
11
|
-
Requires-Dist: numpy
|
|
12
|
-
Requires-Dist: tqdm
|
|
13
|
-
Requires-Dist: wcmatch
|
|
14
|
-
Requires-Dist: pygments
|
|
15
|
-
Requires-Dist: plotly
|
|
16
|
-
Requires-Dist: streamlit
|
|
17
|
-
Requires-Dist: python-dateutil
|
|
18
|
-
Requires-Dist: scipy
|
|
19
|
-
Dynamic: author
|
|
20
|
-
Dynamic: author-email
|
|
21
|
-
Dynamic: description
|
|
22
|
-
Dynamic: description-content-type
|
|
23
|
-
Dynamic: home-page
|
|
24
|
-
Dynamic: license-file
|
|
25
|
-
Dynamic: requires-dist
|
|
26
|
-
Dynamic: summary
|
|
27
|
-
|
|
28
|
-
[](https://pypi.python.org/pypi/git-of-theseus)
|
|
29
|
-
|
|
30
|
-
Some scripts to analyze Git repos. Produces cool looking graphs like this (running it on [git](https://github.com/git/git) itself):
|
|
31
|
-
|
|
32
|
-

|
|
33
|
-
|
|
34
|
-
Installing
|
|
35
|
-
----------
|
|
36
|
-
|
|
37
|
-
Run `pip install git-of-theseus`
|
|
38
|
-
|
|
39
|
-
Running
|
|
40
|
-
-------
|
|
41
|
-
|
|
42
|
-
First, you need to run `git-of-theseus-analyze <path to repo>` (see `git-of-theseus-analyze --help` for a bunch of config). This will analyze a repository and might take quite some time.
|
|
43
|
-
|
|
44
|
-
After that, you can generate plots! Some examples:
|
|
45
|
-
|
|
46
|
-
1. Run `git-of-theseus-stack-plot cohorts.json` will create a stack plot showing the total amount of code broken down into cohorts (what year the code was added)
|
|
47
|
-
1. Run `git-of-theseus-line-plot authors.json --normalize` will show a plot of the % of code contributed by the top 20 authors
|
|
48
|
-
1. Run `git-of-theseus-survival-plot survival.json`
|
|
49
|
-
|
|
50
|
-
You can run `--help` to see various options.
|
|
51
|
-
|
|
52
|
-
If you want to plot multiple repositories, have to run `git-of-theseus-analyze` separately for each project and store the data in separate directories using the `--outdir` flag. Then you can run `git-of-theseus-survival-plot <foo/survival.json> <bar/survival.json>` (optionally with the `--exp-fit` flag to fit an exponential decay)
|
|
53
|
-
|
|
54
|
-
Help
|
|
55
|
-
----
|
|
56
|
-
|
|
57
|
-
`AttributeError: Unknown property labels` – upgrade matplotlib if you are seeing this. `pip install matplotlib --upgrade`
|
|
58
|
-
|
|
59
|
-
Some pics
|
|
60
|
-
---------
|
|
61
|
-
|
|
62
|
-
Survival of a line of code in a set of interesting repos:
|
|
63
|
-
|
|
64
|
-

|
|
65
|
-
|
|
66
|
-
This curve is produced by the `git-of-theseus-survival-plot` script and shows the *percentage of lines in a commit that are still present after x years*. It aggregates it over all commits, no matter what point in time they were made. So for *x=0* it includes all commits, whereas for *x>0* not all commits are counted (because we would have to look into the future for some of them). The survival curves are estimated using [Kaplan-Meier](https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator).
|
|
67
|
-
|
|
68
|
-
You can also add an exponential fit:
|
|
69
|
-
|
|
70
|
-

|
|
71
|
-
|
|
72
|
-
Linux – stack plot:
|
|
73
|
-
|
|
74
|
-

|
|
75
|
-
|
|
76
|
-
This curve is produced by the `git-of-theseus-stack-plot` script and shows the total number of lines in a repo broken down into cohorts by the year the code was added.
|
|
77
|
-
|
|
78
|
-
Node – stack plot:
|
|
79
|
-
|
|
80
|
-

|
|
81
|
-
|
|
82
|
-
Rails – stack plot:
|
|
83
|
-
|
|
84
|
-

|
|
85
|
-
|
|
86
|
-
Tensorflow – stack plot:
|
|
87
|
-
|
|
88
|
-

|
|
89
|
-
|
|
90
|
-
Rust – stack plot:
|
|
91
|
-
|
|
92
|
-

|
|
93
|
-
|
|
94
|
-
Plotting other stuff
|
|
95
|
-
--------------------
|
|
96
|
-
|
|
97
|
-
`git-of-theseus-analyze` will write `exts.json`, `cohorts.json` and `authors.json`. You can run `git-of-theseus-stack-plot authors.json` to plot author statistics as well, or `git-of-theseus-stack-plot exts.json` to plot file extension statistics. For author statistics, you might want to create a [.mailmap](https://git-scm.com/docs/gitmailmap) file in the root directory of the repository to deduplicate authors. If you need to create a .mailmap file the following command can list the distinct author-email combinations in a repository:
|
|
98
|
-
|
|
99
|
-
Mac / Linux
|
|
100
|
-
|
|
101
|
-
```shell
|
|
102
|
-
git log --pretty=format:"%an %ae" | sort | uniq
|
|
103
|
-
```
|
|
104
|
-
|
|
105
|
-
Windows Powershell
|
|
106
|
-
|
|
107
|
-
```powershell
|
|
108
|
-
git log --pretty=format:"%an %ae" | Sort-Object | Select-Object -Unique
|
|
109
|
-
```
|
|
110
|
-
|
|
111
|
-
For instance, here's the author statistics for [Kubernetes](https://github.com/kubernetes/kubernetes):
|
|
112
|
-
|
|
113
|
-

|
|
114
|
-
|
|
115
|
-
You can also normalize it to 100%. Here's author statistics for Git:
|
|
116
|
-
|
|
117
|
-

|
|
118
|
-
|
|
119
|
-
Other stuff
|
|
120
|
-
-----------
|
|
121
|
-
|
|
122
|
-
[Markovtsev Vadim](https://twitter.com/tmarkhor) implemented a very similar analysis that claims to be 20%-6x faster than Git of Theseus. It's named [Hercules](https://github.com/src-d/hercules) and there's a great [blog post](https://web.archive.org/web/20180918135417/https://blog.sourced.tech/post/hercules.v4/) about all the complexity going into the analysis of Git history.
|
|
@@ -1,95 +0,0 @@
|
|
|
1
|
-
[](https://pypi.python.org/pypi/git-of-theseus)
|
|
2
|
-
|
|
3
|
-
Some scripts to analyze Git repos. Produces cool looking graphs like this (running it on [git](https://github.com/git/git) itself):
|
|
4
|
-
|
|
5
|
-

|
|
6
|
-
|
|
7
|
-
Installing
|
|
8
|
-
----------
|
|
9
|
-
|
|
10
|
-
Run `pip install git-of-theseus`
|
|
11
|
-
|
|
12
|
-
Running
|
|
13
|
-
-------
|
|
14
|
-
|
|
15
|
-
First, you need to run `git-of-theseus-analyze <path to repo>` (see `git-of-theseus-analyze --help` for a bunch of config). This will analyze a repository and might take quite some time.
|
|
16
|
-
|
|
17
|
-
After that, you can generate plots! Some examples:
|
|
18
|
-
|
|
19
|
-
1. Run `git-of-theseus-stack-plot cohorts.json` will create a stack plot showing the total amount of code broken down into cohorts (what year the code was added)
|
|
20
|
-
1. Run `git-of-theseus-line-plot authors.json --normalize` will show a plot of the % of code contributed by the top 20 authors
|
|
21
|
-
1. Run `git-of-theseus-survival-plot survival.json`
|
|
22
|
-
|
|
23
|
-
You can run `--help` to see various options.
|
|
24
|
-
|
|
25
|
-
If you want to plot multiple repositories, have to run `git-of-theseus-analyze` separately for each project and store the data in separate directories using the `--outdir` flag. Then you can run `git-of-theseus-survival-plot <foo/survival.json> <bar/survival.json>` (optionally with the `--exp-fit` flag to fit an exponential decay)
|
|
26
|
-
|
|
27
|
-
Help
|
|
28
|
-
----
|
|
29
|
-
|
|
30
|
-
`AttributeError: Unknown property labels` – upgrade matplotlib if you are seeing this. `pip install matplotlib --upgrade`
|
|
31
|
-
|
|
32
|
-
Some pics
|
|
33
|
-
---------
|
|
34
|
-
|
|
35
|
-
Survival of a line of code in a set of interesting repos:
|
|
36
|
-
|
|
37
|
-

|
|
38
|
-
|
|
39
|
-
This curve is produced by the `git-of-theseus-survival-plot` script and shows the *percentage of lines in a commit that are still present after x years*. It aggregates it over all commits, no matter what point in time they were made. So for *x=0* it includes all commits, whereas for *x>0* not all commits are counted (because we would have to look into the future for some of them). The survival curves are estimated using [Kaplan-Meier](https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator).
|
|
40
|
-
|
|
41
|
-
You can also add an exponential fit:
|
|
42
|
-
|
|
43
|
-

|
|
44
|
-
|
|
45
|
-
Linux – stack plot:
|
|
46
|
-
|
|
47
|
-

|
|
48
|
-
|
|
49
|
-
This curve is produced by the `git-of-theseus-stack-plot` script and shows the total number of lines in a repo broken down into cohorts by the year the code was added.
|
|
50
|
-
|
|
51
|
-
Node – stack plot:
|
|
52
|
-
|
|
53
|
-

|
|
54
|
-
|
|
55
|
-
Rails – stack plot:
|
|
56
|
-
|
|
57
|
-

|
|
58
|
-
|
|
59
|
-
Tensorflow – stack plot:
|
|
60
|
-
|
|
61
|
-

|
|
62
|
-
|
|
63
|
-
Rust – stack plot:
|
|
64
|
-
|
|
65
|
-

|
|
66
|
-
|
|
67
|
-
Plotting other stuff
|
|
68
|
-
--------------------
|
|
69
|
-
|
|
70
|
-
`git-of-theseus-analyze` will write `exts.json`, `cohorts.json` and `authors.json`. You can run `git-of-theseus-stack-plot authors.json` to plot author statistics as well, or `git-of-theseus-stack-plot exts.json` to plot file extension statistics. For author statistics, you might want to create a [.mailmap](https://git-scm.com/docs/gitmailmap) file in the root directory of the repository to deduplicate authors. If you need to create a .mailmap file the following command can list the distinct author-email combinations in a repository:
|
|
71
|
-
|
|
72
|
-
Mac / Linux
|
|
73
|
-
|
|
74
|
-
```shell
|
|
75
|
-
git log --pretty=format:"%an %ae" | sort | uniq
|
|
76
|
-
```
|
|
77
|
-
|
|
78
|
-
Windows Powershell
|
|
79
|
-
|
|
80
|
-
```powershell
|
|
81
|
-
git log --pretty=format:"%an %ae" | Sort-Object | Select-Object -Unique
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
For instance, here's the author statistics for [Kubernetes](https://github.com/kubernetes/kubernetes):
|
|
85
|
-
|
|
86
|
-

|
|
87
|
-
|
|
88
|
-
You can also normalize it to 100%. Here's author statistics for Git:
|
|
89
|
-
|
|
90
|
-

|
|
91
|
-
|
|
92
|
-
Other stuff
|
|
93
|
-
-----------
|
|
94
|
-
|
|
95
|
-
[Markovtsev Vadim](https://twitter.com/tmarkhor) implemented a very similar analysis that claims to be 20%-6x faster than Git of Theseus. It's named [Hercules](https://github.com/src-d/hercules) and there's a great [blog post](https://web.archive.org/web/20180918135417/https://blog.sourced.tech/post/hercules.v4/) about all the complexity going into the analysis of Git history.
|
|
@@ -1,122 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: better-git-of-theseus
|
|
3
|
-
Version: 0.4.0
|
|
4
|
-
Summary: Plot stats on Git repositories with interactive Plotly charts
|
|
5
|
-
Home-page: https://github.com/onewesong/better-git-of-theseus
|
|
6
|
-
Author: Erik Bernhardsson
|
|
7
|
-
Author-email: mail@erikbern.com
|
|
8
|
-
Description-Content-Type: text/markdown
|
|
9
|
-
License-File: LICENSE
|
|
10
|
-
Requires-Dist: gitpython
|
|
11
|
-
Requires-Dist: numpy
|
|
12
|
-
Requires-Dist: tqdm
|
|
13
|
-
Requires-Dist: wcmatch
|
|
14
|
-
Requires-Dist: pygments
|
|
15
|
-
Requires-Dist: plotly
|
|
16
|
-
Requires-Dist: streamlit
|
|
17
|
-
Requires-Dist: python-dateutil
|
|
18
|
-
Requires-Dist: scipy
|
|
19
|
-
Dynamic: author
|
|
20
|
-
Dynamic: author-email
|
|
21
|
-
Dynamic: description
|
|
22
|
-
Dynamic: description-content-type
|
|
23
|
-
Dynamic: home-page
|
|
24
|
-
Dynamic: license-file
|
|
25
|
-
Dynamic: requires-dist
|
|
26
|
-
Dynamic: summary
|
|
27
|
-
|
|
28
|
-
[](https://pypi.python.org/pypi/git-of-theseus)
|
|
29
|
-
|
|
30
|
-
Some scripts to analyze Git repos. Produces cool looking graphs like this (running it on [git](https://github.com/git/git) itself):
|
|
31
|
-
|
|
32
|
-

|
|
33
|
-
|
|
34
|
-
Installing
|
|
35
|
-
----------
|
|
36
|
-
|
|
37
|
-
Run `pip install git-of-theseus`
|
|
38
|
-
|
|
39
|
-
Running
|
|
40
|
-
-------
|
|
41
|
-
|
|
42
|
-
First, you need to run `git-of-theseus-analyze <path to repo>` (see `git-of-theseus-analyze --help` for a bunch of config). This will analyze a repository and might take quite some time.
|
|
43
|
-
|
|
44
|
-
After that, you can generate plots! Some examples:
|
|
45
|
-
|
|
46
|
-
1. Run `git-of-theseus-stack-plot cohorts.json` will create a stack plot showing the total amount of code broken down into cohorts (what year the code was added)
|
|
47
|
-
1. Run `git-of-theseus-line-plot authors.json --normalize` will show a plot of the % of code contributed by the top 20 authors
|
|
48
|
-
1. Run `git-of-theseus-survival-plot survival.json`
|
|
49
|
-
|
|
50
|
-
You can run `--help` to see various options.
|
|
51
|
-
|
|
52
|
-
If you want to plot multiple repositories, have to run `git-of-theseus-analyze` separately for each project and store the data in separate directories using the `--outdir` flag. Then you can run `git-of-theseus-survival-plot <foo/survival.json> <bar/survival.json>` (optionally with the `--exp-fit` flag to fit an exponential decay)
|
|
53
|
-
|
|
54
|
-
Help
|
|
55
|
-
----
|
|
56
|
-
|
|
57
|
-
`AttributeError: Unknown property labels` – upgrade matplotlib if you are seeing this. `pip install matplotlib --upgrade`
|
|
58
|
-
|
|
59
|
-
Some pics
|
|
60
|
-
---------
|
|
61
|
-
|
|
62
|
-
Survival of a line of code in a set of interesting repos:
|
|
63
|
-
|
|
64
|
-

|
|
65
|
-
|
|
66
|
-
This curve is produced by the `git-of-theseus-survival-plot` script and shows the *percentage of lines in a commit that are still present after x years*. It aggregates it over all commits, no matter what point in time they were made. So for *x=0* it includes all commits, whereas for *x>0* not all commits are counted (because we would have to look into the future for some of them). The survival curves are estimated using [Kaplan-Meier](https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator).
|
|
67
|
-
|
|
68
|
-
You can also add an exponential fit:
|
|
69
|
-
|
|
70
|
-

|
|
71
|
-
|
|
72
|
-
Linux – stack plot:
|
|
73
|
-
|
|
74
|
-

|
|
75
|
-
|
|
76
|
-
This curve is produced by the `git-of-theseus-stack-plot` script and shows the total number of lines in a repo broken down into cohorts by the year the code was added.
|
|
77
|
-
|
|
78
|
-
Node – stack plot:
|
|
79
|
-
|
|
80
|
-

|
|
81
|
-
|
|
82
|
-
Rails – stack plot:
|
|
83
|
-
|
|
84
|
-

|
|
85
|
-
|
|
86
|
-
Tensorflow – stack plot:
|
|
87
|
-
|
|
88
|
-

|
|
89
|
-
|
|
90
|
-
Rust – stack plot:
|
|
91
|
-
|
|
92
|
-

|
|
93
|
-
|
|
94
|
-
Plotting other stuff
|
|
95
|
-
--------------------
|
|
96
|
-
|
|
97
|
-
`git-of-theseus-analyze` will write `exts.json`, `cohorts.json` and `authors.json`. You can run `git-of-theseus-stack-plot authors.json` to plot author statistics as well, or `git-of-theseus-stack-plot exts.json` to plot file extension statistics. For author statistics, you might want to create a [.mailmap](https://git-scm.com/docs/gitmailmap) file in the root directory of the repository to deduplicate authors. If you need to create a .mailmap file the following command can list the distinct author-email combinations in a repository:
|
|
98
|
-
|
|
99
|
-
Mac / Linux
|
|
100
|
-
|
|
101
|
-
```shell
|
|
102
|
-
git log --pretty=format:"%an %ae" | sort | uniq
|
|
103
|
-
```
|
|
104
|
-
|
|
105
|
-
Windows Powershell
|
|
106
|
-
|
|
107
|
-
```powershell
|
|
108
|
-
git log --pretty=format:"%an %ae" | Sort-Object | Select-Object -Unique
|
|
109
|
-
```
|
|
110
|
-
|
|
111
|
-
For instance, here's the author statistics for [Kubernetes](https://github.com/kubernetes/kubernetes):
|
|
112
|
-
|
|
113
|
-

|
|
114
|
-
|
|
115
|
-
You can also normalize it to 100%. Here's author statistics for Git:
|
|
116
|
-
|
|
117
|
-

|
|
118
|
-
|
|
119
|
-
Other stuff
|
|
120
|
-
-----------
|
|
121
|
-
|
|
122
|
-
[Markovtsev Vadim](https://twitter.com/tmarkhor) implemented a very similar analysis that claims to be 20%-6x faster than Git of Theseus. It's named [Hercules](https://github.com/src-d/hercules) and there's a great [blog post](https://web.archive.org/web/20180918135417/https://blog.sourced.tech/post/hercules.v4/) about all the complexity going into the analysis of Git history.
|
|
@@ -1,6 +0,0 @@
|
|
|
1
|
-
[console_scripts]
|
|
2
|
-
git-of-theseus-analyze = git_of_theseus.analyze:analyze_cmdline
|
|
3
|
-
git-of-theseus-line-plot = git_of_theseus:line_plot_cmdline
|
|
4
|
-
git-of-theseus-stack-plot = git_of_theseus:stack_plot_cmdline
|
|
5
|
-
git-of-theseus-survival-plot = git_of_theseus:survival_plot_cmdline
|
|
6
|
-
git-of-theseus-visualize = git_of_theseus.cmd:main
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|