better-git-of-theseus 0.4.0__tar.gz → 0.4.5__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (24) hide show
  1. better_git_of_theseus-0.4.5/PKG-INFO +96 -0
  2. better_git_of_theseus-0.4.5/README.md +69 -0
  3. better_git_of_theseus-0.4.5/better_git_of_theseus.egg-info/PKG-INFO +96 -0
  4. better_git_of_theseus-0.4.5/better_git_of_theseus.egg-info/entry_points.txt +2 -0
  5. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/git_of_theseus/app.py +91 -11
  6. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/git_of_theseus/cmd.py +2 -3
  7. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/git_of_theseus/plotly_plots.py +33 -0
  8. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/setup.py +2 -6
  9. better_git_of_theseus-0.4.0/PKG-INFO +0 -122
  10. better_git_of_theseus-0.4.0/README.md +0 -95
  11. better_git_of_theseus-0.4.0/better_git_of_theseus.egg-info/PKG-INFO +0 -122
  12. better_git_of_theseus-0.4.0/better_git_of_theseus.egg-info/entry_points.txt +0 -6
  13. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/LICENSE +0 -0
  14. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/better_git_of_theseus.egg-info/SOURCES.txt +0 -0
  15. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/better_git_of_theseus.egg-info/dependency_links.txt +0 -0
  16. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/better_git_of_theseus.egg-info/requires.txt +0 -0
  17. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/better_git_of_theseus.egg-info/top_level.txt +0 -0
  18. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/git_of_theseus/__init__.py +0 -0
  19. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/git_of_theseus/analyze.py +0 -0
  20. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/git_of_theseus/line_plot.py +0 -0
  21. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/git_of_theseus/stack_plot.py +0 -0
  22. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/git_of_theseus/survival_plot.py +0 -0
  23. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/git_of_theseus/utils.py +0 -0
  24. {better_git_of_theseus-0.4.0 → better_git_of_theseus-0.4.5}/setup.cfg +0 -0
@@ -0,0 +1,96 @@
1
+ Metadata-Version: 2.4
2
+ Name: better-git-of-theseus
3
+ Version: 0.4.5
4
+ Summary: Plot stats on Git repositories with interactive Plotly charts
5
+ Home-page: https://github.com/onewesong/better-git-of-theseus
6
+ Author: Erik Bernhardsson
7
+ Author-email: mail@erikbern.com
8
+ Description-Content-Type: text/markdown
9
+ License-File: LICENSE
10
+ Requires-Dist: gitpython
11
+ Requires-Dist: numpy
12
+ Requires-Dist: tqdm
13
+ Requires-Dist: wcmatch
14
+ Requires-Dist: pygments
15
+ Requires-Dist: plotly
16
+ Requires-Dist: streamlit
17
+ Requires-Dist: python-dateutil
18
+ Requires-Dist: scipy
19
+ Dynamic: author
20
+ Dynamic: author-email
21
+ Dynamic: description
22
+ Dynamic: description-content-type
23
+ Dynamic: home-page
24
+ Dynamic: license-file
25
+ Dynamic: requires-dist
26
+ Dynamic: summary
27
+
28
+ <div align="center">
29
+
30
+ # Better Git of Theseus
31
+
32
+ [![pypi badge](https://img.shields.io/pypi/v/better-git-of-theseus.svg?style=flat)](https://pypi.python.org/pypi/better-git-of-theseus)
33
+ [![PyPI - Downloads](https://img.shields.io/pypi/dm/better-git-of-theseus)](https://pypi.org/project/better-git-of-theseus/)
34
+ [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/better-git-of-theseus)](https://pypi.org/project/better-git-of-theseus/)
35
+ [![GitHub License](https://img.shields.io/github/license/onewesong/better-git-of-theseus)](https://github.com/onewesong/better-git-of-theseus/blob/master/LICENSE)
36
+ [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/onewesong/better-git-of-theseus)
37
+
38
+ [中文版](README_zh.md)
39
+
40
+ </div>
41
+
42
+ **Better Git of Theseus** is a modern refactor of the original [git-of-theseus](https://github.com/erikbern/git-of-theseus). It provides a fully interactive Web Dashboard powered by **Streamlit** and **Plotly**, making it easier than ever to visualize how your code evolves over time.
43
+
44
+ ![Git of Theseus Dashboard](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-git.png) *(Note: Charts are now fully interactive!)*
45
+
46
+ ## Key Enhancements
47
+
48
+ - 🚀 **One-Click Visualization**: New `better-git-of-theseus` command automatically scans your project and launches a Web UI.
49
+ - 📊 **Interactive Charts**: Replaced static Matplotlib plots with Plotly. Support for zooming, panning, and detailed data hovers.
50
+ - 🧠 **In-Memory Processing**: Data flows directly in memory. No more mandatory intermediate `.json` files cluttering your repo.
51
+ - ⚡ **Smart Caching**: Leverages Streamlit's caching to make repeat analysis of large repos nearly instantaneous.
52
+ - 🎨 **Modern UI**: Adjust parameters (Cohort format, ignore rules, normalization, etc.) in real-time via the sidebar.
53
+
54
+ ## Installation
55
+
56
+ Install via pip:
57
+
58
+ ```bash
59
+ pip install better-git-of-theseus
60
+ ```
61
+
62
+ ## Quick Start
63
+
64
+ Run the following in any Git repository:
65
+
66
+ ```bash
67
+ better-git-of-theseus
68
+ ```
69
+
70
+ It will automatically open your browser to the interactive dashboard.
71
+
72
+ ## Feature Highlights
73
+
74
+ ### Cohort Formatting
75
+
76
+ Customize how commits are grouped by year, month, or week (based on Python strftime):
77
+ - `%Y`: Group by **Year** (Default)
78
+ - `%Y-%m`: Group by **Month**
79
+ - `%Y-W%W`: Group by **Week**
80
+
81
+ ### Real-time Parameters
82
+
83
+ Adjust parameters like "Max Series", "Normalization", and "Exponential Fit" directly in the Web UI without re-running any commands.
84
+
85
+ ## FAQ
86
+
87
+ - **Duplicate Authors?** Configure a [.mailmap](https://git-scm.com/docs/gitmailmap) file in your repo root to merge identities.
88
+ - **Performance?** First-time analysis of very large repos (like the Linux Kernel) may take time, but subsequent views are extremely fast due to caching.
89
+
90
+ ## Credits
91
+
92
+ Special thanks to [Erik Bernhardsson](https://github.com/erikbern) for creating the original `git-of-theseus`.
93
+
94
+ ## License
95
+
96
+ MIT
@@ -0,0 +1,69 @@
1
+ <div align="center">
2
+
3
+ # Better Git of Theseus
4
+
5
+ [![pypi badge](https://img.shields.io/pypi/v/better-git-of-theseus.svg?style=flat)](https://pypi.python.org/pypi/better-git-of-theseus)
6
+ [![PyPI - Downloads](https://img.shields.io/pypi/dm/better-git-of-theseus)](https://pypi.org/project/better-git-of-theseus/)
7
+ [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/better-git-of-theseus)](https://pypi.org/project/better-git-of-theseus/)
8
+ [![GitHub License](https://img.shields.io/github/license/onewesong/better-git-of-theseus)](https://github.com/onewesong/better-git-of-theseus/blob/master/LICENSE)
9
+ [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/onewesong/better-git-of-theseus)
10
+
11
+ [中文版](README_zh.md)
12
+
13
+ </div>
14
+
15
+ **Better Git of Theseus** is a modern refactor of the original [git-of-theseus](https://github.com/erikbern/git-of-theseus). It provides a fully interactive Web Dashboard powered by **Streamlit** and **Plotly**, making it easier than ever to visualize how your code evolves over time.
16
+
17
+ ![Git of Theseus Dashboard](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-git.png) *(Note: Charts are now fully interactive!)*
18
+
19
+ ## Key Enhancements
20
+
21
+ - 🚀 **One-Click Visualization**: New `better-git-of-theseus` command automatically scans your project and launches a Web UI.
22
+ - 📊 **Interactive Charts**: Replaced static Matplotlib plots with Plotly. Support for zooming, panning, and detailed data hovers.
23
+ - 🧠 **In-Memory Processing**: Data flows directly in memory. No more mandatory intermediate `.json` files cluttering your repo.
24
+ - ⚡ **Smart Caching**: Leverages Streamlit's caching to make repeat analysis of large repos nearly instantaneous.
25
+ - 🎨 **Modern UI**: Adjust parameters (Cohort format, ignore rules, normalization, etc.) in real-time via the sidebar.
26
+
27
+ ## Installation
28
+
29
+ Install via pip:
30
+
31
+ ```bash
32
+ pip install better-git-of-theseus
33
+ ```
34
+
35
+ ## Quick Start
36
+
37
+ Run the following in any Git repository:
38
+
39
+ ```bash
40
+ better-git-of-theseus
41
+ ```
42
+
43
+ It will automatically open your browser to the interactive dashboard.
44
+
45
+ ## Feature Highlights
46
+
47
+ ### Cohort Formatting
48
+
49
+ Customize how commits are grouped by year, month, or week (based on Python strftime):
50
+ - `%Y`: Group by **Year** (Default)
51
+ - `%Y-%m`: Group by **Month**
52
+ - `%Y-W%W`: Group by **Week**
53
+
54
+ ### Real-time Parameters
55
+
56
+ Adjust parameters like "Max Series", "Normalization", and "Exponential Fit" directly in the Web UI without re-running any commands.
57
+
58
+ ## FAQ
59
+
60
+ - **Duplicate Authors?** Configure a [.mailmap](https://git-scm.com/docs/gitmailmap) file in your repo root to merge identities.
61
+ - **Performance?** First-time analysis of very large repos (like the Linux Kernel) may take time, but subsequent views are extremely fast due to caching.
62
+
63
+ ## Credits
64
+
65
+ Special thanks to [Erik Bernhardsson](https://github.com/erikbern) for creating the original `git-of-theseus`.
66
+
67
+ ## License
68
+
69
+ MIT
@@ -0,0 +1,96 @@
1
+ Metadata-Version: 2.4
2
+ Name: better-git-of-theseus
3
+ Version: 0.4.5
4
+ Summary: Plot stats on Git repositories with interactive Plotly charts
5
+ Home-page: https://github.com/onewesong/better-git-of-theseus
6
+ Author: Erik Bernhardsson
7
+ Author-email: mail@erikbern.com
8
+ Description-Content-Type: text/markdown
9
+ License-File: LICENSE
10
+ Requires-Dist: gitpython
11
+ Requires-Dist: numpy
12
+ Requires-Dist: tqdm
13
+ Requires-Dist: wcmatch
14
+ Requires-Dist: pygments
15
+ Requires-Dist: plotly
16
+ Requires-Dist: streamlit
17
+ Requires-Dist: python-dateutil
18
+ Requires-Dist: scipy
19
+ Dynamic: author
20
+ Dynamic: author-email
21
+ Dynamic: description
22
+ Dynamic: description-content-type
23
+ Dynamic: home-page
24
+ Dynamic: license-file
25
+ Dynamic: requires-dist
26
+ Dynamic: summary
27
+
28
+ <div align="center">
29
+
30
+ # Better Git of Theseus
31
+
32
+ [![pypi badge](https://img.shields.io/pypi/v/better-git-of-theseus.svg?style=flat)](https://pypi.python.org/pypi/better-git-of-theseus)
33
+ [![PyPI - Downloads](https://img.shields.io/pypi/dm/better-git-of-theseus)](https://pypi.org/project/better-git-of-theseus/)
34
+ [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/better-git-of-theseus)](https://pypi.org/project/better-git-of-theseus/)
35
+ [![GitHub License](https://img.shields.io/github/license/onewesong/better-git-of-theseus)](https://github.com/onewesong/better-git-of-theseus/blob/master/LICENSE)
36
+ [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/onewesong/better-git-of-theseus)
37
+
38
+ [中文版](README_zh.md)
39
+
40
+ </div>
41
+
42
+ **Better Git of Theseus** is a modern refactor of the original [git-of-theseus](https://github.com/erikbern/git-of-theseus). It provides a fully interactive Web Dashboard powered by **Streamlit** and **Plotly**, making it easier than ever to visualize how your code evolves over time.
43
+
44
+ ![Git of Theseus Dashboard](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-git.png) *(Note: Charts are now fully interactive!)*
45
+
46
+ ## Key Enhancements
47
+
48
+ - 🚀 **One-Click Visualization**: New `better-git-of-theseus` command automatically scans your project and launches a Web UI.
49
+ - 📊 **Interactive Charts**: Replaced static Matplotlib plots with Plotly. Support for zooming, panning, and detailed data hovers.
50
+ - 🧠 **In-Memory Processing**: Data flows directly in memory. No more mandatory intermediate `.json` files cluttering your repo.
51
+ - ⚡ **Smart Caching**: Leverages Streamlit's caching to make repeat analysis of large repos nearly instantaneous.
52
+ - 🎨 **Modern UI**: Adjust parameters (Cohort format, ignore rules, normalization, etc.) in real-time via the sidebar.
53
+
54
+ ## Installation
55
+
56
+ Install via pip:
57
+
58
+ ```bash
59
+ pip install better-git-of-theseus
60
+ ```
61
+
62
+ ## Quick Start
63
+
64
+ Run the following in any Git repository:
65
+
66
+ ```bash
67
+ better-git-of-theseus
68
+ ```
69
+
70
+ It will automatically open your browser to the interactive dashboard.
71
+
72
+ ## Feature Highlights
73
+
74
+ ### Cohort Formatting
75
+
76
+ Customize how commits are grouped by year, month, or week (based on Python strftime):
77
+ - `%Y`: Group by **Year** (Default)
78
+ - `%Y-%m`: Group by **Month**
79
+ - `%Y-W%W`: Group by **Week**
80
+
81
+ ### Real-time Parameters
82
+
83
+ Adjust parameters like "Max Series", "Normalization", and "Exponential Fit" directly in the Web UI without re-running any commands.
84
+
85
+ ## FAQ
86
+
87
+ - **Duplicate Authors?** Configure a [.mailmap](https://git-scm.com/docs/gitmailmap) file in your repo root to merge identities.
88
+ - **Performance?** First-time analysis of very large repos (like the Linux Kernel) may take time, but subsequent views are extremely fast due to caching.
89
+
90
+ ## Credits
91
+
92
+ Special thanks to [Erik Bernhardsson](https://github.com/erikbern) for creating the original `git-of-theseus`.
93
+
94
+ ## License
95
+
96
+ MIT
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ better-git-of-theseus = git_of_theseus.cmd:main
@@ -4,13 +4,26 @@ import tempfile
4
4
  import shutil
5
5
  try:
6
6
  from git_of_theseus.analyze import analyze
7
- from git_of_theseus.plotly_plots import plotly_stack_plot, plotly_line_plot, plotly_survival_plot
7
+ from git_of_theseus.plotly_plots import plotly_stack_plot, plotly_line_plot, plotly_survival_plot, plotly_bar_plot
8
8
  except ImportError:
9
9
  from analyze import analyze
10
- from plotly_plots import plotly_stack_plot, plotly_line_plot, plotly_survival_plot
10
+ from plotly_plots import plotly_stack_plot, plotly_line_plot, plotly_survival_plot, plotly_bar_plot
11
11
 
12
12
  st.set_page_config(page_title="Git of Theseus Dash", layout="wide")
13
13
 
14
+ # GitHub Link in Sidebar
15
+ st.sidebar.markdown(
16
+ """
17
+ <div style="display: flex; align-items: center; margin-bottom: 20px;">
18
+ <img src="https://github.githubassets.com/images/modules/logos_page/GitHub-Mark.png" width="30" style="margin-right: 10px;">
19
+ <a href="https://github.com/onewesong/better-git-of-theseus" target="_blank" style="text-decoration: none; color: inherit; font-weight: bold;">
20
+ better-git-of-theseus
21
+ </a>
22
+ </div>
23
+ """,
24
+ unsafe_allow_html=True
25
+ )
26
+
14
27
  st.title("📊 Git of Theseus - Repository Analysis")
15
28
 
16
29
  import sys
@@ -18,12 +31,49 @@ import sys
18
31
  # Sidebar Configuration
19
32
  st.sidebar.header("Configuration")
20
33
 
34
+ with st.sidebar.expander("📖 How to use", expanded=False):
35
+ st.markdown("""
36
+ **Better Git of Theseus** is a tool to analyze the evolution of Git repositories.
37
+
38
+ ### Plots Explained:
39
+ - **Stack Plot**: Shows code growth over time, broken down by cohort (when code was added).
40
+ - **Line Plot**: Shows trends across different dimensions (Author, Extension, etc.).
41
+ - **Distribution**: Shows the **current** distribution (Who contributed most, which file types are dominant).
42
+ - **Survival Plot**: Estimates how long a line of code typically lasts before being modified or deleted.
43
+
44
+ ### Tips:
45
+ - **Cohort Format**: `%Y` (Yearly) and `%Y-%m` (Monthly) are recommended.
46
+ - **Mailmap**: Use a `.mailmap` file in the repo root to resolve duplicate author names.
47
+ """)
48
+
21
49
  default_repo = "."
22
50
  if len(sys.argv) > 1:
23
51
  default_repo = sys.argv[1]
24
52
 
25
- repo_path = st.sidebar.text_input("Git Repository Path", value=default_repo)
26
- branch = st.sidebar.text_input("Branch", value="master")
53
+ repo_path = default_repo
54
+ # Path display removed as per user request
55
+
56
+ # Fetch branches for the selectbox
57
+ try:
58
+ import git
59
+ repo = git.Repo(repo_path)
60
+ # Get local branches
61
+ branches = [h.name for h in repo.heads]
62
+
63
+ # Try to determine the best default branch (active one, or master/main)
64
+ try:
65
+ current_active = repo.active_branch.name
66
+ except:
67
+ current_active = "master"
68
+
69
+ if current_active in branches:
70
+ branches.remove(current_active)
71
+
72
+ options = [current_active] + sorted(branches)
73
+ branch = st.sidebar.selectbox("Branch", options=options)
74
+ except Exception as e:
75
+ # Fallback if git repo access fails
76
+ branch = st.sidebar.text_input("Branch", value="master")
27
77
 
28
78
  with st.sidebar.expander("Analysis Parameters"):
29
79
  cohortfm = st.text_input(
@@ -35,9 +85,23 @@ with st.sidebar.expander("Analysis Parameters"):
35
85
  "- `%Y-W%W`: Week (e.g., 2023-W01)\n"
36
86
  "- `%Y-%m-%d`: Day"
37
87
  )
38
- interval = st.number_input("Interval (seconds)", value=7 * 24 * 60 * 60)
39
- procs = st.number_input("Processes", value=2, min_value=1)
40
- ignore = st.text_area("Ignore (comma separated)").split(",")
88
+ interval = st.number_input(
89
+ "Analysis Interval (seconds)",
90
+ value=7 * 24 * 60 * 60,
91
+ help="The time step between data points. Default is 604800s (7 days). Larger values are faster; smaller values result in smoother curves."
92
+ )
93
+ st.caption(f"Current resolution: {interval / 86400:.1f} days")
94
+
95
+ procs = st.number_input(
96
+ "Parallel Processes",
97
+ value=2,
98
+ min_value=1,
99
+ help="Number of concurrent processes. Increase to speed up analysis on multi-core CPUs, but note it increases RAM usage."
100
+ )
101
+ ignore = st.text_area(
102
+ "Ignore Patterns",
103
+ help="Glob patterns to ignore (comma separated), e.g.: 'tests/**, *.md'"
104
+ ).split(",")
41
105
  ignore = [i.strip() for i in ignore if i.strip()]
42
106
 
43
107
  @st.cache_data(show_spinner=False)
@@ -71,7 +135,7 @@ if st.sidebar.button("🚀 Run Analysis") or (len(sys.argv) > 1 and st.session_s
71
135
  # Main View
72
136
  if st.session_state.analysis_results:
73
137
  results = st.session_state.analysis_results
74
- tab1, tab2, tab3 = st.tabs(["Stack Plot", "Line Plot", "Survival Plot"])
138
+ tab1, tab2, tab3, tab4 = st.tabs(["Stack Plot", "Line Plot", "Distribution", "Survival Plot"])
75
139
 
76
140
  with tab1:
77
141
  st.header("Stack Plot")
@@ -93,7 +157,7 @@ if st.session_state.analysis_results:
93
157
  data = results.get(data_key)
94
158
  if data:
95
159
  fig = plotly_stack_plot(data, normalize=normalize, max_n=max_n, title=project_name)
96
- st.plotly_chart(fig, use_container_width=True)
160
+ st.plotly_chart(fig, width="stretch")
97
161
  else:
98
162
  st.warning(f"Data for {data_source_label} not found.")
99
163
 
@@ -110,11 +174,27 @@ if st.session_state.analysis_results:
110
174
  data_line = results.get(data_key_line)
111
175
  if data_line:
112
176
  fig = plotly_line_plot(data_line, normalize=normalize_line, max_n=max_n_line, title=project_name)
113
- st.plotly_chart(fig, use_container_width=True)
177
+ st.plotly_chart(fig, width="stretch")
114
178
  else:
115
179
  st.warning(f"Data for {data_source_label_line} not found.")
116
180
 
117
181
  with tab3:
182
+ st.header("Latest Distribution")
183
+ col1, col2 = st.columns([1, 3])
184
+ with col1:
185
+ data_source_label_bar = st.selectbox("Data Source", list(source_map.keys()), key="bar_source")
186
+ data_key_bar = source_map[data_source_label_bar]
187
+ max_n_bar = st.slider("Max Series", 5, 100, 30, key="bar_max_n")
188
+ with col2:
189
+ project_name = os.path.basename(os.path.abspath(repo_path))
190
+ data_bar = results.get(data_key_bar)
191
+ if data_bar:
192
+ fig = plotly_bar_plot(data_bar, max_n=max_n_bar, title=f"{project_name} - {data_source_label_bar}")
193
+ st.plotly_chart(fig, width="stretch")
194
+ else:
195
+ st.warning(f"Data for {data_source_label_bar} not found.")
196
+
197
+ with tab4:
118
198
  st.header("Survival Plot")
119
199
  col1, col2 = st.columns([1, 3])
120
200
  with col1:
@@ -125,7 +205,7 @@ if st.session_state.analysis_results:
125
205
  survival_data = results.get("survival")
126
206
  if survival_data:
127
207
  fig = plotly_survival_plot(survival_data, exp_fit=exp_fit, years=years, title=project_name)
128
- st.plotly_chart(fig, use_container_width=True)
208
+ st.plotly_chart(fig, width="stretch")
129
209
  else:
130
210
  st.warning("Survival data not found.")
131
211
 
@@ -7,9 +7,8 @@ def main():
7
7
  cmd_dir = os.path.dirname(os.path.abspath(__file__))
8
8
  app_path = os.path.join(cmd_dir, "app.py")
9
9
 
10
- # The first argument is the repo path, default to current directory
11
- repo_path = sys.argv[1] if len(sys.argv) > 1 else os.getcwd()
12
- repo_path = os.path.abspath(repo_path)
10
+ # Always use the current working directory
11
+ repo_path = os.path.abspath(os.getcwd())
13
12
 
14
13
  # Run streamlit
15
14
  # We pass the repo_path as an argument to the streamlit script
@@ -240,4 +240,37 @@ def plotly_survival_plot(commit_history, exp_fit=False, years=5, title=None):
240
240
  )
241
241
 
242
242
 
243
+ return fig
244
+
245
+ def plotly_bar_plot(data, max_n=20, title=None):
246
+ ts, y, labels = _process_stack_line_data(data, max_n, normalize=False)
247
+
248
+ # Get latest data point (current state)
249
+ latest_values = [row[-1] for row in y]
250
+
251
+ # Sort by value for better bar chart presentation
252
+ # (Though _process_stack_line_data already does some sorting, we want descending order)
253
+ indices = sorted(range(len(labels)), key=lambda i: latest_values[i], reverse=True)
254
+
255
+ sorted_labels = [labels[i] for i in indices]
256
+ sorted_values = [latest_values[i] for i in indices]
257
+
258
+ # Generate colors
259
+ colors = px.colors.qualitative.Plotly
260
+ if len(sorted_labels) > len(colors):
261
+ colors = px.colors.qualitative.Dark24
262
+
263
+ fig = go.Figure(go.Bar(
264
+ x=sorted_labels,
265
+ y=sorted_values,
266
+ marker_color=[colors[i % len(colors)] for i in range(len(sorted_labels))]
267
+ ))
268
+
269
+ fig.update_layout(
270
+ title=dict(text=f"{title} (Current Distribution)" if title else "Current Distribution", x=0.5),
271
+ yaxis=dict(title="Lines of Code"),
272
+ xaxis=dict(title=""),
273
+ margin=dict(l=20, r=20, t=50, b=100),
274
+ )
275
+
243
276
  return fig
@@ -5,7 +5,7 @@ with open("README.md", "r", encoding="utf-8") as fh:
5
5
 
6
6
  setup(
7
7
  name="better-git-of-theseus",
8
- version="0.4.0",
8
+ version="0.4.5",
9
9
  description="Plot stats on Git repositories with interactive Plotly charts",
10
10
  long_description=long_description,
11
11
  long_description_content_type="text/markdown",
@@ -27,11 +27,7 @@ setup(
27
27
  ],
28
28
  entry_points={
29
29
  "console_scripts": [
30
- "git-of-theseus-analyze=git_of_theseus.analyze:analyze_cmdline",
31
- "git-of-theseus-survival-plot=git_of_theseus:survival_plot_cmdline",
32
- "git-of-theseus-stack-plot=git_of_theseus:stack_plot_cmdline",
33
- "git-of-theseus-line-plot=git_of_theseus:line_plot_cmdline",
34
- "git-of-theseus-visualize=git_of_theseus.cmd:main",
30
+ "better-git-of-theseus=git_of_theseus.cmd:main",
35
31
  ]
36
32
  },
37
33
  )
@@ -1,122 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: better-git-of-theseus
3
- Version: 0.4.0
4
- Summary: Plot stats on Git repositories with interactive Plotly charts
5
- Home-page: https://github.com/onewesong/better-git-of-theseus
6
- Author: Erik Bernhardsson
7
- Author-email: mail@erikbern.com
8
- Description-Content-Type: text/markdown
9
- License-File: LICENSE
10
- Requires-Dist: gitpython
11
- Requires-Dist: numpy
12
- Requires-Dist: tqdm
13
- Requires-Dist: wcmatch
14
- Requires-Dist: pygments
15
- Requires-Dist: plotly
16
- Requires-Dist: streamlit
17
- Requires-Dist: python-dateutil
18
- Requires-Dist: scipy
19
- Dynamic: author
20
- Dynamic: author-email
21
- Dynamic: description
22
- Dynamic: description-content-type
23
- Dynamic: home-page
24
- Dynamic: license-file
25
- Dynamic: requires-dist
26
- Dynamic: summary
27
-
28
- [![pypi badge](https://img.shields.io/pypi/v/git-of-theseus.svg?style=flat)](https://pypi.python.org/pypi/git-of-theseus)
29
-
30
- Some scripts to analyze Git repos. Produces cool looking graphs like this (running it on [git](https://github.com/git/git) itself):
31
-
32
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-git.png)
33
-
34
- Installing
35
- ----------
36
-
37
- Run `pip install git-of-theseus`
38
-
39
- Running
40
- -------
41
-
42
- First, you need to run `git-of-theseus-analyze <path to repo>` (see `git-of-theseus-analyze --help` for a bunch of config). This will analyze a repository and might take quite some time.
43
-
44
- After that, you can generate plots! Some examples:
45
-
46
- 1. Run `git-of-theseus-stack-plot cohorts.json` will create a stack plot showing the total amount of code broken down into cohorts (what year the code was added)
47
- 1. Run `git-of-theseus-line-plot authors.json --normalize` will show a plot of the % of code contributed by the top 20 authors
48
- 1. Run `git-of-theseus-survival-plot survival.json`
49
-
50
- You can run `--help` to see various options.
51
-
52
- If you want to plot multiple repositories, have to run `git-of-theseus-analyze` separately for each project and store the data in separate directories using the `--outdir` flag. Then you can run `git-of-theseus-survival-plot <foo/survival.json> <bar/survival.json>` (optionally with the `--exp-fit` flag to fit an exponential decay)
53
-
54
- Help
55
- ----
56
-
57
- `AttributeError: Unknown property labels` – upgrade matplotlib if you are seeing this. `pip install matplotlib --upgrade`
58
-
59
- Some pics
60
- ---------
61
-
62
- Survival of a line of code in a set of interesting repos:
63
-
64
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-projects-survival.png)
65
-
66
- This curve is produced by the `git-of-theseus-survival-plot` script and shows the *percentage of lines in a commit that are still present after x years*. It aggregates it over all commits, no matter what point in time they were made. So for *x=0* it includes all commits, whereas for *x>0* not all commits are counted (because we would have to look into the future for some of them). The survival curves are estimated using [Kaplan-Meier](https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator).
67
-
68
- You can also add an exponential fit:
69
-
70
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-projects-survival-exp-fit.png)
71
-
72
- Linux – stack plot:
73
-
74
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-linux.png)
75
-
76
- This curve is produced by the `git-of-theseus-stack-plot` script and shows the total number of lines in a repo broken down into cohorts by the year the code was added.
77
-
78
- Node – stack plot:
79
-
80
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-node.png)
81
-
82
- Rails – stack plot:
83
-
84
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-rails.png)
85
-
86
- Tensorflow – stack plot:
87
-
88
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-tensorflow.png)
89
-
90
- Rust – stack plot:
91
-
92
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-rust.png)
93
-
94
- Plotting other stuff
95
- --------------------
96
-
97
- `git-of-theseus-analyze` will write `exts.json`, `cohorts.json` and `authors.json`. You can run `git-of-theseus-stack-plot authors.json` to plot author statistics as well, or `git-of-theseus-stack-plot exts.json` to plot file extension statistics. For author statistics, you might want to create a [.mailmap](https://git-scm.com/docs/gitmailmap) file in the root directory of the repository to deduplicate authors. If you need to create a .mailmap file the following command can list the distinct author-email combinations in a repository:
98
-
99
- Mac / Linux
100
-
101
- ```shell
102
- git log --pretty=format:"%an %ae" | sort | uniq
103
- ```
104
-
105
- Windows Powershell
106
-
107
- ```powershell
108
- git log --pretty=format:"%an %ae" | Sort-Object | Select-Object -Unique
109
- ```
110
-
111
- For instance, here's the author statistics for [Kubernetes](https://github.com/kubernetes/kubernetes):
112
-
113
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-kubernetes-authors.png)
114
-
115
- You can also normalize it to 100%. Here's author statistics for Git:
116
-
117
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-git-authors-normalized.png)
118
-
119
- Other stuff
120
- -----------
121
-
122
- [Markovtsev Vadim](https://twitter.com/tmarkhor) implemented a very similar analysis that claims to be 20%-6x faster than Git of Theseus. It's named [Hercules](https://github.com/src-d/hercules) and there's a great [blog post](https://web.archive.org/web/20180918135417/https://blog.sourced.tech/post/hercules.v4/) about all the complexity going into the analysis of Git history.
@@ -1,95 +0,0 @@
1
- [![pypi badge](https://img.shields.io/pypi/v/git-of-theseus.svg?style=flat)](https://pypi.python.org/pypi/git-of-theseus)
2
-
3
- Some scripts to analyze Git repos. Produces cool looking graphs like this (running it on [git](https://github.com/git/git) itself):
4
-
5
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-git.png)
6
-
7
- Installing
8
- ----------
9
-
10
- Run `pip install git-of-theseus`
11
-
12
- Running
13
- -------
14
-
15
- First, you need to run `git-of-theseus-analyze <path to repo>` (see `git-of-theseus-analyze --help` for a bunch of config). This will analyze a repository and might take quite some time.
16
-
17
- After that, you can generate plots! Some examples:
18
-
19
- 1. Run `git-of-theseus-stack-plot cohorts.json` will create a stack plot showing the total amount of code broken down into cohorts (what year the code was added)
20
- 1. Run `git-of-theseus-line-plot authors.json --normalize` will show a plot of the % of code contributed by the top 20 authors
21
- 1. Run `git-of-theseus-survival-plot survival.json`
22
-
23
- You can run `--help` to see various options.
24
-
25
- If you want to plot multiple repositories, have to run `git-of-theseus-analyze` separately for each project and store the data in separate directories using the `--outdir` flag. Then you can run `git-of-theseus-survival-plot <foo/survival.json> <bar/survival.json>` (optionally with the `--exp-fit` flag to fit an exponential decay)
26
-
27
- Help
28
- ----
29
-
30
- `AttributeError: Unknown property labels` – upgrade matplotlib if you are seeing this. `pip install matplotlib --upgrade`
31
-
32
- Some pics
33
- ---------
34
-
35
- Survival of a line of code in a set of interesting repos:
36
-
37
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-projects-survival.png)
38
-
39
- This curve is produced by the `git-of-theseus-survival-plot` script and shows the *percentage of lines in a commit that are still present after x years*. It aggregates it over all commits, no matter what point in time they were made. So for *x=0* it includes all commits, whereas for *x>0* not all commits are counted (because we would have to look into the future for some of them). The survival curves are estimated using [Kaplan-Meier](https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator).
40
-
41
- You can also add an exponential fit:
42
-
43
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-projects-survival-exp-fit.png)
44
-
45
- Linux – stack plot:
46
-
47
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-linux.png)
48
-
49
- This curve is produced by the `git-of-theseus-stack-plot` script and shows the total number of lines in a repo broken down into cohorts by the year the code was added.
50
-
51
- Node – stack plot:
52
-
53
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-node.png)
54
-
55
- Rails – stack plot:
56
-
57
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-rails.png)
58
-
59
- Tensorflow – stack plot:
60
-
61
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-tensorflow.png)
62
-
63
- Rust – stack plot:
64
-
65
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-rust.png)
66
-
67
- Plotting other stuff
68
- --------------------
69
-
70
- `git-of-theseus-analyze` will write `exts.json`, `cohorts.json` and `authors.json`. You can run `git-of-theseus-stack-plot authors.json` to plot author statistics as well, or `git-of-theseus-stack-plot exts.json` to plot file extension statistics. For author statistics, you might want to create a [.mailmap](https://git-scm.com/docs/gitmailmap) file in the root directory of the repository to deduplicate authors. If you need to create a .mailmap file the following command can list the distinct author-email combinations in a repository:
71
-
72
- Mac / Linux
73
-
74
- ```shell
75
- git log --pretty=format:"%an %ae" | sort | uniq
76
- ```
77
-
78
- Windows Powershell
79
-
80
- ```powershell
81
- git log --pretty=format:"%an %ae" | Sort-Object | Select-Object -Unique
82
- ```
83
-
84
- For instance, here's the author statistics for [Kubernetes](https://github.com/kubernetes/kubernetes):
85
-
86
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-kubernetes-authors.png)
87
-
88
- You can also normalize it to 100%. Here's author statistics for Git:
89
-
90
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-git-authors-normalized.png)
91
-
92
- Other stuff
93
- -----------
94
-
95
- [Markovtsev Vadim](https://twitter.com/tmarkhor) implemented a very similar analysis that claims to be 20%-6x faster than Git of Theseus. It's named [Hercules](https://github.com/src-d/hercules) and there's a great [blog post](https://web.archive.org/web/20180918135417/https://blog.sourced.tech/post/hercules.v4/) about all the complexity going into the analysis of Git history.
@@ -1,122 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: better-git-of-theseus
3
- Version: 0.4.0
4
- Summary: Plot stats on Git repositories with interactive Plotly charts
5
- Home-page: https://github.com/onewesong/better-git-of-theseus
6
- Author: Erik Bernhardsson
7
- Author-email: mail@erikbern.com
8
- Description-Content-Type: text/markdown
9
- License-File: LICENSE
10
- Requires-Dist: gitpython
11
- Requires-Dist: numpy
12
- Requires-Dist: tqdm
13
- Requires-Dist: wcmatch
14
- Requires-Dist: pygments
15
- Requires-Dist: plotly
16
- Requires-Dist: streamlit
17
- Requires-Dist: python-dateutil
18
- Requires-Dist: scipy
19
- Dynamic: author
20
- Dynamic: author-email
21
- Dynamic: description
22
- Dynamic: description-content-type
23
- Dynamic: home-page
24
- Dynamic: license-file
25
- Dynamic: requires-dist
26
- Dynamic: summary
27
-
28
- [![pypi badge](https://img.shields.io/pypi/v/git-of-theseus.svg?style=flat)](https://pypi.python.org/pypi/git-of-theseus)
29
-
30
- Some scripts to analyze Git repos. Produces cool looking graphs like this (running it on [git](https://github.com/git/git) itself):
31
-
32
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-git.png)
33
-
34
- Installing
35
- ----------
36
-
37
- Run `pip install git-of-theseus`
38
-
39
- Running
40
- -------
41
-
42
- First, you need to run `git-of-theseus-analyze <path to repo>` (see `git-of-theseus-analyze --help` for a bunch of config). This will analyze a repository and might take quite some time.
43
-
44
- After that, you can generate plots! Some examples:
45
-
46
- 1. Run `git-of-theseus-stack-plot cohorts.json` will create a stack plot showing the total amount of code broken down into cohorts (what year the code was added)
47
- 1. Run `git-of-theseus-line-plot authors.json --normalize` will show a plot of the % of code contributed by the top 20 authors
48
- 1. Run `git-of-theseus-survival-plot survival.json`
49
-
50
- You can run `--help` to see various options.
51
-
52
- If you want to plot multiple repositories, have to run `git-of-theseus-analyze` separately for each project and store the data in separate directories using the `--outdir` flag. Then you can run `git-of-theseus-survival-plot <foo/survival.json> <bar/survival.json>` (optionally with the `--exp-fit` flag to fit an exponential decay)
53
-
54
- Help
55
- ----
56
-
57
- `AttributeError: Unknown property labels` – upgrade matplotlib if you are seeing this. `pip install matplotlib --upgrade`
58
-
59
- Some pics
60
- ---------
61
-
62
- Survival of a line of code in a set of interesting repos:
63
-
64
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-projects-survival.png)
65
-
66
- This curve is produced by the `git-of-theseus-survival-plot` script and shows the *percentage of lines in a commit that are still present after x years*. It aggregates it over all commits, no matter what point in time they were made. So for *x=0* it includes all commits, whereas for *x>0* not all commits are counted (because we would have to look into the future for some of them). The survival curves are estimated using [Kaplan-Meier](https://en.wikipedia.org/wiki/Kaplan%E2%80%93Meier_estimator).
67
-
68
- You can also add an exponential fit:
69
-
70
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-projects-survival-exp-fit.png)
71
-
72
- Linux – stack plot:
73
-
74
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-linux.png)
75
-
76
- This curve is produced by the `git-of-theseus-stack-plot` script and shows the total number of lines in a repo broken down into cohorts by the year the code was added.
77
-
78
- Node – stack plot:
79
-
80
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-node.png)
81
-
82
- Rails – stack plot:
83
-
84
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-rails.png)
85
-
86
- Tensorflow – stack plot:
87
-
88
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-tensorflow.png)
89
-
90
- Rust – stack plot:
91
-
92
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-rust.png)
93
-
94
- Plotting other stuff
95
- --------------------
96
-
97
- `git-of-theseus-analyze` will write `exts.json`, `cohorts.json` and `authors.json`. You can run `git-of-theseus-stack-plot authors.json` to plot author statistics as well, or `git-of-theseus-stack-plot exts.json` to plot file extension statistics. For author statistics, you might want to create a [.mailmap](https://git-scm.com/docs/gitmailmap) file in the root directory of the repository to deduplicate authors. If you need to create a .mailmap file the following command can list the distinct author-email combinations in a repository:
98
-
99
- Mac / Linux
100
-
101
- ```shell
102
- git log --pretty=format:"%an %ae" | sort | uniq
103
- ```
104
-
105
- Windows Powershell
106
-
107
- ```powershell
108
- git log --pretty=format:"%an %ae" | Sort-Object | Select-Object -Unique
109
- ```
110
-
111
- For instance, here's the author statistics for [Kubernetes](https://github.com/kubernetes/kubernetes):
112
-
113
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-kubernetes-authors.png)
114
-
115
- You can also normalize it to 100%. Here's author statistics for Git:
116
-
117
- ![git](https://raw.githubusercontent.com/erikbern/git-of-theseus/master/pics/git-git-authors-normalized.png)
118
-
119
- Other stuff
120
- -----------
121
-
122
- [Markovtsev Vadim](https://twitter.com/tmarkhor) implemented a very similar analysis that claims to be 20%-6x faster than Git of Theseus. It's named [Hercules](https://github.com/src-d/hercules) and there's a great [blog post](https://web.archive.org/web/20180918135417/https://blog.sourced.tech/post/hercules.v4/) about all the complexity going into the analysis of Git history.
@@ -1,6 +0,0 @@
1
- [console_scripts]
2
- git-of-theseus-analyze = git_of_theseus.analyze:analyze_cmdline
3
- git-of-theseus-line-plot = git_of_theseus:line_plot_cmdline
4
- git-of-theseus-stack-plot = git_of_theseus:stack_plot_cmdline
5
- git-of-theseus-survival-plot = git_of_theseus:survival_plot_cmdline
6
- git-of-theseus-visualize = git_of_theseus.cmd:main