git_sme 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +54 -0
- data/.rspec +2 -0
- data/.travis.yml +5 -0
- data/CODE_OF_CONDUCT.md +74 -0
- data/Gemfile +6 -0
- data/Gemfile.lock +41 -0
- data/LICENSE +674 -0
- data/README.md +78 -0
- data/Rakefile +6 -0
- data/bin/console +14 -0
- data/bin/setup +8 -0
- data/exe/git-sme +6 -0
- data/git_sme.gemspec +31 -0
- data/lib/git_sme.rb +9 -0
- data/lib/git_sme/analysis_presenter.rb +71 -0
- data/lib/git_sme/cache.rb +51 -0
- data/lib/git_sme/cli.rb +60 -0
- data/lib/git_sme/commit_analyzer.rb +125 -0
- data/lib/git_sme/commit_loader.rb +149 -0
- data/lib/git_sme/preferences.rb +3 -0
- data/lib/git_sme/version.rb +3 -0
- metadata +151 -0
data/README.md
ADDED
@@ -0,0 +1,78 @@
|
|
1
|
+
# Git SME
|
2
|
+
|
3
|
+
Git SME allows you to analyze your git repository and identify subject matter experts for any file
|
4
|
+
or directory that you would like to know more about. It does this by analyzing all commits made to
|
5
|
+
all files in your git repository over time and finding out who made the most changes to each file
|
6
|
+
and directory.
|
7
|
+
|
8
|
+
Commits are weighted so recent commits are more significant than past commits which should mitigate
|
9
|
+
the effect a legacy coder would have on these reports.
|
10
|
+
|
11
|
+
## Installation
|
12
|
+
|
13
|
+
Install the gem for commandline usage in the appropriate version of ruby:
|
14
|
+
|
15
|
+
$ gem install git_sme
|
16
|
+
|
17
|
+
This will install the git-sme command which should now be available from everywhere assuming your
|
18
|
+
PATH is setup appropriately.
|
19
|
+
|
20
|
+
## Usage
|
21
|
+
|
22
|
+
Basic usage of git-sme is as follows:
|
23
|
+
|
24
|
+
git-sme </path/to/repository> [flags]
|
25
|
+
|
26
|
+
This will throw an error if the path is not a git repository. As a rule, you don't have to point to
|
27
|
+
the `.git` folder in your checked out code because `git-sme` will know to look for that folder as a
|
28
|
+
child of the folder you **do** provide it.
|
29
|
+
|
30
|
+
`git-sme` will output a list of paths (files and directories) and the list of users who it thinks
|
31
|
+
are the subject matter experts on each of those paths. Users are listed in decreasing order of
|
32
|
+
expertise:
|
33
|
+
|
34
|
+
$ bundle exec git-sme ~/rails/dinghy --file ~/rails/dinghy/cli/dinghy/preferences.rb ~/rails/dinghy/cli/dinghy/machine.rb
|
35
|
+
Repository: /Users/sjaveed/rails/dinghy
|
36
|
+
Analyzed: 317 (0/s) 100.00% Time: 00:00:00 |==============================================================================|
|
37
|
+
|
38
|
+
/: brianp, ryan, brian, adrian.falleiro, sally, dev, markse, fgrehm, kallin.nagelberg, matt
|
39
|
+
/cli: brianp, ryan, markse, sally, adrian.falleiro, fgrehm, matt, kallin.nagelberg, brian, dev
|
40
|
+
/cli/dinghy: brianp, markse, ryan, sally, fgrehm, matt, adrian.falleiro, brian, aisipos, paul.moelders
|
41
|
+
/cli/dinghy/machine.rb: brianp, markse, sally, brian, fgrehm, ryan, robertc
|
42
|
+
/cli/dinghy/preferences.rb: brianp
|
43
|
+
/cli/dinghy/machine: brianp, ryan, fgrehm
|
44
|
+
/dinghy: brianp
|
45
|
+
|
46
|
+
Based on analysis of a checked out copy of dinghy, I can see that, for the files I'm interested in,
|
47
|
+
brianp would be a subject matter expert but I'll probably find some useful information from ryan and
|
48
|
+
markse as well since it looks like ryan has touched enough files in the `/cli/dinghy/machine`
|
49
|
+
directory
|
50
|
+
|
51
|
+
### Flags
|
52
|
+
|
53
|
+
Flag | Description
|
54
|
+
-----|------------
|
55
|
+
`--branch <branch>` | The branch you want to analyze on the given repository. Defaults to 'master'.
|
56
|
+
`--user <username1 [username2 ...]>` | An optional list of users to whom you'd like to restrict the analysis. This allows you to see e.g. who might know more about a file given their history of working with it over time.
|
57
|
+
`--file </path/to/file [/path/to/other/file ...]` | An optional list of files/directories for which you'd like analysis. Defaults to /. The analysis will also include all directories between a subdirectory and the root of the repository. All file paths are relative to the repository root.
|
58
|
+
`--cache` | This is a default specification which caches all commits that the tool loads for a git repository. This allows you to e.g. `git pull` on a large repository and only incur the cost of loading the additional commits from the repository while previously seen commits are loaded a lot quicker from a cache.
|
59
|
+
`--no-cache` | Specify this if you do *not* want caching. You'll probably never need to use this.
|
60
|
+
`--results <count>` | The number of subject matter experts you'd like to see for each path. Defaults to 10.
|
61
|
+
|
62
|
+
## Development
|
63
|
+
|
64
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
65
|
+
|
66
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
67
|
+
|
68
|
+
## Contributing
|
69
|
+
|
70
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/sjaveed/git_sme. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
|
71
|
+
|
72
|
+
## License
|
73
|
+
|
74
|
+
The gem is available as open source under the terms of the [GPLv3 License](https://www.gnu.org/licenses/gpl-3.0.en.html).
|
75
|
+
|
76
|
+
## Code of Conduct
|
77
|
+
|
78
|
+
Everyone interacting in the GitSme project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/sjaveed/git_sme/blob/master/CODE_OF_CONDUCT.md).
|
data/Rakefile
ADDED
data/bin/console
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
|
3
|
+
require "bundler/setup"
|
4
|
+
require "git_sme"
|
5
|
+
|
6
|
+
# You can add fixtures and/or initialization code here to make experimenting
|
7
|
+
# with your gem easier. You can also use a different console, if you like.
|
8
|
+
|
9
|
+
# (If you use this, don't forget to add pry to your Gemfile!)
|
10
|
+
# require "pry"
|
11
|
+
# Pry.start
|
12
|
+
|
13
|
+
require "irb"
|
14
|
+
IRB.start(__FILE__)
|
data/bin/setup
ADDED
data/exe/git-sme
ADDED
data/git_sme.gemspec
ADDED
@@ -0,0 +1,31 @@
|
|
1
|
+
# coding: utf-8
|
2
|
+
lib = File.expand_path("../lib", __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require "git_sme/version"
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "git_sme"
|
8
|
+
spec.version = GitSme::VERSION
|
9
|
+
spec.authors = ["Shahbaz Javeed"]
|
10
|
+
spec.email = ["sjaveed@gmail.com"]
|
11
|
+
|
12
|
+
spec.summary = %q{Identify subject matter experts by analyzing your git repository}
|
13
|
+
spec.description = %q{Analyze your git repository and determine subject matter experts by identifying everyone who has touched a file with preference given to recent touches}
|
14
|
+
spec.homepage = "https://github.com/sjaveed/git_sme"
|
15
|
+
spec.license = "MIT"
|
16
|
+
|
17
|
+
spec.files = `git ls-files -z`.split("\x0").reject do |f|
|
18
|
+
f.match(%r{^(test|spec|features)/})
|
19
|
+
end
|
20
|
+
spec.bindir = "exe"
|
21
|
+
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
22
|
+
spec.require_paths = ["lib"]
|
23
|
+
|
24
|
+
spec.add_development_dependency "bundler", "~> 1.15"
|
25
|
+
spec.add_development_dependency "rake", "~> 10.0"
|
26
|
+
spec.add_development_dependency "rspec", "~> 3.2"
|
27
|
+
|
28
|
+
spec.add_dependency 'ruby-progressbar'
|
29
|
+
spec.add_dependency 'rugged'
|
30
|
+
spec.add_dependency 'thor'
|
31
|
+
end
|
data/lib/git_sme.rb
ADDED
@@ -0,0 +1,71 @@
|
|
1
|
+
module GitSme
|
2
|
+
class AnalysisPresenter
|
3
|
+
attr_reader :valid, :error_message
|
4
|
+
|
5
|
+
alias_method :valid?, :valid
|
6
|
+
|
7
|
+
def initialize(commit_analyzer, users = [], files = [])
|
8
|
+
@commit_analyzer = commit_analyzer
|
9
|
+
@users = users
|
10
|
+
@files = files
|
11
|
+
@files = ['/'] unless @users.any? || @files.any?
|
12
|
+
|
13
|
+
@valid = @commit_analyzer.valid?
|
14
|
+
@error_message = @commit_analyzer.error_message
|
15
|
+
end
|
16
|
+
|
17
|
+
def get_relevant_analyses(results_to_show = 10)
|
18
|
+
@commit_analyzer.analyze unless @commit_analyzer.analyzed?
|
19
|
+
|
20
|
+
users_to_match = @users.any? ? get_matching_keys(@commit_analyzer.analysis[:by_user].keys, @users) : []
|
21
|
+
files_to_match = @files.any? ? get_matching_keys(@commit_analyzer.analysis[:by_file].keys, @files) : []
|
22
|
+
presentable_data = []
|
23
|
+
|
24
|
+
if users_to_match.any? && files_to_match.any?
|
25
|
+
users_to_match.each do |user|
|
26
|
+
user_data = @commit_analyzer.analysis[:by_user][user].select { |k, v| files_to_match.include?(k) }
|
27
|
+
presentable_data << presentable_file_or_user({ user => user_data }, user, results_to_show: results_to_show)
|
28
|
+
end
|
29
|
+
|
30
|
+
puts
|
31
|
+
|
32
|
+
files_to_match.each do |file|
|
33
|
+
user_data = @commit_analyzer.analysis[:by_file][file].select { |k, v| users_to_match.include?(k) }
|
34
|
+
presentable_data << presentable_file_or_user({ file => user_data }, file)
|
35
|
+
end
|
36
|
+
elsif users_to_match.any?
|
37
|
+
get_matching_keys(@commit_analyzer.analysis[:by_user].keys, users_to_match).each do |user|
|
38
|
+
presentable_data << presentable_file_or_user(@commit_analyzer.analysis[:by_user], user)
|
39
|
+
end
|
40
|
+
elsif files_to_match.any?
|
41
|
+
get_matching_keys(@commit_analyzer.analysis[:by_file].keys, files_to_match).each do |path|
|
42
|
+
presentable_data << presentable_file_or_user(@commit_analyzer.analysis[:by_file], path)
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
46
|
+
presentable_data.compact
|
47
|
+
end
|
48
|
+
|
49
|
+
private
|
50
|
+
|
51
|
+
def presentable_file_or_user(data, key, results_to_show: 10)
|
52
|
+
stats = data[key]
|
53
|
+
info_to_show = sort_keys_by_value(stats).first(results_to_show)
|
54
|
+
return if info_to_show.empty?
|
55
|
+
|
56
|
+
{
|
57
|
+
key => info_to_show
|
58
|
+
}
|
59
|
+
end
|
60
|
+
|
61
|
+
def sort_keys_by_value(data)
|
62
|
+
data.keys.sort_by { |k| data[k] }.reverse
|
63
|
+
end
|
64
|
+
|
65
|
+
def get_matching_keys(all_keys, keys_to_match)
|
66
|
+
all_keys.select do |key|
|
67
|
+
keys_to_match.map { |matcher| matcher.match?(key) }.any? { |val| val }
|
68
|
+
end
|
69
|
+
end
|
70
|
+
end
|
71
|
+
end
|
@@ -0,0 +1,51 @@
|
|
1
|
+
require 'fileutils'
|
2
|
+
|
3
|
+
require_relative 'preferences'
|
4
|
+
|
5
|
+
module GitSme
|
6
|
+
class Cache
|
7
|
+
def initialize(name, enabled: true, directory: 'cache', file_prefix: '', file_suffix: '')
|
8
|
+
raise "Invalid cache name: [#{name}]" if name.nil? || name =~ /^\s+$/
|
9
|
+
|
10
|
+
@name = name.gsub(/[^a-zA-Z-]/, '').strip
|
11
|
+
@enabled = enabled
|
12
|
+
@cache_directory = File.join(PREFERENCES_HOME, directory)
|
13
|
+
@file_prefix = file_prefix
|
14
|
+
@file_suffix = file_suffix
|
15
|
+
|
16
|
+
FileUtils.mkdir_p(@cache_directory) unless File.exist?(@cache_directory)
|
17
|
+
end
|
18
|
+
|
19
|
+
def load
|
20
|
+
return [] unless @enabled && File.exist?(cache_filename)
|
21
|
+
|
22
|
+
YAML.load(File.read(cache_filename))
|
23
|
+
end
|
24
|
+
|
25
|
+
def save(data)
|
26
|
+
return unless @enabled
|
27
|
+
|
28
|
+
File.open(cache_filename, 'w') { |f| f.write(YAML.dump(data)) }
|
29
|
+
end
|
30
|
+
|
31
|
+
private
|
32
|
+
|
33
|
+
def prefix
|
34
|
+
return '' if @file_prefix =~ /^\s*$/
|
35
|
+
|
36
|
+
"#{@file_prefix}-"
|
37
|
+
end
|
38
|
+
|
39
|
+
def suffix
|
40
|
+
return '' if @file_suffix =~ /^\s*$/
|
41
|
+
|
42
|
+
"-#{@file_suffix}"
|
43
|
+
end
|
44
|
+
|
45
|
+
def cache_filename
|
46
|
+
filename = @name
|
47
|
+
|
48
|
+
File.join(@cache_directory, "#{prefix}#{filename}#{suffix}.yml")
|
49
|
+
end
|
50
|
+
end
|
51
|
+
end
|
data/lib/git_sme/cli.rb
ADDED
@@ -0,0 +1,60 @@
|
|
1
|
+
require 'thor'
|
2
|
+
require 'ruby-progressbar'
|
3
|
+
require 'git_sme'
|
4
|
+
|
5
|
+
module GitSme
|
6
|
+
class CLI < Thor
|
7
|
+
desc 'analyze <repository> [--branch <branch>] [--user <username>] [--file </path/to/file>] [--cache | --no-cache] [--results <count>]',
|
8
|
+
'Analyze the repository and determine the subject matter experts for the given files, limiting them to the users provided if needed'
|
9
|
+
|
10
|
+
method_option :branch, type: :string, default: 'master'
|
11
|
+
method_option :user, type: :array, default: []
|
12
|
+
method_option :file, type: :array, default: ['/']
|
13
|
+
method_option :cache, type: :boolean, default: true
|
14
|
+
method_option :results, type: :numeric, default: 10
|
15
|
+
|
16
|
+
def analyze(repository)
|
17
|
+
loader = GitSme::CommitLoader.new(repository, branch: options[:branch], enable_cache: options[:cache])
|
18
|
+
unless loader.valid?
|
19
|
+
puts "Error: #{loader.error_message}"
|
20
|
+
return
|
21
|
+
end
|
22
|
+
|
23
|
+
puts "Repository: #{loader.repo.path.gsub('/.git/', '')}"
|
24
|
+
|
25
|
+
loader_progress = ProgressBar.create(starting_at: 0, format: 'Loaded: %c (%R/s) %P%% %f |%B|')
|
26
|
+
loader.load do |new_commit_count, processed_commit_count, all_commit_count|
|
27
|
+
loader_progress.total = all_commit_count
|
28
|
+
loader_progress.increment
|
29
|
+
end
|
30
|
+
|
31
|
+
analyzer = GitSme::CommitAnalyzer.new(loader, enable_cache: false)
|
32
|
+
unless analyzer.valid?
|
33
|
+
puts "Error: #{analyzer.error_message}"
|
34
|
+
return
|
35
|
+
end
|
36
|
+
|
37
|
+
analyzer_progress = ProgressBar.create(starting_at: 0, total: loader.commits.size, format: 'Analyzed: %c (%R/s) %P%% %f |%B|')
|
38
|
+
analyzer.analyze do |commit_count, total_commits|
|
39
|
+
analyzer_progress.increment
|
40
|
+
end
|
41
|
+
|
42
|
+
presenter = AnalysisPresenter.new(analyzer, options[:user], options[:file])
|
43
|
+
analyses = presenter.get_relevant_analyses(options[:results].to_i)
|
44
|
+
|
45
|
+
puts
|
46
|
+
|
47
|
+
if !analyses.empty?
|
48
|
+
analyses.each do |result|
|
49
|
+
result.each do |path, users|
|
50
|
+
puts "#{path}: #{users.join(', ')}"
|
51
|
+
end
|
52
|
+
end
|
53
|
+
else
|
54
|
+
puts 'No data found!'
|
55
|
+
end
|
56
|
+
end
|
57
|
+
|
58
|
+
default_task :analyze
|
59
|
+
end
|
60
|
+
end
|
@@ -0,0 +1,125 @@
|
|
1
|
+
module GitSme
|
2
|
+
class CommitAnalyzer
|
3
|
+
attr_reader :valid, :error_message, :analysis, :analyzed
|
4
|
+
|
5
|
+
alias_method :valid?, :valid
|
6
|
+
alias_method :analyzed?, :analyzed
|
7
|
+
|
8
|
+
def initialize(commit_loader, enable_cache: true)
|
9
|
+
@enable_cache = true
|
10
|
+
@commit_loader = commit_loader
|
11
|
+
@analyzed = false
|
12
|
+
@valid = @commit_loader.valid?
|
13
|
+
@error_message = @commit_loader.error_message
|
14
|
+
|
15
|
+
@analysis = {}
|
16
|
+
@cache = GitSme::Cache.new(@commit_loader.repo.path.gsub('/.git/', ''),
|
17
|
+
enabled: @enable_cache, file_suffix: "#{@commit_loader.branch}-analysis"
|
18
|
+
)
|
19
|
+
end
|
20
|
+
|
21
|
+
def analyze(force: false)
|
22
|
+
return unless valid?
|
23
|
+
return if analyzed? && !force
|
24
|
+
|
25
|
+
@commit_loader.load
|
26
|
+
@analysis = @cache.load
|
27
|
+
new_analysis = []
|
28
|
+
|
29
|
+
if !@commit_loader.new_commits? || !@analysis.any?
|
30
|
+
if block_given?
|
31
|
+
@analysis = analyze_new_commits(@commit_loader.commits) do |commit_count, total_commits|
|
32
|
+
yield(commit_count, total_commits)
|
33
|
+
end
|
34
|
+
else
|
35
|
+
@analysis = analyze_new_commits(@commit_loader.commits)
|
36
|
+
end
|
37
|
+
elsif @commit_loader.new_commits?
|
38
|
+
new_analysis = if block_given?
|
39
|
+
analyze_new_commits(@commit_loader.new_commits) do |commit_count, total_commits|
|
40
|
+
yield(commit_count, total_commits)
|
41
|
+
end
|
42
|
+
else
|
43
|
+
analyze_new_commits(@commit_loader.new_commits)
|
44
|
+
end
|
45
|
+
|
46
|
+
summed_merge(@analysis[:by_user], new_analysis[:by_user])
|
47
|
+
summed_merge(@analysis[:by_file], new_analysis[:by_file])
|
48
|
+
end
|
49
|
+
|
50
|
+
@cache.save(@analysis)
|
51
|
+
@analyzed = true
|
52
|
+
end
|
53
|
+
|
54
|
+
private
|
55
|
+
|
56
|
+
def analyze_new_commits(commits_to_process)
|
57
|
+
user_stats = {}
|
58
|
+
file_stats = {}
|
59
|
+
now = Time.now.to_i
|
60
|
+
commit_count = commits_to_process.size
|
61
|
+
|
62
|
+
commits_to_process.each_with_index do |commit, current_commit_idx|
|
63
|
+
author = commit[:author]
|
64
|
+
time_delta = now - commit[:timestamp]
|
65
|
+
|
66
|
+
commit[:file_changes].each do |filename, change_details|
|
67
|
+
all_affected_paths(filename).each do |path|
|
68
|
+
change_value = weighted_value(change_details[:changes], time_delta)
|
69
|
+
|
70
|
+
user_stats[author] = {} unless user_stats.key?(author)
|
71
|
+
user_stats[author][path] = 0 unless user_stats[author].key?(path)
|
72
|
+
|
73
|
+
file_stats[path] = {} unless file_stats.key?(path)
|
74
|
+
file_stats[path][author] = 0 unless file_stats[path].key?(author)
|
75
|
+
|
76
|
+
user_stats[author][path] += change_value
|
77
|
+
file_stats[path][author] += change_value
|
78
|
+
end
|
79
|
+
end
|
80
|
+
|
81
|
+
if block_given?
|
82
|
+
yield(current_commit_idx, commit_count)
|
83
|
+
end
|
84
|
+
end
|
85
|
+
|
86
|
+
{
|
87
|
+
by_user: user_stats,
|
88
|
+
by_file: file_stats
|
89
|
+
}
|
90
|
+
end
|
91
|
+
|
92
|
+
def summed_merge(cached_data, new_data)
|
93
|
+
return if new_data.nil? || cached_data.nil?
|
94
|
+
|
95
|
+
new_data.each do |key, value_hash|
|
96
|
+
if cached_data.key?(key)
|
97
|
+
value_hash.each do |value_key, value|
|
98
|
+
if cached_data[key].key?(value_key)
|
99
|
+
cached_data[key][value_key] += value
|
100
|
+
else
|
101
|
+
cached_data[key][value_key] = value
|
102
|
+
end
|
103
|
+
end
|
104
|
+
else
|
105
|
+
cached_data[key] = value_hash
|
106
|
+
end
|
107
|
+
end
|
108
|
+
end
|
109
|
+
|
110
|
+
def all_affected_paths(filename)
|
111
|
+
['/'] + filename.split('/').each_with_object([]) do |path_part, path_list|
|
112
|
+
path_list << [path_list[-1], path_part].join('/')
|
113
|
+
end
|
114
|
+
end
|
115
|
+
|
116
|
+
def weighted_value(value, time_delta)
|
117
|
+
# value_attenuator = 1.0
|
118
|
+
# value_attenuation = value_attenuator * time_delta / time_delta
|
119
|
+
value_attenuation = time_delta > 0 ? time_delta ** (-1/3) : 1
|
120
|
+
|
121
|
+
(value * value_attenuation).to_f
|
122
|
+
end
|
123
|
+
|
124
|
+
end
|
125
|
+
end
|