bnet_scraper 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ *.gem
2
+ .bundle
3
+ Gemfile.lock
4
+ pkg/*
5
+ .rvmrc
6
+ doc/*
7
+ .yardoc/*
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --color
2
+ --format nested
@@ -0,0 +1,4 @@
1
+ notifications:
2
+ email:
3
+ recipients:
4
+ - anordman@majorleaguegaming.com
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source "http://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in bnet_scraper.gemspec
4
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2012 Andrew Nordman
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,136 @@
1
+ # BnetScraper
2
+
3
+ BnetScraper is a Nokogiri-based scraper of Battle.net profile information. Currently this only includes Starcraft2.
4
+
5
+ # Installation
6
+
7
+ Run `gem install bnet_scraper` or add `gem 'bnet_scraper'` to your `Gemfile`.
8
+
9
+ # Usage
10
+
11
+ All of the scrapers take an options hash, and can be created by either passing a URL string for the profile URL or
12
+ passing the account information in the options hash. Thus, either of these two approaches work:
13
+
14
+ ``` ruby
15
+ BnetScraper::Starcraft2::ProfileScraper.new(url: 'http://us.battle.net/sc2/en/profile/12345/1/TestAccount/')
16
+ BnetScraper::Starcraft2::ProfileScraper.new(bnet_id: '12345', account: 'TestAccount', region: 'na')
17
+ ```
18
+
19
+ There are several scrapers that pull various information. They are:
20
+
21
+ * BnetScraper::Starcraft2::ProfileScraper - collects basic profile information and an array of league URLs
22
+ * BnetScraper::Starcraft2::LeagueScraper - collects data on a particular league for a particular Battle.net account
23
+ * BnetScraper::Starcraft2::AchievementScraper - collects achievement data for the account.
24
+ * BnetScraper::Starcraft2::MatchHistoryScraper - collects the 25 most recent matches played on the account
25
+
26
+ All scrapers have a `#scrape` method that triggers the scraping and storage. By default they will return the result,
27
+ but an additional `#output` method exists to retrieve the results subsequent times without re-scraping.
28
+
29
+ ## BnetScraper::Starcraft2::ProfileScraper
30
+
31
+ This pulls basic profile information for an account, as well as an array of league URLs. This is a good starting
32
+ point for league scraping as it provides the league URLs necessary to do supplemental scraping.
33
+
34
+ ``` ruby
35
+ scraper = BnetScraper::Starcraft2::ProfileScraper.new(url: 'http://us.battle.net/sc2/en/profile/2377239/1/Demon/')
36
+ scraper.scrape
37
+ # => {
38
+ bnet_id: '2377239',
39
+ account: 'Demon',
40
+ bnet_index: 1,
41
+ race: 'Protoss',
42
+ wins: '684',
43
+ achievement_points: '3630',
44
+ leagues: [
45
+ {
46
+ name: "1v1 Platinum Rank 95",
47
+ id: "96905",
48
+ href: "http://us.battle.net/sc2/en/profile/2377239/1/Demon/ladder/96905#current-rank"
49
+ }
50
+ ]
51
+ }
52
+ ```
53
+
54
+ ## BnetScraper::Starcraft2::LeagueScraper
55
+
56
+ This pulls information on a specific league for a specific account. It is best used either in conjunction with a
57
+ profile scrape that profiles a URL, or if you happen to know the specific league\_id and can pass it as an option.
58
+
59
+ ``` ruby
60
+ scraper = BnetScraper::Starcraft2::LeagueScraper.new(league_id: '12345', account: 'Demon', bnet_id: '2377239')
61
+ scraper.scrape
62
+ # => {
63
+ season: '6',
64
+ name: 'Aleksander Pepper',
65
+ division: 'Diamond',
66
+ size: '4v4',
67
+ random: false,
68
+ bnet_id: '2377239',
69
+ account: 'Demon'
70
+ }
71
+ ```
72
+
73
+ ## BnetScraper::Starcraft2::AchievementScraper
74
+
75
+ This pulls achievement information for an account. Note that currently only returns the overall achievements,
76
+ not the in-depth, by-category achievement information.
77
+
78
+ ``` ruby
79
+ scraper = BnetScraper::Starcraft2::AchievementScraper.new(url: 'http://us.battle.net/sc2/en/profile/2377239/1/Demon/')
80
+ scraper.scrape
81
+ # => {
82
+ recent: [
83
+ { title: 'Blink of an Eye', description: 'Complete round 24 in "Starcraft Master" without losing any stalkers', earned: '3/5/2012' },
84
+ { title: 'Whack-a-Roach', description: 'Complete round 9 in "Starcraft Master" in under 45 seconds', earned: '3/5/2012' },
85
+ { title: 'Safe Zone', description: 'Complete round 8 in "Starcraft Master" without losing any stalkers', earned: '3/5/2012' },
86
+ { title: 'Starcraft Master', description: 'Complete all 30 rounds in "Starcraft Master"', earned: '3/5/2012' },
87
+ { title: 'Starcraft Expert', description: 'Complete any 25 rounds in "Starcraft Master"', earned: '3/5/2012' },
88
+ { title: 'Starcraft Apprentice', description: 'Complete any 20 rounds in "Starcraft Master"', earned: '3/5/2012' }
89
+ ],
90
+ showcase: [
91
+ { title: 'Hot Shot', description: 'Finish a Qualification Round with an undefeated record.' },
92
+ { title: 'Starcraft Master', description: 'Complete all rounds in "Starcraft Master"' },
93
+ { title: 'Team Protoss 500', description: 'Win 500 team league matches as Protoss' },
94
+ { title: 'Night of the Living III', description: 'Survive 15 Infested Horde Attacks in the "Night 2 Die" mode of the "Left 2 Die" scenario.' },
95
+ { title: 'Team Top 100 Diamond', description: 'Finish a Season in Team Diamond Division' }
96
+
97
+ ],
98
+ progress: {
99
+ liberty_campaign: '1580',
100
+ exploration: '480',
101
+ custom_game: '330',
102
+ cooperative: '660',
103
+ quick_match: '170'
104
+ }
105
+ }
106
+ ```
107
+
108
+ ## BnetScraper::Starcraft2::MatchHistoryScraper
109
+
110
+ This pulls the 25 most recent matches played for an account. Note that this is only as up-to-date as battle.net is, and
111
+ will likely not be as fast as in-game.
112
+
113
+ ``` ruby
114
+ scraper = BnetScraper::Starcraft2::MatchHistoryScraper.new(url: 'http://us.battle.net/sc2/en/profile/2377239/1/Demon/')
115
+ scraper.scrape
116
+ # => {
117
+ wins: '15',
118
+ losses: '10',
119
+ matches: [
120
+ { map_name: 'Bx Monobattle - Sand Canyon (Fix)', outcome: :win, type: 'Custom', date: '3/12/2012' },
121
+ { map_name: 'Deadlock Ridge', outcome: :loss, type: '4v4', date: '3/12/2012' },
122
+ { map_name: 'District 10', outcome: :win, type: '4v4', date: '3/12/2012' },
123
+ # ...
124
+ ]
125
+ }
126
+ ```
127
+
128
+ # Contribute!
129
+
130
+ I would love to see contributions! Please send a pull request with a feature branch containing specs
131
+ (Chances are excellent I will break it if you do not) and I will take a look. Please do not change the version
132
+ as I tend to bundle multiple fixes together before releasing a new version anyway.
133
+
134
+ # Author
135
+
136
+ Written by [Andrew Nordman](http://github.com/cadwallion), see LICENSE for details.
@@ -0,0 +1,10 @@
1
+ require "bundler/gem_tasks"
2
+ require 'rspec/core/rake_task'
3
+
4
+ RSpec::Core::RakeTask.new(:spec) do |spec|
5
+ spec.pattern = 'spec/**/*_spec.rb'
6
+ spec.rspec_opts = ['--backtrace']
7
+ # spec.ruby_opts = ['-w']
8
+ end
9
+
10
+ task :default => :spec
@@ -0,0 +1,24 @@
1
+ $:.push File.join(File.dirname(__FILE__), '..', 'lib')
2
+
3
+ require 'benchmark'
4
+ require 'bnet_scraper'
5
+
6
+ @@bnet_accounts = [
7
+ { bnet_id: 2377239, name: 'Demon', race: 'protoss' },
8
+ { bnet_id: 2035618, name: 'Mykaelos', race: 'random' },
9
+ { bnet_id: 1826063, name: 'Fyrefly', race: 'zerg' },
10
+ { bnet_id: 2539344, name: 'Cadwallion', race: 'terran' }
11
+ ]
12
+
13
+ def parse_random_account
14
+ account = @@bnet_accounts[rand(@@bnet_accounts.size)]
15
+ BnetScraper::Starcraft2::ProfileScraper.new(account[:bnet_id], account[:name]).scrape
16
+ end
17
+
18
+ Benchmark.bmbm(7) do |x|
19
+ x.report('1') { parse_random_account }
20
+ x.report('5') { 5.times { parse_random_account } }
21
+ x.report('10') { 10.times { parse_random_account } }
22
+ x.report('25') { 25.times { parse_random_account } }
23
+ x.report('50') { 50.times { parse_random_account } }
24
+ end
@@ -0,0 +1,22 @@
1
+ # -*- encoding: utf-8 -*-
2
+ $:.push File.expand_path("../lib", __FILE__)
3
+
4
+ Gem::Specification.new do |s|
5
+ s.name = "bnet_scraper"
6
+ s.version = "0.0.1"
7
+ s.authors = ["Andrew Nordman"]
8
+ s.email = ["anordman@majorleaguegaming.com"]
9
+ s.homepage = "https://github.com/agoragames/bnet_scraper/"
10
+ s.summary = %q{Battle.net Profile Scraper}
11
+ s.description = %q{BnetScraper is a Nokogiri-based scraper of Battle.net profile information. Currently this only includes Starcraft2.}
12
+
13
+ s.files = `git ls-files`.split("\n")
14
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
15
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
16
+ s.require_paths = ["lib"]
17
+
18
+ s.add_runtime_dependency 'nokogiri'
19
+ s.add_development_dependency 'rake'
20
+ s.add_development_dependency 'rspec'
21
+ s.add_development_dependency 'fakeweb'
22
+ end
@@ -0,0 +1,7 @@
1
+ require 'bnet_scraper/starcraft2'
2
+ require 'net/http'
3
+ require 'open-uri'
4
+ require 'nokogiri'
5
+
6
+ module BnetScraper
7
+ end
@@ -0,0 +1,46 @@
1
+ require 'bnet_scraper/starcraft2/base_scraper'
2
+ require 'bnet_scraper/starcraft2/profile_scraper'
3
+ require 'bnet_scraper/starcraft2/league_scraper'
4
+ require 'bnet_scraper/starcraft2/achievement_scraper'
5
+ require 'bnet_scraper/starcraft2/match_history_scraper'
6
+
7
+ module BnetScraper
8
+ # This module contains everything about scraping Starcraft 2 Battle.net accounts.
9
+ # See `BnetScraper::Starcraft2::ProfileScraper` and `BnetScraper::Starcraft2::LeagueScraper`
10
+ # for more details
11
+ module Starcraft2
12
+ REGIONS = {
13
+ 'na' => { domain: 'us.battle.net', dir: 'en' },
14
+ 'eu' => { domain: 'eu.battle.net', dir: 'eu' },
15
+ 'cn' => { domain: 'www.battlenet.com.cn', dir: 'zh' },
16
+ 'sea' => { domain: 'sea.battle.net', dir: 'en' },
17
+ 'fea' => { domain: 'tw.battle.net', dir: 'zh' }
18
+ }
19
+
20
+ # This is a convenience method that chains calls to ProfileScraper,
21
+ # followed by a scrape of each league returned in the `leagues` array
22
+ # in the profile_data. The end result is a fully scraped profile with
23
+ # profile and league data in a hash.
24
+ #
25
+ # See `BnetScraper::Starcraft2::ProfileScraper` for more information on
26
+ # the parameters being sent to `#full_profile_scrape`.
27
+ #
28
+ # @param bnet_id - Battle.net Account ID
29
+ # @param account - Battle.net Account Name
30
+ # @param region - Battle.net Account Region
31
+ # @return profile_data - Hash containing complete profile and league data
32
+ # scraped from the website
33
+ def self.full_profile_scrape bnet_id, account, region = 'na'
34
+ profile_scraper = ProfileScraper.new bnet_id: bnet_id, account: account, region: region
35
+ profile_output = profile_scraper.scrape
36
+
37
+ parsed_leagues = []
38
+ profile_output[:leagues].each do |league|
39
+ league_scraper = LeagueScraper.new url: league[:href]
40
+ parsed_leagues << league_scraper.scrape
41
+ end
42
+ profile_output[:leagues] = parsed_leagues
43
+ return profile_output
44
+ end
45
+ end
46
+ end
@@ -0,0 +1,63 @@
1
+ module BnetScraper
2
+ module Starcraft2
3
+ class AchievementScraper < BaseScraper
4
+ attr_reader :recent, :progress, :showcase, :response
5
+
6
+ def scrape
7
+ get_response
8
+ scrape_recent
9
+ scrape_progress
10
+ scrape_showcase
11
+ output
12
+ end
13
+
14
+ def get_response
15
+ @response = Nokogiri::HTML(open(profile_url+"achievements/"))
16
+ end
17
+
18
+ def scrape_recent
19
+ @recent = []
20
+ 6.times do |num|
21
+ achievement = {}
22
+ div = response.css("#achv-recent-#{num}")
23
+ if div
24
+ achievement[:title] = div.css("div > div").inner_text.strip
25
+ achievement[:description] = div.inner_text.gsub(achievement[:title], '').strip
26
+ achievement[:earned] = response.css("#recent-achievements span")[(num*3)+1].inner_text
27
+
28
+ @recent << achievement
29
+ end
30
+ end
31
+ @recent
32
+ end
33
+
34
+ def scrape_progress
35
+ progress_ach = response.css("#progress-module .achievements-progress:nth(2) span")
36
+ @progress = {
37
+ liberty_campaign: progress_ach[0].inner_text,
38
+ exploration: progress_ach[1].inner_text,
39
+ custom_game: progress_ach[2].inner_text,
40
+ cooperative: progress_ach[3].inner_text,
41
+ quick_match: progress_ach[4].inner_text,
42
+ }
43
+ end
44
+
45
+ def scrape_showcase
46
+ @showcase = response.css("#showcase-module .progress-tile").map do |achievement|
47
+ hsh = { title: achievement.css('.tooltip-title').inner_text.strip }
48
+ hsh[:description] = achievement.css('div').inner_text.gsub(hsh[:title], '').strip
49
+ hsh
50
+ end
51
+ @showcase
52
+ end
53
+
54
+ def output
55
+ {
56
+ recent: @recent,
57
+ progress: @progress,
58
+ showcase: @showcase
59
+ }
60
+ end
61
+ end
62
+ end
63
+ end
@@ -0,0 +1,53 @@
1
+ module BnetScraper
2
+ module Starcraft2
3
+ class BaseScraper
4
+ attr_reader :bnet_id, :account, :region, :bnet_index, :url
5
+
6
+ def initialize options = {}
7
+ if options[:url]
8
+ extracted_data = options[:url].match(/http:\/\/(.+)\/sc2\/(.+)\/profile\/(.+)\/(\d{1})\/(.[^\/]+)\//)
9
+ @region = REGIONS.key({ domain: extracted_data[1], dir: extracted_data[2] })
10
+ @bnet_id = extracted_data[3]
11
+ @bnet_index = extracted_data[4]
12
+ @account = extracted_data[5]
13
+ @url = options[:url]
14
+ elsif options[:bnet_id] && options[:account]
15
+ @bnet_id = options[:bnet_id]
16
+ @account = options[:account]
17
+ @region = options[:region] || 'na'
18
+ if options[:bnet_index]
19
+ @bnet_index = options[:bnet_index]
20
+ else
21
+ set_bnet_index
22
+ end
23
+ end
24
+ end
25
+
26
+ # set_bnet_index
27
+ #
28
+ # Because profile URLs have to have a specific bnet_index that is seemingly incalculable,
29
+ # we must ping both variants to determine the correct bnet_index. We then store that value.
30
+ def set_bnet_index
31
+ [1,2].each do |idx|
32
+ res = Net::HTTP.get_response URI profile_url idx
33
+ if res.is_a? Net::HTTPSuccess
34
+ @bnet_index = idx
35
+ return
36
+ end
37
+ end
38
+ end
39
+
40
+ def profile_url bnet_index = @bnet_index
41
+ "http://#{region_info[:domain]}/sc2/#{region_info[:dir]}/profile/#{bnet_id}/#{bnet_index}/#{account}/"
42
+ end
43
+
44
+ def region_info
45
+ REGIONS[region]
46
+ end
47
+
48
+ def scrape
49
+ raise NotImplementedError, "Abstract method #scrape called."
50
+ end
51
+ end
52
+ end
53
+ end
@@ -0,0 +1,50 @@
1
+ module BnetScraper
2
+ module Starcraft2
3
+ # LeagueScraper
4
+ #
5
+ # Extracts information from an SC2 League URL and scrapes for information.
6
+ # Example:
7
+ # league_data = BnetScraper::Starcraft2::LeagueScraper.new("http://us.battle.net/sc2/en/profile/2377239/1/Demon/ladder/96905")
8
+ # league_data # => { bnet_id: '2377239', account: 'Demon', season: '6', size: '4v4', name: "Aleksander Pepper", division: "Diamond", random: false }
9
+ #
10
+ # @param [String] url - The league URL on battle.net
11
+ # @return [Hash] league_data - Hash of data extracted
12
+ class LeagueScraper < BaseScraper
13
+ attr_reader :league_id, :season, :size, :random, :name, :division
14
+
15
+ def initialize options = {}
16
+ super(options)
17
+
18
+ if options[:url]
19
+ @league_id = options[:url].match(/http:\/\/.+\/sc2\/.+\/profile\/.+\/\d{1}\/.+\/ladder\/(.+)(#current-rank)?/).to_a[1]
20
+ else
21
+ @league_id = options[:league_id]
22
+ end
23
+ end
24
+
25
+ def scrape
26
+ @response = Nokogiri::HTML(open(@url))
27
+ value = @response.css(".data-title .data-label h3").inner_text().strip
28
+ header_regex = /Season (\d{1}) - \s+(\dv\d)( Random)? (\w+)\s+Division (.+)/
29
+ header_values = value.match(header_regex).to_a
30
+ header_values.shift()
31
+ @season, @size, @random, @division, @name = header_values
32
+
33
+ @random = !@random.nil?
34
+ output
35
+ end
36
+
37
+ def output
38
+ {
39
+ season: @season,
40
+ size: @size,
41
+ name: @name,
42
+ division: @division,
43
+ random: @random,
44
+ bnet_id: @bnet_id,
45
+ account: @account
46
+ }
47
+ end
48
+ end
49
+ end
50
+ end