bnet_scraper 0.0.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ *.gem
2
+ .bundle
3
+ Gemfile.lock
4
+ pkg/*
5
+ .rvmrc
6
+ doc/*
7
+ .yardoc/*
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --color
2
+ --format nested
@@ -0,0 +1,4 @@
1
+ notifications:
2
+ email:
3
+ recipients:
4
+ - anordman@majorleaguegaming.com
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source "http://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in bnet_scraper.gemspec
4
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2012 Andrew Nordman
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,136 @@
1
+ # BnetScraper
2
+
3
+ BnetScraper is a Nokogiri-based scraper of Battle.net profile information. Currently this only includes Starcraft2.
4
+
5
+ # Installation
6
+
7
+ Run `gem install bnet_scraper` or add `gem 'bnet_scraper'` to your `Gemfile`.
8
+
9
+ # Usage
10
+
11
+ All of the scrapers take an options hash, and can be created by either passing a URL string for the profile URL or
12
+ passing the account information in the options hash. Thus, either of these two approaches work:
13
+
14
+ ``` ruby
15
+ BnetScraper::Starcraft2::ProfileScraper.new(url: 'http://us.battle.net/sc2/en/profile/12345/1/TestAccount/')
16
+ BnetScraper::Starcraft2::ProfileScraper.new(bnet_id: '12345', account: 'TestAccount', region: 'na')
17
+ ```
18
+
19
+ There are several scrapers that pull various information. They are:
20
+
21
+ * BnetScraper::Starcraft2::ProfileScraper - collects basic profile information and an array of league URLs
22
+ * BnetScraper::Starcraft2::LeagueScraper - collects data on a particular league for a particular Battle.net account
23
+ * BnetScraper::Starcraft2::AchievementScraper - collects achievement data for the account.
24
+ * BnetScraper::Starcraft2::MatchHistoryScraper - collects the 25 most recent matches played on the account
25
+
26
+ All scrapers have a `#scrape` method that triggers the scraping and storage. By default they will return the result,
27
+ but an additional `#output` method exists to retrieve the results subsequent times without re-scraping.
28
+
29
+ ## BnetScraper::Starcraft2::ProfileScraper
30
+
31
+ This pulls basic profile information for an account, as well as an array of league URLs. This is a good starting
32
+ point for league scraping as it provides the league URLs necessary to do supplemental scraping.
33
+
34
+ ``` ruby
35
+ scraper = BnetScraper::Starcraft2::ProfileScraper.new(url: 'http://us.battle.net/sc2/en/profile/2377239/1/Demon/')
36
+ scraper.scrape
37
+ # => {
38
+ bnet_id: '2377239',
39
+ account: 'Demon',
40
+ bnet_index: 1,
41
+ race: 'Protoss',
42
+ wins: '684',
43
+ achievement_points: '3630',
44
+ leagues: [
45
+ {
46
+ name: "1v1 Platinum Rank 95",
47
+ id: "96905",
48
+ href: "http://us.battle.net/sc2/en/profile/2377239/1/Demon/ladder/96905#current-rank"
49
+ }
50
+ ]
51
+ }
52
+ ```
53
+
54
+ ## BnetScraper::Starcraft2::LeagueScraper
55
+
56
+ This pulls information on a specific league for a specific account. It is best used either in conjunction with a
57
+ profile scrape that profiles a URL, or if you happen to know the specific league\_id and can pass it as an option.
58
+
59
+ ``` ruby
60
+ scraper = BnetScraper::Starcraft2::LeagueScraper.new(league_id: '12345', account: 'Demon', bnet_id: '2377239')
61
+ scraper.scrape
62
+ # => {
63
+ season: '6',
64
+ name: 'Aleksander Pepper',
65
+ division: 'Diamond',
66
+ size: '4v4',
67
+ random: false,
68
+ bnet_id: '2377239',
69
+ account: 'Demon'
70
+ }
71
+ ```
72
+
73
+ ## BnetScraper::Starcraft2::AchievementScraper
74
+
75
+ This pulls achievement information for an account. Note that currently only returns the overall achievements,
76
+ not the in-depth, by-category achievement information.
77
+
78
+ ``` ruby
79
+ scraper = BnetScraper::Starcraft2::AchievementScraper.new(url: 'http://us.battle.net/sc2/en/profile/2377239/1/Demon/')
80
+ scraper.scrape
81
+ # => {
82
+ recent: [
83
+ { title: 'Blink of an Eye', description: 'Complete round 24 in "Starcraft Master" without losing any stalkers', earned: '3/5/2012' },
84
+ { title: 'Whack-a-Roach', description: 'Complete round 9 in "Starcraft Master" in under 45 seconds', earned: '3/5/2012' },
85
+ { title: 'Safe Zone', description: 'Complete round 8 in "Starcraft Master" without losing any stalkers', earned: '3/5/2012' },
86
+ { title: 'Starcraft Master', description: 'Complete all 30 rounds in "Starcraft Master"', earned: '3/5/2012' },
87
+ { title: 'Starcraft Expert', description: 'Complete any 25 rounds in "Starcraft Master"', earned: '3/5/2012' },
88
+ { title: 'Starcraft Apprentice', description: 'Complete any 20 rounds in "Starcraft Master"', earned: '3/5/2012' }
89
+ ],
90
+ showcase: [
91
+ { title: 'Hot Shot', description: 'Finish a Qualification Round with an undefeated record.' },
92
+ { title: 'Starcraft Master', description: 'Complete all rounds in "Starcraft Master"' },
93
+ { title: 'Team Protoss 500', description: 'Win 500 team league matches as Protoss' },
94
+ { title: 'Night of the Living III', description: 'Survive 15 Infested Horde Attacks in the "Night 2 Die" mode of the "Left 2 Die" scenario.' },
95
+ { title: 'Team Top 100 Diamond', description: 'Finish a Season in Team Diamond Division' }
96
+
97
+ ],
98
+ progress: {
99
+ liberty_campaign: '1580',
100
+ exploration: '480',
101
+ custom_game: '330',
102
+ cooperative: '660',
103
+ quick_match: '170'
104
+ }
105
+ }
106
+ ```
107
+
108
+ ## BnetScraper::Starcraft2::MatchHistoryScraper
109
+
110
+ This pulls the 25 most recent matches played for an account. Note that this is only as up-to-date as battle.net is, and
111
+ will likely not be as fast as in-game.
112
+
113
+ ``` ruby
114
+ scraper = BnetScraper::Starcraft2::MatchHistoryScraper.new(url: 'http://us.battle.net/sc2/en/profile/2377239/1/Demon/')
115
+ scraper.scrape
116
+ # => {
117
+ wins: '15',
118
+ losses: '10',
119
+ matches: [
120
+ { map_name: 'Bx Monobattle - Sand Canyon (Fix)', outcome: :win, type: 'Custom', date: '3/12/2012' },
121
+ { map_name: 'Deadlock Ridge', outcome: :loss, type: '4v4', date: '3/12/2012' },
122
+ { map_name: 'District 10', outcome: :win, type: '4v4', date: '3/12/2012' },
123
+ # ...
124
+ ]
125
+ }
126
+ ```
127
+
128
+ # Contribute!
129
+
130
+ I would love to see contributions! Please send a pull request with a feature branch containing specs
131
+ (Chances are excellent I will break it if you do not) and I will take a look. Please do not change the version
132
+ as I tend to bundle multiple fixes together before releasing a new version anyway.
133
+
134
+ # Author
135
+
136
+ Written by [Andrew Nordman](http://github.com/cadwallion), see LICENSE for details.
@@ -0,0 +1,10 @@
1
+ require "bundler/gem_tasks"
2
+ require 'rspec/core/rake_task'
3
+
4
+ RSpec::Core::RakeTask.new(:spec) do |spec|
5
+ spec.pattern = 'spec/**/*_spec.rb'
6
+ spec.rspec_opts = ['--backtrace']
7
+ # spec.ruby_opts = ['-w']
8
+ end
9
+
10
+ task :default => :spec
@@ -0,0 +1,24 @@
1
+ $:.push File.join(File.dirname(__FILE__), '..', 'lib')
2
+
3
+ require 'benchmark'
4
+ require 'bnet_scraper'
5
+
6
+ @@bnet_accounts = [
7
+ { bnet_id: 2377239, name: 'Demon', race: 'protoss' },
8
+ { bnet_id: 2035618, name: 'Mykaelos', race: 'random' },
9
+ { bnet_id: 1826063, name: 'Fyrefly', race: 'zerg' },
10
+ { bnet_id: 2539344, name: 'Cadwallion', race: 'terran' }
11
+ ]
12
+
13
+ def parse_random_account
14
+ account = @@bnet_accounts[rand(@@bnet_accounts.size)]
15
+ BnetScraper::Starcraft2::ProfileScraper.new(account[:bnet_id], account[:name]).scrape
16
+ end
17
+
18
+ Benchmark.bmbm(7) do |x|
19
+ x.report('1') { parse_random_account }
20
+ x.report('5') { 5.times { parse_random_account } }
21
+ x.report('10') { 10.times { parse_random_account } }
22
+ x.report('25') { 25.times { parse_random_account } }
23
+ x.report('50') { 50.times { parse_random_account } }
24
+ end
@@ -0,0 +1,22 @@
1
+ # -*- encoding: utf-8 -*-
2
+ $:.push File.expand_path("../lib", __FILE__)
3
+
4
+ Gem::Specification.new do |s|
5
+ s.name = "bnet_scraper"
6
+ s.version = "0.0.1"
7
+ s.authors = ["Andrew Nordman"]
8
+ s.email = ["anordman@majorleaguegaming.com"]
9
+ s.homepage = "https://github.com/agoragames/bnet_scraper/"
10
+ s.summary = %q{Battle.net Profile Scraper}
11
+ s.description = %q{BnetScraper is a Nokogiri-based scraper of Battle.net profile information. Currently this only includes Starcraft2.}
12
+
13
+ s.files = `git ls-files`.split("\n")
14
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
15
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
16
+ s.require_paths = ["lib"]
17
+
18
+ s.add_runtime_dependency 'nokogiri'
19
+ s.add_development_dependency 'rake'
20
+ s.add_development_dependency 'rspec'
21
+ s.add_development_dependency 'fakeweb'
22
+ end
@@ -0,0 +1,7 @@
1
+ require 'bnet_scraper/starcraft2'
2
+ require 'net/http'
3
+ require 'open-uri'
4
+ require 'nokogiri'
5
+
6
+ module BnetScraper
7
+ end
@@ -0,0 +1,46 @@
1
+ require 'bnet_scraper/starcraft2/base_scraper'
2
+ require 'bnet_scraper/starcraft2/profile_scraper'
3
+ require 'bnet_scraper/starcraft2/league_scraper'
4
+ require 'bnet_scraper/starcraft2/achievement_scraper'
5
+ require 'bnet_scraper/starcraft2/match_history_scraper'
6
+
7
+ module BnetScraper
8
+ # This module contains everything about scraping Starcraft 2 Battle.net accounts.
9
+ # See `BnetScraper::Starcraft2::ProfileScraper` and `BnetScraper::Starcraft2::LeagueScraper`
10
+ # for more details
11
+ module Starcraft2
12
+ REGIONS = {
13
+ 'na' => { domain: 'us.battle.net', dir: 'en' },
14
+ 'eu' => { domain: 'eu.battle.net', dir: 'eu' },
15
+ 'cn' => { domain: 'www.battlenet.com.cn', dir: 'zh' },
16
+ 'sea' => { domain: 'sea.battle.net', dir: 'en' },
17
+ 'fea' => { domain: 'tw.battle.net', dir: 'zh' }
18
+ }
19
+
20
+ # This is a convenience method that chains calls to ProfileScraper,
21
+ # followed by a scrape of each league returned in the `leagues` array
22
+ # in the profile_data. The end result is a fully scraped profile with
23
+ # profile and league data in a hash.
24
+ #
25
+ # See `BnetScraper::Starcraft2::ProfileScraper` for more information on
26
+ # the parameters being sent to `#full_profile_scrape`.
27
+ #
28
+ # @param bnet_id - Battle.net Account ID
29
+ # @param account - Battle.net Account Name
30
+ # @param region - Battle.net Account Region
31
+ # @return profile_data - Hash containing complete profile and league data
32
+ # scraped from the website
33
+ def self.full_profile_scrape bnet_id, account, region = 'na'
34
+ profile_scraper = ProfileScraper.new bnet_id: bnet_id, account: account, region: region
35
+ profile_output = profile_scraper.scrape
36
+
37
+ parsed_leagues = []
38
+ profile_output[:leagues].each do |league|
39
+ league_scraper = LeagueScraper.new url: league[:href]
40
+ parsed_leagues << league_scraper.scrape
41
+ end
42
+ profile_output[:leagues] = parsed_leagues
43
+ return profile_output
44
+ end
45
+ end
46
+ end
@@ -0,0 +1,63 @@
1
+ module BnetScraper
2
+ module Starcraft2
3
+ class AchievementScraper < BaseScraper
4
+ attr_reader :recent, :progress, :showcase, :response
5
+
6
+ def scrape
7
+ get_response
8
+ scrape_recent
9
+ scrape_progress
10
+ scrape_showcase
11
+ output
12
+ end
13
+
14
+ def get_response
15
+ @response = Nokogiri::HTML(open(profile_url+"achievements/"))
16
+ end
17
+
18
+ def scrape_recent
19
+ @recent = []
20
+ 6.times do |num|
21
+ achievement = {}
22
+ div = response.css("#achv-recent-#{num}")
23
+ if div
24
+ achievement[:title] = div.css("div > div").inner_text.strip
25
+ achievement[:description] = div.inner_text.gsub(achievement[:title], '').strip
26
+ achievement[:earned] = response.css("#recent-achievements span")[(num*3)+1].inner_text
27
+
28
+ @recent << achievement
29
+ end
30
+ end
31
+ @recent
32
+ end
33
+
34
+ def scrape_progress
35
+ progress_ach = response.css("#progress-module .achievements-progress:nth(2) span")
36
+ @progress = {
37
+ liberty_campaign: progress_ach[0].inner_text,
38
+ exploration: progress_ach[1].inner_text,
39
+ custom_game: progress_ach[2].inner_text,
40
+ cooperative: progress_ach[3].inner_text,
41
+ quick_match: progress_ach[4].inner_text,
42
+ }
43
+ end
44
+
45
+ def scrape_showcase
46
+ @showcase = response.css("#showcase-module .progress-tile").map do |achievement|
47
+ hsh = { title: achievement.css('.tooltip-title').inner_text.strip }
48
+ hsh[:description] = achievement.css('div').inner_text.gsub(hsh[:title], '').strip
49
+ hsh
50
+ end
51
+ @showcase
52
+ end
53
+
54
+ def output
55
+ {
56
+ recent: @recent,
57
+ progress: @progress,
58
+ showcase: @showcase
59
+ }
60
+ end
61
+ end
62
+ end
63
+ end
@@ -0,0 +1,53 @@
1
+ module BnetScraper
2
+ module Starcraft2
3
+ class BaseScraper
4
+ attr_reader :bnet_id, :account, :region, :bnet_index, :url
5
+
6
+ def initialize options = {}
7
+ if options[:url]
8
+ extracted_data = options[:url].match(/http:\/\/(.+)\/sc2\/(.+)\/profile\/(.+)\/(\d{1})\/(.[^\/]+)\//)
9
+ @region = REGIONS.key({ domain: extracted_data[1], dir: extracted_data[2] })
10
+ @bnet_id = extracted_data[3]
11
+ @bnet_index = extracted_data[4]
12
+ @account = extracted_data[5]
13
+ @url = options[:url]
14
+ elsif options[:bnet_id] && options[:account]
15
+ @bnet_id = options[:bnet_id]
16
+ @account = options[:account]
17
+ @region = options[:region] || 'na'
18
+ if options[:bnet_index]
19
+ @bnet_index = options[:bnet_index]
20
+ else
21
+ set_bnet_index
22
+ end
23
+ end
24
+ end
25
+
26
+ # set_bnet_index
27
+ #
28
+ # Because profile URLs have to have a specific bnet_index that is seemingly incalculable,
29
+ # we must ping both variants to determine the correct bnet_index. We then store that value.
30
+ def set_bnet_index
31
+ [1,2].each do |idx|
32
+ res = Net::HTTP.get_response URI profile_url idx
33
+ if res.is_a? Net::HTTPSuccess
34
+ @bnet_index = idx
35
+ return
36
+ end
37
+ end
38
+ end
39
+
40
+ def profile_url bnet_index = @bnet_index
41
+ "http://#{region_info[:domain]}/sc2/#{region_info[:dir]}/profile/#{bnet_id}/#{bnet_index}/#{account}/"
42
+ end
43
+
44
+ def region_info
45
+ REGIONS[region]
46
+ end
47
+
48
+ def scrape
49
+ raise NotImplementedError, "Abstract method #scrape called."
50
+ end
51
+ end
52
+ end
53
+ end
@@ -0,0 +1,50 @@
1
+ module BnetScraper
2
+ module Starcraft2
3
+ # LeagueScraper
4
+ #
5
+ # Extracts information from an SC2 League URL and scrapes for information.
6
+ # Example:
7
+ # league_data = BnetScraper::Starcraft2::LeagueScraper.new("http://us.battle.net/sc2/en/profile/2377239/1/Demon/ladder/96905")
8
+ # league_data # => { bnet_id: '2377239', account: 'Demon', season: '6', size: '4v4', name: "Aleksander Pepper", division: "Diamond", random: false }
9
+ #
10
+ # @param [String] url - The league URL on battle.net
11
+ # @return [Hash] league_data - Hash of data extracted
12
+ class LeagueScraper < BaseScraper
13
+ attr_reader :league_id, :season, :size, :random, :name, :division
14
+
15
+ def initialize options = {}
16
+ super(options)
17
+
18
+ if options[:url]
19
+ @league_id = options[:url].match(/http:\/\/.+\/sc2\/.+\/profile\/.+\/\d{1}\/.+\/ladder\/(.+)(#current-rank)?/).to_a[1]
20
+ else
21
+ @league_id = options[:league_id]
22
+ end
23
+ end
24
+
25
+ def scrape
26
+ @response = Nokogiri::HTML(open(@url))
27
+ value = @response.css(".data-title .data-label h3").inner_text().strip
28
+ header_regex = /Season (\d{1}) - \s+(\dv\d)( Random)? (\w+)\s+Division (.+)/
29
+ header_values = value.match(header_regex).to_a
30
+ header_values.shift()
31
+ @season, @size, @random, @division, @name = header_values
32
+
33
+ @random = !@random.nil?
34
+ output
35
+ end
36
+
37
+ def output
38
+ {
39
+ season: @season,
40
+ size: @size,
41
+ name: @name,
42
+ division: @division,
43
+ random: @random,
44
+ bnet_id: @bnet_id,
45
+ account: @account
46
+ }
47
+ end
48
+ end
49
+ end
50
+ end