espn_scraper 1.3.1

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 4e71f14160cc29d944f5860b4eaa3f983d353890
4
+ data.tar.gz: 75353026bf2aa879d8151a4c1455314e69af8c78
5
+ SHA512:
6
+ metadata.gz: 5f06578dd6ca1eeefa13910a35a03e84a4781499de5d478c924707db90a476780fb25a9276a6ede4d3b66e1135c31dd1b0c105ad3bd37cc213cd15ae050720f6
7
+ data.tar.gz: d2b1eda63a460fc734f7280ae171ce0e9260612d1513bc6507dda595d8d484d038a7dc3df213c5066fcfb51151799156cd81b51a07a75bd204cbec1d99c267e8
@@ -0,0 +1,2 @@
1
+ .ruby-version
2
+ Gemfile.lock
@@ -0,0 +1,4 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.0.0
4
+ - 1.9.3
data/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source 'https://rubygems.org/'
2
+
3
+ gem 'rake', '~> 10.4.2'
4
+ gem 'httparty'
5
+ gem 'nokogiri'
6
+ gem 'minitest', '~> 5.6.0'
@@ -0,0 +1,141 @@
1
+ # ESPN Scraper
2
+
3
+ ESPN Scraper is a simple gem for scraping teams and scores from `ESPN`'s website. Please note that `ESPN` is not involved with this gem or me in any way. I chose `ESPN` because it is a leader in sports statistics and has a robust website.
4
+
5
+ ```ruby
6
+ ESPN.responding?
7
+ # => true
8
+ ```
9
+
10
+ Lets begin...
11
+
12
+ #### Supported leagues
13
+
14
+ The gem only supports the following leagues:
15
+
16
+ ```ruby
17
+ ESPN.leagues
18
+ # => [ "nfl", "mlb", "nba", "nhl", "ncf", "ncb" ]
19
+ ```
20
+
21
+ Which are the NFL, MLB, NBA, NHL, NCAA D1 Football, NCAA D1 Men's Basketball respectively.
22
+
23
+ #### Scrape Divisions
24
+
25
+ You can get all the divisions in each league.
26
+
27
+ ```ruby
28
+ ESPN.get_divisions
29
+ # => {
30
+ # "nfl" => [
31
+ # { :name => "NFC East", :data_name => "nfc-east" },
32
+ # { :name => "NFC West", :data_name => "nfc-west" },
33
+ # ...
34
+ # ],
35
+ # "mlb" => ...
36
+ # }
37
+ ```
38
+
39
+ #### Scrape Conferences (NCAA D1 Men's Basketball only)
40
+
41
+ You can get all the conferences in NCAA D1 Men's Basketball.
42
+
43
+ ```ruby
44
+ ESPN.get_conferences_in_ncb
45
+ # => [{:name=>"America East", :data_name=>"1"},
46
+ # {:name=>"American", :data_name=>"62"},
47
+ # ...
48
+ # ]
49
+ ```
50
+
51
+ #### Scrape teams
52
+
53
+ You can get the teams in each league by acronym. It returns a hash of each division with an array of hashes for each team in the division.
54
+
55
+ ```ruby
56
+ ESPN.get_teams_in('nba')
57
+ # => {
58
+ # "atlantic"=> [
59
+ # { :name => "Boston Celtics", :data_name => "bos" },
60
+ # { :name => "Brooklyn Nets", :data_name => "bkn" },
61
+ # { :name => "New York Knicks", :data_name => "ny" },
62
+ # { :name => "Philadelphia 76ers", :data_name => "phi" },
63
+ # { :name => "Toronto Raptors", :data_name => "tor" }
64
+ # ]
65
+ # "pacific" => ...
66
+ # }
67
+ ```
68
+
69
+ #### Scraping scores
70
+
71
+ All score requests return an array of hashes. Here's an example NFL score hash:
72
+
73
+ ```ruby
74
+ {
75
+ league: 'nfl',
76
+ game_date: #<DateTime: 2012-10-25>,
77
+ home_team: 'min',
78
+ home_score: 17,
79
+ away_team: 'tb',
80
+ away_score: 36
81
+ }
82
+ ```
83
+
84
+ You'll notice the teams are identified with the same `:data_name` from a `ESPN.get_teams_in` request. One issue with scraping scores is that football goes by year and week, and baseball, basketball, hockey go by date.
85
+
86
+ ###### weekly (football)
87
+
88
+ Pattern is `ESPN.get_<league>_scores(year, week)`. This is for `nfl` and `ncf`:
89
+
90
+ ```ruby
91
+ ESPN.get_nfl_scores(2012, 8)
92
+ ESPN.get_ncf_scores(2011, 3)
93
+ ```
94
+
95
+ ###### daily (baseball, basketball, hockey)
96
+
97
+ Pattern is `ESPN.get_<league>_scores(date)`. This is for `mlb`, `nba`, `nhl`, `ncb`:
98
+
99
+ ```ruby
100
+ ESPN.get_mlb_scores( Date.parse('Aug 13, 2012') )
101
+ ESPN.get_nba_scores( Date.parse('Dec 25, 2011') )
102
+ ESPN.get_nhl_scores( Date.parse('Feb 14, 2009') )
103
+ ESPN.get_ncb_scores( Date.parse('Mar 15, 2012') )
104
+ ```
105
+
106
+ ## Installing
107
+
108
+ Add the gem to your `Gemfile`
109
+
110
+ ```ruby
111
+ gem 'espn_scraper', git: 'git://github.com/aj0strow/espn-scraper.git'
112
+ # or
113
+ gem 'espn_scraper', github: 'aj0strow/espn-scraper'
114
+ ```
115
+
116
+ ..and then require it. I personally use it in rake tasks of a Rails app.
117
+
118
+ ```ruby
119
+ require 'espn_scraper'
120
+ ```
121
+
122
+ ## Contributing
123
+
124
+ Please report back if something breaks on you!
125
+
126
+ Also please let me know if any of the data names get outdated. For instance a bunch of NFL data names were recently changed. You can make fixes temporarily with the following:
127
+
128
+ ```ruby
129
+ ESPN::DATA_NAME_FIXES['nfl']['gnb'] = 'gb'
130
+ ```
131
+
132
+ Future plans:
133
+ - Get start and end dates of a season
134
+
135
+ ### Thank You
136
+
137
+ * Dan Madere ([dgmdan](https://github.com/dgmdan))
138
+
139
+ ---
140
+
141
+ **MIT License**
@@ -0,0 +1,9 @@
1
+ require "bundler/gem_tasks"
2
+ require 'rake/testtask'
3
+
4
+ Rake::TestTask.new do |t|
5
+ t.libs << 'test'
6
+ t.test_files = FileList['test/espn_scraper_test/*_test.rb']
7
+ end
8
+
9
+ task default: :test
@@ -0,0 +1,19 @@
1
+ $:.push File.expand_path("../lib", __FILE__)
2
+ require 'espn_scraper/version'
3
+
4
+ Gem::Specification.new do |s|
5
+ s.name = 'espn_scraper'
6
+ s.version = ESPN::VERSION
7
+ s.date = '2013-04-07'
8
+ s.summary = 'ESPN Scraper'
9
+ s.description = 'a simple scraping api for espn stats and data'
10
+ s.authors = %w[ aj0strow ]
11
+ s.email = 'alexander.ostrow@gmail.com'
12
+ s.homepage = 'http://github.com/aj0strow/espn-scraper'
13
+
14
+ s.add_dependency 'httparty'
15
+ s.add_dependency 'nokogiri'
16
+
17
+ s.files = `git ls-files`.split("\n")
18
+ s.test_files = `git ls-files -- test`.split("\n")
19
+ end
@@ -0,0 +1,3 @@
1
+ %w[ boilerplate teams scores ].each do |file|
2
+ require "espn_scraper/#{ file }"
3
+ end
@@ -0,0 +1,50 @@
1
+ require 'httparty'
2
+ require 'nokogiri'
3
+
4
+ module ESPN
5
+ class << self
6
+
7
+ def responding?
8
+ HTTParty.get('http://espn.go.com/').code == 200
9
+ end
10
+
11
+ def down?
12
+ !responding?
13
+ end
14
+
15
+ # Ex: ESPN.url('scores')
16
+ # ESPN.url('teams', 'nba')
17
+ def url(*path)
18
+ subdomain = (path.first == 'scores') ? path.shift : nil
19
+ domain = [subdomain, 'espn', 'go', 'com'].compact.join('.')
20
+ ['http:/', domain, *path].join('/')
21
+ end
22
+
23
+ # Returns Nokogiri HTML document
24
+ # Ex: ESPN.get('teams', 'nba')
25
+ def get(*path)
26
+ http_url = self.url(*path)
27
+ response = HTTParty.get(http_url)
28
+ if response.code == 200
29
+ Nokogiri::HTML(response.body)
30
+ else
31
+ raise ArgumentError, error_message(url, path)
32
+ end
33
+ end
34
+
35
+ def dasherize(str)
36
+ str.strip.downcase.gsub(/\s+/, '-')
37
+ end
38
+
39
+
40
+ private
41
+
42
+
43
+
44
+ def error_message(url, path)
45
+ "The url #{url} from the path #{path} did not return a valid page."
46
+ end
47
+
48
+ end
49
+ end
50
+
@@ -0,0 +1,311 @@
1
+ require 'uri'
2
+ require 'cgi'
3
+ require 'json'
4
+
5
+ module ESPN
6
+ SEASONS = {
7
+ preseason: 1,
8
+ regular_season: 2,
9
+ postseason: 3
10
+ }
11
+
12
+ mlb_ignores = %w(
13
+ florida-state u-of-south-florida georgetown fla.-southern northeastern boston-college
14
+ miami-florida florida-intl canada hanshin yomiuri sacramento springfield corpus-christi
15
+ round-rock carolina manatee-cc mexico cincinnati-(f) atlanta-(f) frisco toledo norfolk
16
+ fort-myers tampa-bay-(f) nl-all-stars al-all-stars
17
+ )
18
+
19
+ nba_ignores = %w( west-all-stars east-all-stars )
20
+
21
+ nhl_ignores = %w(
22
+ hc-sparta frolunda hc-slovan ev-zug jokerit-helsinki hamburg-freezers adler-mannheim
23
+ team-chara team-alfredsson
24
+ )
25
+
26
+ ncf_ignores = %w( paul-quinn san-diego-christian ferris-st notre-dame-college chaminade
27
+ w-new-mexico n-new-mexico tx-a&m-commerce nw-oklahoma-st )
28
+
29
+ IGNORED_TEAMS = (mlb_ignores + nhl_ignores + nba_ignores + ncf_ignores).inject({}) do |h, team|
30
+ h.merge team => false
31
+ end
32
+
33
+ DATA_NAME_EXCEPTIONS = {
34
+ 'nets' => 'bkn',
35
+ 'supersonics' => 'okc',
36
+ 'hornets' => 'no',
37
+
38
+ 'marlins' => 'mia'
39
+ }.merge(IGNORED_TEAMS)
40
+
41
+ DATA_NAME_FIXES = {
42
+ 'nfl' => {
43
+ 'nwe' => 'ne',
44
+ 'kan' => 'kc',
45
+ 'was' => 'wsh',
46
+ 'nor' => 'no',
47
+ 'gnb' => 'gb',
48
+ 'sfo' => 'sf',
49
+ 'tam' => 'tb',
50
+ 'sdg' => 'sd'
51
+ },
52
+ 'mlb' => {},
53
+ 'nba' => {},
54
+ 'nhl' => {},
55
+ 'ncf' => {},
56
+ 'ncb' => {}
57
+ }
58
+
59
+ # Example output:
60
+ # {
61
+ # league: "nfl",
62
+ # game_date: #<Date: 2013-01-05 ((2456298j,0s,0n),+0s,2299161j)>,
63
+ # home_team: "sea",
64
+ # home_score: 48,
65
+ # away_team: "min",
66
+ # away_score: 27
67
+ # }
68
+
69
+ class << self
70
+
71
+ def get_superbowl_score
72
+ markup = Scores.markup_from_superbowl
73
+ scores = Scores.home_away_parse(markup, false)
74
+ add_league_and_fixes(scores, 'nfl')
75
+ scores
76
+ end
77
+
78
+ def get_nfl_scores(year, week)
79
+ markup = Scores.markup_from_year_and_week('nfl', year, week)
80
+ scores = Scores.home_away_parse(markup)
81
+ add_league_and_fixes(scores, 'nfl')
82
+ scores
83
+ end
84
+
85
+ def get_mlb_scores(date)
86
+ markup = Scores.markup_from_date('mlb', date)
87
+ scores = Scores.home_away_parse(markup)
88
+ scores.each { |report| report[:league] = 'mlb' }
89
+ scores
90
+ end
91
+
92
+ def get_nba_scores(date)
93
+ markup = Scores.markup_from_date('nba', date)
94
+ scores = Scores.home_away_parse(markup)
95
+ scores.each { |report| report[:league] = 'nba' }
96
+ scores
97
+ end
98
+
99
+ def get_nhl_scores(date)
100
+ markup = Scores.markup_from_date('nhl', date)
101
+ scores = Scores.winner_loser_parse(markup, date)
102
+ scores.each { |report| report[:league] = 'nhl' }
103
+ scores
104
+ end
105
+
106
+ def get_ncf_scores(year, week)
107
+ markup = Scores.markup_from_year_and_week('college-football', year, week)
108
+ scores = Scores.ncf_parse(markup)
109
+ scores.each { |report| report[:league] = 'college-football' }
110
+ scores
111
+ end
112
+
113
+ alias_method :get_college_football_scores, :get_ncf_scores
114
+
115
+ def get_ncb_scores(date, conference_id=nil, final_only=true)
116
+ if conference_id
117
+ markup = Scores.markup_from_date_and_conference('ncb', date, conference_id)
118
+ else
119
+ markup = Scores.markup_from_date('ncb', date)
120
+ end
121
+
122
+ scores = Scores.home_away_parse(markup, final_only)
123
+
124
+ scores.each do |report|
125
+ report[:league] ||= 'mens-college-basketball'
126
+ report[:game_date] ||= date
127
+ end
128
+
129
+ scores
130
+ end
131
+
132
+ alias_method :get_college_basketball_scores, :get_ncb_scores
133
+
134
+ def get_ncb_abbreviations(date)
135
+ markup = Scores.markup_from_date('ncb', date)
136
+
137
+ teams = Scores.team_abbreviation_parse(markup)
138
+ teams
139
+ end
140
+
141
+ def add_league_and_fixes(scores, league)
142
+ scores.each do |report|
143
+ report[:league] = league
144
+ [:home_team, :away_team].each do |sym|
145
+ team = report[sym]
146
+ report[sym] = DATA_NAME_FIXES[league][team] || team
147
+ end
148
+ end
149
+ end
150
+ end
151
+
152
+
153
+
154
+ module Scores
155
+ class << self
156
+
157
+ # Get Markup
158
+
159
+ def markup_from_superbowl
160
+ ESPN.get 'scores', 'nfl', "scoreboard/_/group/80/year/#{Time.now.year - 1}/seasontype/3/week/5"
161
+ end
162
+
163
+ def markup_from_year_and_week(league, year, week)
164
+ ESPN.get 'scores', league, "scoreboard/_/group/80/year/#{year}/seasontype/2/week/#{week}"
165
+ end
166
+
167
+ def markup_from_date(league, date)
168
+ day = date.to_s.gsub(/[^\d]+/, '')
169
+ ESPN.get 'scores', league, "scoreboard?date=#{ day }"
170
+ end
171
+
172
+ def markup_from_date_and_conference(league, date, conference_id)
173
+ day = date.to_s.gsub(/[^\d]+/, '')
174
+ ESPN.get league, 'scoreboard', '_', 'group', conference_id.to_s, 'date', day
175
+ end
176
+
177
+ # parsing strategies
178
+
179
+ def home_away_parse(doc, final=true)
180
+ scores = []
181
+ games = []
182
+ espn_regex = /window\.espn\.scoreboardData \t= (\{.*?\});/
183
+ doc.xpath("//script").each do |script_section|
184
+ if script_section.content =~ espn_regex
185
+ espn_data = JSON.parse(espn_regex.match(script_section.content)[1])
186
+ games = espn_data['events']
187
+ break
188
+ end
189
+ end
190
+ games.each do |game|
191
+ # Game must be regular or postseason
192
+ next unless game['season']['type'] == SEASONS[:regular_season] || game['season']['type'] == SEASONS[:postseason]
193
+ score = {}
194
+ competition = game['competitions'].first
195
+
196
+ # Score must be final
197
+ if !final || competition['status']['type']['detail'] =~ /^Final/
198
+ competition['competitors'].each do |competitor|
199
+ if competitor['homeAway'] == 'home'
200
+ score[:home_team] = competitor['team']['abbreviation'].downcase
201
+ score[:home_score] = competitor['score'].to_i
202
+ else
203
+ score[:away_team] = competitor['team']['abbreviation'].downcase
204
+ score[:away_score] = competitor['score'].to_i
205
+ end
206
+ end
207
+ score[:game_date] = DateTime.parse(game['date'])
208
+ score[:status] = competition['status']['type']['detail']
209
+ scores << score
210
+ end
211
+ end
212
+ scores
213
+ end
214
+
215
+ def team_abbreviation_parse(doc)
216
+ games = []
217
+ teams = {}
218
+ espn_regex = /window\.espn\.scoreboardData \t= (\{.*?\});/
219
+ doc.xpath("//script").each do |script_section|
220
+ if script_section.content =~ espn_regex
221
+ espn_data = JSON.parse(espn_regex.match(script_section.content)[1])
222
+ games = espn_data['events']
223
+ break
224
+ end
225
+ end
226
+ games.each do |game|
227
+ competition = game['competitions'].first
228
+ competition['competitors'].each do |competitor|
229
+ teams[competitor['team']['displayName']] = competitor['team']['abbreviation'].downcase
230
+ end
231
+ end
232
+
233
+ teams
234
+ end
235
+
236
+ def ncf_parse(doc)
237
+ scores = []
238
+ games = []
239
+ espn_regex = /window\.espn\.scoreboardData \t= (\{.*?\});/
240
+ doc.xpath("//script").each do |script_section|
241
+ if script_section.content =~ espn_regex
242
+ espn_data = JSON.parse(espn_regex.match(script_section.content)[1])
243
+ games = espn_data['events']
244
+ break
245
+ end
246
+ end
247
+ games.each do |game|
248
+ score = { league: 'college-football' }
249
+ competition = game['competitions'].first
250
+ date = DateTime.parse(competition['startDate'])
251
+ date = date.new_offset('-06:00')
252
+ score[:game_date] = date.to_date
253
+ # Score must be final
254
+ if competition['status']['type']['detail'] =~ /^Final/
255
+ competition['competitors'].each do |competitor|
256
+ if competitor['homeAway'] == 'home'
257
+ score[:home_team] = competitor['team']['id'].downcase
258
+ score[:home_score] = competitor['score'].to_i
259
+ else
260
+ score[:away_team] = competitor['team']['id'].downcase
261
+ score[:away_score] = competitor['score'].to_i
262
+ end
263
+ end
264
+ scores << score
265
+ end
266
+ end
267
+ scores
268
+ end
269
+
270
+ def winner_loser_parse(doc, date)
271
+ doc.css('.mod-scorebox-final').map do |game|
272
+ game_info = { game_date: date }
273
+ teams = game.css('td.team-name:not([colspan])').map { |td| parse_data_name_from(td) }
274
+ game_info[:away_team], game_info[:home_team] = teams
275
+ scores = game.css('.team-score').map { |td| td.at_css('span').content.to_i }
276
+ game_info[:away_score], game_info[:home_score] = scores
277
+ game_info
278
+ end
279
+ end
280
+
281
+ # parsing helpers
282
+
283
+ def parse_data_name_from(container)
284
+ if container.at_css('a')
285
+ link = container.at_css('a')['href']
286
+ self.data_name_from(link)
287
+ else
288
+ if container.at_css('div')
289
+ name = container.at_css('div text()').content
290
+ elsif container.at_css('span')
291
+ name = container.at_css('span text()').content
292
+ else
293
+ name = container.at_css('text()').content
294
+ end
295
+ ESPN::DATA_NAME_EXCEPTIONS[ ESPN.dasherize(name) ]
296
+ end
297
+ end
298
+
299
+ def data_name_from(link)
300
+ encoded_link = URI::encode(link.strip)
301
+ query = URI::parse(encoded_link).query
302
+ if query
303
+ CGI::parse(query)['team'].first
304
+ else
305
+ link.split('/')[-2]
306
+ end
307
+ end
308
+
309
+ end
310
+ end
311
+ end
@@ -0,0 +1,78 @@
1
+ module ESPN
2
+ class << self
3
+
4
+ def leagues
5
+ @leagues || %w(nfl mlb nba nhl ncf ncb)
6
+ end
7
+
8
+ def leagues=(leagues)
9
+ @leagues = leagues
10
+ end
11
+
12
+ def get_divisions
13
+ divisions = {}
14
+ leagues.each do |league|
15
+ divisions[league] = get_divisions_in(league)
16
+ end
17
+ divisions
18
+ end
19
+
20
+ def get_divisions_in(league)
21
+ get_divs(league).map do |div|
22
+ name = parse_div_name(div)
23
+ { name: name, data_name: div_data_name(name) }
24
+ end
25
+ end
26
+
27
+ def get_conferences_in_ncb
28
+ get_ncb_conferences.map do |element|
29
+ name = element.content
30
+ data_name = $1 if element.children[0].attributes['href'].value =~ /confId=(\d+)/
31
+ { name: name, data_name: data_name }
32
+ end
33
+ end
34
+
35
+ def get_teams_in(league)
36
+ divisions = {}
37
+ get_divs(league.to_s.downcase).each do |division|
38
+ key = div_data_name parse_div_name(division)
39
+ divisions[key] = division.css('.mod-content li').map do |team|
40
+ team_elem = team.at_css('h5 a.bi')
41
+ team_name = team_elem.content
42
+ data_name, slug = team_elem['href'].split('/').last(2)
43
+
44
+ slug.sub! dasherize(team_name), ''
45
+ team_name.sub! /\(.*\)/, ''
46
+
47
+ name_adds = slug.split('-').reject(&:empty?).each(&:capitalize!)
48
+ name = name_adds.unshift(team_name).join(' ').strip.gsub(/\s+/, ' ')
49
+ { name: name, data_name: data_name }
50
+ end
51
+ end
52
+ divisions
53
+ end
54
+
55
+
56
+
57
+ private
58
+
59
+
60
+
61
+ def get_divs(league)
62
+ self.get(league, 'teams').css('.mod-teams-list-medium')
63
+ end
64
+
65
+ def get_ncb_conferences
66
+ self.get('ncb', 'conferences').css('.mod-content h5')
67
+ end
68
+
69
+ def parse_div_name(div)
70
+ div.at_css('.mod-header h4 text()').content
71
+ end
72
+
73
+ def div_data_name(div_name)
74
+ dasherize div_name.gsub(/division/i, '')
75
+ end
76
+
77
+ end
78
+ end
@@ -0,0 +1,3 @@
1
+ module ESPN
2
+ VERSION = '1.3.1'
3
+ end
@@ -0,0 +1,35 @@
1
+ require 'test_helper'
2
+
3
+ class DivisionsTest < EspnTest
4
+
5
+ test 'nfl afc north' do
6
+ nfl = ESPN.get_divisions['nfl']
7
+ assert nfl.include?({ name: 'AFC North', data_name: 'afc-north' })
8
+ end
9
+
10
+ test 'mlb al west' do
11
+ mlb = ESPN.get_divisions['mlb']
12
+ assert mlb.include?({ name: 'AL West', data_name: 'al-west' })
13
+ end
14
+
15
+ test 'nba central' do
16
+ nba = ESPN.get_divisions['nba']
17
+ assert nba.include?({ name: 'Central', data_name: 'central' })
18
+ end
19
+
20
+ test 'nhl pacific division' do
21
+ nhl = ESPN.get_divisions['nhl']
22
+ assert nhl.include?({ name: 'Pacific Division', data_name: 'pacific' })
23
+ end
24
+
25
+ test 'ncf conference usa' do
26
+ ncf = ESPN.get_divisions['ncf']
27
+ assert ncf.include?({ name: 'Conference USA', data_name: 'conference-usa' })
28
+ end
29
+
30
+ test 'ncb' do
31
+ ncb = ESPN.get_divisions['ncb']
32
+ assert ncb.include?({ name: 'ACC', data_name: 'acc' })
33
+ end
34
+
35
+ end
@@ -0,0 +1,34 @@
1
+ require 'test_helper'
2
+
3
+ class BoilerplateTest < EspnTest
4
+
5
+ test 'espn is up' do
6
+ assert ESPN.responding?
7
+ assert !ESPN.down?
8
+ end
9
+
10
+ test 'paths are working' do
11
+ assert_equal 'http://scores.espn.go.com', ESPN.url('scores')
12
+ assert_equal 'http://espn.go.com/nba/teams', ESPN.url('nba', 'teams')
13
+ end
14
+
15
+ test 'error message works' do
16
+ assert_raises(ArgumentError) do
17
+ ESPN.get('bad-api-keyword')
18
+ end
19
+ end
20
+
21
+ test 'get pages is working' do
22
+ assert ESPN.get('scores')
23
+ end
24
+
25
+ test 'dasherize strings' do
26
+ assert_equal 'string-is-dashed', ESPN.send(:dasherize, 'String is dashed')
27
+ end
28
+
29
+ test 'leagues' do
30
+ leagues = 'nfl mlb nba nhl ncf ncb'.split
31
+ assert_equal leagues, ESPN.leagues
32
+ end
33
+
34
+ end
@@ -0,0 +1,40 @@
1
+ require 'test_helper'
2
+
3
+ class MlbTest < EspnTest
4
+
5
+ test 'mlb august 13th 2012 yankees beat rangers' do
6
+ starts_at = DateTime.parse('2012-08-13T23:00:00+00:00')
7
+ expected = {
8
+ league: 'mlb',
9
+ game_date: starts_at,
10
+ home_team: 'nyy',
11
+ home_score: 8,
12
+ away_team: 'tex',
13
+ away_score: 2
14
+ }
15
+ scores = ESPN.get_mlb_scores(starts_at.to_date)
16
+ assert scores.include?(expected), 'A known MLB final score cannot be found'
17
+ end
18
+
19
+ test 'mlb april 18th 2015 blue jays beat braves in extra innings' do
20
+ starts_at = DateTime.parse('2015-04-18T17:07:00+00:00')
21
+ expected = {
22
+ league: 'mlb',
23
+ game_date: starts_at,
24
+ home_team: 'tor',
25
+ home_score: 6,
26
+ away_team: 'atl',
27
+ away_score: 5
28
+ }
29
+ scores = ESPN.get_mlb_scores(starts_at.to_date)
30
+ assert scores.include?(expected), 'A known MLB final score cannot be found'
31
+ end
32
+
33
+ test 'random mlb days' do
34
+ random_days.each do |day|
35
+ scores = ESPN.get_mlb_scores(day)
36
+ assert all_names_present?(scores), "Error on #{day} for mlb"
37
+ end
38
+ end
39
+
40
+ end
@@ -0,0 +1,26 @@
1
+ require 'test_helper'
2
+
3
+ class NbaTest < EspnTest
4
+
5
+ test 'nba december 25th celtics beat nets' do
6
+ day = Date.parse('Dec 25, 2012')
7
+ expected = {
8
+ league: 'nba',
9
+ game_date: DateTime.parse('2012-12-25T17:00:00+00:00'),
10
+ home_team: 'bkn',
11
+ home_score: 76,
12
+ away_team: 'bos',
13
+ away_score: 93
14
+ }
15
+ scores = ESPN.get_nba_scores(day)
16
+ assert_equal expected, scores.first
17
+ end
18
+
19
+ test 'random nba days' do
20
+ random_days.each do |day|
21
+ scores = ESPN.get_nba_scores(day)
22
+ assert all_names_present?(scores), "Error on #{day} for nba"
23
+ end
24
+ end
25
+
26
+ end
@@ -0,0 +1,30 @@
1
+ require 'test_helper'
2
+
3
+ class NcbTest < EspnTest
4
+
5
+ test 'mens college basketball march 15th murray state beats colorado state' do
6
+ day = Date.parse('2012-03-15')
7
+ league = 'mens-college-basketball'
8
+ expected = {
9
+ league: league,
10
+ game_date: day,
11
+ home_team: 'murr',
12
+ home_score: 58,
13
+ away_team: 'csu',
14
+ away_score: 41
15
+ }
16
+ mountain_west_conf = ESPN.get_conferences_in_ncb.select {|c| c[:name] == 'Mountain West' }.first
17
+ scores = ESPN.get_college_basketball_scores(day, mountain_west_conf[:data_name])
18
+ assert_equal expected, scores.first
19
+ end
20
+
21
+
22
+ test 'random ncb dates' do
23
+ random_days.each do |day|
24
+ random_conf = ESPN.get_conferences_in_ncb.sample
25
+ scores = ESPN.get_college_basketball_scores(day, random_conf[:data_name])
26
+ assert all_names_present?(scores), "Error on #{day} for college basketball"
27
+ end
28
+ end
29
+
30
+ end
@@ -0,0 +1,26 @@
1
+ require 'test_helper'
2
+
3
+ class NcfTest < EspnTest
4
+
5
+ test 'college football 2012 week 9 regular season' do
6
+ starts_at = DateTime.parse('2012-10-23T00:00:00+00:00')
7
+ expected = {
8
+ league: 'college-football',
9
+ game_date: starts_at,
10
+ home_team: '309',
11
+ home_score: 27,
12
+ away_team: '2032',
13
+ away_score: 50
14
+ }
15
+ scores = ESPN.get_college_football_scores(2012, 9)
16
+ assert_equal expected, scores.first
17
+ end
18
+
19
+ test 'looking for a break' do
20
+ random_days.each do |week|
21
+ scores = ESPN.get_college_football_scores(2012, week)
22
+ assert all_names_present?(scores), "!!! error in week #{week}"
23
+ end
24
+ end
25
+
26
+ end
@@ -0,0 +1,46 @@
1
+ require 'test_helper'
2
+
3
+ class NflTest < EspnTest
4
+
5
+ test 'data names are fixed' do
6
+ scores = ESPN.get_nfl_scores(2012, 2)
7
+ assert scores.any?, 'scores parsing failed'
8
+ assert_equal 'gb', scores.first[:home_team]
9
+ end
10
+
11
+ test 'nfl 2012 week 8 regular season' do
12
+ starts_at = DateTime.parse('2012-10-26T00:20Z')
13
+ expected = {
14
+ league: 'nfl',
15
+ game_date: starts_at,
16
+ home_team: 'min',
17
+ home_score: 17,
18
+ away_team: 'tb',
19
+ away_score: 36
20
+ }
21
+ scores = ESPN.get_nfl_scores(2012, 8)
22
+ assert_equal expected, scores.first
23
+ end
24
+
25
+ test 'nfl 2012 week 7 regular season' do
26
+ starts_at = DateTime.parse('2012-10-23T00:30:00+00:00')
27
+ expected = {
28
+ league: 'nfl',
29
+ game_date: starts_at,
30
+ home_team: 'chi',
31
+ home_score: 13,
32
+ away_team: 'det',
33
+ away_score: 7
34
+ }
35
+ scores = ESPN.get_nfl_scores(2012, 7)
36
+ assert_equal expected, scores.last
37
+ end
38
+
39
+ test 'looking for a break' do
40
+ random_weeks.each do |week|
41
+ scores = ESPN.get_nfl_scores(2012, week)
42
+ assert all_names_present?(scores), "!!! error in week #{week}"
43
+ end
44
+ end
45
+
46
+ end
@@ -0,0 +1,26 @@
1
+ require 'test_helper'
2
+
3
+ class NhlTest < EspnTest
4
+
5
+ test 'nhl rangers beat bruins on valentines day' do
6
+ day = Date.parse('Feb 14, 2012')
7
+ expected = {
8
+ league: 'nhl',
9
+ game_date: day,
10
+ home_team: 'bos',
11
+ home_score: 0,
12
+ away_team: 'nyr',
13
+ away_score: 3
14
+ }
15
+ scores = ESPN.get_nhl_scores(day)
16
+ assert_equal expected, scores.first
17
+ end
18
+
19
+ test 'random nhl dates' do
20
+ random_days.each do |day|
21
+ scores = ESPN.get_nhl_scores(day)
22
+ assert all_names_present?(scores), "Error on #{day} for nhl"
23
+ end
24
+ end
25
+
26
+ end
@@ -0,0 +1,79 @@
1
+ require 'test_helper'
2
+
3
+ class TeamsTest < EspnTest
4
+
5
+ test 'scrape nfl teams' do
6
+ divisions = ESPN.get_teams_in('nfl')
7
+ assert_equal 8, divisions.count
8
+ divisions.each do |name, teams|
9
+ assert_equal 4, teams.count
10
+ end
11
+ teams = divisions.values.flatten
12
+ assert_equal 32, teams.map{ |h| h[:name] }.uniq.count
13
+ assert_equal 32, teams.map{ |h| h[:data_name] }.uniq.count
14
+ assert divisions['nfc-west'].include?({ name: 'Seattle Seahawks', data_name: 'sea' })
15
+ end
16
+
17
+ test 'scrape mlb teams' do
18
+ divisions = ESPN.get_teams_in('mlb')
19
+ assert_equal 6, divisions.count
20
+ divisions.each do |name, teams|
21
+ assert_equal 5, teams.count
22
+ end
23
+ teams = divisions.values.flatten
24
+ assert_equal 30, teams.map{ |h| h[:name] }.uniq.count
25
+ assert_equal 30, teams.map{ |h| h[:data_name] }.uniq.count
26
+ assert divisions['al-west'].include?({ name: 'Seattle Mariners', data_name: 'sea' })
27
+ end
28
+
29
+ test 'scrape nba teams' do
30
+ divisions = ESPN.get_teams_in('nba')
31
+ assert_equal 6, divisions.count
32
+ divisions.each do |name, teams|
33
+ assert_equal 5, teams.count
34
+ end
35
+ teams = divisions.values.flatten
36
+ assert_equal 30, teams.map{ |h| h[:name] }.uniq.count
37
+ assert_equal 30, teams.map{ |h| h[:data_name] }.uniq.count
38
+ assert divisions['atlantic'].include?({ name: 'Toronto Raptors', data_name: 'tor' })
39
+ end
40
+
41
+ test 'scrape nhl teams' do
42
+ divisions = ESPN.get_teams_in('nhl')
43
+ assert_equal 4, divisions.count
44
+ assert_equal 7, divisions['central'].count
45
+ assert_equal 8, divisions['atlantic'].count
46
+ teams = divisions.values.flatten
47
+ assert_equal 30, teams.map{ |h| h[:name] }.uniq.count
48
+ assert_equal 30, teams.map{ |h| h[:data_name] }.uniq.count
49
+ assert divisions['atlantic'].include?({ name: 'Montreal Canadiens', data_name: 'mtl' })
50
+ end
51
+
52
+ test 'scrape ncaa football teams' do
53
+ divisions = ESPN.get_teams_in('college-football')
54
+ assert_equal 25, divisions.count
55
+ assert_equal 12, divisions['pac-12'].count
56
+
57
+ assert divisions['conference-usa'].include?({ name: 'Rice Owls', data_name: '242' })
58
+ assert divisions['meac'].include?({ name: 'Bethune-Cookman Wildcats', data_name: '2065' })
59
+ assert divisions['northeast'].include?({ name: 'St Francis Red Flash', data_name: '2598' })
60
+ assert divisions['swac'].include?({ name: 'Alabama A&M Bulldogs', data_name: '2010' })
61
+ end
62
+
63
+ test 'scrape ncaa basketball teams' do
64
+ divisions = ESPN.get_teams_in('mens-college-basketball')
65
+ assert_equal 32, divisions.count
66
+ assert_equal 15, divisions['acc'].count
67
+ assert_equal 10, divisions['patriot-league'].count
68
+
69
+ assert divisions['southland'].include?({ name: 'Texas A&M-CC Islanders', data_name: '357' })
70
+ assert divisions['atlantic-10'].include?({ name: "Saint Joe's Saint Joseph's Hawks", data_name: '2603' })
71
+ end
72
+
73
+ test 'scrape ncaa basketball conferences' do
74
+ conferences = ESPN.get_conferences_in_ncb
75
+ assert_equal 32, conferences.count
76
+ assert conferences.include?({ name: 'Mountain West', data_name: '44' })
77
+ end
78
+
79
+ end
@@ -0,0 +1,36 @@
1
+ ERROR_CHECKS = 1
2
+
3
+ require 'minitest/autorun'
4
+ require 'espn_scraper'
5
+
6
+ class EspnTest < Minitest::Test
7
+
8
+ class << self
9
+
10
+ def test(test_name, &block)
11
+ define_method("test_: #{ test_name }", &block)
12
+ end
13
+
14
+ end
15
+
16
+ def all_names_present?(ary)
17
+ ary.map do |obj|
18
+ h, a = obj[:home_team], obj[:away_team]
19
+ test = h.nil? || (h && h.empty?) || a.nil? || (a && a.empty?)
20
+ puts h, a if test
21
+ test
22
+ end.count(true) == 0
23
+ end
24
+
25
+ def whole_year
26
+ Date.today..Date.today.prev_year
27
+ end
28
+
29
+ def random_days(amount = ERROR_CHECKS)
30
+ whole_year.to_a.sample(amount)
31
+ end
32
+
33
+ def random_weeks(amount = ERROR_CHECKS)
34
+ (1..17).to_a.sample(amount)
35
+ end
36
+ end
metadata ADDED
@@ -0,0 +1,101 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: espn_scraper
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.3.1
5
+ platform: ruby
6
+ authors:
7
+ - aj0strow
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2013-04-07 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: httparty
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: nokogiri
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ description: a simple scraping api for espn stats and data
42
+ email: alexander.ostrow@gmail.com
43
+ executables: []
44
+ extensions: []
45
+ extra_rdoc_files: []
46
+ files:
47
+ - ".gitignore"
48
+ - ".travis.yml"
49
+ - Gemfile
50
+ - README.md
51
+ - Rakefile
52
+ - espn_scraper.gemspec
53
+ - lib/espn_scraper.rb
54
+ - lib/espn_scraper/boilerplate.rb
55
+ - lib/espn_scraper/scores.rb
56
+ - lib/espn_scraper/teams.rb
57
+ - lib/espn_scraper/version.rb
58
+ - test/espn_scraper_test/divisions_test.rb
59
+ - test/espn_scraper_test/espn_test.rb
60
+ - test/espn_scraper_test/mlb_test.rb
61
+ - test/espn_scraper_test/nba_test.rb
62
+ - test/espn_scraper_test/ncb_test.rb
63
+ - test/espn_scraper_test/ncf_test.rb
64
+ - test/espn_scraper_test/nfl_test.rb
65
+ - test/espn_scraper_test/nhl_test.rb
66
+ - test/espn_scraper_test/teams_test.rb
67
+ - test/test_helper.rb
68
+ homepage: http://github.com/aj0strow/espn-scraper
69
+ licenses: []
70
+ metadata: {}
71
+ post_install_message:
72
+ rdoc_options: []
73
+ require_paths:
74
+ - lib
75
+ required_ruby_version: !ruby/object:Gem::Requirement
76
+ requirements:
77
+ - - ">="
78
+ - !ruby/object:Gem::Version
79
+ version: '0'
80
+ required_rubygems_version: !ruby/object:Gem::Requirement
81
+ requirements:
82
+ - - ">="
83
+ - !ruby/object:Gem::Version
84
+ version: '0'
85
+ requirements: []
86
+ rubyforge_project:
87
+ rubygems_version: 2.5.2
88
+ signing_key:
89
+ specification_version: 4
90
+ summary: ESPN Scraper
91
+ test_files:
92
+ - test/espn_scraper_test/divisions_test.rb
93
+ - test/espn_scraper_test/espn_test.rb
94
+ - test/espn_scraper_test/mlb_test.rb
95
+ - test/espn_scraper_test/nba_test.rb
96
+ - test/espn_scraper_test/ncb_test.rb
97
+ - test/espn_scraper_test/ncf_test.rb
98
+ - test/espn_scraper_test/nfl_test.rb
99
+ - test/espn_scraper_test/nhl_test.rb
100
+ - test/espn_scraper_test/teams_test.rb
101
+ - test/test_helper.rb