espn_scraper 1.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 4e71f14160cc29d944f5860b4eaa3f983d353890
4
+ data.tar.gz: 75353026bf2aa879d8151a4c1455314e69af8c78
5
+ SHA512:
6
+ metadata.gz: 5f06578dd6ca1eeefa13910a35a03e84a4781499de5d478c924707db90a476780fb25a9276a6ede4d3b66e1135c31dd1b0c105ad3bd37cc213cd15ae050720f6
7
+ data.tar.gz: d2b1eda63a460fc734f7280ae171ce0e9260612d1513bc6507dda595d8d484d038a7dc3df213c5066fcfb51151799156cd81b51a07a75bd204cbec1d99c267e8
@@ -0,0 +1,2 @@
1
+ .ruby-version
2
+ Gemfile.lock
@@ -0,0 +1,4 @@
1
+ language: ruby
2
+ rvm:
3
+ - 2.0.0
4
+ - 1.9.3
data/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source 'https://rubygems.org/'
2
+
3
+ gem 'rake', '~> 10.4.2'
4
+ gem 'httparty'
5
+ gem 'nokogiri'
6
+ gem 'minitest', '~> 5.6.0'
@@ -0,0 +1,141 @@
1
+ # ESPN Scraper
2
+
3
+ ESPN Scraper is a simple gem for scraping teams and scores from `ESPN`'s website. Please note that `ESPN` is not involved with this gem or me in any way. I chose `ESPN` because it is a leader in sports statistics and has a robust website.
4
+
5
+ ```ruby
6
+ ESPN.responding?
7
+ # => true
8
+ ```
9
+
10
+ Lets begin...
11
+
12
+ #### Supported leagues
13
+
14
+ The gem only supports the following leagues:
15
+
16
+ ```ruby
17
+ ESPN.leagues
18
+ # => [ "nfl", "mlb", "nba", "nhl", "ncf", "ncb" ]
19
+ ```
20
+
21
+ Which are the NFL, MLB, NBA, NHL, NCAA D1 Football, NCAA D1 Men's Basketball respectively.
22
+
23
+ #### Scrape Divisions
24
+
25
+ You can get all the divisions in each league.
26
+
27
+ ```ruby
28
+ ESPN.get_divisions
29
+ # => {
30
+ # "nfl" => [
31
+ # { :name => "NFC East", :data_name => "nfc-east" },
32
+ # { :name => "NFC West", :data_name => "nfc-west" },
33
+ # ...
34
+ # ],
35
+ # "mlb" => ...
36
+ # }
37
+ ```
38
+
39
+ #### Scrape Conferences (NCAA D1 Men's Basketball only)
40
+
41
+ You can get all the conferences in NCAA D1 Men's Basketball.
42
+
43
+ ```ruby
44
+ ESPN.get_conferences_in_ncb
45
+ # => [{:name=>"America East", :data_name=>"1"},
46
+ # {:name=>"American", :data_name=>"62"},
47
+ # ...
48
+ # ]
49
+ ```
50
+
51
+ #### Scrape teams
52
+
53
+ You can get the teams in each league by acronym. It returns a hash of each division with an array of hashes for each team in the division.
54
+
55
+ ```ruby
56
+ ESPN.get_teams_in('nba')
57
+ # => {
58
+ # "atlantic"=> [
59
+ # { :name => "Boston Celtics", :data_name => "bos" },
60
+ # { :name => "Brooklyn Nets", :data_name => "bkn" },
61
+ # { :name => "New York Knicks", :data_name => "ny" },
62
+ # { :name => "Philadelphia 76ers", :data_name => "phi" },
63
+ # { :name => "Toronto Raptors", :data_name => "tor" }
64
+ # ]
65
+ # "pacific" => ...
66
+ # }
67
+ ```
68
+
69
+ #### Scraping scores
70
+
71
+ All score requests return an array of hashes. Here's an example NFL score hash:
72
+
73
+ ```ruby
74
+ {
75
+ league: 'nfl',
76
+ game_date: #<DateTime: 2012-10-25>,
77
+ home_team: 'min',
78
+ home_score: 17,
79
+ away_team: 'tb',
80
+ away_score: 36
81
+ }
82
+ ```
83
+
84
+ You'll notice the teams are identified with the same `:data_name` from a `ESPN.get_teams_in` request. One issue with scraping scores is that football goes by year and week, and baseball, basketball, hockey go by date.
85
+
86
+ ###### weekly (football)
87
+
88
+ Pattern is `ESPN.get_<league>_scores(year, week)`. This is for `nfl` and `ncf`:
89
+
90
+ ```ruby
91
+ ESPN.get_nfl_scores(2012, 8)
92
+ ESPN.get_ncf_scores(2011, 3)
93
+ ```
94
+
95
+ ###### daily (baseball, basketball, hockey)
96
+
97
+ Pattern is `ESPN.get_<league>_scores(date)`. This is for `mlb`, `nba`, `nhl`, `ncb`:
98
+
99
+ ```ruby
100
+ ESPN.get_mlb_scores( Date.parse('Aug 13, 2012') )
101
+ ESPN.get_nba_scores( Date.parse('Dec 25, 2011') )
102
+ ESPN.get_nhl_scores( Date.parse('Feb 14, 2009') )
103
+ ESPN.get_ncb_scores( Date.parse('Mar 15, 2012') )
104
+ ```
105
+
106
+ ## Installing
107
+
108
+ Add the gem to your `Gemfile`
109
+
110
+ ```ruby
111
+ gem 'espn_scraper', git: 'git://github.com/aj0strow/espn-scraper.git'
112
+ # or
113
+ gem 'espn_scraper', github: 'aj0strow/espn-scraper'
114
+ ```
115
+
116
+ ..and then require it. I personally use it in rake tasks of a Rails app.
117
+
118
+ ```ruby
119
+ require 'espn_scraper'
120
+ ```
121
+
122
+ ## Contributing
123
+
124
+ Please report back if something breaks on you!
125
+
126
+ Also please let me know if any of the data names get outdated. For instance a bunch of NFL data names were recently changed. You can make fixes temporarily with the following:
127
+
128
+ ```ruby
129
+ ESPN::DATA_NAME_FIXES['nfl']['gnb'] = 'gb'
130
+ ```
131
+
132
+ Future plans:
133
+ - Get start and end dates of a season
134
+
135
+ ### Thank You
136
+
137
+ * Dan Madere ([dgmdan](https://github.com/dgmdan))
138
+
139
+ ---
140
+
141
+ **MIT License**
@@ -0,0 +1,9 @@
1
+ require "bundler/gem_tasks"
2
+ require 'rake/testtask'
3
+
4
+ Rake::TestTask.new do |t|
5
+ t.libs << 'test'
6
+ t.test_files = FileList['test/espn_scraper_test/*_test.rb']
7
+ end
8
+
9
+ task default: :test
@@ -0,0 +1,19 @@
1
+ $:.push File.expand_path("../lib", __FILE__)
2
+ require 'espn_scraper/version'
3
+
4
+ Gem::Specification.new do |s|
5
+ s.name = 'espn_scraper'
6
+ s.version = ESPN::VERSION
7
+ s.date = '2013-04-07'
8
+ s.summary = 'ESPN Scraper'
9
+ s.description = 'a simple scraping api for espn stats and data'
10
+ s.authors = %w[ aj0strow ]
11
+ s.email = 'alexander.ostrow@gmail.com'
12
+ s.homepage = 'http://github.com/aj0strow/espn-scraper'
13
+
14
+ s.add_dependency 'httparty'
15
+ s.add_dependency 'nokogiri'
16
+
17
+ s.files = `git ls-files`.split("\n")
18
+ s.test_files = `git ls-files -- test`.split("\n")
19
+ end
@@ -0,0 +1,3 @@
1
+ %w[ boilerplate teams scores ].each do |file|
2
+ require "espn_scraper/#{ file }"
3
+ end
@@ -0,0 +1,50 @@
1
+ require 'httparty'
2
+ require 'nokogiri'
3
+
4
+ module ESPN
5
+ class << self
6
+
7
+ def responding?
8
+ HTTParty.get('http://espn.go.com/').code == 200
9
+ end
10
+
11
+ def down?
12
+ !responding?
13
+ end
14
+
15
+ # Ex: ESPN.url('scores')
16
+ # ESPN.url('teams', 'nba')
17
+ def url(*path)
18
+ subdomain = (path.first == 'scores') ? path.shift : nil
19
+ domain = [subdomain, 'espn', 'go', 'com'].compact.join('.')
20
+ ['http:/', domain, *path].join('/')
21
+ end
22
+
23
+ # Returns Nokogiri HTML document
24
+ # Ex: ESPN.get('teams', 'nba')
25
+ def get(*path)
26
+ http_url = self.url(*path)
27
+ response = HTTParty.get(http_url)
28
+ if response.code == 200
29
+ Nokogiri::HTML(response.body)
30
+ else
31
+ raise ArgumentError, error_message(url, path)
32
+ end
33
+ end
34
+
35
+ def dasherize(str)
36
+ str.strip.downcase.gsub(/\s+/, '-')
37
+ end
38
+
39
+
40
+ private
41
+
42
+
43
+
44
+ def error_message(url, path)
45
+ "The url #{url} from the path #{path} did not return a valid page."
46
+ end
47
+
48
+ end
49
+ end
50
+
@@ -0,0 +1,311 @@
1
+ require 'uri'
2
+ require 'cgi'
3
+ require 'json'
4
+
5
+ module ESPN
6
+ SEASONS = {
7
+ preseason: 1,
8
+ regular_season: 2,
9
+ postseason: 3
10
+ }
11
+
12
+ mlb_ignores = %w(
13
+ florida-state u-of-south-florida georgetown fla.-southern northeastern boston-college
14
+ miami-florida florida-intl canada hanshin yomiuri sacramento springfield corpus-christi
15
+ round-rock carolina manatee-cc mexico cincinnati-(f) atlanta-(f) frisco toledo norfolk
16
+ fort-myers tampa-bay-(f) nl-all-stars al-all-stars
17
+ )
18
+
19
+ nba_ignores = %w( west-all-stars east-all-stars )
20
+
21
+ nhl_ignores = %w(
22
+ hc-sparta frolunda hc-slovan ev-zug jokerit-helsinki hamburg-freezers adler-mannheim
23
+ team-chara team-alfredsson
24
+ )
25
+
26
+ ncf_ignores = %w( paul-quinn san-diego-christian ferris-st notre-dame-college chaminade
27
+ w-new-mexico n-new-mexico tx-a&m-commerce nw-oklahoma-st )
28
+
29
+ IGNORED_TEAMS = (mlb_ignores + nhl_ignores + nba_ignores + ncf_ignores).inject({}) do |h, team|
30
+ h.merge team => false
31
+ end
32
+
33
+ DATA_NAME_EXCEPTIONS = {
34
+ 'nets' => 'bkn',
35
+ 'supersonics' => 'okc',
36
+ 'hornets' => 'no',
37
+
38
+ 'marlins' => 'mia'
39
+ }.merge(IGNORED_TEAMS)
40
+
41
+ DATA_NAME_FIXES = {
42
+ 'nfl' => {
43
+ 'nwe' => 'ne',
44
+ 'kan' => 'kc',
45
+ 'was' => 'wsh',
46
+ 'nor' => 'no',
47
+ 'gnb' => 'gb',
48
+ 'sfo' => 'sf',
49
+ 'tam' => 'tb',
50
+ 'sdg' => 'sd'
51
+ },
52
+ 'mlb' => {},
53
+ 'nba' => {},
54
+ 'nhl' => {},
55
+ 'ncf' => {},
56
+ 'ncb' => {}
57
+ }
58
+
59
+ # Example output:
60
+ # {
61
+ # league: "nfl",
62
+ # game_date: #<Date: 2013-01-05 ((2456298j,0s,0n),+0s,2299161j)>,
63
+ # home_team: "sea",
64
+ # home_score: 48,
65
+ # away_team: "min",
66
+ # away_score: 27
67
+ # }
68
+
69
+ class << self
70
+
71
+ def get_superbowl_score
72
+ markup = Scores.markup_from_superbowl
73
+ scores = Scores.home_away_parse(markup, false)
74
+ add_league_and_fixes(scores, 'nfl')
75
+ scores
76
+ end
77
+
78
+ def get_nfl_scores(year, week)
79
+ markup = Scores.markup_from_year_and_week('nfl', year, week)
80
+ scores = Scores.home_away_parse(markup)
81
+ add_league_and_fixes(scores, 'nfl')
82
+ scores
83
+ end
84
+
85
+ def get_mlb_scores(date)
86
+ markup = Scores.markup_from_date('mlb', date)
87
+ scores = Scores.home_away_parse(markup)
88
+ scores.each { |report| report[:league] = 'mlb' }
89
+ scores
90
+ end
91
+
92
+ def get_nba_scores(date)
93
+ markup = Scores.markup_from_date('nba', date)
94
+ scores = Scores.home_away_parse(markup)
95
+ scores.each { |report| report[:league] = 'nba' }
96
+ scores
97
+ end
98
+
99
+ def get_nhl_scores(date)
100
+ markup = Scores.markup_from_date('nhl', date)
101
+ scores = Scores.winner_loser_parse(markup, date)
102
+ scores.each { |report| report[:league] = 'nhl' }
103
+ scores
104
+ end
105
+
106
+ def get_ncf_scores(year, week)
107
+ markup = Scores.markup_from_year_and_week('college-football', year, week)
108
+ scores = Scores.ncf_parse(markup)
109
+ scores.each { |report| report[:league] = 'college-football' }
110
+ scores
111
+ end
112
+
113
+ alias_method :get_college_football_scores, :get_ncf_scores
114
+
115
+ def get_ncb_scores(date, conference_id=nil, final_only=true)
116
+ if conference_id
117
+ markup = Scores.markup_from_date_and_conference('ncb', date, conference_id)
118
+ else
119
+ markup = Scores.markup_from_date('ncb', date)
120
+ end
121
+
122
+ scores = Scores.home_away_parse(markup, final_only)
123
+
124
+ scores.each do |report|
125
+ report[:league] ||= 'mens-college-basketball'
126
+ report[:game_date] ||= date
127
+ end
128
+
129
+ scores
130
+ end
131
+
132
+ alias_method :get_college_basketball_scores, :get_ncb_scores
133
+
134
+ def get_ncb_abbreviations(date)
135
+ markup = Scores.markup_from_date('ncb', date)
136
+
137
+ teams = Scores.team_abbreviation_parse(markup)
138
+ teams
139
+ end
140
+
141
+ def add_league_and_fixes(scores, league)
142
+ scores.each do |report|
143
+ report[:league] = league
144
+ [:home_team, :away_team].each do |sym|
145
+ team = report[sym]
146
+ report[sym] = DATA_NAME_FIXES[league][team] || team
147
+ end
148
+ end
149
+ end
150
+ end
151
+
152
+
153
+
154
+ module Scores
155
+ class << self
156
+
157
+ # Get Markup
158
+
159
+ def markup_from_superbowl
160
+ ESPN.get 'scores', 'nfl', "scoreboard/_/group/80/year/#{Time.now.year - 1}/seasontype/3/week/5"
161
+ end
162
+
163
+ def markup_from_year_and_week(league, year, week)
164
+ ESPN.get 'scores', league, "scoreboard/_/group/80/year/#{year}/seasontype/2/week/#{week}"
165
+ end
166
+
167
+ def markup_from_date(league, date)
168
+ day = date.to_s.gsub(/[^\d]+/, '')
169
+ ESPN.get 'scores', league, "scoreboard?date=#{ day }"
170
+ end
171
+
172
+ def markup_from_date_and_conference(league, date, conference_id)
173
+ day = date.to_s.gsub(/[^\d]+/, '')
174
+ ESPN.get league, 'scoreboard', '_', 'group', conference_id.to_s, 'date', day
175
+ end
176
+
177
+ # parsing strategies
178
+
179
+ def home_away_parse(doc, final=true)
180
+ scores = []
181
+ games = []
182
+ espn_regex = /window\.espn\.scoreboardData \t= (\{.*?\});/
183
+ doc.xpath("//script").each do |script_section|
184
+ if script_section.content =~ espn_regex
185
+ espn_data = JSON.parse(espn_regex.match(script_section.content)[1])
186
+ games = espn_data['events']
187
+ break
188
+ end
189
+ end
190
+ games.each do |game|
191
+ # Game must be regular or postseason
192
+ next unless game['season']['type'] == SEASONS[:regular_season] || game['season']['type'] == SEASONS[:postseason]
193
+ score = {}
194
+ competition = game['competitions'].first
195
+
196
+ # Score must be final
197
+ if !final || competition['status']['type']['detail'] =~ /^Final/
198
+ competition['competitors'].each do |competitor|
199
+ if competitor['homeAway'] == 'home'
200
+ score[:home_team] = competitor['team']['abbreviation'].downcase
201
+ score[:home_score] = competitor['score'].to_i
202
+ else
203
+ score[:away_team] = competitor['team']['abbreviation'].downcase
204
+ score[:away_score] = competitor['score'].to_i
205
+ end
206
+ end
207
+ score[:game_date] = DateTime.parse(game['date'])
208
+ score[:status] = competition['status']['type']['detail']
209
+ scores << score
210
+ end
211
+ end
212
+ scores
213
+ end
214
+
215
+ def team_abbreviation_parse(doc)
216
+ games = []
217
+ teams = {}
218
+ espn_regex = /window\.espn\.scoreboardData \t= (\{.*?\});/
219
+ doc.xpath("//script").each do |script_section|
220
+ if script_section.content =~ espn_regex
221
+ espn_data = JSON.parse(espn_regex.match(script_section.content)[1])
222
+ games = espn_data['events']
223
+ break
224
+ end
225
+ end
226
+ games.each do |game|
227
+ competition = game['competitions'].first
228
+ competition['competitors'].each do |competitor|
229
+ teams[competitor['team']['displayName']] = competitor['team']['abbreviation'].downcase
230
+ end
231
+ end
232
+
233
+ teams
234
+ end
235
+
236
+ def ncf_parse(doc)
237
+ scores = []
238
+ games = []
239
+ espn_regex = /window\.espn\.scoreboardData \t= (\{.*?\});/
240
+ doc.xpath("//script").each do |script_section|
241
+ if script_section.content =~ espn_regex
242
+ espn_data = JSON.parse(espn_regex.match(script_section.content)[1])
243
+ games = espn_data['events']
244
+ break
245
+ end
246
+ end
247
+ games.each do |game|
248
+ score = { league: 'college-football' }
249
+ competition = game['competitions'].first
250
+ date = DateTime.parse(competition['startDate'])
251
+ date = date.new_offset('-06:00')
252
+ score[:game_date] = date.to_date
253
+ # Score must be final
254
+ if competition['status']['type']['detail'] =~ /^Final/
255
+ competition['competitors'].each do |competitor|
256
+ if competitor['homeAway'] == 'home'
257
+ score[:home_team] = competitor['team']['id'].downcase
258
+ score[:home_score] = competitor['score'].to_i
259
+ else
260
+ score[:away_team] = competitor['team']['id'].downcase
261
+ score[:away_score] = competitor['score'].to_i
262
+ end
263
+ end
264
+ scores << score
265
+ end
266
+ end
267
+ scores
268
+ end
269
+
270
+ def winner_loser_parse(doc, date)
271
+ doc.css('.mod-scorebox-final').map do |game|
272
+ game_info = { game_date: date }
273
+ teams = game.css('td.team-name:not([colspan])').map { |td| parse_data_name_from(td) }
274
+ game_info[:away_team], game_info[:home_team] = teams
275
+ scores = game.css('.team-score').map { |td| td.at_css('span').content.to_i }
276
+ game_info[:away_score], game_info[:home_score] = scores
277
+ game_info
278
+ end
279
+ end
280
+
281
+ # parsing helpers
282
+
283
+ def parse_data_name_from(container)
284
+ if container.at_css('a')
285
+ link = container.at_css('a')['href']
286
+ self.data_name_from(link)
287
+ else
288
+ if container.at_css('div')
289
+ name = container.at_css('div text()').content
290
+ elsif container.at_css('span')
291
+ name = container.at_css('span text()').content
292
+ else
293
+ name = container.at_css('text()').content
294
+ end
295
+ ESPN::DATA_NAME_EXCEPTIONS[ ESPN.dasherize(name) ]
296
+ end
297
+ end
298
+
299
+ def data_name_from(link)
300
+ encoded_link = URI::encode(link.strip)
301
+ query = URI::parse(encoded_link).query
302
+ if query
303
+ CGI::parse(query)['team'].first
304
+ else
305
+ link.split('/')[-2]
306
+ end
307
+ end
308
+
309
+ end
310
+ end
311
+ end
@@ -0,0 +1,78 @@
1
+ module ESPN
2
+ class << self
3
+
4
+ def leagues
5
+ @leagues || %w(nfl mlb nba nhl ncf ncb)
6
+ end
7
+
8
+ def leagues=(leagues)
9
+ @leagues = leagues
10
+ end
11
+
12
+ def get_divisions
13
+ divisions = {}
14
+ leagues.each do |league|
15
+ divisions[league] = get_divisions_in(league)
16
+ end
17
+ divisions
18
+ end
19
+
20
+ def get_divisions_in(league)
21
+ get_divs(league).map do |div|
22
+ name = parse_div_name(div)
23
+ { name: name, data_name: div_data_name(name) }
24
+ end
25
+ end
26
+
27
+ def get_conferences_in_ncb
28
+ get_ncb_conferences.map do |element|
29
+ name = element.content
30
+ data_name = $1 if element.children[0].attributes['href'].value =~ /confId=(\d+)/
31
+ { name: name, data_name: data_name }
32
+ end
33
+ end
34
+
35
+ def get_teams_in(league)
36
+ divisions = {}
37
+ get_divs(league.to_s.downcase).each do |division|
38
+ key = div_data_name parse_div_name(division)
39
+ divisions[key] = division.css('.mod-content li').map do |team|
40
+ team_elem = team.at_css('h5 a.bi')
41
+ team_name = team_elem.content
42
+ data_name, slug = team_elem['href'].split('/').last(2)
43
+
44
+ slug.sub! dasherize(team_name), ''
45
+ team_name.sub! /\(.*\)/, ''
46
+
47
+ name_adds = slug.split('-').reject(&:empty?).each(&:capitalize!)
48
+ name = name_adds.unshift(team_name).join(' ').strip.gsub(/\s+/, ' ')
49
+ { name: name, data_name: data_name }
50
+ end
51
+ end
52
+ divisions
53
+ end
54
+
55
+
56
+
57
+ private
58
+
59
+
60
+
61
+ def get_divs(league)
62
+ self.get(league, 'teams').css('.mod-teams-list-medium')
63
+ end
64
+
65
+ def get_ncb_conferences
66
+ self.get('ncb', 'conferences').css('.mod-content h5')
67
+ end
68
+
69
+ def parse_div_name(div)
70
+ div.at_css('.mod-header h4 text()').content
71
+ end
72
+
73
+ def div_data_name(div_name)
74
+ dasherize div_name.gsub(/division/i, '')
75
+ end
76
+
77
+ end
78
+ end
@@ -0,0 +1,3 @@
1
+ module ESPN
2
+ VERSION = '1.3.1'
3
+ end
@@ -0,0 +1,35 @@
1
+ require 'test_helper'
2
+
3
+ class DivisionsTest < EspnTest
4
+
5
+ test 'nfl afc north' do
6
+ nfl = ESPN.get_divisions['nfl']
7
+ assert nfl.include?({ name: 'AFC North', data_name: 'afc-north' })
8
+ end
9
+
10
+ test 'mlb al west' do
11
+ mlb = ESPN.get_divisions['mlb']
12
+ assert mlb.include?({ name: 'AL West', data_name: 'al-west' })
13
+ end
14
+
15
+ test 'nba central' do
16
+ nba = ESPN.get_divisions['nba']
17
+ assert nba.include?({ name: 'Central', data_name: 'central' })
18
+ end
19
+
20
+ test 'nhl pacific division' do
21
+ nhl = ESPN.get_divisions['nhl']
22
+ assert nhl.include?({ name: 'Pacific Division', data_name: 'pacific' })
23
+ end
24
+
25
+ test 'ncf conference usa' do
26
+ ncf = ESPN.get_divisions['ncf']
27
+ assert ncf.include?({ name: 'Conference USA', data_name: 'conference-usa' })
28
+ end
29
+
30
+ test 'ncb' do
31
+ ncb = ESPN.get_divisions['ncb']
32
+ assert ncb.include?({ name: 'ACC', data_name: 'acc' })
33
+ end
34
+
35
+ end
@@ -0,0 +1,34 @@
1
+ require 'test_helper'
2
+
3
+ class BoilerplateTest < EspnTest
4
+
5
+ test 'espn is up' do
6
+ assert ESPN.responding?
7
+ assert !ESPN.down?
8
+ end
9
+
10
+ test 'paths are working' do
11
+ assert_equal 'http://scores.espn.go.com', ESPN.url('scores')
12
+ assert_equal 'http://espn.go.com/nba/teams', ESPN.url('nba', 'teams')
13
+ end
14
+
15
+ test 'error message works' do
16
+ assert_raises(ArgumentError) do
17
+ ESPN.get('bad-api-keyword')
18
+ end
19
+ end
20
+
21
+ test 'get pages is working' do
22
+ assert ESPN.get('scores')
23
+ end
24
+
25
+ test 'dasherize strings' do
26
+ assert_equal 'string-is-dashed', ESPN.send(:dasherize, 'String is dashed')
27
+ end
28
+
29
+ test 'leagues' do
30
+ leagues = 'nfl mlb nba nhl ncf ncb'.split
31
+ assert_equal leagues, ESPN.leagues
32
+ end
33
+
34
+ end
@@ -0,0 +1,40 @@
1
+ require 'test_helper'
2
+
3
+ class MlbTest < EspnTest
4
+
5
+ test 'mlb august 13th 2012 yankees beat rangers' do
6
+ starts_at = DateTime.parse('2012-08-13T23:00:00+00:00')
7
+ expected = {
8
+ league: 'mlb',
9
+ game_date: starts_at,
10
+ home_team: 'nyy',
11
+ home_score: 8,
12
+ away_team: 'tex',
13
+ away_score: 2
14
+ }
15
+ scores = ESPN.get_mlb_scores(starts_at.to_date)
16
+ assert scores.include?(expected), 'A known MLB final score cannot be found'
17
+ end
18
+
19
+ test 'mlb april 18th 2015 blue jays beat braves in extra innings' do
20
+ starts_at = DateTime.parse('2015-04-18T17:07:00+00:00')
21
+ expected = {
22
+ league: 'mlb',
23
+ game_date: starts_at,
24
+ home_team: 'tor',
25
+ home_score: 6,
26
+ away_team: 'atl',
27
+ away_score: 5
28
+ }
29
+ scores = ESPN.get_mlb_scores(starts_at.to_date)
30
+ assert scores.include?(expected), 'A known MLB final score cannot be found'
31
+ end
32
+
33
+ test 'random mlb days' do
34
+ random_days.each do |day|
35
+ scores = ESPN.get_mlb_scores(day)
36
+ assert all_names_present?(scores), "Error on #{day} for mlb"
37
+ end
38
+ end
39
+
40
+ end
@@ -0,0 +1,26 @@
1
+ require 'test_helper'
2
+
3
+ class NbaTest < EspnTest
4
+
5
+ test 'nba december 25th celtics beat nets' do
6
+ day = Date.parse('Dec 25, 2012')
7
+ expected = {
8
+ league: 'nba',
9
+ game_date: DateTime.parse('2012-12-25T17:00:00+00:00'),
10
+ home_team: 'bkn',
11
+ home_score: 76,
12
+ away_team: 'bos',
13
+ away_score: 93
14
+ }
15
+ scores = ESPN.get_nba_scores(day)
16
+ assert_equal expected, scores.first
17
+ end
18
+
19
+ test 'random nba days' do
20
+ random_days.each do |day|
21
+ scores = ESPN.get_nba_scores(day)
22
+ assert all_names_present?(scores), "Error on #{day} for nba"
23
+ end
24
+ end
25
+
26
+ end
@@ -0,0 +1,30 @@
1
+ require 'test_helper'
2
+
3
+ class NcbTest < EspnTest
4
+
5
+ test 'mens college basketball march 15th murray state beats colorado state' do
6
+ day = Date.parse('2012-03-15')
7
+ league = 'mens-college-basketball'
8
+ expected = {
9
+ league: league,
10
+ game_date: day,
11
+ home_team: 'murr',
12
+ home_score: 58,
13
+ away_team: 'csu',
14
+ away_score: 41
15
+ }
16
+ mountain_west_conf = ESPN.get_conferences_in_ncb.select {|c| c[:name] == 'Mountain West' }.first
17
+ scores = ESPN.get_college_basketball_scores(day, mountain_west_conf[:data_name])
18
+ assert_equal expected, scores.first
19
+ end
20
+
21
+
22
+ test 'random ncb dates' do
23
+ random_days.each do |day|
24
+ random_conf = ESPN.get_conferences_in_ncb.sample
25
+ scores = ESPN.get_college_basketball_scores(day, random_conf[:data_name])
26
+ assert all_names_present?(scores), "Error on #{day} for college basketball"
27
+ end
28
+ end
29
+
30
+ end
@@ -0,0 +1,26 @@
1
+ require 'test_helper'
2
+
3
+ class NcfTest < EspnTest
4
+
5
+ test 'college football 2012 week 9 regular season' do
6
+ starts_at = DateTime.parse('2012-10-23T00:00:00+00:00')
7
+ expected = {
8
+ league: 'college-football',
9
+ game_date: starts_at,
10
+ home_team: '309',
11
+ home_score: 27,
12
+ away_team: '2032',
13
+ away_score: 50
14
+ }
15
+ scores = ESPN.get_college_football_scores(2012, 9)
16
+ assert_equal expected, scores.first
17
+ end
18
+
19
+ test 'looking for a break' do
20
+ random_days.each do |week|
21
+ scores = ESPN.get_college_football_scores(2012, week)
22
+ assert all_names_present?(scores), "!!! error in week #{week}"
23
+ end
24
+ end
25
+
26
+ end
@@ -0,0 +1,46 @@
1
+ require 'test_helper'
2
+
3
+ class NflTest < EspnTest
4
+
5
+ test 'data names are fixed' do
6
+ scores = ESPN.get_nfl_scores(2012, 2)
7
+ assert scores.any?, 'scores parsing failed'
8
+ assert_equal 'gb', scores.first[:home_team]
9
+ end
10
+
11
+ test 'nfl 2012 week 8 regular season' do
12
+ starts_at = DateTime.parse('2012-10-26T00:20Z')
13
+ expected = {
14
+ league: 'nfl',
15
+ game_date: starts_at,
16
+ home_team: 'min',
17
+ home_score: 17,
18
+ away_team: 'tb',
19
+ away_score: 36
20
+ }
21
+ scores = ESPN.get_nfl_scores(2012, 8)
22
+ assert_equal expected, scores.first
23
+ end
24
+
25
+ test 'nfl 2012 week 7 regular season' do
26
+ starts_at = DateTime.parse('2012-10-23T00:30:00+00:00')
27
+ expected = {
28
+ league: 'nfl',
29
+ game_date: starts_at,
30
+ home_team: 'chi',
31
+ home_score: 13,
32
+ away_team: 'det',
33
+ away_score: 7
34
+ }
35
+ scores = ESPN.get_nfl_scores(2012, 7)
36
+ assert_equal expected, scores.last
37
+ end
38
+
39
+ test 'looking for a break' do
40
+ random_weeks.each do |week|
41
+ scores = ESPN.get_nfl_scores(2012, week)
42
+ assert all_names_present?(scores), "!!! error in week #{week}"
43
+ end
44
+ end
45
+
46
+ end
@@ -0,0 +1,26 @@
1
+ require 'test_helper'
2
+
3
+ class NhlTest < EspnTest
4
+
5
+ test 'nhl rangers beat bruins on valentines day' do
6
+ day = Date.parse('Feb 14, 2012')
7
+ expected = {
8
+ league: 'nhl',
9
+ game_date: day,
10
+ home_team: 'bos',
11
+ home_score: 0,
12
+ away_team: 'nyr',
13
+ away_score: 3
14
+ }
15
+ scores = ESPN.get_nhl_scores(day)
16
+ assert_equal expected, scores.first
17
+ end
18
+
19
+ test 'random nhl dates' do
20
+ random_days.each do |day|
21
+ scores = ESPN.get_nhl_scores(day)
22
+ assert all_names_present?(scores), "Error on #{day} for nhl"
23
+ end
24
+ end
25
+
26
+ end
@@ -0,0 +1,79 @@
1
+ require 'test_helper'
2
+
3
+ class TeamsTest < EspnTest
4
+
5
+ test 'scrape nfl teams' do
6
+ divisions = ESPN.get_teams_in('nfl')
7
+ assert_equal 8, divisions.count
8
+ divisions.each do |name, teams|
9
+ assert_equal 4, teams.count
10
+ end
11
+ teams = divisions.values.flatten
12
+ assert_equal 32, teams.map{ |h| h[:name] }.uniq.count
13
+ assert_equal 32, teams.map{ |h| h[:data_name] }.uniq.count
14
+ assert divisions['nfc-west'].include?({ name: 'Seattle Seahawks', data_name: 'sea' })
15
+ end
16
+
17
+ test 'scrape mlb teams' do
18
+ divisions = ESPN.get_teams_in('mlb')
19
+ assert_equal 6, divisions.count
20
+ divisions.each do |name, teams|
21
+ assert_equal 5, teams.count
22
+ end
23
+ teams = divisions.values.flatten
24
+ assert_equal 30, teams.map{ |h| h[:name] }.uniq.count
25
+ assert_equal 30, teams.map{ |h| h[:data_name] }.uniq.count
26
+ assert divisions['al-west'].include?({ name: 'Seattle Mariners', data_name: 'sea' })
27
+ end
28
+
29
+ test 'scrape nba teams' do
30
+ divisions = ESPN.get_teams_in('nba')
31
+ assert_equal 6, divisions.count
32
+ divisions.each do |name, teams|
33
+ assert_equal 5, teams.count
34
+ end
35
+ teams = divisions.values.flatten
36
+ assert_equal 30, teams.map{ |h| h[:name] }.uniq.count
37
+ assert_equal 30, teams.map{ |h| h[:data_name] }.uniq.count
38
+ assert divisions['atlantic'].include?({ name: 'Toronto Raptors', data_name: 'tor' })
39
+ end
40
+
41
+ test 'scrape nhl teams' do
42
+ divisions = ESPN.get_teams_in('nhl')
43
+ assert_equal 4, divisions.count
44
+ assert_equal 7, divisions['central'].count
45
+ assert_equal 8, divisions['atlantic'].count
46
+ teams = divisions.values.flatten
47
+ assert_equal 30, teams.map{ |h| h[:name] }.uniq.count
48
+ assert_equal 30, teams.map{ |h| h[:data_name] }.uniq.count
49
+ assert divisions['atlantic'].include?({ name: 'Montreal Canadiens', data_name: 'mtl' })
50
+ end
51
+
52
+ test 'scrape ncaa football teams' do
53
+ divisions = ESPN.get_teams_in('college-football')
54
+ assert_equal 25, divisions.count
55
+ assert_equal 12, divisions['pac-12'].count
56
+
57
+ assert divisions['conference-usa'].include?({ name: 'Rice Owls', data_name: '242' })
58
+ assert divisions['meac'].include?({ name: 'Bethune-Cookman Wildcats', data_name: '2065' })
59
+ assert divisions['northeast'].include?({ name: 'St Francis Red Flash', data_name: '2598' })
60
+ assert divisions['swac'].include?({ name: 'Alabama A&M Bulldogs', data_name: '2010' })
61
+ end
62
+
63
+ test 'scrape ncaa basketball teams' do
64
+ divisions = ESPN.get_teams_in('mens-college-basketball')
65
+ assert_equal 32, divisions.count
66
+ assert_equal 15, divisions['acc'].count
67
+ assert_equal 10, divisions['patriot-league'].count
68
+
69
+ assert divisions['southland'].include?({ name: 'Texas A&M-CC Islanders', data_name: '357' })
70
+ assert divisions['atlantic-10'].include?({ name: "Saint Joe's Saint Joseph's Hawks", data_name: '2603' })
71
+ end
72
+
73
+ test 'scrape ncaa basketball conferences' do
74
+ conferences = ESPN.get_conferences_in_ncb
75
+ assert_equal 32, conferences.count
76
+ assert conferences.include?({ name: 'Mountain West', data_name: '44' })
77
+ end
78
+
79
+ end
@@ -0,0 +1,36 @@
1
+ ERROR_CHECKS = 1
2
+
3
+ require 'minitest/autorun'
4
+ require 'espn_scraper'
5
+
6
+ class EspnTest < Minitest::Test
7
+
8
+ class << self
9
+
10
+ def test(test_name, &block)
11
+ define_method("test_: #{ test_name }", &block)
12
+ end
13
+
14
+ end
15
+
16
+ def all_names_present?(ary)
17
+ ary.map do |obj|
18
+ h, a = obj[:home_team], obj[:away_team]
19
+ test = h.nil? || (h && h.empty?) || a.nil? || (a && a.empty?)
20
+ puts h, a if test
21
+ test
22
+ end.count(true) == 0
23
+ end
24
+
25
+ def whole_year
26
+ Date.today..Date.today.prev_year
27
+ end
28
+
29
+ def random_days(amount = ERROR_CHECKS)
30
+ whole_year.to_a.sample(amount)
31
+ end
32
+
33
+ def random_weeks(amount = ERROR_CHECKS)
34
+ (1..17).to_a.sample(amount)
35
+ end
36
+ end
metadata ADDED
@@ -0,0 +1,101 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: espn_scraper
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.3.1
5
+ platform: ruby
6
+ authors:
7
+ - aj0strow
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2013-04-07 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: httparty
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - ">="
18
+ - !ruby/object:Gem::Version
19
+ version: '0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - ">="
25
+ - !ruby/object:Gem::Version
26
+ version: '0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: nokogiri
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - ">="
32
+ - !ruby/object:Gem::Version
33
+ version: '0'
34
+ type: :runtime
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - ">="
39
+ - !ruby/object:Gem::Version
40
+ version: '0'
41
+ description: a simple scraping api for espn stats and data
42
+ email: alexander.ostrow@gmail.com
43
+ executables: []
44
+ extensions: []
45
+ extra_rdoc_files: []
46
+ files:
47
+ - ".gitignore"
48
+ - ".travis.yml"
49
+ - Gemfile
50
+ - README.md
51
+ - Rakefile
52
+ - espn_scraper.gemspec
53
+ - lib/espn_scraper.rb
54
+ - lib/espn_scraper/boilerplate.rb
55
+ - lib/espn_scraper/scores.rb
56
+ - lib/espn_scraper/teams.rb
57
+ - lib/espn_scraper/version.rb
58
+ - test/espn_scraper_test/divisions_test.rb
59
+ - test/espn_scraper_test/espn_test.rb
60
+ - test/espn_scraper_test/mlb_test.rb
61
+ - test/espn_scraper_test/nba_test.rb
62
+ - test/espn_scraper_test/ncb_test.rb
63
+ - test/espn_scraper_test/ncf_test.rb
64
+ - test/espn_scraper_test/nfl_test.rb
65
+ - test/espn_scraper_test/nhl_test.rb
66
+ - test/espn_scraper_test/teams_test.rb
67
+ - test/test_helper.rb
68
+ homepage: http://github.com/aj0strow/espn-scraper
69
+ licenses: []
70
+ metadata: {}
71
+ post_install_message:
72
+ rdoc_options: []
73
+ require_paths:
74
+ - lib
75
+ required_ruby_version: !ruby/object:Gem::Requirement
76
+ requirements:
77
+ - - ">="
78
+ - !ruby/object:Gem::Version
79
+ version: '0'
80
+ required_rubygems_version: !ruby/object:Gem::Requirement
81
+ requirements:
82
+ - - ">="
83
+ - !ruby/object:Gem::Version
84
+ version: '0'
85
+ requirements: []
86
+ rubyforge_project:
87
+ rubygems_version: 2.5.2
88
+ signing_key:
89
+ specification_version: 4
90
+ summary: ESPN Scraper
91
+ test_files:
92
+ - test/espn_scraper_test/divisions_test.rb
93
+ - test/espn_scraper_test/espn_test.rb
94
+ - test/espn_scraper_test/mlb_test.rb
95
+ - test/espn_scraper_test/nba_test.rb
96
+ - test/espn_scraper_test/ncb_test.rb
97
+ - test/espn_scraper_test/ncf_test.rb
98
+ - test/espn_scraper_test/nfl_test.rb
99
+ - test/espn_scraper_test/nhl_test.rb
100
+ - test/espn_scraper_test/teams_test.rb
101
+ - test/test_helper.rb