RubyGems - tf2r - Versions diffs - 0.0.1 → 0.1.0 - Mend

tf2r 0.0.1 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (22) hide show

checksums.yaml +4 -4
data/.gitignore +3 -0
data/.ruby-gemset +1 -1
data/.travis.yml +3 -0
data/CHANGELOG.md +0 -0
data/Gemfile.lock +38 -2
data/README.md +30 -2
data/Rakefile +8 -1
data/lib/tf2r/scraper.rb +262 -166
data/lib/tf2r/version.rb +1 -1
data/spec/raffles.html +320 -0
data/spec/scraper_spec.rb +210 -3
data/spec/spec_helper.rb +14 -1
data/spec/vcr/cassettes/raffles.yml +243 -0
data/spec/vcr/cassettes/scrape_raffle_for_creator.yml +213 -0
data/spec/vcr/cassettes/scrape_raffle_for_participants.yml +213 -0
data/spec/vcr/cassettes/scrape_raffle_for_raffle.yml +213 -0
data/spec/vcr/cassettes/scrape_ranks.yml +197 -0
data/spec/vcr/cassettes/scrape_user_not_found.yml +119 -0
data/spec/vcr/cassettes/scrape_user_real.yml +275 -0
data/tf2r.gemspec +17 -11
metadata +84 -10

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 143e9f53a0a365e8bf98ac186a8b81acc7a3f5a0
-  data.tar.gz: f417eab178af88208690c7f257389ae84bf49cf3
+  metadata.gz: cda160e8773382ca326bd42ad5f6469cbf63fd57
+  data.tar.gz: 3a43402a23424491b50559b27fbde5f2c72fd2a0
 SHA512:
-  metadata.gz: 3030e85ac97ab0a5726f8b7aa4f5b0e03bfacd540fcdc581d3f92428432580c50ad9cb9c7975031e68f0fd24e35816258283ca4329bc32697e1c4dffd7082776
-  data.tar.gz: 1cf2a0eeae95a3d7b83e248039e706ccbdb96c1cd59421550433a5c5848eff185abc65b5afece44c0254ab1781239485d36b2bf08860b5de73d02a3a22e17976
+  metadata.gz: 5a76b5c1c173ca4190a1c4e6197feacfe0a00d4304aa11d07d3de6e5e0f6c9eb9e8a5e335f4b93a594d4adbc50b150c6a9111fdd0f8f683ac9c571ae67cd632d
+  data.tar.gz: 5d1aa559bcb08c0b1479b99d56301987289732c15431056dfab4e458814bb39767a2e149577ec4e862fc3a5e129920e519472909ef44863e17a2f30b0cf472cf

data/.gitignore CHANGED Viewed

@@ -11,3 +11,6 @@
 *.o
 *.a
 mkmf.log
+cookies.txt
+tf2r-*.gem

data/.ruby-gemset CHANGED Viewed

	@@ -1 +1 @@
1	- ~~tf2r_scraper~~
1	+ tf2r

data/.travis.yml ADDED Viewed

@@ -0,0 +1,3 @@
+language: ruby
+rvm:
+  - "2.1.2"

data/CHANGELOG.md ADDED Viewed

File without changes

data/Gemfile.lock CHANGED Viewed

@@ -1,14 +1,26 @@
 PATH
   remote: .
   specs:
-    tf2r (0.0.1)
+    tf2r (0.1.0)
       mechanize (~> 2.7)
 GEM
   remote: https://rubygems.org/
   specs:
+    addressable (2.3.6)
+    cane (2.6.2)
+      parallel
     coderay (1.1.0)
+    coveralls (0.7.0)
+      multi_json (~> 1.3)
+      rest-client
+      simplecov (>= 0.7)
+      term-ansicolor
+      thor
+    crack (0.4.2)
+      safe_yaml (~> 1.0.0)
     diff-lcs (1.2.5)
+    docile (1.1.5)
     domain_name (0.5.19)
       unf (>= 0.0.5, < 1.0.0)
     http-cookie (1.0.2)
@@ -25,16 +37,22 @@ GEM
     method_source (0.8.2)
     mime-types (2.3)
     mini_portile (0.6.0)
+    multi_json (1.10.1)
     net-http-digest_auth (1.4)
     net-http-persistent (2.9.4)
+    netrc (0.7.7)
     nokogiri (1.6.3.1)
       mini_portile (= 0.6.0)
     ntlm-http (0.1.1)
+    parallel (1.1.2)
     pry (0.10.0)
       coderay (~> 1.1.0)
       method_source (~> 0.8.1)
       slop (~> 3.4)
     rake (10.3.2)
+    rest-client (1.7.2)
+      mime-types (>= 1.16, < 3.0)
+      netrc (~> 0.7)
     rspec (3.0.0)
       rspec-core (~> 3.0.0)
       rspec-expectations (~> 3.0.0)
@@ -47,18 +65,36 @@ GEM
     rspec-mocks (3.0.3)
       rspec-support (~> 3.0.0)
     rspec-support (3.0.3)
+    safe_yaml (1.0.3)
+    simplecov (0.9.0)
+      docile (~> 1.1.0)
+      multi_json
+      simplecov-html (~> 0.8.0)
+    simplecov-html (0.8.0)
     slop (3.6.0)
+    term-ansicolor (1.3.0)
+      tins (~> 1.0)
+    thor (0.19.1)
+    tins (1.3.0)
     unf (0.1.4)
       unf_ext
     unf_ext (0.0.6)
+    vcr (2.9.2)
+    webmock (1.18.0)
+      addressable (>= 2.3.6)
+      crack (>= 0.3.2)
     webrobots (0.1.1)
 PLATFORMS
   ruby
 DEPENDENCIES
-  bundler (~> 1.6)
+  bundler (~> 1.3)
+  cane (~> 2.6)
+  coveralls (~> 0.7)
   pry (~> 0.10)
   rake (~> 10.0)
   rspec (~> 3.0)
   tf2r!
+  vcr (~> 2.9)
+  webmock (~> 1.18)

data/README.md CHANGED Viewed

@@ -1,8 +1,31 @@
-# TF2R [![Code Climate](https://codeclimate.com/github/justinkim/tf2r/badges/gpa.svg)](https://codeclimate.com/github/justinkim/tf2r)
+# TF2R - [tf2r.com][tf2r] interaction gem
+[tf2r]: http://tf2r.com
+[![Gem Version](http://img.shields.io/gem/v/tf2r.svg)][gem]
+[![Build Status](http://img.shields.io/travis/justinkim/tf2r.svg)][travis]
+[![Dependency Status](http://img.shields.io/gemnasium/justinkim/tf2r.svg)][gemnasium]
+[![Coverage Status](https://img.shields.io/coveralls/justinkim/tf2r.svg)][coveralls]
+[![Code Climate](http://img.shields.io/codeclimate/github/justinkim/tf2r.svg)][codeclimate]
+[codeclimate]:https://codeclimate.com/github/justinkim/tf2r
+[coveralls]: https://coveralls.io/r/justinkim/tf2r
+[gem]: http://badge.fury.io/rb/tf2r
+[gemnasium]: https://gemnasium.com/justinkim/tf2r
+[travis]: https://travis-ci.org/justinkim/tf2r
+GitHub: [https://github.com/justinkim/tf2r](https://github.com/justinkim/tf2r)
+Documentation: [http://www.rubydoc.info/github/justinkim/tf2r](http://www.rubydoc.info/github/justinkim/tf2r)
+Bugs: [https://github.com/justinkim/tf2r/issues](https://github.com/justinkim/tf2r/issues)
+## Description
 This gem provides a `TF2R::Scraper` that has the ability to scrape various pages on [TF2R](http://tf2r.com) into usable data.
-Yes, this gem is [semantically versioned](http://semver.org/)!
+Yes, this gem is [semantically versioned][semver]!
+[semver]: http://semver.org
 ## Installation
@@ -31,3 +54,8 @@ TODO: Write usage instructions here
 3. Commit your changes (`git commit -am 'Add some feature'`)
 4. Push to the branch (`git push origin my-new-feature`)
 5. Create a new Pull Request
+## License
+Released under the ISC license. See the [LICENSE][] for further details.
+[license]: LICENSE.md

data/Rakefile CHANGED Viewed

@@ -1,2 +1,9 @@
-require "bundler/gem_tasks"
+require 'bundler/gem_tasks'
+require 'rspec/core/rake_task'
+desc 'Run specs'
+RSpec::Core::RakeTask.new
+desc 'Default: run specs'
+task :default => :spec
+task :test => :spec

data/lib/tf2r/scraper.rb CHANGED Viewed

@@ -1,195 +1,291 @@
 module TF2R
+  # @author Justin Kim
   class Scraper
+    # Creates a Scraper. Pass values using the options hash.
+    #
+    #   :user_agent a String used for the User-Agent header
+    #   :cookies_txt a File containing cookies to load into the Mechanize agent
+    #
+    # @param opts [Hash] options to create a Scraper with
+    # @option opts [String] :user_agent a custom User-Agent header content
+    # @option opts [File] :cookies_txt a cookies.txt to load the Mechanize
+    #   agent with
     def initialize(options)
       @mech = Mechanize.new { |mech|
         mech.user_agent = options[:user_agent] || "TF2R::Scraper #{VERSION}"
       }
-      @mech.cookie_jar.load(options[:cookies_txt], :cookiestxt) if options[:cookies_txt]
+      load_cookies(options[:cookies_txt]) if options[:cookies_txt]
     end
+    # Loads the Mechanize agent with cookies from a cookies.txt.
+    #
+    # Certain pages on TF2R require a session with a logged-in user.
+    # This requires a Netscape-style cookies.txt that contains a valid
+    # "session" cookie for "tf2r.com".
+    #
+    # @param cookies_txt [File] the cookies.txt file.
+    # @return [Mechanize::CookieJar] the CookieJar of the Mechanize agent.
+    def load_cookies(cookies_txt)
+      @mech.cookie_jar.load(cookies_txt, :cookiestxt)
+    end
+    # Fetches the page at the given URL.
+    #
+    # @param url [String] the desired URL.
+    # @return [Mechanize::Page] the page given by Mechanize.
     def fetch(url)
       @mech.get(url)
     end
-  end
-end
-__END__
-# This is the old Scraper from NervyPipe.
-class Scraper
-  def initialize(cookies_txt_path)
-    @cookies_txt_path = cookies_txt_path
-    @main = Mechanize.new { |agent|
-      # the User-Agent field in headers
-      agent.user_agent = 'Jenna Bot'
-    }
-    auth_cookies(@main)
-  end
-  def auth_cookies(mech)
-    # Before anything, load our auth cookies into the cookie jar
-    # This requires a Netscape-style cookies.txt to be in the working dir
+    # Scrapes TF2R for all active raffles.
     #
-    # cookies.txt must include at least a valid "session" cookie from tf2r.com
-    mech.cookie_jar.load_cookiestxt(@cookies_txt_path)
-  end
-  # Simply return the Mechanize::Page for a url
-  def fetch(url)
-    @main.get(url)
-  end
-  def run(type)
-    case type
-    when :raffle
-      scrape_raffle(@main.get 'http://tf2r.com/kblf84f.html')
-    when :user
-      scrape_user(@main.get 'http://tf2r.com/user/76561198061719848.html')
-    when :main
-      scrape_main_page
-    when :ranks
-      scrape_ranks
+    # See http://tf2r.com/raffles.html
+    #
+    # @example
+    #   s.scrape_main_page #=> ['http://tf2r.com/kold.html',
+    #                           'http://tf2r.com/knew.html',
+    #                           'http://tf2r.com/knewest.html']
+    #
+    # @return [Hash] String links of all active raffles in chronological
+    #   order (oldest to newest by creation time).
+    def scrape_main_page
+      page = fetch('http://tf2r.com/raffles.html')
+      # All raffle links begin with 'tf2r.com/k'
+      raffle_links = page.links_with(href: /tf2r\.com\/k/)
+      raffle_links.map! { |x| x.uri.to_s }
+      raffle_links.reverse!
     end
-  end
-  def scrape_main_page
-    page = @main.get('http://tf2r.com/raffles.html')
-    # This regex matches all Mechanize::Page::Links on the main raffles page that are actual raffles
-    raffle_mech_links = page.links_with(href: /tf2r\.com\/k/)
-    # an array of strings, which are raffle links
-    raffle_links = raffle_mech_links.map { |x| x.uri.to_s }
-    # the array should have raffles from bottom-to-top, old-to-new
-    raffle_links.reverse!
-  end
-  def scrape_raffle_for_user(page)
-    # This is an array of all things Reag was nice enough to class "raffle_infomation"
-    # Reag made a typo, so the class really is "raffle_infomation"
-    raffle_infos = page.parser.css('.raffle_infomation')
-    # User information
-    steam_id = raffle_infos[2].css('a')[0].attributes['href'].text.split('/')[-1].split('.')[0].to_i
-    username = raffle_infos[2].css('a').text
-    avatar_link = raffle_infos[1].css('a')[0].css('img')[0].attributes['src'].text
-    # posrept will be nil if the Scraper's user has already voted on a user's rep in the raffle
-    posrepa = raffle_infos.css('.upvb').text.split
-    posrepa.delete('+')
-    posrep = posrepa[-1].to_i.to_s
-    negrepa = raffle_infos.css('.downvb').text.split
-    negrepa.delete('+')
-    negrep = negrepa[-1].to_i.to_s
-    colour = raffle_infos[2].css('a')[0].attributes['style'].value.split('#')[-1].split(';')[0].downcase.chomp
-    # The creator of the raffle, using above
-    userhash = {steam_id: steam_id, username: username, avatar_link: avatar_link, posrep: posrep, negrep: negrep, colour: colour}
-  end
-  def scrape_raffle_for_raffle(page)
-    # This is an array of all things Reag was nice enough to class "raffle_infomation"
-    # Reag made a typo, so the class really is "raffle_infomation"
-    raffle_infos = page.parser.css('.raffle_infomation')
-    # Raffle information
-    uri = page.uri # is a URI:HTTP
-    path = uri.path # is "/welcome.html" for "http://tf2r.com/welcome.html"
-    link_snippet = path.split('/')[1].split('.html')[0] # is 'kabc123' for 'http://tf2r.com/kabc123.html'
-    title = raffle_infos[0].text.split('Title: ')[-1]
-    # Lots of info in a single table
-    raffle_tds = raffle_infos[3].css('td')
-    description = raffle_tds[1].text
-    start_time_string = raffle_tds[9].text
-    start_time = DateTime.strptime(start_time_string, '%a, %d %b %Y %H:%M:%S %z').to_time
-    end_time_string = raffle_tds[11].text
-    end_time = DateTime.strptime(end_time_string, '%a, %d %b %Y %H:%M:%S %z').to_time
-    win_chance_pre_round = raffle_tds[5].text.to_f / 100 # also #winc
-    win_chance = win_chance_pre_round.round(5)
-    entries = raffle_tds[7].text # also #entry
-    # Entries looks like "42/123", as "current/max"
-    # Split by slash, multiple assignment to array with elements mapped to integers
-    # Equivalent to a = b[0].to_i; c = b[1].to_i
-    current_entries, max_entries = entries.split('/').map { |x| x.to_i }
-    is_done = end_time <= Time.now || current_entries == max_entries || page.parser.css('.welcome_font').text.include?('No winners') || page.parser.css('.welcome_font').text.include?('Winner(s):')
-    rafflehash = {link_snippet: link_snippet, title: title, description: description, start_time: start_time, end_time: end_time,
-      win_chance: win_chance, current_entries: current_entries, max_entries: max_entries, is_done: is_done}
-  end
-  def scrape_raffle_for_participants(page)
-    participants = []
-    participant_divs = page.parser.css('.pentry')
-    participant_divs.each do |participant|
-      steam_id = participant.css('a')[-1].attributes['href'].text.split('/')[-1].split('.')[0].to_i
-      username = participant.text
-      colour = participant.css('a')[-1].attributes['style'].text.split('#')[-1].split(';')[0].downcase.chomp
-      participants << {steam_id: steam_id, username: username, colour: colour}
+    # Scrapes a raffle page for information about the creator.
+    #
+    # @example
+    #   p = s.fetch('http://tf2r.com/kstzcbd.html')
+    #   s.scrape_raffle_for_creator(p) #=>
+    #   {:steam_id=>76561198061719848,
+    #    :username=>"Yulli",
+    #    :avatar_link=>
+    #    "http://media.steampowered.com/steamcommunity/public/images/avatars/bc/bc9dc4302d23f2e2f37f59c59f29c27dbc8cade6_full.jpg",
+    #    :posrep=>11458,
+    #    :negrep=>0,
+    #    :colour=>"70b01b"}
+    #
+    # @param page [Mechanize::Page] the raffle page.
+    # @return [Hash] a representation of a user, the raffle creator.
+    #   * :steam_id (+Fixnum+) — the creator's SteamID64.
+    #   * :username (+String+) — the creator's username.
+    #   * :avatar_link (+String+) — a link to the creator's avatar.
+    #   * :posrep (+Fixnum+) — the creator's positive rep.
+    #   * :negrep (+Fixnum+) — the creator's negative rep.
+    #   * :colour (+String+) — hex colour code of the creator's username.
+    def scrape_raffle_for_creator(page)
+      # Reag classed some things "raffle_infomation". That's spelled right.
+      infos = page.parser.css('.raffle_infomation')
+      # The main 'a' element, containing the creator's username.
+      user_anchor = infos[2].css('a')[0]
+      steam_id = extract_steam_id(user_anchor.attribute('href').to_s)
+      username = user_anchor.text
+      avatar_link = infos[1].css('img')[0].attribute('src').to_s
+      posrep = /(\d+)/.match(infos.css('.upvb').text)[1].to_i
+      negrep = /(\d+)/.match(infos.css('.downvb').text)[1].to_i
+      # The creator's username colour. Corresponds to rank.
+      colour = extract_hex_colour(user_anchor.attribute('style').to_s)
+      {steam_id: steam_id, username: username, avatar_link: avatar_link,
+       posrep: posrep, negrep: negrep, colour: colour}
     end
-    participants.uniq.reverse
-  end
-  def scrape_raffle(page, portions = :all)
-    userhash, rafflehash, participants = {}, {}, []
-    case portions
-    when :core
-      userhash = scrape_raffle_for_user(page)
-      rafflehash = scrape_raffle_for_raffle(page)
-    when :participants
-      participants = scrape_raffle_for_participants(page)
-    else
-      userhash = scrape_raffle_for_user(page)
-      rafflehash = scrape_raffle_for_raffle(page)
-      participants = scrape_raffle_for_participants(page)
+    # Scrapes a raffle page for information about the raffle.
+    #
+    # @example
+    #   p = s.fetch('http://tf2r.com/kstzcbd.html')
+    #   s.scrape_raffle_for_raffle(p) #=>
+    #   {:link_snippet=>"kstzcbd",
+    #    :title=>"Just one refined [1 hour]",
+    #    :description=>"Plain and simple.",
+    #    :start_time=>2012-10-29 09:51:45 -0400,
+    #    :end_time=>2012-10-29 09:53:01 -0400,
+    #    :win_chance=>0.1,
+    #    :current_entries=>10,
+    #    :max_entries=>10,
+    #    :is_done=>true}
+    #
+    # @param page [Mechanize::Page] the raffle page.
+    # @return [Hash] a representation of the raffle.
+    #   * :link_snippet (+String+) — the "raffle id" in the URL.
+    #   * :title (+String+) — the raffle's title.
+    #   * :description (+String+) — the raffle's "message".
+    #   * :start_time (+Time+) — the creation time of the raffle.
+    #   * :end_time (+Time+) — the projects/observed end time for the raffle.
+    #   * :win_chance (+Float+) — a participant's chance to win the raffle.
+    #   * :current_entries (+Fixnum+) — the current number of participants.
+    #   * :max_entries (+Fixnum+) — the maximum number of particpants allowed.
+    #   * :is_done (+Boolean+) — whether new users can enter the raffle.
+    def scrape_raffle_for_raffle(page)
+      # Reag classed some things "raffle_infomation". That's spelled right.
+      infos = page.parser.css('.raffle_infomation')
+      # Elements of the main raffle info table.
+      raffle_tds = infos[3].css('td')
+      # 'kabc123' for http://tf2r.com/kabc123.html'
+      link_snippet = /\/(k.+)\.html/.match(page.uri.path)[1]
+      title = infos[0].text.split('Title: ')[-1]
+      description = raffle_tds[1].text
+      start_time = raffle_tds[9].attribute('data-rstart-unix').to_s
+      start_time = DateTime.strptime(start_time, '%s').to_time
+      end_time = raffle_tds[11].attribute('data-rsend-unix').to_s
+      end_time= DateTime.strptime(end_time, '%s').to_time
+      win_chance = /(.+)%/.match(infos.css('#winc').text)[1].to_f / 100
+      entries = /(\d+)\/(\d+)/.match(infos.css('#entry').text)
+      current_entries = entries[1].to_i
+      max_entries = entries[2].to_i
+      text = page.parser.css('.welcome_font').css('div')[3..-1].text
+      is_done = end_time < Time.now ||
+        current_entries == max_entries ||
+        page.parser.css('.welcome_font')[5..-1].text.downcase.include?('winner')
+      {link_snippet: link_snippet, title: title, description: description,
+       start_time: start_time, end_time: end_time, win_chance: win_chance,
+       current_entries: current_entries, max_entries: max_entries,
+       is_done: is_done}
     end
-    [userhash, rafflehash, participants]
-  end
+    # Scrapes a raffle page for all the participants.
+    #
+    # TODO: add an example
+    #
+    # @param page [Mechanize::Page] the raffle page.
+    # @return [Array] contains Hashes representing each of the participants,
+    # in chronological order (first entered to last).
+    #   * :steam_id (+Fixnum+) — the participant's SteamID64.
+    #   * :username (+String+) — the participant's username.
+    #   * :colour (+String+) — hex colour code of the participant's username.
+    def scrape_raffle_for_participants(page)
+      participants = []
+      participant_divs = page.parser.css('.pentry')
+      participant_divs.each do |participant|
+        user_anchor = participant.children[1]
+        steam_id = extract_steam_id(user_anchor.to_s)
+        username = participant.text
+        colour = extract_hex_colour(user_anchor.children[0].attribute('style'))
+        participants << {steam_id: steam_id, username: username, colour: colour}
+      end
+      participants.reverse!
+    end
-  def scrape_user(user_page)
-    if user_page.parser.css('.profile_info').empty?
-      username, avatar_link, posrep, negrep, colour = nil, nil, nil, nil, nil
-      steam_id = user_page.uri.path.split('/')[-1].split('.')[0].to_i
-    else
-      pp user_page.parser.css('.profile_info')
-      raffle_infos = user_page.parser.css('.raffle_infomation') # sic
+    # Scrapes a user page for information about the user.
+    #
+    # @example
+    #   p = s.fetch('http://tf2r.com/user/76561198061719848.html')
+    #   s.scrape_user(p) #=>
+    #   {:steam_id=>76561198061719848,
+    #    :username=>"Yulli",
+    #    :avatar_link=>
+    #     "http://media.steampowered.com/steamcommunity/public/images/avatars/bc/bc9dc4302d23f2e2f37f59c59f29c27dbc8cade6_full.jpg",
+    #     :posrep=>11459,
+    #     :negrep=>0,
+    #     :colour=>"70b01b"}
+    #
+    # @param user_page [Mechanize::Page] the user page.
+    # @return [Hash] a representation of the user.
+    #   * :steam_id (+Fixnum+) — the user's SteamID64.
+    #   * :username (+String+) — the user's username.
+    #   * :avatar_link (+String+) — a link to the user's avatar.
+    #   * :posrep (+Fixnum+) — the user's positive rep.
+    #   * :negrep (+Fixnum+) — the user's negative rep.
+    #   * :colour (+String+) — hex colour code of the user's username.
+    def scrape_user(user_page)
+      if user_page.parser.css('.profile_info').empty?
+        # TODO: Should raise an exception here
+        steam_id = extract_steam_id(user_page.uri.to_s)
+        username, avatar_link, posrep, negrep, colour = nil, nil, nil, nil, nil
+      else
+        infos = user_page.parser.css('.raffle_infomation') #sic
+        user_anchor = infos[2].css('a')[0]
+        steam_id = extract_steam_id(user_page.uri.to_s)
+        username = /TF2R Item Raffles - (.+)/.match(user_page.title)[1]
+        avatar_link = infos[0].css('img')[0].attribute('src').to_s
+        posrep = infos.css('.upvb').text.to_i
+        negrep = infos.css('.downvb').text.to_i
+        colour = extract_hex_colour(infos[1].css('a')[0].attribute('style').to_s)
+      end
+      {steam_id: steam_id, username: username, avatar_link: avatar_link,
+       posrep: posrep, negrep: negrep, colour: colour}
+    end
-      steam_id = user_page.uri.path.split('/')[-1].split('.')[0].to_i
-      username = user_page.parser.title.split('TF2R Item Raffles - ')[-1]
-      avatar_link = raffle_infos[0].css('img')[0].attributes['src'].text
+    # Scrapes the TF2R info page for available user ranks.
+    #
+    # See http://tf2r.com/info.html.
+    #
+    # @example
+    #   p = s.fetch('http://tf2r.com/info.html')
+    #   s.scrape_user(p) #=>
+    #   [{:colour=>"ebe2ca", :name=>"User", :description=>"Every new or existing user has this rank."},
+    #    {:colour=>"ffd700", :name=>"Trusted", :description=>"This rank can only be assigned on staff approval. Granted for 1,000~ Rep."},
+    #     ...]
+    #
+    # @param info_page [Mechanize::Page] the info page.
+    # @return [Array] contains Hashes representing each of the ranks.
+    #   * :name (+String+) — the rank's name.
+    #   * :description (+String+) — the rank's description.
+    #   * :colour (+String+) — the rank's hex colour code.
+    def scrape_ranks(info_page)
+      rank_divs = info_page.parser.css('#ranks').children
+      ranks = rank_divs.select { |div| div.children.size == 3 }
+      ranks.map { |div| extract_rank(div) }
+    end
-      posrep = raffle_infos.css('.upvb').text.to_i.to_s
-      negrep = raffle_infos.css('.downvb').text.to_i.to_s
+    private
-      colour = raffle_infos[1].css('a')[0].attributes['style'].value.split('#')[-1].split(';')[0].downcase.chomp
+    # Extracts a rank hash from a rank div.
+    # Only for use by #scrape_ranks.
+    #
+    # @param rank_div [Nokogiri::XML::Element] a div containing the rank info.
+    # @return [Hash] a representation of a rank as outlined in #scrape_ranks.
+    def extract_rank(div)
+      name = div.children[0].text
+      description = div.children[2].text
+      colour = extract_hex_colour(div.children[0].attribute('style').to_s)
+      {name: name, description: description, colour: colour}
     end
-    userhash = {steam_id: steam_id, username: username, avatar_link: avatar_link, posrep: posrep, negrep: negrep, colour: colour}
-  end
+    # Extracts a SteamID64 from a TF2R user link.
+    #
+    # @example
+    #   extract_steam_id('http://tf2r.com/user/76561198061719848.html')
+    #     #=> 76561198061719848
+    #
+    # @param href [String] The full user profile link.
+    # @return [Fixnum] The Steam ID.
+    def extract_steam_id(href)
+      /http:\/\/tf2r.com\/user\/(\d+)\.html/.match(href)[1].to_i
+    end
-  def scrape_ranks
-    # This scrapes the info page for the various ranks that exist
-    page = @main.get('http://tf2r.com/info.html')
-    ranks_div = page.parser.css('#ranks')
-    divs = ranks_div.css('div')
-    rank_divs = []
-    divs.each { |div|
-      rank_divs << div unless div.attributes['style'].nil? || !(div.attributes['style'].value.include? 'color')
-    }
-    colours = rank_divs.map {|div| div.attributes['style'].value.split('color:#')[-1].split(';')[0].downcase.chomp }
+    # Extracts a lowercase hex colour code.
+    #
+    # @example
+    #   extract_hex_colour('color:#70B01B;') #=> '70b01b'
+    #
+    # @param href [String] Any string containing a hex colour code.
+    # @return [String] The lowercase hex colour code.
+    def extract_hex_colour(str)
+      /#(\w+)\s*;/.match(str)[1].downcase
+    end
   end
 end

data/lib/tf2r/version.rb CHANGED Viewed

@@ -1,3 +1,3 @@
 module TF2R
-  VERSION = "0.0.1"
+  VERSION = '0.1.0'
 end