rubyuw 0.99.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (48) hide show
  1. data/.gitignore +4 -0
  2. data/README.md +81 -0
  3. data/Rakefile +45 -0
  4. data/VERSION +1 -0
  5. data/lib/rubyuw.rb +9 -0
  6. data/lib/rubyuw/base.rb +122 -0
  7. data/lib/rubyuw/connection.rb +145 -0
  8. data/lib/rubyuw/curriculum_enrollment.rb +75 -0
  9. data/lib/rubyuw/errors.rb +16 -0
  10. data/lib/rubyuw/schedule.rb +52 -0
  11. data/lib/rubyuw/sln.rb +108 -0
  12. data/rubyuw.gemspec +97 -0
  13. data/test/live/curriculum_enrollment_test.rb +47 -0
  14. data/test/live/sln_test.rb +54 -0
  15. data/test/live/test_helper.rb +24 -0
  16. data/test/mocked/base_test.rb +171 -0
  17. data/test/mocked/connection_test.rb +178 -0
  18. data/test/mocked/curriculum_enrollment_test.rb +158 -0
  19. data/test/mocked/fixture_pages/bad_request.html +80 -0
  20. data/test/mocked/fixture_pages/curric_courses.html +2659 -0
  21. data/test/mocked/fixture_pages/curric_no_courses.html +88 -0
  22. data/test/mocked/fixture_pages/curric_no_curric_found.html +86 -0
  23. data/test/mocked/fixture_pages/logged_in_relay.html +660 -0
  24. data/test/mocked/fixture_pages/login.html +121 -0
  25. data/test/mocked/fixture_pages/no_form_login.html +122 -0
  26. data/test/mocked/fixture_pages/no_login_button.html +100 -0
  27. data/test/mocked/fixture_pages/not_logged_in_relay.html +15 -0
  28. data/test/mocked/fixture_pages/registration.html +699 -0
  29. data/test/mocked/fixture_pages/registration_error.html +699 -0
  30. data/test/mocked/fixture_pages/registration_error_no_rows.html +627 -0
  31. data/test/mocked/fixture_pages/registration_no_form.html +700 -0
  32. data/test/mocked/fixture_pages/registration_success.html +718 -0
  33. data/test/mocked/fixture_pages/registration_unknown.html +721 -0
  34. data/test/mocked/fixture_pages/schedule.html +166 -0
  35. data/test/mocked/fixture_pages/schedule_diff_credit_row.html +167 -0
  36. data/test/mocked/fixture_pages/schedule_empty.html +130 -0
  37. data/test/mocked/fixture_pages/schedule_no_credit_row.html +169 -0
  38. data/test/mocked/fixture_pages/sln_does_not_exist.html +85 -0
  39. data/test/mocked/fixture_pages/sln_no_course_info.html +94 -0
  40. data/test/mocked/fixture_pages/sln_no_course_notes.html +94 -0
  41. data/test/mocked/fixture_pages/sln_no_enrollment_info.html +94 -0
  42. data/test/mocked/fixture_pages/sln_status.html +92 -0
  43. data/test/mocked/fixture_pages/welcome.html +93 -0
  44. data/test/mocked/schedule_test.rb +89 -0
  45. data/test/mocked/sln_test.rb +146 -0
  46. data/test/mocked/test_helper.rb +67 -0
  47. data/test/password.rb.sample +8 -0
  48. metadata +118 -0
data/.gitignore ADDED
@@ -0,0 +1,4 @@
1
+ pkg/*
2
+ test/password.rb
3
+ .yardoc
4
+ doc/*
data/README.md ADDED
@@ -0,0 +1,81 @@
1
+ # RubyUW: Ruby Interface to MyUW
2
+
3
+ __RubyUW is NOT supported in any way by the University of Washington__
4
+
5
+ RubyUW provides a programmable interface to MyUW, University
6
+ of Washington's student web portal.
7
+
8
+ ## Why?
9
+
10
+ It's a fun project and it was really a proof of concept more than
11
+ anything. I don't plan on officially supporting this library or
12
+ promising updates in case any features break. Its a good example
13
+ for any Ruby developer on the uses of mechanize and the ability
14
+ to scrape content or simulate a human at a browser.
15
+
16
+ It is also a useful library to create MyUW automation tools
17
+ with. I do not support this.
18
+
19
+ ## How does it work?
20
+
21
+ RubyUW functions by emulating a human in an actual browser. It hunts
22
+ down buttons to click, fields to fill in, etc. It is programmed
23
+ using the Ruby Mechanize library to achieve this level of human
24
+ simulation.
25
+
26
+ Of course this also means that even minor tweaks to the MyUW layout
27
+ could potentially "break" the RubyUW library.
28
+
29
+ ## Installing
30
+
31
+ # Install the gem
32
+ sudo gem sources -a http://gems.github.com
33
+ sudo gem install mitchellh-rubyuw
34
+
35
+ ## Using RubyUW
36
+
37
+ It's easy to get started with RubyUW. Below is a basic example
38
+ but you can also read the comprehensive documentation, check out
39
+ the section below this, titled "Documentation."
40
+
41
+ The following is a quick and simple example:
42
+
43
+ require 'rubyuw'
44
+ RubyUW::Base.authenticate("netid", "password") or raise("Login Failed")
45
+
46
+ # Get SLN information
47
+ sln_info = RubyUW::SLN.find(14153, "AUT+2009")
48
+
49
+ ## Documentation
50
+
51
+ You may view all documentation [at the following URL](http://mitchellh.github.com/RubyUW/):
52
+
53
+ http://mitchellh.github.com/RubyUW/
54
+
55
+ ## Testing
56
+
57
+ RubyUW includes a comprehensive test suite included both live and
58
+ mocked testing. Mocked testing uses "mocked" HTML pages and a lot
59
+ of object stubbing to simulate a real environment and to force
60
+ certain events to occur (such as strange HTML in a request) in order
61
+ to completely test the library. Mock testing can be run straight
62
+ "out of the box" via a rake task:
63
+
64
+ rake test:run_mock
65
+
66
+ RubyUW also includes "live" testing which uses a real UW NetID and
67
+ password to conduct live tests. To run this, see test/password.rb.sample
68
+ and modify it to include your credentials, then save it as password.rb.
69
+ Following this, run another rake task:
70
+
71
+ rake test:run_live
72
+
73
+ Note: Live testing is very very slow. On my machine with the current
74
+ suite it takes about a minute to run all the tests (though this is
75
+ highly dependent on your internet connection).
76
+
77
+ Note: Also, live testing DOES NOT test registration since that actually
78
+ affects your MyUW account in a significant way. To test registration,
79
+ run the registration test directly:
80
+
81
+ cd test/live && ruby registration_test.rb
data/Rakefile ADDED
@@ -0,0 +1,45 @@
1
+ begin
2
+ require 'jeweler'
3
+ Jeweler::Tasks.new do |gemspec|
4
+ gemspec.name = "rubyuw"
5
+ gemspec.summary = "Library which provides a ruby interface to the University of Washington student portal."
6
+ gemspec.email = "mitchell.hashimoto@gmail.com"
7
+ gemspec.homepage = "http://github.com/mitchellh/rubyuw"
8
+ gemspec.description = "Library which provides a ruby interface to the University of Washington student portal."
9
+ gemspec.authors = ["Mitchell Hashimoto"]
10
+
11
+ gemspec.add_dependency('tenderlove-mechanize', '>= 0.9.3.20090623142847')
12
+ end
13
+ Jeweler::GemcutterTasks.new
14
+ rescue LoadError
15
+ puts "Jeweler not available. Install it with: sudo gem install technicalpickles-jeweler -s http://gems.github.com"
16
+ end
17
+
18
+ def run_tests(files)
19
+ files.each { |f| require f }
20
+ end
21
+
22
+ begin
23
+ require 'yard'
24
+ YARD::Rake::YardocTask.new
25
+ rescue LoadError
26
+ desc "Generate YARD documentation."
27
+ task :yardoc do
28
+ puts "Yard is not available. Install with: sudo gem install yard"
29
+ end
30
+ end
31
+
32
+ namespace :test do
33
+ desc "Run all the tests"
34
+ task :run => [:run_mock, :run_live]
35
+
36
+ desc "Run the library test on fixtured data"
37
+ task :run_mock do
38
+ run_tests(Dir[File.join(File.dirname(__FILE__), 'test', 'mocked', '**', '*.rb')])
39
+ end
40
+
41
+ desc "Run the library tests on live data (requires password.rb file). NOTE: This method is very slow."
42
+ task :run_live do
43
+ run_tests(Dir[File.join(File.dirname(__FILE__), 'test', 'live', '**', '*.rb')])
44
+ end
45
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.99.1
data/lib/rubyuw.rb ADDED
@@ -0,0 +1,9 @@
1
+ require 'rubygems'
2
+ require 'mechanize'
3
+ require 'nokogiri'
4
+ require 'rubyuw/errors'
5
+ require 'rubyuw/base'
6
+ require 'rubyuw/connection'
7
+ require 'rubyuw/sln'
8
+ require 'rubyuw/curriculum_enrollment'
9
+ require 'rubyuw/schedule'
@@ -0,0 +1,122 @@
1
+ module RubyUW
2
+ # The base class for RubyUW. As a developer using RubyUW, you'll
3
+ # most likely only use this class to authenticate a connection to
4
+ # MyUW and to verify the connection.
5
+ #
6
+ # The base connection is shared by all other portions of RubyUW.
7
+ # Therefore, it a connection is not established first, you'll
8
+ # probably see a NotLoggedInError from other areas of RubyUW.
9
+ #
10
+ # == Usage Example
11
+ #
12
+ # @example Authenticating with MyUW
13
+ # require 'rubyuw'
14
+ # RubyUW::Base.authenticate("uwnetid", "password")
15
+ class Base
16
+ @@connection = nil
17
+
18
+ class <<self
19
+ # Authenticate with MyUW. This method sets up and verifies
20
+ # a session with MyUW. Every feature of RubyUW which requires
21
+ # authentication (as of writing this documentation this is
22
+ # every feature), will require that this authentication is
23
+ # completed first.
24
+ #
25
+ # @param [String] UW NetID
26
+ # @param [String] Password for the UW NetID
27
+ # @return [nil]
28
+ def authenticate(netid, password)
29
+ # Clear pre-existing session information first
30
+ logout
31
+
32
+ # Log in
33
+ results = connection.goto("http://myuw.washington.edu").
34
+ verify("//input[@type='submit' and @value='Log in with your UW NetID']").
35
+ submit_form('f').
36
+ submit_form('relay').
37
+ submit_form('query', {
38
+ :user => netid,
39
+ :pass => password
40
+ }).
41
+ execute!
42
+
43
+ relay_form = results.form_with(:name => 'relay')
44
+ return false unless results.form_with(:name => 'query').nil?
45
+ relay_form.submit()
46
+
47
+ true
48
+ end
49
+
50
+ # Checks whether authentication is still valid. This method
51
+ # goes to the MyUW portal and checks to verify that the
52
+ # previous authentication is valid. If no authentication was
53
+ # ever done, this will return false.
54
+ #
55
+ # @return [Boolean]
56
+ def authenticated?
57
+ results = connection.goto("http://myuw.washington.edu").
58
+ submit_form('f').
59
+ execute!
60
+
61
+ !results.search("//div[@class='main_search']").empty?
62
+ end
63
+
64
+ # Log out from MyUW. This method clears the cookies of the
65
+ # connection, also cleraing out any session data. This effectively
66
+ # logs a user out of anything.
67
+ def logout
68
+ connection.cookie_jar.clear!
69
+ end
70
+
71
+ # Returns the connection object. Used by other portions of RubyUW
72
+ # to browse the pages, authenticate, and more. This method will
73
+ # typically *never* be called by non-internal systems.
74
+ #
75
+ # @return [WWW::Mechanize] A WWW::Mechanize object.
76
+ def connection
77
+ @@connection ||= RubyUW::Connection.new do |browser|
78
+ browser.user_agent_alias = 'Mac Safari'
79
+ browser.follow_meta_refresh = true
80
+
81
+ # Workaround to avoid frozen object error SSL pages
82
+ browser.keep_alive = false
83
+
84
+ # Do not keep any history, which is otherwise a
85
+ # surefire way to grow memory usage without bound.
86
+ browser.max_history = 0
87
+
88
+ # Force parsing with Nokogiri. Strange bug introduced
89
+ # in 0.9.3 of WWW::Mechanize defaults to nil.
90
+ # http://github.com/tenderlove/mechanize/issues#issue/5
91
+ browser.html_parser = Nokogiri::HTML
92
+ end
93
+ end
94
+
95
+ # Resets the connection by clearing out the old. This way,
96
+ # when connection is next called, it will recreate the object.
97
+ def reset_connection!
98
+ @@connection = nil
99
+ self
100
+ end
101
+
102
+ # Runs an xpath on a page, mapping the results to a hash map.
103
+ # This method should never have to be called outside the core library
104
+ # code. It is purely a conveniance method supplied to other parts
105
+ # of the library since this functionality is repeated a few times
106
+ # in different areas.
107
+ def extract(page, xpath, keys)
108
+ nodes = page.search(xpath)
109
+ return nil if nodes.empty?
110
+
111
+ data = {}
112
+ nodes.each_with_index do |node, i|
113
+ # If we have no keys left, break out of the loop.
114
+ break if i >= keys.length
115
+ data[keys[i]] = node.inner_text.strip.gsub("\302\240", "") unless keys[i].nil?
116
+ end
117
+
118
+ data
119
+ end
120
+ end
121
+ end
122
+ end
@@ -0,0 +1,145 @@
1
+ module RubyUW
2
+ # Represents a connection to the MyUW portal. Outside developers
3
+ # will most likely not use this class, and will almost certainly
4
+ # not instantiate this class alone. However, for documentation
5
+ # purposes for myself and for potential contributors to RubyUW,
6
+ # this class is documented to the full extent.
7
+ #
8
+ # This class is mostly
9
+ # a fancy wrapper around WWW::Mechanize to provide nifty features
10
+ # such as chaining a workflow to get to a certain result and so on.
11
+ #
12
+ # == Page Flow Representation
13
+ #
14
+ # The main focus of this class is to provide a flexible way for
15
+ # the various pieces of RubyUW to represent and execute web flows.
16
+ # By "web flow" I mean the process of executing a task within a
17
+ # web site. WWW::Mechanize on its own is very good at getting/posting
18
+ # web pages and storing cookies and so on. But it has no good way (yet)
19
+ # of representing advanced multi-page actions. This is where
20
+ # this class comes in.
21
+ #
22
+ # If at any point a step in the process fails (link was not found,
23
+ # for example) then a RubyUW::InvalidPageError is thrown with details
24
+ # on what occured.
25
+ #
26
+ # Below is a simple example of a three-step web flow which simulates
27
+ # logging into a fake website.
28
+ #
29
+ # connection = RubyUW::Connection.new
30
+ # connection.goto("http://fakewebsite.com").
31
+ # click_link("Log In").
32
+ # submit_form('login', :username => "foo", :password => "baz").
33
+ # execute!
34
+ #
35
+ # In RubyUW, more advanced features were necessary to verify that the
36
+ # page being visited is in fact valid to what we're looking for (to
37
+ # protected against potential changes in the MyUW system). The example
38
+ # below shows how this works. The verify method accepts any valid xpath
39
+ # and fails if the xpath fails to return any matches.
40
+ # Just like any other step, if the verification
41
+ # fails, a simple RubyUW::InvalidPageError is thrown.
42
+ #
43
+ # connection = RubyUW::Connection.new
44
+ # connection.goto("http://fakewebsite.com").
45
+ # verify("//input[@type='submit' and @value='Log in with your UW NetID']").
46
+ # click_submit("Log in with your UW NetID").
47
+ # execute!
48
+ #
49
+ # == Thread Safety
50
+ #
51
+ # The RubyUW::Connection object is *NOT* thread-safe at all. If multiple
52
+ # threads are attempting to do web flows concurrently, the results are
53
+ # unspecified.
54
+ class Connection < WWW::Mechanize
55
+ # Flow item: Go to a specific URL. Calling this method will
56
+ # attach a task to the current flow to go to a specific URL.
57
+ #
58
+ # @param [String] URL to go to (must respond to #to_s)
59
+ # @return self
60
+ def goto(url)
61
+ current_flow.push([:goto, url])
62
+ self
63
+ end
64
+
65
+ # Flow item: Submit a form. Attaches a task to the curernt flow to
66
+ # submit a form specified by the given form name. Optionally, data
67
+ # parameters may be given as the second argument to be filled into
68
+ # the form.
69
+ #
70
+ # @param [String] Name of the form.
71
+ # @param [Hash] Parameters to pass through to the form.
72
+ # @return self
73
+ def submit_form(name, parameters={})
74
+ current_flow.push([:submit_form, name, parameters])
75
+ self
76
+ end
77
+
78
+ # Flow item: Verify the existence of an element or elements. Attaches
79
+ # a task to verify that the given XPath string matches at least one
80
+ # element.
81
+ #
82
+ # @param [String] XPath of the element(s) to verify existence of.
83
+ # @return self
84
+ def verify(xpath)
85
+ current_flow.push([:verify, xpath])
86
+ self
87
+ end
88
+
89
+ # Execute the current web flow. This task will execute the flow
90
+ # (in order of course) that has been built up and will clear out the
91
+ # flow for future usage. It will return the last executed task.
92
+ def execute!
93
+ # The context variable provides contextual information to the
94
+ # execute_* task. It is merely the result of the last execute
95
+ # task.
96
+ context = nil
97
+
98
+ current_flow.each do |command, *params|
99
+ # The context becomes the first parameter
100
+ params.unshift(context)
101
+
102
+ # Send to the special execution method
103
+ context = send("execute_#{command}!", *params)
104
+ end
105
+
106
+ @current_flow = []
107
+ context
108
+ end
109
+
110
+ # Returns the internal representation of the current web flow.
111
+ def current_flow
112
+ @current_flow ||= []
113
+ end
114
+
115
+ protected
116
+
117
+ # Executes the 'goto' web flow task by simply GETting the
118
+ # url specified. Returns the get result.
119
+ def execute_goto!(context, url)
120
+ self.get(url)
121
+ end
122
+
123
+ # Executes the 'verify' web flow task by checking the
124
+ # context with the xpath. Raises an InvalidPageError if no
125
+ # elements match.
126
+ def execute_verify!(context, xpath)
127
+ results = context.search(xpath)
128
+ raise Errors::InvalidPageError.new(context, "Failed verify step in web flow.") if results.empty?
129
+
130
+ context
131
+ end
132
+
133
+ # Executes the 'submit_form' task. Submits a form given by finding
134
+ # it on the previous page and submitting it.
135
+ def execute_submit_form!(context, name, parameters = {})
136
+ form = context.form_with(:name => name)
137
+ raise Errors::InvalidPageError.new(context, "Failed to find form on submit_form: #{name}") if form.nil?
138
+
139
+ # The #to_s is necessary here to avoid strange exceptions being
140
+ # thrown by WWW::Mechanize. Unit tested.
141
+ parameters.each { |k,v| form[k.to_s] = v.to_s }
142
+ form.submit()
143
+ end
144
+ end
145
+ end
@@ -0,0 +1,75 @@
1
+ module RubyUW
2
+ # RubyUW::CurriculumEnrollment is used to extract SLN information
3
+ # for an entire "curriculum" of courses (whic is how MyUW refers
4
+ # to it). If you have multiple SLNs within the same curriculum
5
+ # (chemistry, computer science, history, etc.), it is more efficient
6
+ # to use the CurriculumEnrollment class rather than SLN.
7
+ #
8
+ # <b>Requires authentication with RubyUW::Base prior to use!</b>
9
+ #
10
+ # == Usage Examples
11
+ #
12
+ # require 'rubyuw'
13
+ # RubyUW::Base.authenticate("netid", "password")
14
+ # curriculum = RubyUW::CurriculumEnrollment.find("HIST")
15
+ # curriculum["12345"] # RubyUW::SLN object. Use it like one!
16
+ #
17
+ # After using {RubyUW::CurriculumEnrollment} to grab the SLNs of a
18
+ # specific curriculum, you can access the individual SLNs using
19
+ # the array access notation common to ruby. This access returns
20
+ # a {RubyUW::SLN} object.
21
+ class CurriculumEnrollment
22
+ class <<self
23
+ # Pulls data about a specific curriculum. The SLN data is pulled
24
+ # directly from the MyUW curriculum information page. Authentication
25
+ # is required prior to use.
26
+ #
27
+ # @param [String] Curriculum to search for.
28
+ # @param [String] The term (quarter) to search in.
29
+ def find(curriculum, term)
30
+ raise Errors::NotLoggedInError.new unless Base.authenticated?
31
+
32
+ page = Base.connection.get("https://sdb.admin.washington.edu/timeschd/uwnetid/tsstat.asp?QTRYR=#{term}&CURRIC=#{curriculum}")
33
+ raise Errors::CurriculumDoesNotExistError.new if !curriculum_exists?(page)
34
+
35
+ extract_courses(page, term)
36
+ rescue WWW::Mechanize::ResponseCodeError
37
+ raise Errors::CurriculumDoesNotExistError.new
38
+ end
39
+
40
+ protected
41
+
42
+ def curriculum_exists?(page)
43
+ !(page.body.to_s =~ /No sections found for (.+?)/i)
44
+ end
45
+
46
+ def extract_courses(page, term)
47
+ course_nodes = page.search('//table//tr[count(th)>1 and @bgcolor="#d0d0d0"]//following-sibling::tr')
48
+
49
+ results = {}
50
+ course_nodes.each { |node|
51
+ sln, data = extract_course(node)
52
+ results[sln] = RubyUW::SLN.new(sln, term, data)
53
+ }
54
+
55
+ results
56
+ end
57
+
58
+ def extract_course(node)
59
+ data_keys = [:sln, :course, :section, :type, :title, :current_enrollment, :limit_enrollment,
60
+ :room_capacity, :space_available, nil, :notes]
61
+
62
+ results = Base.extract(node, 'td', data_keys)
63
+ results[:sln].gsub!(/^>/, '') # Strips leading '>' off of quiz sections
64
+
65
+ [results[:sln], results]
66
+ end
67
+ end
68
+ end
69
+
70
+ module Errors
71
+ # An error indicating that the curriclum requesting does not
72
+ # exist.
73
+ class CurriculumDoesNotExistError < Exception; end
74
+ end
75
+ end