rubyuw 0.99.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (48) hide show
  1. data/.gitignore +4 -0
  2. data/README.md +81 -0
  3. data/Rakefile +45 -0
  4. data/VERSION +1 -0
  5. data/lib/rubyuw.rb +9 -0
  6. data/lib/rubyuw/base.rb +122 -0
  7. data/lib/rubyuw/connection.rb +145 -0
  8. data/lib/rubyuw/curriculum_enrollment.rb +75 -0
  9. data/lib/rubyuw/errors.rb +16 -0
  10. data/lib/rubyuw/schedule.rb +52 -0
  11. data/lib/rubyuw/sln.rb +108 -0
  12. data/rubyuw.gemspec +97 -0
  13. data/test/live/curriculum_enrollment_test.rb +47 -0
  14. data/test/live/sln_test.rb +54 -0
  15. data/test/live/test_helper.rb +24 -0
  16. data/test/mocked/base_test.rb +171 -0
  17. data/test/mocked/connection_test.rb +178 -0
  18. data/test/mocked/curriculum_enrollment_test.rb +158 -0
  19. data/test/mocked/fixture_pages/bad_request.html +80 -0
  20. data/test/mocked/fixture_pages/curric_courses.html +2659 -0
  21. data/test/mocked/fixture_pages/curric_no_courses.html +88 -0
  22. data/test/mocked/fixture_pages/curric_no_curric_found.html +86 -0
  23. data/test/mocked/fixture_pages/logged_in_relay.html +660 -0
  24. data/test/mocked/fixture_pages/login.html +121 -0
  25. data/test/mocked/fixture_pages/no_form_login.html +122 -0
  26. data/test/mocked/fixture_pages/no_login_button.html +100 -0
  27. data/test/mocked/fixture_pages/not_logged_in_relay.html +15 -0
  28. data/test/mocked/fixture_pages/registration.html +699 -0
  29. data/test/mocked/fixture_pages/registration_error.html +699 -0
  30. data/test/mocked/fixture_pages/registration_error_no_rows.html +627 -0
  31. data/test/mocked/fixture_pages/registration_no_form.html +700 -0
  32. data/test/mocked/fixture_pages/registration_success.html +718 -0
  33. data/test/mocked/fixture_pages/registration_unknown.html +721 -0
  34. data/test/mocked/fixture_pages/schedule.html +166 -0
  35. data/test/mocked/fixture_pages/schedule_diff_credit_row.html +167 -0
  36. data/test/mocked/fixture_pages/schedule_empty.html +130 -0
  37. data/test/mocked/fixture_pages/schedule_no_credit_row.html +169 -0
  38. data/test/mocked/fixture_pages/sln_does_not_exist.html +85 -0
  39. data/test/mocked/fixture_pages/sln_no_course_info.html +94 -0
  40. data/test/mocked/fixture_pages/sln_no_course_notes.html +94 -0
  41. data/test/mocked/fixture_pages/sln_no_enrollment_info.html +94 -0
  42. data/test/mocked/fixture_pages/sln_status.html +92 -0
  43. data/test/mocked/fixture_pages/welcome.html +93 -0
  44. data/test/mocked/schedule_test.rb +89 -0
  45. data/test/mocked/sln_test.rb +146 -0
  46. data/test/mocked/test_helper.rb +67 -0
  47. data/test/password.rb.sample +8 -0
  48. metadata +118 -0
data/.gitignore ADDED
@@ -0,0 +1,4 @@
1
+ pkg/*
2
+ test/password.rb
3
+ .yardoc
4
+ doc/*
data/README.md ADDED
@@ -0,0 +1,81 @@
1
+ # RubyUW: Ruby Interface to MyUW
2
+
3
+ __RubyUW is NOT supported in any way by the University of Washington__
4
+
5
+ RubyUW provides a programmable interface to MyUW, University
6
+ of Washington's student web portal.
7
+
8
+ ## Why?
9
+
10
+ It's a fun project and it was really a proof of concept more than
11
+ anything. I don't plan on officially supporting this library or
12
+ promising updates in case any features break. Its a good example
13
+ for any Ruby developer on the uses of mechanize and the ability
14
+ to scrape content or simulate a human at a browser.
15
+
16
+ It is also a useful library to create MyUW automation tools
17
+ with. I do not support this.
18
+
19
+ ## How does it work?
20
+
21
+ RubyUW functions by emulating a human in an actual browser. It hunts
22
+ down buttons to click, fields to fill in, etc. It is programmed
23
+ using the Ruby Mechanize library to achieve this level of human
24
+ simulation.
25
+
26
+ Of course this also means that even minor tweaks to the MyUW layout
27
+ could potentially "break" the RubyUW library.
28
+
29
+ ## Installing
30
+
31
+ # Install the gem
32
+ sudo gem sources -a http://gems.github.com
33
+ sudo gem install mitchellh-rubyuw
34
+
35
+ ## Using RubyUW
36
+
37
+ It's easy to get started with RubyUW. Below is a basic example
38
+ but you can also read the comprehensive documentation, check out
39
+ the section below this, titled "Documentation."
40
+
41
+ The following is a quick and simple example:
42
+
43
+ require 'rubyuw'
44
+ RubyUW::Base.authenticate("netid", "password") or raise("Login Failed")
45
+
46
+ # Get SLN information
47
+ sln_info = RubyUW::SLN.find(14153, "AUT+2009")
48
+
49
+ ## Documentation
50
+
51
+ You may view all documentation [at the following URL](http://mitchellh.github.com/RubyUW/):
52
+
53
+ http://mitchellh.github.com/RubyUW/
54
+
55
+ ## Testing
56
+
57
+ RubyUW includes a comprehensive test suite included both live and
58
+ mocked testing. Mocked testing uses "mocked" HTML pages and a lot
59
+ of object stubbing to simulate a real environment and to force
60
+ certain events to occur (such as strange HTML in a request) in order
61
+ to completely test the library. Mock testing can be run straight
62
+ "out of the box" via a rake task:
63
+
64
+ rake test:run_mock
65
+
66
+ RubyUW also includes "live" testing which uses a real UW NetID and
67
+ password to conduct live tests. To run this, see test/password.rb.sample
68
+ and modify it to include your credentials, then save it as password.rb.
69
+ Following this, run another rake task:
70
+
71
+ rake test:run_live
72
+
73
+ Note: Live testing is very very slow. On my machine with the current
74
+ suite it takes about a minute to run all the tests (though this is
75
+ highly dependent on your internet connection).
76
+
77
+ Note: Also, live testing DOES NOT test registration since that actually
78
+ affects your MyUW account in a significant way. To test registration,
79
+ run the registration test directly:
80
+
81
+ cd test/live && ruby registration_test.rb
data/Rakefile ADDED
@@ -0,0 +1,45 @@
1
+ begin
2
+ require 'jeweler'
3
+ Jeweler::Tasks.new do |gemspec|
4
+ gemspec.name = "rubyuw"
5
+ gemspec.summary = "Library which provides a ruby interface to the University of Washington student portal."
6
+ gemspec.email = "mitchell.hashimoto@gmail.com"
7
+ gemspec.homepage = "http://github.com/mitchellh/rubyuw"
8
+ gemspec.description = "Library which provides a ruby interface to the University of Washington student portal."
9
+ gemspec.authors = ["Mitchell Hashimoto"]
10
+
11
+ gemspec.add_dependency('tenderlove-mechanize', '>= 0.9.3.20090623142847')
12
+ end
13
+ Jeweler::GemcutterTasks.new
14
+ rescue LoadError
15
+ puts "Jeweler not available. Install it with: sudo gem install technicalpickles-jeweler -s http://gems.github.com"
16
+ end
17
+
18
+ def run_tests(files)
19
+ files.each { |f| require f }
20
+ end
21
+
22
+ begin
23
+ require 'yard'
24
+ YARD::Rake::YardocTask.new
25
+ rescue LoadError
26
+ desc "Generate YARD documentation."
27
+ task :yardoc do
28
+ puts "Yard is not available. Install with: sudo gem install yard"
29
+ end
30
+ end
31
+
32
+ namespace :test do
33
+ desc "Run all the tests"
34
+ task :run => [:run_mock, :run_live]
35
+
36
+ desc "Run the library test on fixtured data"
37
+ task :run_mock do
38
+ run_tests(Dir[File.join(File.dirname(__FILE__), 'test', 'mocked', '**', '*.rb')])
39
+ end
40
+
41
+ desc "Run the library tests on live data (requires password.rb file). NOTE: This method is very slow."
42
+ task :run_live do
43
+ run_tests(Dir[File.join(File.dirname(__FILE__), 'test', 'live', '**', '*.rb')])
44
+ end
45
+ end
data/VERSION ADDED
@@ -0,0 +1 @@
1
+ 0.99.1
data/lib/rubyuw.rb ADDED
@@ -0,0 +1,9 @@
1
+ require 'rubygems'
2
+ require 'mechanize'
3
+ require 'nokogiri'
4
+ require 'rubyuw/errors'
5
+ require 'rubyuw/base'
6
+ require 'rubyuw/connection'
7
+ require 'rubyuw/sln'
8
+ require 'rubyuw/curriculum_enrollment'
9
+ require 'rubyuw/schedule'
@@ -0,0 +1,122 @@
1
+ module RubyUW
2
+ # The base class for RubyUW. As a developer using RubyUW, you'll
3
+ # most likely only use this class to authenticate a connection to
4
+ # MyUW and to verify the connection.
5
+ #
6
+ # The base connection is shared by all other portions of RubyUW.
7
+ # Therefore, it a connection is not established first, you'll
8
+ # probably see a NotLoggedInError from other areas of RubyUW.
9
+ #
10
+ # == Usage Example
11
+ #
12
+ # @example Authenticating with MyUW
13
+ # require 'rubyuw'
14
+ # RubyUW::Base.authenticate("uwnetid", "password")
15
+ class Base
16
+ @@connection = nil
17
+
18
+ class <<self
19
+ # Authenticate with MyUW. This method sets up and verifies
20
+ # a session with MyUW. Every feature of RubyUW which requires
21
+ # authentication (as of writing this documentation this is
22
+ # every feature), will require that this authentication is
23
+ # completed first.
24
+ #
25
+ # @param [String] UW NetID
26
+ # @param [String] Password for the UW NetID
27
+ # @return [nil]
28
+ def authenticate(netid, password)
29
+ # Clear pre-existing session information first
30
+ logout
31
+
32
+ # Log in
33
+ results = connection.goto("http://myuw.washington.edu").
34
+ verify("//input[@type='submit' and @value='Log in with your UW NetID']").
35
+ submit_form('f').
36
+ submit_form('relay').
37
+ submit_form('query', {
38
+ :user => netid,
39
+ :pass => password
40
+ }).
41
+ execute!
42
+
43
+ relay_form = results.form_with(:name => 'relay')
44
+ return false unless results.form_with(:name => 'query').nil?
45
+ relay_form.submit()
46
+
47
+ true
48
+ end
49
+
50
+ # Checks whether authentication is still valid. This method
51
+ # goes to the MyUW portal and checks to verify that the
52
+ # previous authentication is valid. If no authentication was
53
+ # ever done, this will return false.
54
+ #
55
+ # @return [Boolean]
56
+ def authenticated?
57
+ results = connection.goto("http://myuw.washington.edu").
58
+ submit_form('f').
59
+ execute!
60
+
61
+ !results.search("//div[@class='main_search']").empty?
62
+ end
63
+
64
+ # Log out from MyUW. This method clears the cookies of the
65
+ # connection, also cleraing out any session data. This effectively
66
+ # logs a user out of anything.
67
+ def logout
68
+ connection.cookie_jar.clear!
69
+ end
70
+
71
+ # Returns the connection object. Used by other portions of RubyUW
72
+ # to browse the pages, authenticate, and more. This method will
73
+ # typically *never* be called by non-internal systems.
74
+ #
75
+ # @return [WWW::Mechanize] A WWW::Mechanize object.
76
+ def connection
77
+ @@connection ||= RubyUW::Connection.new do |browser|
78
+ browser.user_agent_alias = 'Mac Safari'
79
+ browser.follow_meta_refresh = true
80
+
81
+ # Workaround to avoid frozen object error SSL pages
82
+ browser.keep_alive = false
83
+
84
+ # Do not keep any history, which is otherwise a
85
+ # surefire way to grow memory usage without bound.
86
+ browser.max_history = 0
87
+
88
+ # Force parsing with Nokogiri. Strange bug introduced
89
+ # in 0.9.3 of WWW::Mechanize defaults to nil.
90
+ # http://github.com/tenderlove/mechanize/issues#issue/5
91
+ browser.html_parser = Nokogiri::HTML
92
+ end
93
+ end
94
+
95
+ # Resets the connection by clearing out the old. This way,
96
+ # when connection is next called, it will recreate the object.
97
+ def reset_connection!
98
+ @@connection = nil
99
+ self
100
+ end
101
+
102
+ # Runs an xpath on a page, mapping the results to a hash map.
103
+ # This method should never have to be called outside the core library
104
+ # code. It is purely a conveniance method supplied to other parts
105
+ # of the library since this functionality is repeated a few times
106
+ # in different areas.
107
+ def extract(page, xpath, keys)
108
+ nodes = page.search(xpath)
109
+ return nil if nodes.empty?
110
+
111
+ data = {}
112
+ nodes.each_with_index do |node, i|
113
+ # If we have no keys left, break out of the loop.
114
+ break if i >= keys.length
115
+ data[keys[i]] = node.inner_text.strip.gsub("\302\240", "") unless keys[i].nil?
116
+ end
117
+
118
+ data
119
+ end
120
+ end
121
+ end
122
+ end
@@ -0,0 +1,145 @@
1
+ module RubyUW
2
+ # Represents a connection to the MyUW portal. Outside developers
3
+ # will most likely not use this class, and will almost certainly
4
+ # not instantiate this class alone. However, for documentation
5
+ # purposes for myself and for potential contributors to RubyUW,
6
+ # this class is documented to the full extent.
7
+ #
8
+ # This class is mostly
9
+ # a fancy wrapper around WWW::Mechanize to provide nifty features
10
+ # such as chaining a workflow to get to a certain result and so on.
11
+ #
12
+ # == Page Flow Representation
13
+ #
14
+ # The main focus of this class is to provide a flexible way for
15
+ # the various pieces of RubyUW to represent and execute web flows.
16
+ # By "web flow" I mean the process of executing a task within a
17
+ # web site. WWW::Mechanize on its own is very good at getting/posting
18
+ # web pages and storing cookies and so on. But it has no good way (yet)
19
+ # of representing advanced multi-page actions. This is where
20
+ # this class comes in.
21
+ #
22
+ # If at any point a step in the process fails (link was not found,
23
+ # for example) then a RubyUW::InvalidPageError is thrown with details
24
+ # on what occured.
25
+ #
26
+ # Below is a simple example of a three-step web flow which simulates
27
+ # logging into a fake website.
28
+ #
29
+ # connection = RubyUW::Connection.new
30
+ # connection.goto("http://fakewebsite.com").
31
+ # click_link("Log In").
32
+ # submit_form('login', :username => "foo", :password => "baz").
33
+ # execute!
34
+ #
35
+ # In RubyUW, more advanced features were necessary to verify that the
36
+ # page being visited is in fact valid to what we're looking for (to
37
+ # protected against potential changes in the MyUW system). The example
38
+ # below shows how this works. The verify method accepts any valid xpath
39
+ # and fails if the xpath fails to return any matches.
40
+ # Just like any other step, if the verification
41
+ # fails, a simple RubyUW::InvalidPageError is thrown.
42
+ #
43
+ # connection = RubyUW::Connection.new
44
+ # connection.goto("http://fakewebsite.com").
45
+ # verify("//input[@type='submit' and @value='Log in with your UW NetID']").
46
+ # click_submit("Log in with your UW NetID").
47
+ # execute!
48
+ #
49
+ # == Thread Safety
50
+ #
51
+ # The RubyUW::Connection object is *NOT* thread-safe at all. If multiple
52
+ # threads are attempting to do web flows concurrently, the results are
53
+ # unspecified.
54
+ class Connection < WWW::Mechanize
55
+ # Flow item: Go to a specific URL. Calling this method will
56
+ # attach a task to the current flow to go to a specific URL.
57
+ #
58
+ # @param [String] URL to go to (must respond to #to_s)
59
+ # @return self
60
+ def goto(url)
61
+ current_flow.push([:goto, url])
62
+ self
63
+ end
64
+
65
+ # Flow item: Submit a form. Attaches a task to the curernt flow to
66
+ # submit a form specified by the given form name. Optionally, data
67
+ # parameters may be given as the second argument to be filled into
68
+ # the form.
69
+ #
70
+ # @param [String] Name of the form.
71
+ # @param [Hash] Parameters to pass through to the form.
72
+ # @return self
73
+ def submit_form(name, parameters={})
74
+ current_flow.push([:submit_form, name, parameters])
75
+ self
76
+ end
77
+
78
+ # Flow item: Verify the existence of an element or elements. Attaches
79
+ # a task to verify that the given XPath string matches at least one
80
+ # element.
81
+ #
82
+ # @param [String] XPath of the element(s) to verify existence of.
83
+ # @return self
84
+ def verify(xpath)
85
+ current_flow.push([:verify, xpath])
86
+ self
87
+ end
88
+
89
+ # Execute the current web flow. This task will execute the flow
90
+ # (in order of course) that has been built up and will clear out the
91
+ # flow for future usage. It will return the last executed task.
92
+ def execute!
93
+ # The context variable provides contextual information to the
94
+ # execute_* task. It is merely the result of the last execute
95
+ # task.
96
+ context = nil
97
+
98
+ current_flow.each do |command, *params|
99
+ # The context becomes the first parameter
100
+ params.unshift(context)
101
+
102
+ # Send to the special execution method
103
+ context = send("execute_#{command}!", *params)
104
+ end
105
+
106
+ @current_flow = []
107
+ context
108
+ end
109
+
110
+ # Returns the internal representation of the current web flow.
111
+ def current_flow
112
+ @current_flow ||= []
113
+ end
114
+
115
+ protected
116
+
117
+ # Executes the 'goto' web flow task by simply GETting the
118
+ # url specified. Returns the get result.
119
+ def execute_goto!(context, url)
120
+ self.get(url)
121
+ end
122
+
123
+ # Executes the 'verify' web flow task by checking the
124
+ # context with the xpath. Raises an InvalidPageError if no
125
+ # elements match.
126
+ def execute_verify!(context, xpath)
127
+ results = context.search(xpath)
128
+ raise Errors::InvalidPageError.new(context, "Failed verify step in web flow.") if results.empty?
129
+
130
+ context
131
+ end
132
+
133
+ # Executes the 'submit_form' task. Submits a form given by finding
134
+ # it on the previous page and submitting it.
135
+ def execute_submit_form!(context, name, parameters = {})
136
+ form = context.form_with(:name => name)
137
+ raise Errors::InvalidPageError.new(context, "Failed to find form on submit_form: #{name}") if form.nil?
138
+
139
+ # The #to_s is necessary here to avoid strange exceptions being
140
+ # thrown by WWW::Mechanize. Unit tested.
141
+ parameters.each { |k,v| form[k.to_s] = v.to_s }
142
+ form.submit()
143
+ end
144
+ end
145
+ end
@@ -0,0 +1,75 @@
1
+ module RubyUW
2
+ # RubyUW::CurriculumEnrollment is used to extract SLN information
3
+ # for an entire "curriculum" of courses (whic is how MyUW refers
4
+ # to it). If you have multiple SLNs within the same curriculum
5
+ # (chemistry, computer science, history, etc.), it is more efficient
6
+ # to use the CurriculumEnrollment class rather than SLN.
7
+ #
8
+ # <b>Requires authentication with RubyUW::Base prior to use!</b>
9
+ #
10
+ # == Usage Examples
11
+ #
12
+ # require 'rubyuw'
13
+ # RubyUW::Base.authenticate("netid", "password")
14
+ # curriculum = RubyUW::CurriculumEnrollment.find("HIST")
15
+ # curriculum["12345"] # RubyUW::SLN object. Use it like one!
16
+ #
17
+ # After using {RubyUW::CurriculumEnrollment} to grab the SLNs of a
18
+ # specific curriculum, you can access the individual SLNs using
19
+ # the array access notation common to ruby. This access returns
20
+ # a {RubyUW::SLN} object.
21
+ class CurriculumEnrollment
22
+ class <<self
23
+ # Pulls data about a specific curriculum. The SLN data is pulled
24
+ # directly from the MyUW curriculum information page. Authentication
25
+ # is required prior to use.
26
+ #
27
+ # @param [String] Curriculum to search for.
28
+ # @param [String] The term (quarter) to search in.
29
+ def find(curriculum, term)
30
+ raise Errors::NotLoggedInError.new unless Base.authenticated?
31
+
32
+ page = Base.connection.get("https://sdb.admin.washington.edu/timeschd/uwnetid/tsstat.asp?QTRYR=#{term}&CURRIC=#{curriculum}")
33
+ raise Errors::CurriculumDoesNotExistError.new if !curriculum_exists?(page)
34
+
35
+ extract_courses(page, term)
36
+ rescue WWW::Mechanize::ResponseCodeError
37
+ raise Errors::CurriculumDoesNotExistError.new
38
+ end
39
+
40
+ protected
41
+
42
+ def curriculum_exists?(page)
43
+ !(page.body.to_s =~ /No sections found for (.+?)/i)
44
+ end
45
+
46
+ def extract_courses(page, term)
47
+ course_nodes = page.search('//table//tr[count(th)>1 and @bgcolor="#d0d0d0"]//following-sibling::tr')
48
+
49
+ results = {}
50
+ course_nodes.each { |node|
51
+ sln, data = extract_course(node)
52
+ results[sln] = RubyUW::SLN.new(sln, term, data)
53
+ }
54
+
55
+ results
56
+ end
57
+
58
+ def extract_course(node)
59
+ data_keys = [:sln, :course, :section, :type, :title, :current_enrollment, :limit_enrollment,
60
+ :room_capacity, :space_available, nil, :notes]
61
+
62
+ results = Base.extract(node, 'td', data_keys)
63
+ results[:sln].gsub!(/^>/, '') # Strips leading '>' off of quiz sections
64
+
65
+ [results[:sln], results]
66
+ end
67
+ end
68
+ end
69
+
70
+ module Errors
71
+ # An error indicating that the curriclum requesting does not
72
+ # exist.
73
+ class CurriculumDoesNotExistError < Exception; end
74
+ end
75
+ end