tap-mechanize 0.5.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/History ADDED
@@ -0,0 +1,41 @@
1
+ == 0.5.0 / 2009-03-23
2
+
3
+ Total refactoring of the entire namespace. TapHttp is now
4
+ known as TapMechanize. Obviously not backwards compatible.
5
+
6
+ * updated redirect-http to allow for returns within
7
+ the existing onSubmit
8
+ * Request now returns Page rather than page content
9
+ by default
10
+ * refactored HttpTest to Mechanize::Test
11
+ * added Tap::Mechanize::Agent
12
+
13
+ == 0.4.0 / 2009-03-17
14
+
15
+ Significant rewrite using an updated version of TapServer
16
+ and Mechanize. TapHttp now serves its own redirection
17
+ commands via TapUbiquity.
18
+
19
+ == 0.3.1 / 2009-02-19
20
+
21
+ Improved handling of multipart/form data.
22
+
23
+ * http_to_yaml returns config file as an attachment.
24
+
25
+ == 0.3.0 / 2009-02-19
26
+
27
+ Rework of cgi scripts a tap controllers. Nearly complete
28
+ internal refactoring (this is not a backwards compatible
29
+ release).
30
+
31
+ * Reworked CGI scripts as Tap Controllers
32
+ * Reworked Dispatch as the Request module
33
+ * Added Get and Submit tasks
34
+ * Reworked HttpTest to allow specification of a
35
+ Rack application
36
+
37
+ == 0.2.1 / 2009-02-17
38
+
39
+ Minor updates to utilize Tap 0.12.0
40
+
41
+ * Added History
data/MIT-LICENSE ADDED
@@ -0,0 +1,19 @@
1
+ Copyright (c) 2008-2009, Regents of the University of Colorado.
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this
4
+ software and associated documentation files (the "Software"), to deal in the Software
5
+ without restriction, including without limitation the rights to use, copy, modify, merge,
6
+ publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons
7
+ to whom the Software is furnished to do so, subject to the following conditions:
8
+
9
+ The above copyright notice and this permission notice shall be included in all copies or
10
+ substantial portions of the Software.
11
+
12
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
13
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
14
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
15
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
16
+ HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
17
+ WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
18
+ FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
19
+ OTHER DEALINGS IN THE SOFTWARE.
data/README ADDED
@@ -0,0 +1,58 @@
1
+ = {Tap Mechanize}[http://tap.rubyforge.org/projects/tap-mechanize]
2
+
3
+ A task library for submitting http requests using Tap[http://tap.rubyforge.org] and Mechanize[http://mechanize.rubyforge.org]
4
+
5
+ == Description
6
+
7
+ Tap::Mechanize provides tasks and controllers to automate interaction with
8
+ websites, and in particular with web applications. Tap::Mechanize provides a
9
+ {Ubiquity}[http://labs.mozilla.com/2008/08/introducing-ubiquity/] command to
10
+ redirect and capture HTTP requests as YAML. The resulting request
11
+ configuration files can be edited and resubmitted using a Request task.
12
+ Multi-step actions, file uploads, and actions that require the use of HTTPS
13
+ may all be automated in this way.
14
+
15
+ * Lighthouse[http://bahuvrihi.lighthouseapp.com/projects/9908-tap-task-application/tickets]
16
+ * Github[http://github.com/bahuvrihi/tap-mechanize/tree/master]
17
+ * {Google Group}[http://groups.google.com/group/ruby-on-tap]
18
+
19
+ === Usage
20
+
21
+ Tap::Mechanize submits HTTP requests using the Tap::Mechanize::Submit task.
22
+ Headers and parameters may be specified, but only a uri is required.
23
+
24
+ r = Tap::Mechanize::Submit.new
25
+ r.process(:uri => 'http://www.google.com')[0, 80]
26
+ # => "<html><head><meta http-equiv=\"content-type\" content=\"text/html; charset=ISO-8859"
27
+
28
+ === Capturing Web Forms
29
+
30
+ HTTP requests from web forms may be captured and resubmitted using a
31
+ combination of tools. First start a tap server from the command line:
32
+
33
+ % tap server
34
+
35
+ Now open a browser and work through the tutorial: http://localhost:8080/capture/tutorial
36
+
37
+ The current capture scripts redirect via the onsubmit and onclick events in a
38
+ web page. This works in most cases, but is known to fail in some circumstances
39
+ (ex Java Server Pages). A workaround exists using the {Live HTTP
40
+ Headers}[https://addons.mozilla.org/en-US/firefox/addon/3829] addon. Start a
41
+ web server and see http://localhost:8080/capture/http.
42
+
43
+ == Installation
44
+
45
+ Tap::Mechanize is available as a gem on
46
+ RubyForge[http://rubyforge.org/projects/tap]. Use:
47
+
48
+ % gem install tap-mechanize
49
+
50
+ Note this package was once called tap-http.
51
+
52
+ == Info
53
+
54
+ Copyright (c) 2008-2009, Regents of the University of Colorado.
55
+ Developer:: {Simon Chiang}[http://bahuvrihi.wordpress.com], {Biomolecular Structure Program}[http://biomol.uchsc.edu/], {Hansen Lab}[http://hsc-proteomics.uchsc.edu/hansenlab/]
56
+ Support:: CU Denver School of Medicine Deans Academic Enrichment Fund
57
+ Licence:: {MIT-Style}[link:files/MIT-LICENSE.html]
58
+
@@ -0,0 +1,101 @@
1
+ require 'mechanize'
2
+
3
+ module Tap
4
+ module Mechanize
5
+
6
+ # A Configurable version of the WWW::Mechanize class.
7
+ class Agent < ::WWW::Mechanize
8
+ include Configurable
9
+
10
+ DEFAULT_ATTRIBUTES = DEFAULT_ATTRIBUTES.dup
11
+ DEFAULT_ATTRIBUTES[nil] = {:reader => false, :writer => false}
12
+
13
+ config :ca_file, nil
14
+ config :cert, nil
15
+ config :conditional_requests, true, &c.switch
16
+ config :follow_meta_refresh, false, &c.switch
17
+ config :keep_alive, true, &c.switch
18
+ config :keep_alive_time, 300, &c.integer_or_nil
19
+ config :key, nil
20
+ config :open_timeout, nil
21
+ config :pass, nil
22
+ config :read_timeout, nil
23
+ config :redirect_ok, true, &c.switch
24
+ config :redirection_limit, 20, &c.integer_or_nil
25
+ config :user_agent, "WWW-Mechanize/0.9.2 (http://rubyforge.org/projects/mechanize/)"
26
+ config :verify_callback, nil
27
+ config :watch_for_set, nil
28
+
29
+ def initialize(config={})
30
+ super()
31
+ initialize_config(config)
32
+ end
33
+
34
+ # Fetches the specified request. Request must be a hash at least
35
+ # specifying a url; these are the allowed keys and defaults:
36
+ #
37
+ # url:: nil, must be specified
38
+ # request_method:: get
39
+ # params:: {}
40
+ # headers:: {}
41
+ #
42
+ def fetch(request)
43
+ request = symbolize(request)
44
+
45
+ # note get handles nil request methods (ie '') and has to
46
+ # reassign uri to url to make mechanize happy
47
+ case request[:request_method].to_s
48
+ when /get/i, '' then get(request.merge(:url => request[:uri]))
49
+ when /post/i then submit(prepare_form(request), nil)
50
+ when /head/i then head(nil, nil, request)
51
+ when /put/i then put(nil, nil, request)
52
+ when /delete/i then delete(nil, nil, request)
53
+ else raise "unsupported request method: #{request.inspect}"
54
+ end
55
+ end
56
+
57
+ protected
58
+
59
+ # taken from ActiveSupport
60
+ def symbolize(hash) # :nodoc:
61
+ hash.inject({}) do |options, (key, value)|
62
+ options[(key.to_sym rescue key) || key] = value
63
+ options
64
+ end
65
+ end
66
+
67
+ # A subclass of Hash used exclusively as the node for a form.
68
+ # Evidently a search method is necessary.
69
+ class FormNode < Hash # :nodoc:
70
+ def search(*args); []; end
71
+ end
72
+
73
+ # prepares a Mechanize::Form for the request. this method is patterned
74
+ # after what happens in Mechanize#post and hopefully will not be
75
+ # necessary in the future.
76
+ def prepare_form(request) # :nodoc:
77
+ node = FormNode.new
78
+ node['method'] = request[:request_method]
79
+ node['action'] = request[:uri]
80
+ node['enctype'] = request[:headers] ? request[:headers]['Content-Type'] : nil
81
+
82
+ form = Form.new(node)
83
+ request[:params].each_pair do |key, value|
84
+ case value
85
+ when IO
86
+ form.enctype = 'multipart/form-data'
87
+ upload = Form::FileUpload.new(key.to_s, ::File.basename(value.path))
88
+ upload.file_data = value.read
89
+ form.file_uploads << upload
90
+ when Array
91
+ fields = value.collect {|val| Form::Field.new(key.to_s, val)}
92
+ form.fields.concat fields
93
+ else
94
+ form.fields << Form::Field.new(key.to_s, value)
95
+ end
96
+ end
97
+ form
98
+ end
99
+ end
100
+ end
101
+ end
@@ -0,0 +1,194 @@
1
+ require 'tap/controller'
2
+ require 'tap/mechanize/utils'
3
+ require 'tap/ubiquity/utils'
4
+
5
+ module Tap
6
+ module Mechanize
7
+ # :startdoc::controller
8
+ # :startdoc::ubiquity
9
+ class Capture < Tap::Controller
10
+ include Tap::Mechanize::Utils
11
+ include Tap::Ubiquity::Utils
12
+
13
+ PREFIX = '__redirect_http_'
14
+ REDIRECT_PARAMETER = '__redirect_http_original_action'
15
+
16
+ # Brings up the tutorial.
17
+ def index
18
+ render 'index.erb', :locals => {:captures => persistence.index }
19
+ end
20
+
21
+ def create(id, keep_content=true)
22
+ persistence.create(id) do |io|
23
+ io << YAML.dump([parse_request(keep_content)])
24
+ end
25
+ download(id)
26
+ end
27
+
28
+ def show(id)
29
+ response['Content-Type'] = "text/plain"
30
+ persistence.read(id)
31
+ end
32
+
33
+ def download(id)
34
+ path = persistence.path(id)
35
+ filename = id
36
+ filename += ".yml" if File.extname(id) == ""
37
+
38
+ response['Content-Type'] = "text/plain"
39
+ response['Content-Disposition'] = "attachment; filename=#{filename};"
40
+ persistence.read(id)
41
+ end
42
+
43
+ def update(id="request", keep_content=true)
44
+ path = persistence.path(id)
45
+ requests = File.exists?(path) ? YAML.load_file(path) : []
46
+ requests << parse_request(keep_content)
47
+
48
+ persistence.update(id) do |io|
49
+ io << YAML.dump(requests)
50
+ end
51
+ download(id)
52
+ end
53
+
54
+ def destroy(id)
55
+ persistence.destroy(id)
56
+ redirect uri(:index)
57
+ end
58
+
59
+ # Brings up a tutorial teaching how to capture and resubmit forms.
60
+ def tutorial
61
+ serve js_injection(:redirect_http) do |link|
62
+ content = render 'tutorial.erb'
63
+ content + link
64
+ end
65
+ end
66
+
67
+ def test
68
+ render 'test.erb'
69
+ end
70
+
71
+ # Say is the target of the tutorial.
72
+ def say
73
+ "<pre>#{request.params['words']}</pre>"
74
+ end
75
+
76
+ # Returns the redirection script.
77
+ def redirect_http
78
+ css = render 'redirect.css'
79
+ script = render 'redirect.js', :locals => {
80
+ :redirect_parameter => REDIRECT_PARAMETER,
81
+ :redirect_action => uri(:update),
82
+ }
83
+
84
+ content = render 'redirect_http.erb', :locals => {
85
+ :css => css,
86
+ :script => script
87
+ }
88
+
89
+ if request.get?
90
+ response['Content-Type'] = 'text/plain'
91
+ %Q{
92
+ <div id="#{prefix}">
93
+ #{content}
94
+ </div>}
95
+ else
96
+ response['Content-Type'] = 'text/javascript'
97
+ %Q{
98
+ if(current = document.getElementById("#{prefix}")) {
99
+ RedirectHttp.revert();
100
+ } else {
101
+ var div = document.createElement("div");
102
+ div.id = "#{prefix}";
103
+ div.innerHTML = #{content.to_json};
104
+ document.body.insertBefore(div, document.body.firstChild);
105
+ RedirectHttp.redirect();
106
+ }
107
+ }
108
+ end
109
+ end
110
+
111
+ # Parses HTTP request
112
+ def http
113
+ if request.get?
114
+ render 'http.erb'
115
+ else
116
+ keep_content = request.params['keep_content'] == "true"
117
+
118
+ hash = {}
119
+ parse_http_request(request.params['http'], keep_content).each_pair do |key, value|
120
+ hash[key.to_s] = value
121
+ end
122
+
123
+ # remove extranous data
124
+ hash.delete('headers')
125
+ hash.delete('version')
126
+
127
+ response['Content-Type'] = "text/plain"
128
+ YAML.dump(hash)
129
+ end
130
+ end
131
+
132
+ protected
133
+
134
+ # helper for rendering... saves specification of
135
+ # :locals => {:prefix => PREFIX}
136
+ def prefix # :nodoc:
137
+ PREFIX
138
+ end
139
+
140
+ def capture_overloaded_parameters # :nodoc:
141
+ # perform the actions of Rack::Request::POST, but capture
142
+ # overloaded parameter names
143
+ env = request.env
144
+ env["rack.request.form_input"] = env["rack.input"]
145
+ unless env["rack.request.form_hash"] = parse_multipart(env)
146
+ env["rack.request.form_vars"] = env["rack.input"].read
147
+ env["rack.request.form_hash"] = Rack::Utils.parse_query(env["rack.request.form_vars"])
148
+ env["rack.input"].rewind if env["rack.input"].respond_to?(:rewind)
149
+ end
150
+ end
151
+
152
+ # helper to parse the request into a request hash for
153
+ # use by a Tap::Mechanize::Submit task
154
+ def parse_request(keep_content=true) # :nodoc:
155
+ if keep_content.kind_of?(String)
156
+ keep_content = keep_content =~ /true/i
157
+ end
158
+
159
+ capture_overloaded_parameters
160
+
161
+ hash = {}
162
+ parse_rack_request(request, keep_content).each_pair do |key, value|
163
+ hash[key.to_s] = value
164
+ end
165
+
166
+ action = hash['params'].delete(REDIRECT_PARAMETER)
167
+
168
+ hash['uri'] = case action
169
+ when /^http/
170
+ # action is an href already
171
+ action
172
+ when /^\//
173
+ # make action relative to host
174
+ action, query = action.split('?', 2)
175
+ uri = URI.parse(hash['headers']['Referer'].to_s)
176
+ uri.path = action
177
+ uri.query = query.to_s.gsub(/\s/, "+")
178
+ uri.to_s
179
+ else
180
+ # make action relative to Referer
181
+ base = File.dirname(hash['headers']['Referer'].to_s)
182
+ File.join(base, action)
183
+ end
184
+
185
+ # remove extranous data
186
+ hash.delete('headers')
187
+ hash.delete('version')
188
+
189
+ hash
190
+ end
191
+
192
+ end
193
+ end
194
+ end
@@ -0,0 +1,20 @@
1
+ require 'tap/mechanize/request'
2
+
3
+ module Tap
4
+ module Mechanize
5
+ # :startdoc::manifest gets the uri
6
+ #
7
+ # Submits an Http request to the specified uri and returns the message
8
+ # body.
9
+ #
10
+ # % tap run -- get http://tap.rubyforge.org --: dump
11
+ #
12
+ class Get < Request
13
+
14
+ # Gets the uri and returns the page content.
15
+ def process(uri)
16
+ super(:uri => uri).content
17
+ end
18
+ end
19
+ end
20
+ end
@@ -0,0 +1,33 @@
1
+ require 'tap/mechanize/agent'
2
+
3
+ module Tap
4
+ module Mechanize
5
+ class Request < Tap::Task
6
+ nest :mechanize, Agent # the mechanize agent
7
+
8
+ # Returns the mechanize agent.
9
+ #--
10
+ # Overrides the default reader to ensure the agent log is set.
11
+ def mechanize
12
+ @mechanize ||= begin
13
+ agent = Agent.new
14
+ agent.log = app.logger
15
+ agent
16
+ end
17
+ end
18
+
19
+ # Submits each request in order and returns the final Page.
20
+ # Returns nil if no requests are specified.
21
+ def process(requests)
22
+ unless requests.kind_of?(Array)
23
+ requests = [requests]
24
+ end
25
+
26
+ requests.inject(nil) do |last_page, request|
27
+ mechanize.fetch(request)
28
+ end
29
+ end
30
+
31
+ end
32
+ end
33
+ end