tap-mechanize 0.5.0
Sign up to get free protection for your applications and to get access to all the features.
- data/History +41 -0
- data/MIT-LICENSE +19 -0
- data/README +58 -0
- data/lib/tap/mechanize/agent.rb +101 -0
- data/lib/tap/mechanize/capture.rb +194 -0
- data/lib/tap/mechanize/get.rb +20 -0
- data/lib/tap/mechanize/request.rb +33 -0
- data/lib/tap/mechanize/submit.rb +30 -0
- data/lib/tap/mechanize/test.rb +59 -0
- data/lib/tap/mechanize/test/echo_server.rb +20 -0
- data/lib/tap/mechanize/test/mock_server.rb +31 -0
- data/lib/tap/mechanize/utils.rb +296 -0
- data/tap.yml +1 -0
- data/views/tap/mechanize/capture/http.erb +31 -0
- data/views/tap/mechanize/capture/index.erb +11 -0
- data/views/tap/mechanize/capture/redirect.css +28 -0
- data/views/tap/mechanize/capture/redirect.js +184 -0
- data/views/tap/mechanize/capture/redirect_http.erb +15 -0
- data/views/tap/mechanize/capture/test.erb +108 -0
- data/views/tap/mechanize/capture/tutorial.erb +57 -0
- metadata +108 -0
data/History
ADDED
@@ -0,0 +1,41 @@
|
|
1
|
+
== 0.5.0 / 2009-03-23
|
2
|
+
|
3
|
+
Total refactoring of the entire namespace. TapHttp is now
|
4
|
+
known as TapMechanize. Obviously not backwards compatible.
|
5
|
+
|
6
|
+
* updated redirect-http to allow for returns within
|
7
|
+
the existing onSubmit
|
8
|
+
* Request now returns Page rather than page content
|
9
|
+
by default
|
10
|
+
* refactored HttpTest to Mechanize::Test
|
11
|
+
* added Tap::Mechanize::Agent
|
12
|
+
|
13
|
+
== 0.4.0 / 2009-03-17
|
14
|
+
|
15
|
+
Significant rewrite using an updated version of TapServer
|
16
|
+
and Mechanize. TapHttp now serves its own redirection
|
17
|
+
commands via TapUbiquity.
|
18
|
+
|
19
|
+
== 0.3.1 / 2009-02-19
|
20
|
+
|
21
|
+
Improved handling of multipart/form data.
|
22
|
+
|
23
|
+
* http_to_yaml returns config file as an attachment.
|
24
|
+
|
25
|
+
== 0.3.0 / 2009-02-19
|
26
|
+
|
27
|
+
Rework of cgi scripts a tap controllers. Nearly complete
|
28
|
+
internal refactoring (this is not a backwards compatible
|
29
|
+
release).
|
30
|
+
|
31
|
+
* Reworked CGI scripts as Tap Controllers
|
32
|
+
* Reworked Dispatch as the Request module
|
33
|
+
* Added Get and Submit tasks
|
34
|
+
* Reworked HttpTest to allow specification of a
|
35
|
+
Rack application
|
36
|
+
|
37
|
+
== 0.2.1 / 2009-02-17
|
38
|
+
|
39
|
+
Minor updates to utilize Tap 0.12.0
|
40
|
+
|
41
|
+
* Added History
|
data/MIT-LICENSE
ADDED
@@ -0,0 +1,19 @@
|
|
1
|
+
Copyright (c) 2008-2009, Regents of the University of Colorado.
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy of this
|
4
|
+
software and associated documentation files (the "Software"), to deal in the Software
|
5
|
+
without restriction, including without limitation the rights to use, copy, modify, merge,
|
6
|
+
publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons
|
7
|
+
to whom the Software is furnished to do so, subject to the following conditions:
|
8
|
+
|
9
|
+
The above copyright notice and this permission notice shall be included in all copies or
|
10
|
+
substantial portions of the Software.
|
11
|
+
|
12
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
13
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
14
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
15
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
|
16
|
+
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
|
17
|
+
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
|
18
|
+
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
|
19
|
+
OTHER DEALINGS IN THE SOFTWARE.
|
data/README
ADDED
@@ -0,0 +1,58 @@
|
|
1
|
+
= {Tap Mechanize}[http://tap.rubyforge.org/projects/tap-mechanize]
|
2
|
+
|
3
|
+
A task library for submitting http requests using Tap[http://tap.rubyforge.org] and Mechanize[http://mechanize.rubyforge.org]
|
4
|
+
|
5
|
+
== Description
|
6
|
+
|
7
|
+
Tap::Mechanize provides tasks and controllers to automate interaction with
|
8
|
+
websites, and in particular with web applications. Tap::Mechanize provides a
|
9
|
+
{Ubiquity}[http://labs.mozilla.com/2008/08/introducing-ubiquity/] command to
|
10
|
+
redirect and capture HTTP requests as YAML. The resulting request
|
11
|
+
configuration files can be edited and resubmitted using a Request task.
|
12
|
+
Multi-step actions, file uploads, and actions that require the use of HTTPS
|
13
|
+
may all be automated in this way.
|
14
|
+
|
15
|
+
* Lighthouse[http://bahuvrihi.lighthouseapp.com/projects/9908-tap-task-application/tickets]
|
16
|
+
* Github[http://github.com/bahuvrihi/tap-mechanize/tree/master]
|
17
|
+
* {Google Group}[http://groups.google.com/group/ruby-on-tap]
|
18
|
+
|
19
|
+
=== Usage
|
20
|
+
|
21
|
+
Tap::Mechanize submits HTTP requests using the Tap::Mechanize::Submit task.
|
22
|
+
Headers and parameters may be specified, but only a uri is required.
|
23
|
+
|
24
|
+
r = Tap::Mechanize::Submit.new
|
25
|
+
r.process(:uri => 'http://www.google.com')[0, 80]
|
26
|
+
# => "<html><head><meta http-equiv=\"content-type\" content=\"text/html; charset=ISO-8859"
|
27
|
+
|
28
|
+
=== Capturing Web Forms
|
29
|
+
|
30
|
+
HTTP requests from web forms may be captured and resubmitted using a
|
31
|
+
combination of tools. First start a tap server from the command line:
|
32
|
+
|
33
|
+
% tap server
|
34
|
+
|
35
|
+
Now open a browser and work through the tutorial: http://localhost:8080/capture/tutorial
|
36
|
+
|
37
|
+
The current capture scripts redirect via the onsubmit and onclick events in a
|
38
|
+
web page. This works in most cases, but is known to fail in some circumstances
|
39
|
+
(ex Java Server Pages). A workaround exists using the {Live HTTP
|
40
|
+
Headers}[https://addons.mozilla.org/en-US/firefox/addon/3829] addon. Start a
|
41
|
+
web server and see http://localhost:8080/capture/http.
|
42
|
+
|
43
|
+
== Installation
|
44
|
+
|
45
|
+
Tap::Mechanize is available as a gem on
|
46
|
+
RubyForge[http://rubyforge.org/projects/tap]. Use:
|
47
|
+
|
48
|
+
% gem install tap-mechanize
|
49
|
+
|
50
|
+
Note this package was once called tap-http.
|
51
|
+
|
52
|
+
== Info
|
53
|
+
|
54
|
+
Copyright (c) 2008-2009, Regents of the University of Colorado.
|
55
|
+
Developer:: {Simon Chiang}[http://bahuvrihi.wordpress.com], {Biomolecular Structure Program}[http://biomol.uchsc.edu/], {Hansen Lab}[http://hsc-proteomics.uchsc.edu/hansenlab/]
|
56
|
+
Support:: CU Denver School of Medicine Deans Academic Enrichment Fund
|
57
|
+
Licence:: {MIT-Style}[link:files/MIT-LICENSE.html]
|
58
|
+
|
@@ -0,0 +1,101 @@
|
|
1
|
+
require 'mechanize'
|
2
|
+
|
3
|
+
module Tap
|
4
|
+
module Mechanize
|
5
|
+
|
6
|
+
# A Configurable version of the WWW::Mechanize class.
|
7
|
+
class Agent < ::WWW::Mechanize
|
8
|
+
include Configurable
|
9
|
+
|
10
|
+
DEFAULT_ATTRIBUTES = DEFAULT_ATTRIBUTES.dup
|
11
|
+
DEFAULT_ATTRIBUTES[nil] = {:reader => false, :writer => false}
|
12
|
+
|
13
|
+
config :ca_file, nil
|
14
|
+
config :cert, nil
|
15
|
+
config :conditional_requests, true, &c.switch
|
16
|
+
config :follow_meta_refresh, false, &c.switch
|
17
|
+
config :keep_alive, true, &c.switch
|
18
|
+
config :keep_alive_time, 300, &c.integer_or_nil
|
19
|
+
config :key, nil
|
20
|
+
config :open_timeout, nil
|
21
|
+
config :pass, nil
|
22
|
+
config :read_timeout, nil
|
23
|
+
config :redirect_ok, true, &c.switch
|
24
|
+
config :redirection_limit, 20, &c.integer_or_nil
|
25
|
+
config :user_agent, "WWW-Mechanize/0.9.2 (http://rubyforge.org/projects/mechanize/)"
|
26
|
+
config :verify_callback, nil
|
27
|
+
config :watch_for_set, nil
|
28
|
+
|
29
|
+
def initialize(config={})
|
30
|
+
super()
|
31
|
+
initialize_config(config)
|
32
|
+
end
|
33
|
+
|
34
|
+
# Fetches the specified request. Request must be a hash at least
|
35
|
+
# specifying a url; these are the allowed keys and defaults:
|
36
|
+
#
|
37
|
+
# url:: nil, must be specified
|
38
|
+
# request_method:: get
|
39
|
+
# params:: {}
|
40
|
+
# headers:: {}
|
41
|
+
#
|
42
|
+
def fetch(request)
|
43
|
+
request = symbolize(request)
|
44
|
+
|
45
|
+
# note get handles nil request methods (ie '') and has to
|
46
|
+
# reassign uri to url to make mechanize happy
|
47
|
+
case request[:request_method].to_s
|
48
|
+
when /get/i, '' then get(request.merge(:url => request[:uri]))
|
49
|
+
when /post/i then submit(prepare_form(request), nil)
|
50
|
+
when /head/i then head(nil, nil, request)
|
51
|
+
when /put/i then put(nil, nil, request)
|
52
|
+
when /delete/i then delete(nil, nil, request)
|
53
|
+
else raise "unsupported request method: #{request.inspect}"
|
54
|
+
end
|
55
|
+
end
|
56
|
+
|
57
|
+
protected
|
58
|
+
|
59
|
+
# taken from ActiveSupport
|
60
|
+
def symbolize(hash) # :nodoc:
|
61
|
+
hash.inject({}) do |options, (key, value)|
|
62
|
+
options[(key.to_sym rescue key) || key] = value
|
63
|
+
options
|
64
|
+
end
|
65
|
+
end
|
66
|
+
|
67
|
+
# A subclass of Hash used exclusively as the node for a form.
|
68
|
+
# Evidently a search method is necessary.
|
69
|
+
class FormNode < Hash # :nodoc:
|
70
|
+
def search(*args); []; end
|
71
|
+
end
|
72
|
+
|
73
|
+
# prepares a Mechanize::Form for the request. this method is patterned
|
74
|
+
# after what happens in Mechanize#post and hopefully will not be
|
75
|
+
# necessary in the future.
|
76
|
+
def prepare_form(request) # :nodoc:
|
77
|
+
node = FormNode.new
|
78
|
+
node['method'] = request[:request_method]
|
79
|
+
node['action'] = request[:uri]
|
80
|
+
node['enctype'] = request[:headers] ? request[:headers]['Content-Type'] : nil
|
81
|
+
|
82
|
+
form = Form.new(node)
|
83
|
+
request[:params].each_pair do |key, value|
|
84
|
+
case value
|
85
|
+
when IO
|
86
|
+
form.enctype = 'multipart/form-data'
|
87
|
+
upload = Form::FileUpload.new(key.to_s, ::File.basename(value.path))
|
88
|
+
upload.file_data = value.read
|
89
|
+
form.file_uploads << upload
|
90
|
+
when Array
|
91
|
+
fields = value.collect {|val| Form::Field.new(key.to_s, val)}
|
92
|
+
form.fields.concat fields
|
93
|
+
else
|
94
|
+
form.fields << Form::Field.new(key.to_s, value)
|
95
|
+
end
|
96
|
+
end
|
97
|
+
form
|
98
|
+
end
|
99
|
+
end
|
100
|
+
end
|
101
|
+
end
|
@@ -0,0 +1,194 @@
|
|
1
|
+
require 'tap/controller'
|
2
|
+
require 'tap/mechanize/utils'
|
3
|
+
require 'tap/ubiquity/utils'
|
4
|
+
|
5
|
+
module Tap
|
6
|
+
module Mechanize
|
7
|
+
# :startdoc::controller
|
8
|
+
# :startdoc::ubiquity
|
9
|
+
class Capture < Tap::Controller
|
10
|
+
include Tap::Mechanize::Utils
|
11
|
+
include Tap::Ubiquity::Utils
|
12
|
+
|
13
|
+
PREFIX = '__redirect_http_'
|
14
|
+
REDIRECT_PARAMETER = '__redirect_http_original_action'
|
15
|
+
|
16
|
+
# Brings up the tutorial.
|
17
|
+
def index
|
18
|
+
render 'index.erb', :locals => {:captures => persistence.index }
|
19
|
+
end
|
20
|
+
|
21
|
+
def create(id, keep_content=true)
|
22
|
+
persistence.create(id) do |io|
|
23
|
+
io << YAML.dump([parse_request(keep_content)])
|
24
|
+
end
|
25
|
+
download(id)
|
26
|
+
end
|
27
|
+
|
28
|
+
def show(id)
|
29
|
+
response['Content-Type'] = "text/plain"
|
30
|
+
persistence.read(id)
|
31
|
+
end
|
32
|
+
|
33
|
+
def download(id)
|
34
|
+
path = persistence.path(id)
|
35
|
+
filename = id
|
36
|
+
filename += ".yml" if File.extname(id) == ""
|
37
|
+
|
38
|
+
response['Content-Type'] = "text/plain"
|
39
|
+
response['Content-Disposition'] = "attachment; filename=#{filename};"
|
40
|
+
persistence.read(id)
|
41
|
+
end
|
42
|
+
|
43
|
+
def update(id="request", keep_content=true)
|
44
|
+
path = persistence.path(id)
|
45
|
+
requests = File.exists?(path) ? YAML.load_file(path) : []
|
46
|
+
requests << parse_request(keep_content)
|
47
|
+
|
48
|
+
persistence.update(id) do |io|
|
49
|
+
io << YAML.dump(requests)
|
50
|
+
end
|
51
|
+
download(id)
|
52
|
+
end
|
53
|
+
|
54
|
+
def destroy(id)
|
55
|
+
persistence.destroy(id)
|
56
|
+
redirect uri(:index)
|
57
|
+
end
|
58
|
+
|
59
|
+
# Brings up a tutorial teaching how to capture and resubmit forms.
|
60
|
+
def tutorial
|
61
|
+
serve js_injection(:redirect_http) do |link|
|
62
|
+
content = render 'tutorial.erb'
|
63
|
+
content + link
|
64
|
+
end
|
65
|
+
end
|
66
|
+
|
67
|
+
def test
|
68
|
+
render 'test.erb'
|
69
|
+
end
|
70
|
+
|
71
|
+
# Say is the target of the tutorial.
|
72
|
+
def say
|
73
|
+
"<pre>#{request.params['words']}</pre>"
|
74
|
+
end
|
75
|
+
|
76
|
+
# Returns the redirection script.
|
77
|
+
def redirect_http
|
78
|
+
css = render 'redirect.css'
|
79
|
+
script = render 'redirect.js', :locals => {
|
80
|
+
:redirect_parameter => REDIRECT_PARAMETER,
|
81
|
+
:redirect_action => uri(:update),
|
82
|
+
}
|
83
|
+
|
84
|
+
content = render 'redirect_http.erb', :locals => {
|
85
|
+
:css => css,
|
86
|
+
:script => script
|
87
|
+
}
|
88
|
+
|
89
|
+
if request.get?
|
90
|
+
response['Content-Type'] = 'text/plain'
|
91
|
+
%Q{
|
92
|
+
<div id="#{prefix}">
|
93
|
+
#{content}
|
94
|
+
</div>}
|
95
|
+
else
|
96
|
+
response['Content-Type'] = 'text/javascript'
|
97
|
+
%Q{
|
98
|
+
if(current = document.getElementById("#{prefix}")) {
|
99
|
+
RedirectHttp.revert();
|
100
|
+
} else {
|
101
|
+
var div = document.createElement("div");
|
102
|
+
div.id = "#{prefix}";
|
103
|
+
div.innerHTML = #{content.to_json};
|
104
|
+
document.body.insertBefore(div, document.body.firstChild);
|
105
|
+
RedirectHttp.redirect();
|
106
|
+
}
|
107
|
+
}
|
108
|
+
end
|
109
|
+
end
|
110
|
+
|
111
|
+
# Parses HTTP request
|
112
|
+
def http
|
113
|
+
if request.get?
|
114
|
+
render 'http.erb'
|
115
|
+
else
|
116
|
+
keep_content = request.params['keep_content'] == "true"
|
117
|
+
|
118
|
+
hash = {}
|
119
|
+
parse_http_request(request.params['http'], keep_content).each_pair do |key, value|
|
120
|
+
hash[key.to_s] = value
|
121
|
+
end
|
122
|
+
|
123
|
+
# remove extranous data
|
124
|
+
hash.delete('headers')
|
125
|
+
hash.delete('version')
|
126
|
+
|
127
|
+
response['Content-Type'] = "text/plain"
|
128
|
+
YAML.dump(hash)
|
129
|
+
end
|
130
|
+
end
|
131
|
+
|
132
|
+
protected
|
133
|
+
|
134
|
+
# helper for rendering... saves specification of
|
135
|
+
# :locals => {:prefix => PREFIX}
|
136
|
+
def prefix # :nodoc:
|
137
|
+
PREFIX
|
138
|
+
end
|
139
|
+
|
140
|
+
def capture_overloaded_parameters # :nodoc:
|
141
|
+
# perform the actions of Rack::Request::POST, but capture
|
142
|
+
# overloaded parameter names
|
143
|
+
env = request.env
|
144
|
+
env["rack.request.form_input"] = env["rack.input"]
|
145
|
+
unless env["rack.request.form_hash"] = parse_multipart(env)
|
146
|
+
env["rack.request.form_vars"] = env["rack.input"].read
|
147
|
+
env["rack.request.form_hash"] = Rack::Utils.parse_query(env["rack.request.form_vars"])
|
148
|
+
env["rack.input"].rewind if env["rack.input"].respond_to?(:rewind)
|
149
|
+
end
|
150
|
+
end
|
151
|
+
|
152
|
+
# helper to parse the request into a request hash for
|
153
|
+
# use by a Tap::Mechanize::Submit task
|
154
|
+
def parse_request(keep_content=true) # :nodoc:
|
155
|
+
if keep_content.kind_of?(String)
|
156
|
+
keep_content = keep_content =~ /true/i
|
157
|
+
end
|
158
|
+
|
159
|
+
capture_overloaded_parameters
|
160
|
+
|
161
|
+
hash = {}
|
162
|
+
parse_rack_request(request, keep_content).each_pair do |key, value|
|
163
|
+
hash[key.to_s] = value
|
164
|
+
end
|
165
|
+
|
166
|
+
action = hash['params'].delete(REDIRECT_PARAMETER)
|
167
|
+
|
168
|
+
hash['uri'] = case action
|
169
|
+
when /^http/
|
170
|
+
# action is an href already
|
171
|
+
action
|
172
|
+
when /^\//
|
173
|
+
# make action relative to host
|
174
|
+
action, query = action.split('?', 2)
|
175
|
+
uri = URI.parse(hash['headers']['Referer'].to_s)
|
176
|
+
uri.path = action
|
177
|
+
uri.query = query.to_s.gsub(/\s/, "+")
|
178
|
+
uri.to_s
|
179
|
+
else
|
180
|
+
# make action relative to Referer
|
181
|
+
base = File.dirname(hash['headers']['Referer'].to_s)
|
182
|
+
File.join(base, action)
|
183
|
+
end
|
184
|
+
|
185
|
+
# remove extranous data
|
186
|
+
hash.delete('headers')
|
187
|
+
hash.delete('version')
|
188
|
+
|
189
|
+
hash
|
190
|
+
end
|
191
|
+
|
192
|
+
end
|
193
|
+
end
|
194
|
+
end
|
@@ -0,0 +1,20 @@
|
|
1
|
+
require 'tap/mechanize/request'
|
2
|
+
|
3
|
+
module Tap
|
4
|
+
module Mechanize
|
5
|
+
# :startdoc::manifest gets the uri
|
6
|
+
#
|
7
|
+
# Submits an Http request to the specified uri and returns the message
|
8
|
+
# body.
|
9
|
+
#
|
10
|
+
# % tap run -- get http://tap.rubyforge.org --: dump
|
11
|
+
#
|
12
|
+
class Get < Request
|
13
|
+
|
14
|
+
# Gets the uri and returns the page content.
|
15
|
+
def process(uri)
|
16
|
+
super(:uri => uri).content
|
17
|
+
end
|
18
|
+
end
|
19
|
+
end
|
20
|
+
end
|
@@ -0,0 +1,33 @@
|
|
1
|
+
require 'tap/mechanize/agent'
|
2
|
+
|
3
|
+
module Tap
|
4
|
+
module Mechanize
|
5
|
+
class Request < Tap::Task
|
6
|
+
nest :mechanize, Agent # the mechanize agent
|
7
|
+
|
8
|
+
# Returns the mechanize agent.
|
9
|
+
#--
|
10
|
+
# Overrides the default reader to ensure the agent log is set.
|
11
|
+
def mechanize
|
12
|
+
@mechanize ||= begin
|
13
|
+
agent = Agent.new
|
14
|
+
agent.log = app.logger
|
15
|
+
agent
|
16
|
+
end
|
17
|
+
end
|
18
|
+
|
19
|
+
# Submits each request in order and returns the final Page.
|
20
|
+
# Returns nil if no requests are specified.
|
21
|
+
def process(requests)
|
22
|
+
unless requests.kind_of?(Array)
|
23
|
+
requests = [requests]
|
24
|
+
end
|
25
|
+
|
26
|
+
requests.inject(nil) do |last_page, request|
|
27
|
+
mechanize.fetch(request)
|
28
|
+
end
|
29
|
+
end
|
30
|
+
|
31
|
+
end
|
32
|
+
end
|
33
|
+
end
|