mechanize 0.7.5 → 0.7.6
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of mechanize might be problematic. Click here for more details.
- data/{CHANGELOG.txt → History.txt} +54 -31
- data/Manifest.txt +4 -1
- data/Rakefile +0 -29
- data/lib/www/mechanize.rb +35 -20
- data/lib/www/mechanize/cookie_jar.rb +97 -6
- data/lib/www/mechanize/form.rb +2 -2
- data/lib/www/mechanize/form/option.rb +1 -1
- data/lib/www/mechanize/form/select_list.rb +4 -0
- data/lib/www/mechanize/unsupported_scheme_error.rb +10 -0
- data/test/htdocs/google.html +0 -0
- data/test/htdocs/tc_links.html +2 -0
- data/test/servlets.rb +1 -0
- data/test/tc_authenticate.rb +25 -0
- data/test/tc_cookie_jar.rb +29 -0
- data/test/tc_errors.rb +1 -1
- data/test/tc_history_added.rb +16 -0
- data/test/tc_links.rb +15 -0
- data/test/tc_mech.rb +10 -0
- data/test/tc_option.rb +17 -0
- data/test/tc_select.rb +9 -0
- metadata +9 -6
@@ -1,31 +1,54 @@
|
|
1
1
|
= Mechanize CHANGELOG
|
2
2
|
|
3
|
-
|
3
|
+
=== 0.7.6
|
4
|
+
|
5
|
+
* New Features:
|
6
|
+
* Added support for reading Mozilla cookie jars. Thanks Chris Riddoch!
|
7
|
+
* Moving text, password, hidden, int to default. Thanks Tim Harper!
|
8
|
+
* Mechanize#history_added callback for page loads. Thanks Tobi Reif!
|
9
|
+
* Mechanize#scheme_handlers callbacks for handling unsupported schemes on
|
10
|
+
links.
|
11
|
+
|
12
|
+
* Bug Fixes:
|
13
|
+
* Ignoring scheme case
|
14
|
+
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=470642
|
15
|
+
* Not encoding tildes in uris. Thanks Bruno. [#19380]
|
16
|
+
* Resetting request bodys when retrying form posts. Thanks Bruno. [#19379]
|
17
|
+
* Throwing away keep alive connections on EPIPE and ECONNRESET.
|
18
|
+
* Duplicating request headers when retrying a 401. Thanks Hiroshi Ichikawa.
|
19
|
+
* Simulating an EOF error when a response length is bad. Thanks Tobias Gruetzmacher.
|
20
|
+
http://rubyforge.org/tracker/index.php?func=detail&aid=19178&group_id=1453&atid=5711
|
21
|
+
* Defaulting option tags to the inner text.
|
22
|
+
http://rubyforge.org/tracker/index.php?func=detail&aid=19976&group_id=1453&atid=5709
|
23
|
+
* Supporting blank strings for option values.
|
24
|
+
http://rubyforge.org/tracker/index.php?func=detail&aid=19975&group_id=1453&atid=5709
|
25
|
+
|
26
|
+
=== 0.7.5
|
4
27
|
|
5
28
|
* Fixed a bug when fetching files and not pages. Thanks Mat Schaffer!
|
6
29
|
|
7
|
-
|
30
|
+
=== 0.7.4
|
8
31
|
|
9
32
|
* doh!
|
10
33
|
|
11
|
-
|
34
|
+
=== 0.7.3
|
12
35
|
|
13
36
|
* Pages are now yielded to a blocks given to WWW::Mechanize#get
|
14
37
|
* WWW::Mechanize#get now takes hash arguments for uri parameters.
|
15
38
|
* WWW::Mechanize#post takes an IO object as a parameter and posts correctly.
|
16
39
|
* Fixing a strange zlib inflate problem on windows
|
17
40
|
|
18
|
-
|
41
|
+
=== 0.7.2
|
19
42
|
|
20
43
|
* Handling gzipped responses with no Content-Length header
|
21
44
|
|
22
|
-
|
45
|
+
=== 0.7.1
|
23
46
|
|
24
47
|
* Added iPhone to the user agent aliases. [#17572]
|
25
48
|
* Fixed a bug with EOF errors in net/http. [#17570]
|
26
49
|
* Handling 0 length gzipped responses. [#17471]
|
27
50
|
|
28
|
-
|
51
|
+
=== 0.7.0
|
29
52
|
|
30
53
|
* Removed Ruby 1.8.2 support
|
31
54
|
* Changed parser to lazily parse links
|
@@ -33,7 +56,7 @@
|
|
33
56
|
* Adding verify_callback for SSL requests. Thanks Mike Dalessio!
|
34
57
|
* Fixed a bug with Accept-Language header. Thanks Bill Siggelkow.
|
35
58
|
|
36
|
-
|
59
|
+
=== 0.6.11
|
37
60
|
|
38
61
|
* Detecting single quotes in meta redirects.
|
39
62
|
* Adding pretty inspect for ruby versions > 1.8.4 (Thanks Joel Kociolek)
|
@@ -44,7 +67,7 @@
|
|
44
67
|
* Added a FAQ
|
45
68
|
http://rubyforge.org/tracker/?func=detail&aid=15772&group_id=1453&atid=5709
|
46
69
|
|
47
|
-
|
70
|
+
=== 0.6.10
|
48
71
|
|
49
72
|
* Made digest authentication work with POSTs.
|
50
73
|
* Made sure page was HTML before following meta refreshes.
|
@@ -63,7 +86,7 @@
|
|
63
86
|
* Aliasing inspect to pretty_inspect. Thanks Eric Promislow.
|
64
87
|
http://rubyforge.org/pipermail/mechanize-users/2007-July/000157.html
|
65
88
|
|
66
|
-
|
89
|
+
=== 0.6.9
|
67
90
|
|
68
91
|
* Updating UTF-8 support for urls
|
69
92
|
* Adding AREA tags to the links list.
|
@@ -75,7 +98,7 @@
|
|
75
98
|
* Added Digest Authentication support. Thanks to Ryan Davis and Eric Hodel,
|
76
99
|
you get a gold star!
|
77
100
|
|
78
|
-
|
101
|
+
=== 0.6.8
|
79
102
|
|
80
103
|
* Keep alive can be shut off now with WWW::Mechanize#keep_alive
|
81
104
|
* Conditional requests can be shut off with WWW::Mechanize#conditional_requests
|
@@ -86,12 +109,12 @@
|
|
86
109
|
* Updating compatability with hpricot
|
87
110
|
* Added more unit tests
|
88
111
|
|
89
|
-
|
112
|
+
=== 0.6.7
|
90
113
|
|
91
114
|
* Fixed a bug with keep-alive requests
|
92
115
|
* [#9549] fixed problem with cookie paths
|
93
116
|
|
94
|
-
|
117
|
+
=== 0.6.6
|
95
118
|
|
96
119
|
* Removing hpricot overrides
|
97
120
|
* Fixed a bug where alt text can be nil. Thanks Yannick!
|
@@ -101,7 +124,7 @@
|
|
101
124
|
* [#9434] Fixed bug where html entities weren't decoded
|
102
125
|
* [#9150] Updated mechanize history to deal with redirects
|
103
126
|
|
104
|
-
|
127
|
+
=== 0.6.5
|
105
128
|
|
106
129
|
* Copying headers to a hash to prevent memory leaks
|
107
130
|
* Speeding up page parsing
|
@@ -116,7 +139,7 @@
|
|
116
139
|
http://rubyforge.org/tracker/?func=detail&aid=7563&group_id=1453&atid=5709
|
117
140
|
* Added MSIE 7.0 user agent string
|
118
141
|
|
119
|
-
|
142
|
+
=== 0.6.4
|
120
143
|
|
121
144
|
* Adding the "redirect_ok" method to Mechanize to stop mechanize from
|
122
145
|
following redirects.
|
@@ -133,7 +156,7 @@
|
|
133
156
|
* Fixed bug [#6548]. Input type of 'button' was not being added as a button.
|
134
157
|
* Fixed bug [#7139]. REXML parser calls hpricot parser by accident
|
135
158
|
|
136
|
-
|
159
|
+
=== 0.6.3
|
137
160
|
|
138
161
|
* Added keys and values methods to Form
|
139
162
|
* Added has_value? to Form
|
@@ -148,7 +171,7 @@
|
|
148
171
|
* Fixed a bug where '#' symbols are encoded
|
149
172
|
http://rubyforge.org/forum/message.php?msg_id=14747
|
150
173
|
|
151
|
-
|
174
|
+
=== 0.6.2
|
152
175
|
|
153
176
|
* Added a yield to Page#form so that dealing with forms can be more DSL like.
|
154
177
|
* Added the parsed page to the ResponseCodeError so that the parsed results
|
@@ -156,7 +179,7 @@
|
|
156
179
|
http://rubyforge.org/pipermail/mechanize-users/2006-September/000007.html
|
157
180
|
* Updated documentation (Thanks to Paul Smith)
|
158
181
|
|
159
|
-
|
182
|
+
=== 0.6.1
|
160
183
|
|
161
184
|
* Added a method to Form called "submit". Now forms can be submitted by
|
162
185
|
calling a method on the form.
|
@@ -174,7 +197,7 @@
|
|
174
197
|
* Fixed a bug with loading text in to links.
|
175
198
|
http://rubyforge.org/pipermail/mechanize-users/2006-September/000000.html
|
176
199
|
|
177
|
-
|
200
|
+
=== 0.6.0
|
178
201
|
|
179
202
|
* Changed main parser to use hpricot
|
180
203
|
* Made WWW::Mechanize::Page class searchable like hpricot
|
@@ -186,7 +209,7 @@
|
|
186
209
|
* Removed REXML helper methods since the main parser is now hpricot
|
187
210
|
* Overhauled cookie parser to use WEBrick::Cookie
|
188
211
|
|
189
|
-
|
212
|
+
=== 0.5.4
|
190
213
|
|
191
214
|
* Added WWW::Mechanize#trasact for saving history state between in a
|
192
215
|
transaction. See the EXAMPLES file. Thanks Johan Kiviniemi.
|
@@ -201,7 +224,7 @@
|
|
201
224
|
* Fixed a bug with saving files on windows
|
202
225
|
* Fixed a bug with the filename being set in forms
|
203
226
|
|
204
|
-
|
227
|
+
=== 0.5.3
|
205
228
|
|
206
229
|
* Mechanize#click will now act on the first element of an array. So if an
|
207
230
|
array of links is passed to WWW::Mechanize#click, the first link is clicked.
|
@@ -219,7 +242,7 @@
|
|
219
242
|
* Updated log4r support for a speed increase. Thanks Yinon Bentor
|
220
243
|
* Added inspect methods and pretty printing
|
221
244
|
|
222
|
-
|
245
|
+
=== 0.5.2
|
223
246
|
|
224
247
|
* Fixed a bug with input names that are nil
|
225
248
|
* Added a warning when using attr_finder because attr_finder will be deprecated
|
@@ -236,12 +259,12 @@
|
|
236
259
|
WWW::Mechanize::Form#set_fields. Which can be used like so:
|
237
260
|
form.set_fields( :foo => 'bar', :name => 'Aaron' )
|
238
261
|
|
239
|
-
|
262
|
+
=== 0.5.1
|
240
263
|
|
241
264
|
* Fixed bug with file uploads
|
242
265
|
* Added performance tweaks to the cookie class
|
243
266
|
|
244
|
-
|
267
|
+
=== 0.5.0
|
245
268
|
|
246
269
|
* Added pluggable parsers. (Thanks to Eric Kolve for the idea)
|
247
270
|
* Changed namespace so all classes are under WWW::Mechanize.
|
@@ -256,7 +279,7 @@
|
|
256
279
|
* Removed support for body filters in favor of pluggable parsers.
|
257
280
|
* Fixed cookie bug adding a '/' when the url is missing one (Thanks Nick Dainty)
|
258
281
|
|
259
|
-
|
282
|
+
=== 0.4.7
|
260
283
|
|
261
284
|
* Fixed bug with no action in forms. Thanks to Adam Wiggins
|
262
285
|
* Setting a default user-agent string
|
@@ -265,7 +288,7 @@
|
|
265
288
|
(thanks to Gregory Brown)
|
266
289
|
* Added WWW::Mechanize#get_file for fetching non text/html files
|
267
290
|
|
268
|
-
|
291
|
+
=== 0.4.6
|
269
292
|
|
270
293
|
* Added support for proxies
|
271
294
|
* Added a uri field to WWW::Link
|
@@ -276,7 +299,7 @@
|
|
276
299
|
allows syntax as such: form.fields.name('q').value = 'xyz'
|
277
300
|
Before it was like this: form.fields.name('q').first.value = 'xyz'
|
278
301
|
|
279
|
-
|
302
|
+
=== 0.4.5
|
280
303
|
|
281
304
|
* Added support for multiple values of the same name
|
282
305
|
* Updated build_query_string to take an array of arrays (Thanks Michal Janeczek)
|
@@ -286,13 +309,13 @@
|
|
286
309
|
* Fixed a bug with empty select lists
|
287
310
|
* Fixing a problem with cookies not handling no spaces after semicolons
|
288
311
|
|
289
|
-
|
312
|
+
=== 0.4.4
|
290
313
|
|
291
314
|
* Fixed error in method signature, basic_authetication is now basic_auth
|
292
315
|
* Fixed bug with encoding names in file uploads (Big thanks to Alex Young)
|
293
316
|
* Added options to the select list
|
294
317
|
|
295
|
-
|
318
|
+
=== 0.4.3
|
296
319
|
|
297
320
|
* Added syntactic sugar for finding things
|
298
321
|
* Fixed bug with HttpOnly option in cookies
|
@@ -300,21 +323,21 @@
|
|
300
323
|
* Defaulted dropdown lists to the first element
|
301
324
|
* Added unit tests
|
302
325
|
|
303
|
-
|
326
|
+
=== 0.4.2
|
304
327
|
|
305
328
|
* Added support for iframes
|
306
329
|
* Made mechanize dependant on ruby-web rather than narf
|
307
330
|
* Added unit tests
|
308
331
|
* Fixed a bunch of warnings
|
309
332
|
|
310
|
-
|
333
|
+
=== 0.4.1
|
311
334
|
|
312
335
|
* Added support for file uploading
|
313
336
|
* Added support for frames (Thanks Gabriel[mailto:leerbag@googlemail.com])
|
314
337
|
* Added more unit tests
|
315
338
|
* Fixed some bugs
|
316
339
|
|
317
|
-
|
340
|
+
=== 0.4.0
|
318
341
|
|
319
342
|
* Added more unit tests
|
320
343
|
* Added a cookie jar with better cookie support, included expiration of cookies
|
data/Manifest.txt
CHANGED
@@ -1,7 +1,7 @@
|
|
1
|
-
CHANGELOG.txt
|
2
1
|
EXAMPLES.txt
|
3
2
|
FAQ.txt
|
4
3
|
GUIDE.txt
|
4
|
+
History.txt
|
5
5
|
LICENSE.txt
|
6
6
|
Manifest.txt
|
7
7
|
NOTES.txt
|
@@ -41,6 +41,7 @@ lib/www/mechanize/page/link.rb
|
|
41
41
|
lib/www/mechanize/page/meta.rb
|
42
42
|
lib/www/mechanize/pluggable_parsers.rb
|
43
43
|
lib/www/mechanize/response_code_error.rb
|
44
|
+
lib/www/mechanize/unsupported_scheme_error.rb
|
44
45
|
test/data/htpasswd
|
45
46
|
test/data/server.crt
|
46
47
|
test/data/server.csr
|
@@ -105,6 +106,7 @@ test/tc_forms.rb
|
|
105
106
|
test/tc_frames.rb
|
106
107
|
test/tc_gzipping.rb
|
107
108
|
test/tc_history.rb
|
109
|
+
test/tc_history_added.rb
|
108
110
|
test/tc_html_unscape_forms.rb
|
109
111
|
test/tc_if_modified_since.rb
|
110
112
|
test/tc_keep_alive.rb
|
@@ -113,6 +115,7 @@ test/tc_mech.rb
|
|
113
115
|
test/tc_mechanize_file.rb
|
114
116
|
test/tc_multi_select.rb
|
115
117
|
test/tc_no_attributes.rb
|
118
|
+
test/tc_option.rb
|
116
119
|
test/tc_page.rb
|
117
120
|
test/tc_pluggable_parser.rb
|
118
121
|
test/tc_post_form.rb
|
data/Rakefile
CHANGED
@@ -7,31 +7,6 @@ require 'mechanize'
|
|
7
7
|
class MechHoe < Hoe
|
8
8
|
def define_tasks
|
9
9
|
super
|
10
|
-
|
11
|
-
desc "Tag code"
|
12
|
-
task :tag do |p|
|
13
|
-
abort "Must supply VERSION=x.y.z" unless ENV['VERSION']
|
14
|
-
v = ENV['VERSION'].gsub(/\./, '_')
|
15
|
-
|
16
|
-
rf = RubyForge.new
|
17
|
-
user = rf.userconfig['username']
|
18
|
-
|
19
|
-
baseurl = "svn+ssh://#{user}@rubyforge.org//var/svn/#{name}"
|
20
|
-
sh "svn cp -m 'tagged REL-#{v}' . #{ baseurl }/tags/REL-#{ v }"
|
21
|
-
end
|
22
|
-
|
23
|
-
desc "Branch code"
|
24
|
-
Rake::Task.define_task("branch") do |p|
|
25
|
-
abort "Must supply VERSION=x.y.z" unless ENV['VERSION']
|
26
|
-
v = ENV['VERSION'].split(/\./)[0..1].join('_')
|
27
|
-
|
28
|
-
rf = RubyForge.new
|
29
|
-
user = rf.userconfig['username']
|
30
|
-
|
31
|
-
baseurl = "svn+ssh://#{user}@rubyforge.org/var/svn/#{name}"
|
32
|
-
sh "svn cp -m'branched #{v}' #{baseurl}/trunk #{baseurl}/branches/RB-#{v}"
|
33
|
-
end
|
34
|
-
|
35
10
|
desc "Update SSL Certificate"
|
36
11
|
Rake::Task.define_task('ssl_cert') do |p|
|
37
12
|
sh "openssl genrsa -des3 -out server.key 1024"
|
@@ -51,10 +26,6 @@ MechHoe.new('mechanize', WWW::Mechanize::VERSION) do |p|
|
|
51
26
|
p.author = 'Aaron Patterson'
|
52
27
|
p.email = 'aaronp@rubyforge.org'
|
53
28
|
p.summary = "Mechanize provides automated web-browsing"
|
54
|
-
p.description = p.paragraphs_of('README.txt', 3).join("\n\n")
|
55
|
-
p.url = p.paragraphs_of('README.txt', 1).first.strip
|
56
|
-
p.changes = p.paragraphs_of('CHANGELOG.txt', 0..2).join("\n\n")
|
57
29
|
p.extra_deps = [['hpricot', '>= 0.5.0']]
|
58
30
|
end
|
59
31
|
|
60
|
-
|
data/lib/www/mechanize.rb
CHANGED
@@ -8,6 +8,7 @@ require 'digest/md5'
|
|
8
8
|
|
9
9
|
require 'www/mechanize/content_type_error'
|
10
10
|
require 'www/mechanize/response_code_error'
|
11
|
+
require 'www/mechanize/unsupported_scheme_error'
|
11
12
|
require 'www/mechanize/cookie'
|
12
13
|
require 'www/mechanize/cookie_jar'
|
13
14
|
require 'www/mechanize/history'
|
@@ -38,7 +39,7 @@ module WWW
|
|
38
39
|
class Mechanize
|
39
40
|
##
|
40
41
|
# The version of Mechanize you are using.
|
41
|
-
VERSION = '0.7.
|
42
|
+
VERSION = '0.7.6'
|
42
43
|
|
43
44
|
##
|
44
45
|
# User Agent aliases
|
@@ -70,6 +71,8 @@ module WWW
|
|
70
71
|
attr_accessor :conditional_requests
|
71
72
|
attr_accessor :follow_meta_refresh
|
72
73
|
attr_accessor :verify_callback
|
74
|
+
attr_accessor :history_added
|
75
|
+
attr_accessor :scheme_handlers
|
73
76
|
|
74
77
|
attr_reader :history
|
75
78
|
attr_reader :pluggable_parser
|
@@ -87,6 +90,7 @@ module WWW
|
|
87
90
|
@read_timeout = nil
|
88
91
|
@user_agent = AGENT_ALIASES['Mechanize']
|
89
92
|
@watch_for_set = nil
|
93
|
+
@history_added = nil
|
90
94
|
@ca_file = nil # OpenSSL server certificate file
|
91
95
|
|
92
96
|
# callback for OpenSSL errors while verifying the server certificate
|
@@ -123,10 +127,19 @@ module WWW
|
|
123
127
|
@connection_cache = {}
|
124
128
|
@keep_alive_time = 300
|
125
129
|
@keep_alive = true
|
130
|
+
|
131
|
+
@scheme_handlers = Hash.new { |h,k|
|
132
|
+
h[k] = lambda { |link, page|
|
133
|
+
raise UnsupportedSchemeError.new(k)
|
134
|
+
}
|
135
|
+
}
|
136
|
+
@scheme_handlers['http'] = lambda { |link, page| link }
|
137
|
+
@scheme_handlers['https'] = @scheme_handlers['http']
|
138
|
+
@scheme_handlers['relative'] = @scheme_handlers['http']
|
126
139
|
|
127
140
|
yield self if block_given?
|
128
141
|
end
|
129
|
-
|
142
|
+
|
130
143
|
def max_history=(length); @history.max_size = length; end
|
131
144
|
def max_history; @history.max_size; end
|
132
145
|
|
@@ -403,7 +416,7 @@ module WWW
|
|
403
416
|
|
404
417
|
def to_absolute_uri(url, cur_page=current_page())
|
405
418
|
unless url.is_a? URI
|
406
|
-
url = url.to_s.strip.gsub(/[^#{0.chr}-#{
|
419
|
+
url = url.to_s.strip.gsub(/[^#{0.chr}-#{126.chr}]/) { |match|
|
407
420
|
sprintf('%%%X', match.unpack($KCODE == 'UTF8' ? 'U' : 'c')[0])
|
408
421
|
}
|
409
422
|
|
@@ -418,6 +431,7 @@ module WWW
|
|
418
431
|
)
|
419
432
|
end
|
420
433
|
|
434
|
+
url = @scheme_handlers[url.relative? ? 'relative' : url.scheme.downcase].call(url, cur_page)
|
421
435
|
url.path = '/' if url.path.length == 0
|
422
436
|
|
423
437
|
# construct an absolute uri
|
@@ -456,7 +470,7 @@ module WWW
|
|
456
470
|
|
457
471
|
# Creates a new request object based on the scheme and type
|
458
472
|
def fetch_request(uri, type = :get)
|
459
|
-
raise "unsupported scheme" unless ['http', 'https'].include?(uri.scheme)
|
473
|
+
raise "unsupported scheme: #{uri.scheme}" unless ['http', 'https'].include?(uri.scheme.downcase)
|
460
474
|
if type == :get
|
461
475
|
Net::HTTP::Get.new(uri.request_uri)
|
462
476
|
else
|
@@ -466,7 +480,7 @@ module WWW
|
|
466
480
|
|
467
481
|
# uri is an absolute URI
|
468
482
|
def fetch_page(uri, request, cur_page=current_page(), request_data=[])
|
469
|
-
raise "unsupported scheme" unless ['http', 'https'].include?(uri.scheme)
|
483
|
+
raise "unsupported scheme: #{uri.scheme}" unless ['http', 'https'].include?(uri.scheme.downcase)
|
470
484
|
|
471
485
|
log.info("#{ request.class }: #{ request.path }") if log
|
472
486
|
|
@@ -546,6 +560,8 @@ module WWW
|
|
546
560
|
body.write(part)
|
547
561
|
log.debug("Read #{total} bytes") if log
|
548
562
|
}
|
563
|
+
# Net::HTTP ignores EOFError if Content-length is given, so we emulate it here.
|
564
|
+
raise EOFError if response.content_length() && response.content_length() != total
|
549
565
|
body.rewind
|
550
566
|
|
551
567
|
response.each_header { |k,v|
|
@@ -598,9 +614,10 @@ module WWW
|
|
598
614
|
}
|
599
615
|
|
600
616
|
}
|
601
|
-
rescue EOFError
|
617
|
+
rescue EOFError, Errno::ECONNRESET, Errno::EPIPE
|
602
618
|
log.error("Rescuing EOF error") if log
|
603
619
|
http_obj.finish
|
620
|
+
request.body = nil
|
604
621
|
http_obj.start
|
605
622
|
retry
|
606
623
|
end
|
@@ -652,30 +669,28 @@ module WWW
|
|
652
669
|
else
|
653
670
|
@auth_hash[uri.host] = :basic
|
654
671
|
end
|
655
|
-
|
656
|
-
|
657
|
-
|
658
|
-
|
659
|
-
|
672
|
+
# Copy the request headers for the second attempt
|
673
|
+
req = fetch_request(uri, request.method.downcase.to_sym)
|
674
|
+
request.each_header do |k,v|
|
675
|
+
req[k] = v
|
676
|
+
end
|
677
|
+
return fetch_page(uri, req, cur_page, request_data)
|
660
678
|
end
|
661
679
|
|
662
680
|
raise ResponseCodeError.new(page), "Unhandled response", caller
|
663
681
|
end
|
664
682
|
|
665
683
|
def self.build_query_string(parameters)
|
666
|
-
|
667
|
-
|
668
|
-
|
669
|
-
|
670
|
-
|
671
|
-
WEBrick::HTTPUtils.escape_form(v.to_s)].join("=")
|
672
|
-
}
|
673
|
-
|
674
|
-
vals.join("&")
|
684
|
+
parameters.map { |k,v|
|
685
|
+
k &&
|
686
|
+
[WEBrick::HTTPUtils.escape_form(k.to_s),
|
687
|
+
WEBrick::HTTPUtils.escape_form(v.to_s)].join("=")
|
688
|
+
}.compact.join('&')
|
675
689
|
end
|
676
690
|
|
677
691
|
def add_to_history(page)
|
678
692
|
@history.push(page, to_absolute_uri(page.uri))
|
693
|
+
history_added.call(page) if history_added
|
679
694
|
end
|
680
695
|
end
|
681
696
|
end
|
@@ -60,16 +60,40 @@ module WWW
|
|
60
60
|
cookies
|
61
61
|
end
|
62
62
|
|
63
|
-
# Save the cookie jar to a file
|
64
|
-
|
63
|
+
# Save the cookie jar to a file in the format specified.
|
64
|
+
#
|
65
|
+
# Available formats:
|
66
|
+
# :yaml <- YAML structure
|
67
|
+
# :cookiestxt <- Mozilla's cookies.txt format
|
68
|
+
def save_as(file, format = :yaml)
|
65
69
|
::File.open(file, "w") { |f|
|
66
|
-
|
70
|
+
case format
|
71
|
+
when :yaml:
|
72
|
+
YAML::dump(@jar, f)
|
73
|
+
when :cookiestxt:
|
74
|
+
dump_cookiestxt(f)
|
75
|
+
else
|
76
|
+
raise "Unknown cookie jar file format"
|
77
|
+
end
|
67
78
|
}
|
68
79
|
end
|
69
80
|
|
70
|
-
# Load cookie jar from a file
|
71
|
-
|
72
|
-
|
81
|
+
# Load cookie jar from a file in the format specified.
|
82
|
+
#
|
83
|
+
# Available formats:
|
84
|
+
# :yaml <- YAML structure.
|
85
|
+
# :cookiestxt <- Mozilla's cookies.txt format
|
86
|
+
def load(file, format = :yaml)
|
87
|
+
@jar = ::File.open(file) { |f|
|
88
|
+
case format
|
89
|
+
when :yaml:
|
90
|
+
YAML::load(f)
|
91
|
+
when :cookiestxt:
|
92
|
+
load_cookiestxt(f)
|
93
|
+
else
|
94
|
+
raise "Unknown cookie jar file format"
|
95
|
+
end
|
96
|
+
}
|
73
97
|
end
|
74
98
|
|
75
99
|
# Clear the cookie jar
|
@@ -77,6 +101,73 @@ module WWW
|
|
77
101
|
@jar = {}
|
78
102
|
end
|
79
103
|
|
104
|
+
|
105
|
+
# Read cookies from Mozilla cookies.txt-style IO stream
|
106
|
+
def load_cookiestxt(io)
|
107
|
+
now = Time.now
|
108
|
+
fakeuri = Struct.new(:host) # add_cookie wants something resembling a URI.
|
109
|
+
|
110
|
+
io.each_line do |line|
|
111
|
+
line.chomp!
|
112
|
+
line.gsub!(/#.+/, '')
|
113
|
+
fields = line.split("\t")
|
114
|
+
|
115
|
+
next if fields.length != 7
|
116
|
+
|
117
|
+
expires_seconds = fields[4].to_i
|
118
|
+
begin
|
119
|
+
expires = Time.at(expires_seconds)
|
120
|
+
rescue
|
121
|
+
next
|
122
|
+
# Just in case we ever decide to support DateTime...
|
123
|
+
# expires = DateTime.new(1970,1,1) + ((expires_seconds + 1) / (60*60*24.0))
|
124
|
+
end
|
125
|
+
next if expires < now
|
126
|
+
|
127
|
+
c = WWW::Mechanize::Cookie.new(fields[5], fields[6])
|
128
|
+
c.domain = fields[0]
|
129
|
+
# Field 1 indicates whether the cookie can be read by other machines at the same domain.
|
130
|
+
# This is computed by the cookie implementation, based on the domain value.
|
131
|
+
c.path = fields[2] # Path for which the cookie is relevant
|
132
|
+
c.secure = (fields[3] == "TRUE") # Requires a secure connection
|
133
|
+
c.expires = expires # Time the cookie expires.
|
134
|
+
c.version = 0 # Conforms to Netscape cookie spec.
|
135
|
+
|
136
|
+
add(fakeuri.new(c.domain), c)
|
137
|
+
end
|
138
|
+
@jar
|
139
|
+
end
|
140
|
+
|
141
|
+
# Write cookies to Mozilla cookies.txt-style IO stream
|
142
|
+
def dump_cookiestxt(io)
|
143
|
+
@jar.each_pair do |domain, cookies|
|
144
|
+
cookies.each_pair do |name, cookie|
|
145
|
+
fields = []
|
146
|
+
fields[0] = cookie.domain
|
147
|
+
|
148
|
+
if cookie.domain =~ /^\./
|
149
|
+
fields[1] = "TRUE"
|
150
|
+
else
|
151
|
+
fields[1] = "FALSE"
|
152
|
+
end
|
153
|
+
|
154
|
+
fields[2] = cookie.path
|
155
|
+
|
156
|
+
if cookie.secure == true
|
157
|
+
fields[3] = "TRUE"
|
158
|
+
else
|
159
|
+
fields[3] = "FALSE"
|
160
|
+
end
|
161
|
+
|
162
|
+
fields[4] = cookie.expires.to_i.to_s
|
163
|
+
|
164
|
+
fields[5] = cookie.name
|
165
|
+
fields[6] = cookie.value
|
166
|
+
io.puts(fields.join("\t"))
|
167
|
+
end
|
168
|
+
end
|
169
|
+
end
|
170
|
+
|
80
171
|
private
|
81
172
|
# Remove expired cookies
|
82
173
|
def cleanup
|
data/lib/www/mechanize/form.rb
CHANGED
@@ -211,8 +211,6 @@ module WWW
|
|
211
211
|
name = node['name']
|
212
212
|
next if name.nil? && !(type == 'submit' || type =='button')
|
213
213
|
case type
|
214
|
-
when 'text', 'password', 'hidden', 'int'
|
215
|
-
@fields << Field.new(node['name'], node['value'] || '')
|
216
214
|
when 'radio'
|
217
215
|
@radiobuttons << RadioButton.new(node['name'], node['value'], node.has_attribute?('checked'), self)
|
218
216
|
when 'checkbox'
|
@@ -225,6 +223,8 @@ module WWW
|
|
225
223
|
@buttons << Button.new(node['name'], node['value'])
|
226
224
|
when 'image'
|
227
225
|
@buttons << ImageButton.new(node['name'], node['value'])
|
226
|
+
else
|
227
|
+
@fields << Field.new(node['name'], node['value'] || '')
|
228
228
|
end
|
229
229
|
end
|
230
230
|
|
@@ -14,7 +14,7 @@ module WWW
|
|
14
14
|
|
15
15
|
def initialize(node, select_list)
|
16
16
|
@text = node.inner_text
|
17
|
-
@value = Mechanize.html_unescape(node['value'])
|
17
|
+
@value = Mechanize.html_unescape(node['value'] || node.inner_text)
|
18
18
|
@selected = node.has_attribute? 'selected'
|
19
19
|
@select_list = select_list # The select list this option belongs to
|
20
20
|
end
|
data/test/htdocs/google.html
CHANGED
File without changes
|
data/test/htdocs/tc_links.html
CHANGED
data/test/servlets.rb
CHANGED
data/test/tc_authenticate.rb
CHANGED
@@ -11,6 +11,31 @@ class BasicAuthTest < Test::Unit::TestCase
|
|
11
11
|
assert_equal('You are authenticated', page.body)
|
12
12
|
end
|
13
13
|
|
14
|
+
def test_post_auth_success
|
15
|
+
class << @agent
|
16
|
+
alias :old_fetch_request :fetch_request
|
17
|
+
attr_accessor :requests
|
18
|
+
def fetch_request(*args)
|
19
|
+
@requests ||= []
|
20
|
+
@requests << old_fetch_request(*args)
|
21
|
+
@requests.last
|
22
|
+
end
|
23
|
+
end
|
24
|
+
@agent.basic_auth('user', 'pass')
|
25
|
+
page = @agent.post("http://localhost/basic_auth")
|
26
|
+
assert_equal('You are authenticated', page.body)
|
27
|
+
assert_equal(2, @agent.requests.length)
|
28
|
+
r1 = @agent.requests[0]
|
29
|
+
r2 = @agent.requests[1]
|
30
|
+
assert r1['Content-Type']
|
31
|
+
assert r2['Content-Type']
|
32
|
+
assert_equal(r1['Content-Type'], r2['Content-Type'])
|
33
|
+
|
34
|
+
assert r1['Content-Length']
|
35
|
+
assert r2['Content-Length']
|
36
|
+
assert_equal(r1['Content-Length'], r2['Content-Length'])
|
37
|
+
end
|
38
|
+
|
14
39
|
def test_auth_bad_user_pass
|
15
40
|
@agent.basic_auth('aaron', 'aaron')
|
16
41
|
begin
|
data/test/tc_cookie_jar.rb
CHANGED
@@ -282,4 +282,33 @@ class CookieJarTest < Test::Unit::TestCase
|
|
282
282
|
:expires => Time.now - (10 * 86400))))
|
283
283
|
assert_equal(0, jar.cookies(url).length)
|
284
284
|
end
|
285
|
+
|
286
|
+
|
287
|
+
def test_save_and_read_cookiestxt
|
288
|
+
values = { :name => 'Foo',
|
289
|
+
:value => 'Bar',
|
290
|
+
:path => '/',
|
291
|
+
:expires => Time.now + (10 * 86400),
|
292
|
+
:domain => 'rubyforge.org'
|
293
|
+
}
|
294
|
+
url = URI.parse('http://rubyforge.org/')
|
295
|
+
|
296
|
+
jar = WWW::Mechanize::CookieJar.new
|
297
|
+
assert_equal(0, jar.cookies(url).length)
|
298
|
+
|
299
|
+
# Add one cookie with an expiration date in the future
|
300
|
+
cookie = cookie_from_hash(values)
|
301
|
+
jar.add(url, cookie)
|
302
|
+
jar.add(url, cookie_from_hash(values.merge( :name => 'Baz' )))
|
303
|
+
assert_equal(2, jar.cookies(url).length)
|
304
|
+
|
305
|
+
jar.save_as("cookies.txt", :cookiestxt)
|
306
|
+
jar.clear!
|
307
|
+
assert_equal(0, jar.cookies(url).length)
|
308
|
+
|
309
|
+
jar.load("cookies.txt", :cookiestxt)
|
310
|
+
assert_equal(2, jar.cookies(url).length)
|
311
|
+
|
312
|
+
FileUtils.rm("cookies.txt")
|
313
|
+
end
|
285
314
|
end
|
data/test/tc_errors.rb
CHANGED
@@ -0,0 +1,16 @@
|
|
1
|
+
require File.dirname(__FILE__) + "/helper"
|
2
|
+
|
3
|
+
class HistoryAddedTest < Test::Unit::TestCase
|
4
|
+
def setup
|
5
|
+
@agent = WWW::Mechanize.new
|
6
|
+
end
|
7
|
+
|
8
|
+
def test_history_added_gets_called
|
9
|
+
onload = 0
|
10
|
+
@agent.history_added = lambda { |page|
|
11
|
+
onload += 1
|
12
|
+
}
|
13
|
+
page = @agent.get('http://localhost/tc_blank_form.html')
|
14
|
+
assert_equal(1, onload)
|
15
|
+
end
|
16
|
+
end
|
data/test/tc_links.rb
CHANGED
@@ -5,6 +5,21 @@ class LinksMechTest < Test::Unit::TestCase
|
|
5
5
|
@agent = WWW::Mechanize.new
|
6
6
|
end
|
7
7
|
|
8
|
+
def test_unsupported_link_types
|
9
|
+
page = @agent.get("http://google.com/tc_links.html")
|
10
|
+
link = page.links.text('javascript link').first
|
11
|
+
assert_raise(WWW::Mechanize::UnsupportedSchemeError) {
|
12
|
+
link.click
|
13
|
+
}
|
14
|
+
|
15
|
+
@agent.scheme_handlers['javascript'] = lambda { |link, page|
|
16
|
+
URI.parse('http://localhost/tc_links.html')
|
17
|
+
}
|
18
|
+
assert_nothing_raised {
|
19
|
+
link.click
|
20
|
+
}
|
21
|
+
end
|
22
|
+
|
8
23
|
def test_base
|
9
24
|
page = @agent.get("http://google.com/tc_base_link.html")
|
10
25
|
page = page.links.first.click
|
data/test/tc_mech.rb
CHANGED
@@ -5,11 +5,21 @@ class TestMechMethods < Test::Unit::TestCase
|
|
5
5
|
@agent = WWW::Mechanize.new
|
6
6
|
end
|
7
7
|
|
8
|
+
def test_get_with_tilde
|
9
|
+
page = @agent.get('http://localhost/?foo=~2')
|
10
|
+
assert_equal('http://localhost/?foo=~2', page.uri.to_s)
|
11
|
+
end
|
12
|
+
|
8
13
|
def test_get_with_params
|
9
14
|
page = @agent.get('http://localhost/', { :q => 'hello' })
|
10
15
|
assert_equal('http://localhost/?q=hello', page.uri.to_s)
|
11
16
|
end
|
12
17
|
|
18
|
+
def test_get_with_upper_http
|
19
|
+
page = @agent.get('HTTP://localhost/', { :q => 'hello' })
|
20
|
+
assert_equal('HTTP://localhost/?q=hello', page.uri.to_s)
|
21
|
+
end
|
22
|
+
|
13
23
|
def test_get_with_referer
|
14
24
|
class << @agent
|
15
25
|
attr_reader :request
|
data/test/tc_option.rb
ADDED
@@ -0,0 +1,17 @@
|
|
1
|
+
require File.dirname(__FILE__) + "/helper"
|
2
|
+
|
3
|
+
class OptionTest < Test::Unit::TestCase
|
4
|
+
class FakeAttribute < Hash
|
5
|
+
attr_reader :inner_text
|
6
|
+
def initialize(inner_text)
|
7
|
+
@inner_text = inner_text
|
8
|
+
end
|
9
|
+
alias :has_attribute? :has_key?
|
10
|
+
end
|
11
|
+
|
12
|
+
def test_option_missing_value
|
13
|
+
attribute = FakeAttribute.new('blah')
|
14
|
+
option = WWW::Mechanize::Form::Option.new(attribute, nil)
|
15
|
+
assert_equal('blah', option.value)
|
16
|
+
end
|
17
|
+
end
|
data/test/tc_select.rb
CHANGED
@@ -73,6 +73,15 @@ class SelectTest < Test::Unit::TestCase
|
|
73
73
|
assert_equal(1, page.links.text('list:1').length)
|
74
74
|
end
|
75
75
|
|
76
|
+
def test_select_with_empty_value
|
77
|
+
list = @form.fields.name('list').first
|
78
|
+
list.options.last.instance_variable_set(:@value, '')
|
79
|
+
list.options.last.tick
|
80
|
+
page = @agent.submit(@form)
|
81
|
+
assert_equal(1, page.links.length)
|
82
|
+
assert_equal(1, page.links.text('list:').length)
|
83
|
+
end
|
84
|
+
|
76
85
|
def test_select_with_click
|
77
86
|
@form.list = ['1', 'Aaron']
|
78
87
|
@form.fields.name('list').first.options[3].tick
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: mechanize
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.7.
|
4
|
+
version: 0.7.6
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Aaron Patterson
|
@@ -9,7 +9,7 @@ autorequire:
|
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
11
|
|
12
|
-
date: 2008-
|
12
|
+
date: 2008-05-11 00:00:00 -07:00
|
13
13
|
default_executable:
|
14
14
|
dependencies:
|
15
15
|
- !ruby/object:Gem::Dependency
|
@@ -37,19 +37,19 @@ executables: []
|
|
37
37
|
extensions: []
|
38
38
|
|
39
39
|
extra_rdoc_files:
|
40
|
-
- CHANGELOG.txt
|
41
40
|
- EXAMPLES.txt
|
42
41
|
- FAQ.txt
|
43
42
|
- GUIDE.txt
|
43
|
+
- History.txt
|
44
44
|
- LICENSE.txt
|
45
45
|
- Manifest.txt
|
46
46
|
- NOTES.txt
|
47
47
|
- README.txt
|
48
48
|
files:
|
49
|
-
- CHANGELOG.txt
|
50
49
|
- EXAMPLES.txt
|
51
50
|
- FAQ.txt
|
52
51
|
- GUIDE.txt
|
52
|
+
- History.txt
|
53
53
|
- LICENSE.txt
|
54
54
|
- Manifest.txt
|
55
55
|
- NOTES.txt
|
@@ -89,6 +89,7 @@ files:
|
|
89
89
|
- lib/www/mechanize/page/meta.rb
|
90
90
|
- lib/www/mechanize/pluggable_parsers.rb
|
91
91
|
- lib/www/mechanize/response_code_error.rb
|
92
|
+
- lib/www/mechanize/unsupported_scheme_error.rb
|
92
93
|
- test/data/htpasswd
|
93
94
|
- test/data/server.crt
|
94
95
|
- test/data/server.csr
|
@@ -153,6 +154,7 @@ files:
|
|
153
154
|
- test/tc_frames.rb
|
154
155
|
- test/tc_gzipping.rb
|
155
156
|
- test/tc_history.rb
|
157
|
+
- test/tc_history_added.rb
|
156
158
|
- test/tc_html_unscape_forms.rb
|
157
159
|
- test/tc_if_modified_since.rb
|
158
160
|
- test/tc_keep_alive.rb
|
@@ -161,6 +163,7 @@ files:
|
|
161
163
|
- test/tc_mechanize_file.rb
|
162
164
|
- test/tc_multi_select.rb
|
163
165
|
- test/tc_no_attributes.rb
|
166
|
+
- test/tc_option.rb
|
164
167
|
- test/tc_page.rb
|
165
168
|
- test/tc_pluggable_parser.rb
|
166
169
|
- test/tc_post_form.rb
|
@@ -181,7 +184,7 @@ files:
|
|
181
184
|
- test/tc_upload.rb
|
182
185
|
- test/test_all.rb
|
183
186
|
has_rdoc: true
|
184
|
-
homepage: http://mechanize.rubyforge.org/
|
187
|
+
homepage: " http://mechanize.rubyforge.org/"
|
185
188
|
post_install_message:
|
186
189
|
rdoc_options:
|
187
190
|
- --main
|
@@ -203,7 +206,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
203
206
|
requirements: []
|
204
207
|
|
205
208
|
rubyforge_project: mechanize
|
206
|
-
rubygems_version: 1.0
|
209
|
+
rubygems_version: 1.1.0
|
207
210
|
signing_key:
|
208
211
|
specification_version: 2
|
209
212
|
summary: Mechanize provides automated web-browsing
|