mechanize 0.8.4 → 0.8.5
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of mechanize might be problematic. Click here for more details.
- data/EXAMPLES.txt +1 -1
- data/GUIDE.txt +11 -12
- data/History.txt +24 -0
- data/Manifest.txt +2 -1
- data/README.txt +1 -1
- data/Rakefile +2 -2
- data/lib/www/mechanize.rb +43 -11
- data/lib/www/mechanize/chain/header_resolver.rb +1 -1
- data/lib/www/mechanize/chain/request_resolver.rb +4 -0
- data/lib/www/mechanize/chain/response_body_parser.rb +1 -1
- data/lib/www/mechanize/chain/response_reader.rb +1 -1
- data/lib/www/mechanize/chain/ssl_resolver.rb +1 -1
- data/lib/www/mechanize/chain/uri_resolver.rb +21 -9
- data/lib/www/mechanize/cookie_jar.rb +11 -6
- data/lib/www/mechanize/file_response.rb +2 -0
- data/lib/www/mechanize/form.rb +33 -13
- data/lib/www/mechanize/form/multi_select_list.rb +2 -2
- data/lib/www/mechanize/form/radio_button.rb +1 -1
- data/lib/www/mechanize/list.rb +15 -38
- data/lib/www/mechanize/page.rb +17 -10
- data/lib/www/mechanize/page/link.rb +2 -2
- data/lib/www/mechanize/redirect_not_get_or_head_error.rb +20 -0
- data/test/helper.rb +4 -0
- data/test/htdocs/relative/tc_relative_links.html +1 -0
- data/test/servlets.rb +38 -8
- data/test/test_checkboxes.rb +14 -15
- data/test/test_cookie_class.rb +5 -4
- data/test/test_cookie_jar.rb +29 -0
- data/test/test_encoded_links.rb +1 -1
- data/test/test_errors.rb +1 -1
- data/test/test_follow_meta.rb +50 -6
- data/test/test_form_as_hash.rb +5 -5
- data/test/test_forms.rb +28 -27
- data/test/test_links.rb +22 -16
- data/test/test_mech.rb +24 -7
- data/test/test_multi_select.rb +27 -27
- data/test/test_pluggable_parser.rb +4 -4
- data/test/test_radiobutton.rb +35 -42
- data/test/test_redirect_verb_handling.rb +45 -0
- data/test/test_referer.rb +1 -1
- data/test/test_relative_links.rb +4 -4
- data/test/test_scheme.rb +4 -0
- data/test/test_select.rb +15 -15
- data/test/test_select_all.rb +1 -1
- data/test/test_select_none.rb +1 -1
- data/test/test_select_noopts.rb +1 -1
- data/test/test_set_fields.rb +4 -4
- data/test/test_textarea.rb +13 -13
- data/test/test_upload.rb +1 -1
- metadata +10 -6
- data/NOTES.txt +0 -318
data/NOTES.txt
DELETED
@@ -1,318 +0,0 @@
|
|
1
|
-
= Mechanize Release Notes
|
2
|
-
|
3
|
-
== 0.6.4 (Gwendolyn)
|
4
|
-
|
5
|
-
Custom request headers can now be added to Mechanize by subclassing mechanize
|
6
|
-
and defining the Mechanize#set_headers method. For example:
|
7
|
-
class A < WWW::Mechanize
|
8
|
-
def set_headers(u, r, c)
|
9
|
-
super(uri, request, cur_page)
|
10
|
-
request.add_field('Cookie', 'name=Aaron')
|
11
|
-
request
|
12
|
-
end
|
13
|
-
end
|
14
|
-
The Mechanize#redirect_ok method has been added to that you can keep mechanize
|
15
|
-
from following redirects.
|
16
|
-
|
17
|
-
== 0.6.3 (Big Man)
|
18
|
-
|
19
|
-
Mechanize 0.6.3 (Big Man) has a few bug fixes and some new features added to
|
20
|
-
the Form class. The Form class is now more hash like. I've added an
|
21
|
-
Form#add_field! method that will add a field to your form. Form#[]= will now
|
22
|
-
add a field if the key doesn't exist. For example, your form doesn't have
|
23
|
-
an input field named 'foo', the following 2 lines of code are equivalent, and
|
24
|
-
will create a field named 'foo':
|
25
|
-
form['foo'] = 'bar'
|
26
|
-
or
|
27
|
-
form.add_field!('foo', 'bar')
|
28
|
-
To make forms more hashlike, has_value?, and has_key? methods.
|
29
|
-
|
30
|
-
== 0.6.2 (Bridget)
|
31
|
-
|
32
|
-
Mechanize 0.6.2 (Bridget) is a fairly small bug fix release. You can now
|
33
|
-
access the parsed page when a ResponseCodeError is thrown. For example, this
|
34
|
-
loads a page that doesn't exist, but gives you access to the parsed 404 page:
|
35
|
-
begin
|
36
|
-
WWW::Mechanize.new().get('http://google.com/asdfasdfadsf.html')
|
37
|
-
rescue WWW::Mechanize::ResponseCodeError => ex
|
38
|
-
puts ex.page
|
39
|
-
end
|
40
|
-
Accessing forms is now more DSL like. When manipulating a form, for example,
|
41
|
-
you can use the following syntax:
|
42
|
-
page.form('formname') { |form|
|
43
|
-
form.first_name = "Aaron"
|
44
|
-
}.submit
|
45
|
-
Documentation has also been updated thanks to Paul Smith.
|
46
|
-
|
47
|
-
== 0.6.1 (Chuck)
|
48
|
-
|
49
|
-
Mechanize version 0.6.1 (Chuck) is done, and is ready for you to use. This
|
50
|
-
post "my trip to europe" release includes many bug fixes and a handful of
|
51
|
-
new features.
|
52
|
-
|
53
|
-
New features include, a submit method on forms, a click method on links, and an
|
54
|
-
REXML pluggable parser. Now you can submit a form just by calling a method on
|
55
|
-
the form, rather than passing the form to the submit method on the mech object.
|
56
|
-
The click method on links lets you click the link by calling a method on the
|
57
|
-
link rather than passing the link to the click method on the mech object.
|
58
|
-
Lastly, the REXML pluggable parser lets you use your pre-0.6.0 code with
|
59
|
-
0.6.1. See the CHANGELOG for more details.
|
60
|
-
|
61
|
-
== 0.6.0 (Rufus)
|
62
|
-
|
63
|
-
WWW::Mechanize 0.6.0 aka Rufus is ready! This hpricot flavored pie has
|
64
|
-
finished cooling on the window sill and is ready for you to eat. But if you
|
65
|
-
don't want to eat it, you can just download it and use it. I would
|
66
|
-
understand that.
|
67
|
-
|
68
|
-
The best new feature in this release in my opinion is the hpricot flavoring
|
69
|
-
packed inside. Mechanize now uses hpricot as its html parser. This means
|
70
|
-
mechanize gets a huge speed boost, and you can use the power of hpricot for
|
71
|
-
scraping data. Page objects returned from mechanize will allow you to use
|
72
|
-
hpricot search methods:
|
73
|
-
agent.get('http://rubyforge.org').search("//strong")
|
74
|
-
or
|
75
|
-
agent.get('http://rubyforge.org')/"strong"
|
76
|
-
|
77
|
-
The click method on mechanize has been updated so that you can click on links
|
78
|
-
you find using hpricot methods:
|
79
|
-
agent.click (page/"a").first
|
80
|
-
Or click on frames:
|
81
|
-
agent.click (page/"frame").first
|
82
|
-
|
83
|
-
The cookie parser has been overhauled to be more RFC 2109 compliant and to
|
84
|
-
use WEBrick cookies. Dependencies on ruby-web and mime-types have been
|
85
|
-
removed in favor of using hpricot and WEBrick respectively.
|
86
|
-
|
87
|
-
attr_finder and REXML helper methods have been removed.
|
88
|
-
|
89
|
-
== 0.5.4 (Sylvester)
|
90
|
-
|
91
|
-
WWW::Mechanize 0.5.4 aka Sylvester is fresh out the the frying pan and in to
|
92
|
-
the fire! It is also ready for you to download and use.
|
93
|
-
|
94
|
-
New features include WWW::Mechanize#transact (thanks to Johan Kiviniemi) which
|
95
|
-
lets you maintain your history state between transactions. Forms can now be
|
96
|
-
accessed as a hash. For example, to set the value of an input field, you can
|
97
|
-
do the following:
|
98
|
-
form['name'] = "Aaron"
|
99
|
-
Doing this assumes that you are setting the first field. If there are multiple
|
100
|
-
fields with the same name, you must use a different method to set the value.
|
101
|
-
|
102
|
-
Form file uploads will now read the file specified by FileUpload#file_name.
|
103
|
-
The mime type will also be automatically determined for you! Take a look
|
104
|
-
at the EXAMPLES file for a new flickr upload script.
|
105
|
-
|
106
|
-
Lastly, gzip encoding is now supported! WWW::Mechanize now supports pages
|
107
|
-
being sent gzip encoded. This means less network bandwidth. Yay!
|
108
|
-
|
109
|
-
== 0.5.3 (Twan)
|
110
|
-
|
111
|
-
Here it is. Mechanize 0.5.3 also named the "Twan" release. There are a few
|
112
|
-
new features, a few fixed bugs, and some other stuff too!
|
113
|
-
|
114
|
-
First, new features. WWW::Mechanize#click has been updated to operate on the
|
115
|
-
first link if an array is passed in. How is this helpful? It allows
|
116
|
-
syntax like this:
|
117
|
-
agent.click page.links.first
|
118
|
-
to be like this:
|
119
|
-
agent.click page.links
|
120
|
-
This trick was actually implemented in WWW::Mechanize::List. If you send a
|
121
|
-
method to WWW::Mechanize::List, and it doesn't know how to respond, it will
|
122
|
-
try calling that method on the first element of the list. But it only does
|
123
|
-
that for methods with no arguments.
|
124
|
-
|
125
|
-
Radio buttons, check boxes, and select lists can now be ticked, unticked, and
|
126
|
-
clicked. Now to select the second radio button from a list, you can do this:
|
127
|
-
form.radiobuttons.name('color')[1].click
|
128
|
-
Mechanize will handle unchecking all of the other radio buttons with the same
|
129
|
-
name.
|
130
|
-
|
131
|
-
Pretty printing has been added so that inspecting mechanize objects is very
|
132
|
-
pretty. Go ahead and try it out!
|
133
|
-
pp page
|
134
|
-
Or even
|
135
|
-
pp page.forms.first
|
136
|
-
|
137
|
-
Now, bugfixes. A bug was fixed when spaces are passed in as part of the URL
|
138
|
-
to WWW::Mechanize#get. Thanks to Eric Kolve, a bug was fixed with methods
|
139
|
-
that conflict with rails. Thanks to Yinon Bentor for sending in a patch to
|
140
|
-
improve Log4r support and a slight speed increase.
|
141
|
-
== 0.5.2
|
142
|
-
|
143
|
-
This release comes with a few cool new features. First, support for select
|
144
|
-
lists which are "multi" has been added. This means that you can select
|
145
|
-
multiple values for a select list that has the "multiple" attribute. See
|
146
|
-
WWW::Mechanize::MultiSelectList for more information.
|
147
|
-
|
148
|
-
New methods for select lists have been added. You can use the select_all
|
149
|
-
method to select all options and select_none to select none to select no
|
150
|
-
options. Options can now be "selected" which selects an option, "unselected",
|
151
|
-
which unselects an option, and "clicked" which toggles the current status of
|
152
|
-
the option. What this means is that instead of having to select the first
|
153
|
-
option like this:
|
154
|
-
select_list.value = select_list.options.first.value
|
155
|
-
You can select the first option by just saying this:
|
156
|
-
select_list.options.first.select
|
157
|
-
Of course you can still set the select list to an arbitrary value by just
|
158
|
-
setting the value of the select list.
|
159
|
-
|
160
|
-
A new method has been added to Form so that multiple fields can be set at the
|
161
|
-
same time. To set 'foo', and 'name' at the same time on the form, you can do
|
162
|
-
the following:
|
163
|
-
form.set_fields( :foo => 'bar', :name => 'Aaron' )
|
164
|
-
Or to set the second fields named 'name' you can do the following:
|
165
|
-
form.set_fields( :name => ['Aaron', 1] )
|
166
|
-
|
167
|
-
Finally, attr_finder has been deprecated, and all syntax like this:
|
168
|
-
@agent.links(:text => 'foo')
|
169
|
-
needs to be changed to:
|
170
|
-
@agent.links.text('foo')
|
171
|
-
With this release you will just get a warning, and the code will be removed in
|
172
|
-
0.6.0.
|
173
|
-
|
174
|
-
== 0.5.1
|
175
|
-
|
176
|
-
This release is a small bugfix release. The main bug fixed in this release is
|
177
|
-
a problem with file uploading. I have also made some performance improvements
|
178
|
-
to cookie parsing.
|
179
|
-
|
180
|
-
== 0.5.0
|
181
|
-
|
182
|
-
Good News first:
|
183
|
-
|
184
|
-
This release has many new great features! Mechanize has been updated to
|
185
|
-
handle any content type a web server returns using a system called "Pluggable
|
186
|
-
Parsers". Mechanize has always been able to handle any content type
|
187
|
-
(sort of), but the pluggable parser system lets us cleanly handle any
|
188
|
-
content type by instantiating a class for the content type returned from the
|
189
|
-
server. For example, a web server returns type 'text/html', mechanize asks
|
190
|
-
the pluggable parser for a class to instantiate for 'text/html'. Mechanize
|
191
|
-
then instantiates that class and returns it. Users can define their own
|
192
|
-
parsers, and register them with the Pluggable Parser so that mechanize will
|
193
|
-
instantiate your class when the content type you specify is returned. This
|
194
|
-
allows you to easily preprocess your HTML, or even use other HTML parsers.
|
195
|
-
Content types that the pluggable parser doesn't know how to handle will
|
196
|
-
return WWW::Mechanize::File which has basic functionality like a 'save_as'
|
197
|
-
method. For more information, see the RDoc for
|
198
|
-
WWW::Mechanize::PluggableParser also see the EXAMPLES file.
|
199
|
-
|
200
|
-
A 'save_as' method has been added so that any page downloaded can be easily
|
201
|
-
saved to a file.
|
202
|
-
|
203
|
-
The cookie jar for mechanize can now be saved to disk and loaded back up at
|
204
|
-
another time. If your script needs to save cookie state between executions,
|
205
|
-
you can now use the 'save_as' and 'load' methods on WWW::Mechanize::CookieJar.
|
206
|
-
|
207
|
-
Form fields can now be treated as accessors. This means that if you have a
|
208
|
-
form with the fields 'username' and 'password', you could manipulate them like
|
209
|
-
this:
|
210
|
-
|
211
|
-
form.username = 'test'
|
212
|
-
form.password = 'testing'
|
213
|
-
puts "username: #{form.username}"
|
214
|
-
puts "password: #{form.password}"
|
215
|
-
|
216
|
-
Form fields can still be accessed in the usual way in case there are multiple
|
217
|
-
input fields with the same name.
|
218
|
-
|
219
|
-
Bad news second:
|
220
|
-
|
221
|
-
In this release, the name space has been altered to be more consistent. Many
|
222
|
-
classes used to be under WWW directly, they are now all under WWW::Mechanize.
|
223
|
-
For example, in 0.4.7 Page was WWW::Page, in this release it is now
|
224
|
-
WWW::Mechanize::Page. This may break your code, but if you aren't using
|
225
|
-
class names directly, everything should be fine.
|
226
|
-
|
227
|
-
Body filters have been removed in favor of Pluggable Parsers.
|
228
|
-
|
229
|
-
== 0.4.7
|
230
|
-
|
231
|
-
This release of mechanize comes with a few bug fixes including fixing a
|
232
|
-
bug when there is no action specified in a form.
|
233
|
-
|
234
|
-
In this version, a default user agent string is now set for mechanize. Also
|
235
|
-
a convenience method WWW::Mechanize#get_file has been added for fetching
|
236
|
-
non text/html files.
|
237
|
-
|
238
|
-
== 0.4.6
|
239
|
-
|
240
|
-
The 0.4.6 release comes with proxy support which can be enabled by calling
|
241
|
-
the set_proxy method on your WWW::Mechanize object. Once you have set your
|
242
|
-
proxy settings, all mechanize requests will go through the proxy.
|
243
|
-
|
244
|
-
A new "visited?" method has been added to WWW::Mechanize so that you can see
|
245
|
-
if any particular URL is in your history.
|
246
|
-
|
247
|
-
Image alt text support has been added to links. If a link contains an image
|
248
|
-
with no text, the alt text of the image will be used. For example:
|
249
|
-
|
250
|
-
<a href="foo.html><img src="foo.gif" alt="Foo Image"></a>
|
251
|
-
|
252
|
-
This link will contain the text "Foo Image", and can be found like this:
|
253
|
-
|
254
|
-
link = page.links.text('Foo Image')
|
255
|
-
|
256
|
-
Lists of things have been updated so that you can set a value without
|
257
|
-
specifying the position in the array. It will just assume that you want to
|
258
|
-
set the value on the first element. For example, the following two statements
|
259
|
-
are equivalent:
|
260
|
-
|
261
|
-
form.fields.name('q').first.value = 'xyz' # Old syntax
|
262
|
-
form.fields.name('q').value = 'xyz' # New syntax
|
263
|
-
|
264
|
-
This new syntax comes with a note of caution; make sure you know you want to
|
265
|
-
set only the first value. There could be multiple fields with the name 'q'.
|
266
|
-
|
267
|
-
== 0.4.5
|
268
|
-
|
269
|
-
This release comes with a new filtering system. You can now manipulate the
|
270
|
-
response body before mechanize parses it. This can be useful if you know that
|
271
|
-
the HTML you need to parse is broken, or if you want to speed up the parsing.
|
272
|
-
This filter can be done on a global basis, or on a per page basis. Check out
|
273
|
-
the new examples in the EXAMPLES file for usage.
|
274
|
-
|
275
|
-
This release is also starting to phase out the misspelled method
|
276
|
-
WWW::Mechanize#basic_authetication. If you are using that method, please
|
277
|
-
switch to WWW::Mechanize#basic_auth.
|
278
|
-
|
279
|
-
The 0.4.5 release has many bug fixes, most noteably better cookie parsing and
|
280
|
-
better form support.
|
281
|
-
|
282
|
-
== 0.4.4
|
283
|
-
|
284
|
-
This release of mechanize comes with a new "Option" object that can be
|
285
|
-
accessed from select fields on forms. That means that you can figure out
|
286
|
-
what option to set based on the text in the select field. For example:
|
287
|
-
|
288
|
-
selectlist = form.fields.name('selectlist').first
|
289
|
-
selectlist.value = selectlist.options.find { |o| o.text == 'foo'}.value
|
290
|
-
|
291
|
-
== 0.4.3
|
292
|
-
|
293
|
-
The new syntax for finding things like forms, fields, frames, etcetera looks
|
294
|
-
like this:
|
295
|
-
|
296
|
-
page.links.with.text 'Some Text'
|
297
|
-
|
298
|
-
The preceding statement will find all links in a page with the text
|
299
|
-
'Some Text'. This can be applied to form fields as well:
|
300
|
-
|
301
|
-
form.fields.with.name 'email'
|
302
|
-
|
303
|
-
These can be chained as well like this:
|
304
|
-
|
305
|
-
form.fields.with.name('email').and.with.value('blah@domain.com')
|
306
|
-
|
307
|
-
'with' and 'and' can be omitted, and the old way is still supported. The
|
308
|
-
following statements all do the same thing:
|
309
|
-
|
310
|
-
form.fields.find_all { |f| f.name == 'email' }
|
311
|
-
form.fields.with.name('email')
|
312
|
-
form.fields.name('email')
|
313
|
-
form.fields(:name => 'email')
|
314
|
-
|
315
|
-
Regular expressions are also supported:
|
316
|
-
|
317
|
-
form.fields.with.name(/email/)
|
318
|
-
|