mechanize 2.8.0 → 2.8.1
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of mechanize might be problematic. Click here for more details.
- checksums.yaml +4 -4
- data/.github/workflows/ci-test.yml +1 -1
- data/CHANGELOG.md +32 -32
- data/README.md +1 -3
- data/lib/mechanize/page.rb +3 -3
- data/lib/mechanize/version.rb +1 -1
- data/mechanize.gemspec +1 -1
- data/test/test_mechanize_page_encoding.rb +23 -1
- metadata +3 -3
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 53105ca453eb763c9cb80bea61508bbc08e500f96aceae911191490e4bb03af5
|
4
|
+
data.tar.gz: f5e0f84257bc299060775ce26d52b80d89153db95d3bfa98d8bfbdc1c3cb5fce
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 4a55d7aedbc2e81ed13d073c6ff63b60fb4ad060c771589c0024abb3b02c55ee5be54fa5c44b60c78dc3d201e31358c1f9a4922befff223c5f68befb462a4a6a
|
7
|
+
data.tar.gz: cf5967ce52c29e352dc820d5fd943434d37fe279a2ae957531f88f96a23ce7b7c206066c236e5308b912c6757ca8e130e354d26f7dd90ed529522aa7cc8be24a
|
data/CHANGELOG.md
CHANGED
@@ -1,19 +1,30 @@
|
|
1
1
|
# Mechanize CHANGELOG
|
2
2
|
|
3
|
+
## 2.8.1 / 2021-05-09
|
4
|
+
|
5
|
+
### Fix
|
6
|
+
|
7
|
+
* Gracefully handle parsing errors that contain an invalid byte sequence. Previously, if libxml2 registered a parsing error that itself contained invalid encoding, an exception might be raised. (#553)
|
8
|
+
|
9
|
+
|
3
10
|
## 2.8.0 / 2021-04-01
|
4
11
|
|
5
|
-
|
6
|
-
* Mechanize now requires Ruby 2.5 or newer.
|
7
|
-
* Move from `ntlm-http` to `rubyntlm` gem. (#495, #574)
|
12
|
+
### Requirements
|
8
13
|
|
9
|
-
*
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
+
* Mechanize now requires Ruby 2.5 or newer.
|
15
|
+
* Move from `ntlm-http` to `rubyntlm` gem. (#495, #574)
|
16
|
+
|
17
|
+
### New Features
|
18
|
+
|
19
|
+
* Page::Link#uri now handles non-ASCII `href`s. (#569) @terryyin
|
20
|
+
* FileConnection supports Windows drive letters (#483)
|
21
|
+
* Credential headers 'Authorization' and 'Cookie' are deleted on cross-origin redirects. (#538) @kyoshidajp
|
22
|
+
* ContentDispositionParser handles ISO8601 date headers, to be robust with websites that ignore RFC2183. (#554) @reitermarkus
|
23
|
+
|
24
|
+
### Bug fix
|
25
|
+
|
26
|
+
* POST headers 'Content-Length', 'Content-MD5', and 'Content-Type' are deleted in a case-insensitive manner on redirects. Previously these headers were treated as case-sensitive.
|
14
27
|
|
15
|
-
* Bug fix
|
16
|
-
* POST headers 'Content-Length', 'Content-MD5', and 'Content-Type' are deleted in a case-insensitive manner on redirects. Previously these headers were treated as case-sensitive.
|
17
28
|
|
18
29
|
## 2.7.7 / 2021-02-01
|
19
30
|
|
@@ -117,8 +128,7 @@
|
|
117
128
|
* Mechanize::Agent#response_read will now raise a
|
118
129
|
Mechanize::ResponseReadError instead of an EOFError and avoid losing
|
119
130
|
requested content. #296.
|
120
|
-
* Depend on http-cookie, add backwards compatible deprecations.
|
121
|
-
#257 Akinori MUSHA.
|
131
|
+
* Depend on http-cookie, add backwards compatible deprecations. #257 Akinori MUSHA.
|
122
132
|
* Added `Download#save!` for overwriting existing files. #300 Sean Kim.
|
123
133
|
|
124
134
|
* Bug fix
|
@@ -147,13 +157,10 @@
|
|
147
157
|
* Added iPad and Android user agents. #277 by sambit, #278 by seansay.
|
148
158
|
|
149
159
|
* Bug fix
|
150
|
-
* Mechanize#cert and Mechanize#key now return the values set by
|
151
|
-
|
152
|
-
* Mechanize no longer submits disabled form fields. #276 by Bogdan Gusiev,
|
153
|
-
#279 by Ricardo Valeriano.
|
160
|
+
* Mechanize#cert and Mechanize#key now return the values set by #cert= and #key=. #244, #245 (Thanks, Robert Gogolok!)
|
161
|
+
* Mechanize no longer submits disabled form fields. #276 by Bogdan Gusiev, #279 by Ricardo Valeriano.
|
154
162
|
* Mechanize::File#save now behaves like Mechanize::Download#save in
|
155
|
-
that it will create the parent directory before saving.
|
156
|
-
#272, #280 by Ryan Kowalick
|
163
|
+
that it will create the parent directory before saving. #272, #280 by Ryan Kowalick
|
157
164
|
* Ensure `application/xml` is registered as an XML parser in
|
158
165
|
`PluggableParser`, not just `text/xml`. #266 James Gregory
|
159
166
|
* Mechanize now writes cookiestxt with a prefixed dot for wildcard domain
|
@@ -173,8 +180,7 @@
|
|
173
180
|
In mechanize 3 the old "Mac FireFox" user-agent alias will be removed.
|
174
181
|
Pull request #231 by Gavin Miller.
|
175
182
|
* Mechanize now authenticates using the raw challenge, not a reconstructed
|
176
|
-
one, to avoid dealing with quoting rules of RFC 2617. Fixes failures in
|
177
|
-
#231 due to net-http-digest_auth 1.2.1
|
183
|
+
one, to avoid dealing with quoting rules of RFC 2617. Fixes failures in #231 due to net-http-digest_auth 1.2.1
|
178
184
|
* Fixed Content-Disposition parameter parser to be case insensitive. #233
|
179
185
|
* Fixed redirection counting in following meta refresh. #240
|
180
186
|
|
@@ -205,8 +211,7 @@
|
|
205
211
|
terminate chunked transfer-encoding properly. Issue #116
|
206
212
|
* Mechanize no longer raises an exception when multiple identical
|
207
213
|
radiobuttons are checked. Issue #214 by Matthias Guenther
|
208
|
-
* Fixed documentation for pre_connect_hooks and post_connect_hooks. Issue
|
209
|
-
#226 by Robert Poor
|
214
|
+
* Fixed documentation for pre_connect_hooks and post_connect_hooks. Issue #226 by Robert Poor
|
210
215
|
* Worked around ruby 1.8 run with -Ku and ISO-8859-1 encoded characters in
|
211
216
|
URIs. Issue #228 by Stanislav O.Pogrebnyak
|
212
217
|
|
@@ -272,8 +277,7 @@
|
|
272
277
|
* SSL parameters and proxy may now be set at any time. Issue #194 by
|
273
278
|
dsisnero.
|
274
279
|
* Improved Mechanize::Page with #image_with and #images_with and
|
275
|
-
Mechanize::Page::Image various img element attribute accessors, #caption,
|
276
|
-
#extname, #mime_type and #fetch. Pull request #173 by kitamomonga
|
280
|
+
Mechanize::Page::Image various img element attribute accessors, #caption, #extname, #mime_type and #fetch. Pull request #173 by kitamomonga
|
277
281
|
* Added MIME type parsing for content-types in Mechanize::PluggableParser
|
278
282
|
for fine-grained parser choices. Parsers will be chosen based on exact
|
279
283
|
match, simplified type or media type in that order. See
|
@@ -336,8 +340,7 @@
|
|
336
340
|
* SSL connections will be verified against the system certificate store by
|
337
341
|
default.
|
338
342
|
* Added Mechanize#retry_change_requests to allow mechanize to retry POST and
|
339
|
-
other non-idempotent requests when you know it is safe to do so. Issue
|
340
|
-
#123
|
343
|
+
other non-idempotent requests when you know it is safe to do so. Issue #123
|
341
344
|
* Mechanize can now stream files directly to disk without loading them into
|
342
345
|
memory first through Mechanize::Download, a pluggable parser for
|
343
346
|
downloading files.
|
@@ -352,8 +355,7 @@
|
|
352
355
|
agent.pluggable_parser.default = Mechanize::Download
|
353
356
|
* Added Mechanize#content_encoding_hooks which allow handling of
|
354
357
|
non-standard content encodings like "agzip". Patch #125 by kitamomonga
|
355
|
-
* Added dom_class to elements and the element matcher like dom_id. Patch
|
356
|
-
#156 by Dan Hansen.
|
358
|
+
* Added dom_class to elements and the element matcher like dom_id. Patch #156 by Dan Hansen.
|
357
359
|
* Added support for the HTML5 keygen form element. See
|
358
360
|
http://dev.w3.org/html5/spec/Overview.html#the-keygen-element Patch #157
|
359
361
|
by Victor Costan.
|
@@ -402,8 +404,7 @@
|
|
402
404
|
* The original Referer value persists on redirection. Issue #150
|
403
405
|
* Do not send a referer on a Refresh header based redirection.
|
404
406
|
* Fixed encoding error in tests when LANG=C. Patch #142 by jinschoi.
|
405
|
-
* The order of items in a form submission now match the DOM order. Patch
|
406
|
-
#129 by kitamomonga
|
407
|
+
* The order of items in a form submission now match the DOM order. Patch #129 by kitamomonga
|
407
408
|
* Fixed proxy example in EXAMPLE. Issue #146 by NielsKSchjoedt
|
408
409
|
|
409
410
|
## 2.0.1 / 2011-06-28
|
@@ -471,8 +472,7 @@ Mechanize is now under the MIT license
|
|
471
472
|
* Mechanize now implements session cookies. GH #78
|
472
473
|
* Mechanize now implements deflate decoding. GH #40
|
473
474
|
* Mechanize now allows a certificate and key to be passed directly. GH #71
|
474
|
-
* Mechanize::Form::MultiSelectList now implements #option_with and
|
475
|
-
#options_with. GH #42
|
475
|
+
* Mechanize::Form::MultiSelectList now implements #option_with and #options_with. GH #42
|
476
476
|
* Add Mechanize::Page::Link#rel and #rel?(kind) to read and test the rel
|
477
477
|
attribute.
|
478
478
|
* Add Mechanize::Page#canonical_uri to read a </tt><link
|
data/README.md
CHANGED
@@ -74,6 +74,4 @@ Thank you to Michael Neumann for starting the Ruby version. Thanks to everyone w
|
|
74
74
|
|
75
75
|
## License
|
76
76
|
|
77
|
-
This library is distributed under the MIT license. Please see
|
78
|
-
|
79
|
-
|
77
|
+
This library is distributed under the MIT license. Please see [LICENSE.txt](https://github.com/sparklemotion/mechanize/blob/main/LICENSE.txt).
|
data/lib/mechanize/page.rb
CHANGED
@@ -104,9 +104,9 @@ class Mechanize::Page < Mechanize::File
|
|
104
104
|
parser = self.parser unless parser
|
105
105
|
return false if parser.errors.empty?
|
106
106
|
parser.errors.any? do |error|
|
107
|
-
error.message =~ /(indicate\ encoding)|
|
108
|
-
|
109
|
-
|
107
|
+
error.message.scrub =~ /(indicate\ encoding)|
|
108
|
+
(Invalid\ char)|
|
109
|
+
(input\ conversion\ failed)/x
|
110
110
|
end
|
111
111
|
end
|
112
112
|
|
data/lib/mechanize/version.rb
CHANGED
data/mechanize.gemspec
CHANGED
@@ -7,7 +7,7 @@ require 'mechanize/version'
|
|
7
7
|
Gem::Specification.new do |spec|
|
8
8
|
spec.name = "mechanize"
|
9
9
|
spec.version = Mechanize::VERSION
|
10
|
-
spec.homepage = "
|
10
|
+
spec.homepage = "https://github.com/sparklemotion/mechanize"
|
11
11
|
spec.summary = 'The Mechanize library is used for automating interaction with websites'
|
12
12
|
spec.description =
|
13
13
|
[
|
@@ -183,5 +183,27 @@ class TestMechanizePageEncoding < Mechanize::TestCase
|
|
183
183
|
assert_equal Encoding::UTF_8, result.text.encoding
|
184
184
|
end
|
185
185
|
|
186
|
-
|
186
|
+
def test_parser_error_message_containing_encoding_errors
|
187
|
+
# https://github.com/sparklemotion/mechanize/issues/553
|
188
|
+
body = <<~EOF
|
189
|
+
<html>
|
190
|
+
<body>
|
191
|
+
<!--
|
192
|
+
## メモ
|
193
|
+
処理の一般化, 二重ループ, 多重ループ
|
194
|
+
wzxhzdk:25
|
195
|
+
-->
|
196
|
+
EOF
|
197
|
+
page = util_page body
|
187
198
|
|
199
|
+
# this should not raise an "invalid byte sequence in UTF-8" error while processing parsing errors
|
200
|
+
page.search("body")
|
201
|
+
|
202
|
+
# let's assert on the setup: a libxml2-returned parsing error itself contains an invalid character
|
203
|
+
assert(error = page.parser.errors.find { |e| e.message.include?("Comment not terminated") })
|
204
|
+
exception = assert_raises(ArgumentError) do
|
205
|
+
error.message =~ /any regex just to trigger encoding error/
|
206
|
+
end
|
207
|
+
assert_includes(exception.message, "invalid byte sequence in UTF-8")
|
208
|
+
end unless RUBY_ENGINE == 'jruby' # this is a libxml2-specific condition
|
209
|
+
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: mechanize
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 2.8.
|
4
|
+
version: 2.8.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Eric Hodel
|
@@ -12,7 +12,7 @@ authors:
|
|
12
12
|
autorequire:
|
13
13
|
bindir: bin
|
14
14
|
cert_chain: []
|
15
|
-
date: 2021-
|
15
|
+
date: 2021-05-09 00:00:00.000000000 Z
|
16
16
|
dependencies:
|
17
17
|
- !ruby/object:Gem::Dependency
|
18
18
|
name: addressable
|
@@ -474,7 +474,7 @@ files:
|
|
474
474
|
- test/test_mechanize_util.rb
|
475
475
|
- test/test_mechanize_xml_file.rb
|
476
476
|
- test/test_multi_select.rb
|
477
|
-
homepage:
|
477
|
+
homepage: https://github.com/sparklemotion/mechanize
|
478
478
|
licenses:
|
479
479
|
- MIT
|
480
480
|
metadata: {}
|