mechanize 0.5.4 → 0.6.0
Sign up to get free protection for your applications and to get access to all the features.
Potentially problematic release.
This version of mechanize might be problematic. Click here for more details.
- data/CHANGELOG +12 -0
- data/GUIDE +125 -0
- data/NOTES +28 -0
- data/README +9 -5
- data/lib/mechanize.rb +14 -15
- data/lib/mechanize/cookie.rb +35 -55
- data/lib/mechanize/form.rb +39 -48
- data/lib/mechanize/form_elements.rb +7 -9
- data/lib/mechanize/hpricot.rb +12 -0
- data/lib/mechanize/inspect.rb +0 -6
- data/lib/mechanize/mech_version.rb +1 -3
- data/lib/mechanize/page.rb +70 -115
- data/lib/mechanize/page_elements.rb +10 -6
- data/test/htdocs/frame_test.html +1 -1
- data/test/htdocs/tc_no_attributes.html +16 -0
- data/test/tc_checkboxes.rb +8 -8
- data/test/tc_cookie_jar.rb +36 -28
- data/test/tc_mech.rb +21 -1
- data/test/tc_no_attributes.rb +20 -0
- data/test/tc_page.rb +1 -1
- data/test/tc_pluggable_parser.rb +31 -17
- data/test/tc_pretty_print.rb +1 -1
- data/test/tc_radiobutton.rb +4 -4
- data/test/ts_mech.rb +1 -1
- metadata +126 -134
- data/lib/mechanize/module.rb +0 -27
- data/lib/mechanize/parsing.rb +0 -224
- data/test/parse.rb +0 -39
- data/test/tc_parsing.rb +0 -64
- data/test/test_mech.rb +0 -27
data/CHANGELOG
CHANGED
@@ -1,5 +1,17 @@
|
|
1
1
|
= Mechanize CHANGELOG
|
2
2
|
|
3
|
+
== 0.6.0
|
4
|
+
|
5
|
+
* Changed main parser to use hpricot
|
6
|
+
* Made WWW::Mechanize::Page class searchable like hpricot
|
7
|
+
* Updated WWW::Mechanize#click to support hpricot links like this:
|
8
|
+
@agent.click (page/"a").first
|
9
|
+
* Clicking a Frame is now possible:
|
10
|
+
@agent.click (page/"frame").first
|
11
|
+
* Removed deprecated attr_finder
|
12
|
+
* Removed REXML helper methods since the main parser is now hpricot
|
13
|
+
* Overhauled cookie parser to use WEBrick::Cookie
|
14
|
+
|
3
15
|
== 0.5.4
|
4
16
|
|
5
17
|
* Added WWW::Mechanize#trasact for saving history state between in a
|
data/GUIDE
ADDED
@@ -0,0 +1,125 @@
|
|
1
|
+
= Getting Started With WWW::Mechanize
|
2
|
+
This guide is meant to get you started using Mechanize. By the end of this
|
3
|
+
guide, you should be able to fetch pages, click links, fill out and submit
|
4
|
+
forms, scrape data, and many other hopefully useful things. This guide
|
5
|
+
really just scratches the surface of what is available, but should be enough
|
6
|
+
information to get you really going!
|
7
|
+
|
8
|
+
== Let's Fetch a Page!
|
9
|
+
First thing is first. Make sure that you've required mechanize and that you
|
10
|
+
instantiate a new mechanize object:
|
11
|
+
require 'rubygems'
|
12
|
+
require 'mechanize'
|
13
|
+
|
14
|
+
agent = WWW::Mechanize.new
|
15
|
+
Now we'll use the agent we've created to fetch a page. Let's fetch google
|
16
|
+
with our mechanize agent:
|
17
|
+
page = agent.get('http://google.com/')
|
18
|
+
What just happened? We told mechanize to go pick up google's main page.
|
19
|
+
Mechanize stored any cookies that were set, and followed any redirects that
|
20
|
+
google may have sent. The agent gave us back a page that we can use to
|
21
|
+
scrape data, find links to click, or find forms to fill out.
|
22
|
+
|
23
|
+
Next, lets try finding some links to click.
|
24
|
+
|
25
|
+
== Finding Links
|
26
|
+
Mechanize returns a page object whenever you get a page, post, or submit a
|
27
|
+
form. When a page is fetched, the agent will parse the page and put a list
|
28
|
+
of links on the page object.
|
29
|
+
|
30
|
+
Now that we've fetched google's homepage, lets try listing all of the links:
|
31
|
+
page.links.each do |link|
|
32
|
+
puts link.text
|
33
|
+
end
|
34
|
+
We can list the links, but Mechanize gives a few shortcuts to help us find a
|
35
|
+
link to click on. Lets say we wanted to click the link whose text is 'News'.
|
36
|
+
Normally, we would have to do this:
|
37
|
+
page = agent.click page.links.find { |l| l.name == 'News' }
|
38
|
+
But Mechanize gives us a shortcut. Instead we can say this:
|
39
|
+
page = agent.click page.links.name('News')
|
40
|
+
That shortcut says "find all links with the name 'News'". You're probably
|
41
|
+
thinking "there could be multiple links with that text!", and you would be
|
42
|
+
correct! If you pass a list of links to the "click" method, Mechanize will
|
43
|
+
click on the first one. If you wanted to click on the second news link, you
|
44
|
+
could do this:
|
45
|
+
agent.click page.links.name('News')[1]
|
46
|
+
We can even find a link with a certain href like so:
|
47
|
+
page.links.href('/something')
|
48
|
+
Or chain them together to find a link with certain text and certain href:
|
49
|
+
page.links.name('News').href('/something')
|
50
|
+
|
51
|
+
These shortcuts that mechanize provides are available on any list that you
|
52
|
+
can fetch like frames, iframes, or forms. Now that we know how to find and
|
53
|
+
click links, lets try something more complicated like filling out a form.
|
54
|
+
|
55
|
+
== Filling Out Forms
|
56
|
+
Lets continue with our google example. Here's the code we have so far:
|
57
|
+
require 'rubygems'
|
58
|
+
require 'mechanize'
|
59
|
+
|
60
|
+
agent = WWW::Mechanize.new
|
61
|
+
page = agent.get('http://google.com/')
|
62
|
+
If we pretty print the page, we can see that there is one form named 'f',
|
63
|
+
that has a couple buttons and a few fields:
|
64
|
+
pp page
|
65
|
+
Now that we know the name of the form, lets fetch it off the page:
|
66
|
+
google_form = page.form('f')
|
67
|
+
Mechanize lets you access form input fields in a few different ways, but the
|
68
|
+
most convenient is that you can access input fields as accessors on the
|
69
|
+
object. So lets set the form field named 'q' on the form to 'ruby mechanize':
|
70
|
+
google_form.q = 'ruby mechanize'
|
71
|
+
To make sure that we set the value, lets pretty print the form, and you should
|
72
|
+
see a line similar to this:
|
73
|
+
#<WWW::Mechanize::Field:0x1403488 @name="q", @value="ruby mechanize">
|
74
|
+
If you saw that the value of 'q' changed, you're on the right track! Now we
|
75
|
+
can submit the form and 'press' the submit button and print the results:
|
76
|
+
page = agent.submit(google_form, google_form.buttons.first)
|
77
|
+
pp page
|
78
|
+
What we just did was equivalent to putting text in the search field and
|
79
|
+
clicking the 'Google Search' button. If we had submitted the form without
|
80
|
+
a button, it would be like typing in the text field and hitting the return
|
81
|
+
button.
|
82
|
+
|
83
|
+
Lets take a look at the code all together:
|
84
|
+
require 'rubygems'
|
85
|
+
require 'mechanize'
|
86
|
+
|
87
|
+
agent = WWW::Mechanize.new
|
88
|
+
page = agent.get('http://google.com/')
|
89
|
+
google_form = page.form('f')
|
90
|
+
google_form.q = 'ruby mechanize'
|
91
|
+
page = agent.submit(google_form)
|
92
|
+
pp page
|
93
|
+
|
94
|
+
Before we go on to screen scraping, lets take a look at forms a little more
|
95
|
+
in depth. Unless you want to skip ahead!
|
96
|
+
|
97
|
+
== Advanced Form Techniques
|
98
|
+
In this section, I want to touch on using the different types in input fields
|
99
|
+
possible with a form. Password and textarea fields can be treated just like
|
100
|
+
text input fields. Select fields are very similar to text fields, but they
|
101
|
+
have many options associated with them. If you select one option, mechanize
|
102
|
+
will deselect the other options (unless it is a multi select!).
|
103
|
+
|
104
|
+
For example, lets select an option on a list:
|
105
|
+
form.fields.name('list').options[0].select
|
106
|
+
|
107
|
+
Now lets take a look at checkboxes and radio buttons. To select a checkbox,
|
108
|
+
just check it like this:
|
109
|
+
form.checkboxes.name('box').check
|
110
|
+
Radio buttons are very similar to checkboxes, but they know how to uncheck
|
111
|
+
other radio buttons of the same name. Just check a radio button like you
|
112
|
+
would a checkbox:
|
113
|
+
form.radiobuttons.name('box')[1].check
|
114
|
+
Mechanize also makes file uploads easy! Just find the file upload field, and
|
115
|
+
tell it what file name you want to upload:
|
116
|
+
form.file_uploads.file_name = "somefile.jpg"
|
117
|
+
|
118
|
+
== Scraping Data
|
119
|
+
Mechanize uses hpricot[http://code.whytheluckystiff.net/hpricot/] to parse
|
120
|
+
html. What does this mean for you? You can treat a mechanize page like
|
121
|
+
an hpricot object. After you have used Mechanize to navigate to the page
|
122
|
+
that you need to scrape, then scrape it using hpricot methods:
|
123
|
+
agent.get('http://someurl.com/').search("//p[@class='posted']")
|
124
|
+
For more information on this powerful scraper, take a look at
|
125
|
+
HpricotBasics[http://code.whytheluckystiff.net/hpricot/wiki/HpricotBasics]
|
data/NOTES
CHANGED
@@ -1,5 +1,33 @@
|
|
1
1
|
= Mechanize Release Notes
|
2
2
|
|
3
|
+
== 0.6.0 (Rufus)
|
4
|
+
|
5
|
+
WWW::Mechanize 0.6.0 aka Rufus is ready! This hpricot flavored pie has
|
6
|
+
finished cooling on the window sill and is ready for you to eat. But if you
|
7
|
+
don't want to eat it, you can just download it and use it. I would
|
8
|
+
understand that.
|
9
|
+
|
10
|
+
The best new feature in this release in my opinion is the hpricot flavoring
|
11
|
+
packed inside. Mechanize now uses hpricot as its html parser. This means
|
12
|
+
mechanize gets a huge speed boost, and you can use the power of hpricot for
|
13
|
+
scraping data. Page objects returned from mechanize will allow you to use
|
14
|
+
hpricot search methods:
|
15
|
+
agent.get('http://rubyforge.org').search("//strong")
|
16
|
+
or
|
17
|
+
agent.get('http://rubyforge.org')/"strong"
|
18
|
+
|
19
|
+
The click method on mechanize has been updated so that you can click on links
|
20
|
+
you find using hpricot methods:
|
21
|
+
agent.click (page/"a").first
|
22
|
+
Or click on frames:
|
23
|
+
agent.click (page/"frame").first
|
24
|
+
|
25
|
+
The cookie parser has been overhauled to be more RFC 2109 compliant and to
|
26
|
+
use WEBrick cookies. Dependencies on ruby-web and mime-types have been
|
27
|
+
removed in favor of using hpricot and WEBrick respectively.
|
28
|
+
|
29
|
+
attr_finder and REXML helper methods have been removed.
|
30
|
+
|
3
31
|
== 0.5.4 (Sylvester)
|
4
32
|
|
5
33
|
WWW::Mechanize 0.5.4 aka Sylvester is fresh out the the frying pan and in to
|
data/README
CHANGED
@@ -1,20 +1,23 @@
|
|
1
1
|
= WWW::Mechanize
|
2
2
|
|
3
|
-
The Mechanize library is used for automating interaction with
|
3
|
+
The Mechanize library is used for automating interaction with websites.
|
4
|
+
Mechanize automatically stores and sends cookies, follows redirects,
|
4
5
|
can follow links, and submit forms. Form fields can be populated and
|
5
|
-
submitted.
|
6
|
+
submitted. Mechanize also keeps track of the sites that you have visited as
|
7
|
+
a history.
|
6
8
|
|
7
9
|
== Dependencies
|
8
10
|
|
9
11
|
* ruby 1.8.2
|
12
|
+
* hpricot[http://code.whytheluckystiff.net/hpricot/]
|
10
13
|
|
11
14
|
Note that the files in the net-overrides/ directory are taken from Ruby 1.9.0.
|
12
15
|
|
13
|
-
* ruby-web 1.1.0 (http://rubyforge.org/projects/ruby-web/)
|
14
16
|
|
15
17
|
== Examples
|
16
18
|
|
17
|
-
|
19
|
+
If you are just starting, check out the GUIDE[link://files/GUIDE.html].
|
20
|
+
Also, check out the EXAMPLES[link://files/EXAMPLES.html] file.
|
18
21
|
|
19
22
|
== Authors
|
20
23
|
|
@@ -24,7 +27,8 @@ Copyright (c) 2005 by Michael Neumann (mneumann@ntecs.de)
|
|
24
27
|
New Code:
|
25
28
|
Copyright (c) 2006 by Aaron Patterson (aaronp@rubyforge.org)
|
26
29
|
|
27
|
-
This library comes with a shameless plug for employing me
|
30
|
+
This library comes with a shameless plug for employing me
|
31
|
+
(Aaron[http://tenderlovemaking.com/]) programming
|
28
32
|
Ruby, my favorite language!
|
29
33
|
|
30
34
|
== License
|
data/lib/mechanize.rb
CHANGED
@@ -15,11 +15,10 @@ require 'net/http'
|
|
15
15
|
require 'net/https'
|
16
16
|
|
17
17
|
require 'uri'
|
18
|
-
require 'webrick'
|
18
|
+
require 'webrick/httputils'
|
19
19
|
require 'zlib'
|
20
20
|
require 'stringio'
|
21
|
-
require '
|
22
|
-
require 'mechanize/module'
|
21
|
+
require 'mechanize/hpricot'
|
23
22
|
require 'mechanize/mech_version'
|
24
23
|
require 'mechanize/cookie'
|
25
24
|
require 'mechanize/errors'
|
@@ -29,7 +28,6 @@ require 'mechanize/form_elements'
|
|
29
28
|
require 'mechanize/list'
|
30
29
|
require 'mechanize/page'
|
31
30
|
require 'mechanize/page_elements'
|
32
|
-
require 'mechanize/parsing'
|
33
31
|
require 'mechanize/inspect'
|
34
32
|
|
35
33
|
module WWW
|
@@ -132,7 +130,7 @@ class Mechanize
|
|
132
130
|
|
133
131
|
# Fetches the URL passed in and returns a page.
|
134
132
|
def get(url)
|
135
|
-
cur_page = current_page
|
133
|
+
cur_page = current_page || Page.new( nil, {'content-type'=>'text/html'})
|
136
134
|
|
137
135
|
# fetch the page
|
138
136
|
abs_uri = to_absolute_uri(url, cur_page)
|
@@ -151,7 +149,9 @@ class Mechanize
|
|
151
149
|
# Clicks the WWW::Mechanize::Link object passed in and returns the
|
152
150
|
# page fetched.
|
153
151
|
def click(link)
|
154
|
-
uri = to_absolute_uri(
|
152
|
+
uri = to_absolute_uri(
|
153
|
+
link.attributes['href'] || link.attributes['src'] || link.href
|
154
|
+
)
|
155
155
|
get(uri)
|
156
156
|
end
|
157
157
|
|
@@ -168,11 +168,10 @@ class Mechanize
|
|
168
168
|
# or
|
169
169
|
# agent.post('http://example.com/', [ ["foo", "bar"] ])
|
170
170
|
def post(url, query={})
|
171
|
-
|
172
|
-
|
173
|
-
node =
|
174
|
-
node.
|
175
|
-
node.add_attribute('enctype', 'application/x-www-form-urlencoded')
|
171
|
+
node = Hpricot::Elem.new(Hpricot::STag.new('form'))
|
172
|
+
node.attributes = {}
|
173
|
+
node.attributes['method'] = 'POST'
|
174
|
+
node.attributes['enctype'] = 'application/x-www-form-urlencoded'
|
176
175
|
|
177
176
|
form = Form.new(node)
|
178
177
|
query.each { |k,v|
|
@@ -246,7 +245,7 @@ class Mechanize
|
|
246
245
|
end
|
247
246
|
|
248
247
|
def post_form(url, form)
|
249
|
-
cur_page = current_page
|
248
|
+
cur_page = current_page || Page.new(nil, {'content-type'=>'text/html'})
|
250
249
|
|
251
250
|
request_data = form.request_data
|
252
251
|
|
@@ -279,7 +278,7 @@ class Mechanize
|
|
279
278
|
|
280
279
|
log.info("#{ request.class }: #{ uri.to_s }") if log
|
281
280
|
|
282
|
-
page =
|
281
|
+
page = nil
|
283
282
|
|
284
283
|
http_obj = Net::HTTP.new( uri.host,
|
285
284
|
uri.port,
|
@@ -323,7 +322,7 @@ class Mechanize
|
|
323
322
|
# Add User-Agent header to request
|
324
323
|
request.add_field('User-Agent', @user_agent) if @user_agent
|
325
324
|
|
326
|
-
request.basic_auth(@user, @password) if @user
|
325
|
+
request.basic_auth(@user, @password) if @user || @password
|
327
326
|
|
328
327
|
# Log specified headers for the request
|
329
328
|
if log
|
@@ -348,7 +347,7 @@ class Mechanize
|
|
348
347
|
(response.get_fields('Set-Cookie')||[]).each do |cookie|
|
349
348
|
Cookie::parse(uri, cookie) { |c|
|
350
349
|
log.debug("saved cookie: #{c}") if log
|
351
|
-
@cookie_jar.add(c)
|
350
|
+
@cookie_jar.add(uri, c)
|
352
351
|
}
|
353
352
|
end
|
354
353
|
|
data/lib/mechanize/cookie.rb
CHANGED
@@ -1,69 +1,48 @@
|
|
1
1
|
require 'yaml'
|
2
2
|
require 'time'
|
3
|
+
require 'webrick/cookie'
|
3
4
|
|
4
5
|
module WWW
|
5
6
|
class Mechanize
|
6
7
|
# This class is used to represent an HTTP Cookie.
|
7
|
-
class Cookie
|
8
|
-
|
9
|
-
|
10
|
-
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
next unless name
|
26
|
-
|
27
|
-
name.strip!
|
28
|
-
|
29
|
-
# Set the cookie to invalid if the domain is incorrect
|
30
|
-
case name.downcase
|
31
|
-
when 'path'
|
32
|
-
cookie[:path] = value
|
8
|
+
class Cookie < WEBrick::Cookie
|
9
|
+
def self.parse(uri, str)
|
10
|
+
cookies = []
|
11
|
+
str.gsub(/(,([^;,]*=)|,$)/) { "\r\n#{$2}" }.split(/\r\n/).each { |c|
|
12
|
+
cookie_elem = c.split(/;/)
|
13
|
+
first_elem = cookie_elem.shift
|
14
|
+
first_elem.strip!
|
15
|
+
key, value = first_elem.split(/=/, 2)
|
16
|
+
cookie = new(key, WEBrick::HTTPUtils.dequote(value))
|
17
|
+
cookie_elem.each{|pair|
|
18
|
+
pair.strip!
|
19
|
+
key, value = pair.split(/=/, 2)
|
20
|
+
if value
|
21
|
+
value = WEBrick::HTTPUtils.dequote(value.strip)
|
22
|
+
end
|
23
|
+
case key.downcase
|
24
|
+
when "domain" then cookie.domain = value.sub(/^\./, '')
|
25
|
+
when "path" then cookie.path = value
|
33
26
|
when 'expires'
|
34
|
-
cookie
|
27
|
+
cookie.expires = begin
|
35
28
|
Time::parse(value)
|
36
29
|
rescue
|
37
30
|
Time.now
|
38
31
|
end
|
39
|
-
when
|
40
|
-
|
41
|
-
when
|
42
|
-
|
43
|
-
|
44
|
-
# Reject cookies not for this domain
|
45
|
-
# TODO Move the logic to reject based on host to the jar
|
46
|
-
unless uri.host =~ /#{cookie[:domain]}$/
|
47
|
-
valid_cookie = false
|
48
|
-
end
|
49
|
-
when 'httponly'
|
50
|
-
# do nothing
|
51
|
-
# http://msdn.microsoft.com/workshop/author/dhtml/httponly_cookies.asp
|
52
|
-
else
|
53
|
-
cookie[:name] = name
|
54
|
-
cookie[:value] = value
|
32
|
+
when "max-age" then cookie.max_age = Integer(value)
|
33
|
+
when "comment" then cookie.comment = value
|
34
|
+
when "version" then cookie.version = Integer(value)
|
35
|
+
when "secure" then cookie.secure = true
|
55
36
|
end
|
56
|
-
|
57
|
-
|
58
|
-
|
59
|
-
|
60
|
-
|
61
|
-
cookie
|
62
|
-
cookie
|
63
|
-
|
64
|
-
|
65
|
-
yield Cookie.new(cookie)
|
66
|
-
end
|
37
|
+
}
|
38
|
+
cookie.path ||= uri.path
|
39
|
+
cookie.secure ||= false
|
40
|
+
cookie.domain ||= uri.host
|
41
|
+
# Move this in to the cookie jar
|
42
|
+
yield cookie if block_given?
|
43
|
+
cookies << cookie
|
44
|
+
}
|
45
|
+
return cookies
|
67
46
|
end
|
68
47
|
|
69
48
|
def to_s
|
@@ -81,7 +60,8 @@ module WWW
|
|
81
60
|
end
|
82
61
|
|
83
62
|
# Add a cookie to the Jar.
|
84
|
-
def add(cookie)
|
63
|
+
def add(uri, cookie)
|
64
|
+
return unless uri.host =~ /#{cookie.domain}$/
|
85
65
|
unless @jar.has_key?(cookie.domain)
|
86
66
|
@jar[cookie.domain] = Hash.new
|
87
67
|
end
|
data/lib/mechanize/form.rb
CHANGED
@@ -1,5 +1,3 @@
|
|
1
|
-
require 'mime/types'
|
2
|
-
|
3
1
|
module WWW
|
4
2
|
class Mechanize
|
5
3
|
# =Synopsis
|
@@ -26,12 +24,13 @@ module WWW
|
|
26
24
|
attr_reader :form_node, :elements_node
|
27
25
|
attr_accessor :method, :action, :name
|
28
26
|
|
29
|
-
|
27
|
+
attr_reader :fields, :buttons, :file_uploads, :radiobuttons, :checkboxes
|
30
28
|
attr_reader :enctype
|
31
29
|
|
32
30
|
def initialize(form_node, elements_node)
|
33
31
|
@form_node, @elements_node = form_node, elements_node
|
34
32
|
|
33
|
+
@form_node.attributes ||= {}
|
35
34
|
@method = (@form_node.attributes['method'] || 'GET').upcase
|
36
35
|
@action = @form_node.attributes['action']
|
37
36
|
@name = @form_node.attributes['name']
|
@@ -41,22 +40,6 @@ module WWW
|
|
41
40
|
parse
|
42
41
|
end
|
43
42
|
|
44
|
-
# In the case of malformed HTML, fields of multiple forms might occure in this forms'
|
45
|
-
# field array. If the fields have the same name, posterior fields overwrite former fields.
|
46
|
-
# To avoid this, this method rejects all posterior duplicate fields.
|
47
|
-
|
48
|
-
def uniq_fields!
|
49
|
-
names_in = {}
|
50
|
-
fields.reject! {|f|
|
51
|
-
if names_in.include?(f.name)
|
52
|
-
true
|
53
|
-
else
|
54
|
-
names_in[f.name] = true
|
55
|
-
false
|
56
|
-
end
|
57
|
-
}
|
58
|
-
end
|
59
|
-
|
60
43
|
# This method builds an array of arrays that represent the query
|
61
44
|
# parameters to be used with this form. The return value can then
|
62
45
|
# be used to create a query string for this form.
|
@@ -130,38 +113,45 @@ module WWW
|
|
130
113
|
@radiobuttons = WWW::Mechanize::List.new
|
131
114
|
@checkboxes = WWW::Mechanize::List.new
|
132
115
|
|
133
|
-
|
116
|
+
# Find all input tags
|
117
|
+
(@elements_node/'input').each do |node|
|
118
|
+
node.attributes ||= {}
|
134
119
|
type = (node.attributes['type'] || 'text').downcase
|
120
|
+
name = node.attributes['name']
|
121
|
+
next if type != 'submit' && name.nil?
|
122
|
+
case type
|
123
|
+
when 'text', 'password', 'hidden', 'int'
|
124
|
+
@fields << Field.new(node.attributes['name'], node.attributes['value'] || '')
|
125
|
+
when 'radio'
|
126
|
+
@radiobuttons << RadioButton.new(node.attributes['name'], node.attributes['value'], node.attributes.has_key?('checked'), self)
|
127
|
+
when 'checkbox'
|
128
|
+
@checkboxes << CheckBox.new(node.attributes['name'], node.attributes['value'], node.attributes.has_key?('checked'), self)
|
129
|
+
when 'file'
|
130
|
+
@file_uploads << FileUpload.new(node.attributes['name'], nil)
|
131
|
+
when 'submit'
|
132
|
+
@buttons << Button.new(node.attributes['name'], node.attributes['value'])
|
133
|
+
when 'image'
|
134
|
+
@buttons << ImageButton.new(node.attributes['name'], node.attributes['value'])
|
135
|
+
end
|
136
|
+
end
|
135
137
|
|
136
|
-
|
137
|
-
|
138
|
+
# Find all textarea tags
|
139
|
+
(@elements_node/'textarea').each do |node|
|
140
|
+
next if node.attributes.nil?
|
141
|
+
next if node.attributes['name'].nil?
|
142
|
+
@fields << Field.new(node.attributes['name'], node.all_text)
|
143
|
+
end
|
138
144
|
|
139
|
-
|
140
|
-
|
141
|
-
|
142
|
-
|
143
|
-
|
144
|
-
|
145
|
-
|
146
|
-
|
147
|
-
@checkboxes << CheckBox.new(node.attributes['name'], node.attributes['value'], node.attributes.has_key?('checked'), self)
|
148
|
-
when 'file'
|
149
|
-
@file_uploads << FileUpload.new(node.attributes['name'], nil)
|
150
|
-
when 'submit'
|
151
|
-
@buttons << Button.new(node.attributes['name'], node.attributes['value'])
|
152
|
-
when 'image'
|
153
|
-
@buttons << ImageButton.new(node.attributes['name'], node.attributes['value'])
|
154
|
-
end
|
155
|
-
when 'textarea'
|
156
|
-
@fields << Field.new(node.attributes['name'], node.all_text)
|
157
|
-
when 'select'
|
158
|
-
if node.attributes.has_key? 'multiple'
|
159
|
-
@fields << MultiSelectList.new(node.attributes['name'], node)
|
160
|
-
else
|
161
|
-
@fields << SelectList.new(node.attributes['name'], node)
|
162
|
-
end
|
145
|
+
# Find all select tags
|
146
|
+
(@elements_node/'select').each do |node|
|
147
|
+
next if node.attributes.nil?
|
148
|
+
next if node.attributes['name'].nil?
|
149
|
+
if node.attributes.has_key? 'multiple'
|
150
|
+
@fields << MultiSelectList.new(node.attributes['name'], node)
|
151
|
+
else
|
152
|
+
@fields << SelectList.new(node.attributes['name'], node)
|
163
153
|
end
|
164
|
-
|
154
|
+
end
|
165
155
|
end
|
166
156
|
|
167
157
|
def rand_string(len = 10)
|
@@ -189,7 +179,8 @@ module WWW
|
|
189
179
|
|
190
180
|
if file.file_data.nil? and ! file.file_name.nil?
|
191
181
|
file.file_data = ::File.open(file.file_name, "rb") { |f| f.read }
|
192
|
-
file.mime_type =
|
182
|
+
file.mime_type = WEBrick::HTTPUtils.mime_type(file.file_name,
|
183
|
+
WEBrick::HTTPUtils::DefaultMimeTypes)
|
193
184
|
end
|
194
185
|
|
195
186
|
if file.mime_type != nil
|