mechanize 0.6.0 → 0.6.1

Sign up to get free protection for your applications and to get access to all the features.

Potentially problematic release.


This version of mechanize might be problematic. Click here for more details.

data/CHANGELOG CHANGED
@@ -1,5 +1,23 @@
1
1
  = Mechanize CHANGELOG
2
2
 
3
+ == 0.6.1
4
+
5
+ * Added a method to Form called "submit". Now forms can be submitted by
6
+ calling a method on the form.
7
+ * Added a click method to links
8
+ * Added an REXML pluggable parser for backwards compatability. To use it,
9
+ just do this:
10
+ agent.pluggable_parser.html = WWW::Mechanize::REXMLPage
11
+ * Fixed a bug with referrers by adding a page attribute to forms and links.
12
+ * Fixed a bug where domain names were case sensitive.
13
+ http://tenderlovemaking.com/2006/09/04/road-to-ruby-mechanize-060/#comment-53
14
+ * Fixed a bug with URI escaped links.
15
+ http://rubyforge.org/pipermail/mechanize-users/2006-September/000002.html
16
+ * Fixed a bug when options in select lists don't have a value. Thanks Dan Higham
17
+ [#5837] Code in lib/mechanize/form_elements.rb is incorrect.
18
+ * Fixed a bug with loading text in to links.
19
+ http://rubyforge.org/pipermail/mechanize-users/2006-September/000000.html
20
+
3
21
  == 0.6.0
4
22
 
5
23
  * Changed main parser to use hpricot
data/NOTES CHANGED
@@ -1,5 +1,19 @@
1
1
  = Mechanize Release Notes
2
2
 
3
+ == 0.6.1 (Chuck)
4
+
5
+ Mechanize version 0.6.1 (Chuck) is done, and is ready for you to use. This
6
+ post "my trip to europe" release includes many bug fixes and a handful of
7
+ new features.
8
+
9
+ New features include, a submit method on forms, a click method on links, and an
10
+ REXML pluggable parser. Now you can submit a form just by calling a method on
11
+ the form, rather than passing the form to the submit method on the mech object.
12
+ The click method on links lets you click the link by calling a method on the
13
+ link rather than passing the link to the click method on the mech object.
14
+ Lastly, the REXML pluggable parser lets you use your pre-0.6.0 code with
15
+ 0.6.1. See the CHANGELOG for more details.
16
+
3
17
  == 0.6.0 (Rufus)
4
18
 
5
19
  WWW::Mechanize 0.6.0 aka Rufus is ready! This hpricot flavored pie has
data/lib/mechanize.rb CHANGED
@@ -129,8 +129,9 @@ class Mechanize
129
129
  end
130
130
 
131
131
  # Fetches the URL passed in and returns a page.
132
- def get(url)
133
- cur_page = current_page || Page.new( nil, {'content-type'=>'text/html'})
132
+ def get(url, referer=nil)
133
+ cur_page = referer || current_page ||
134
+ Page.new( nil, {'content-type'=>'text/html'})
134
135
 
135
136
  # fetch the page
136
137
  abs_uri = to_absolute_uri(url, cur_page)
@@ -152,7 +153,13 @@ class Mechanize
152
153
  uri = to_absolute_uri(
153
154
  link.attributes['href'] || link.attributes['src'] || link.href
154
155
  )
155
- get(uri)
156
+ referer =
157
+ begin
158
+ link.page
159
+ rescue
160
+ nil
161
+ end
162
+ get(uri, referer)
156
163
  end
157
164
 
158
165
  # Equivalent to the browser back button. Returns the most recent page
@@ -233,7 +240,10 @@ class Mechanize
233
240
  private
234
241
 
235
242
  def to_absolute_uri(url, cur_page=current_page())
236
- url = URI.parse(URI.escape(url.to_s.strip)) unless url.is_a? URI
243
+ url = URI.parse(
244
+ URI.escape(
245
+ URI.unescape(url.to_s.strip)
246
+ )) unless url.is_a? URI
237
247
 
238
248
  # construct an absolute uri
239
249
  if url.relative?
@@ -245,7 +255,8 @@ class Mechanize
245
255
  end
246
256
 
247
257
  def post_form(url, form)
248
- cur_page = current_page || Page.new(nil, {'content-type'=>'text/html'})
258
+ cur_page = form.page || current_page ||
259
+ Page.new( nil, {'content-type'=>'text/html'})
249
260
 
250
261
  request_data = form.request_data
251
262
 
@@ -380,6 +391,8 @@ class Mechanize
380
391
  response.code
381
392
  )
382
393
 
394
+ page.mech = self if page.respond_to? :mech=
395
+
383
396
  log.info("status: #{ page.code }") if log
384
397
 
385
398
  if page.respond_to? :watch_for_set
@@ -7,8 +7,7 @@ module WWW
7
7
  # This class is used to represent an HTTP Cookie.
8
8
  class Cookie < WEBrick::Cookie
9
9
  def self.parse(uri, str)
10
- cookies = []
11
- str.gsub(/(,([^;,]*=)|,$)/) { "\r\n#{$2}" }.split(/\r\n/).each { |c|
10
+ return str.split(/,(?=[^;,]*=)|,$/).collect { |c|
12
11
  cookie_elem = c.split(/;/)
13
12
  first_elem = cookie_elem.shift
14
13
  first_elem.strip!
@@ -40,9 +39,7 @@ module WWW
40
39
  cookie.domain ||= uri.host
41
40
  # Move this in to the cookie jar
42
41
  yield cookie if block_given?
43
- cookies << cookie
44
42
  }
45
- return cookies
46
43
  end
47
44
 
48
45
  def to_s
@@ -61,12 +58,13 @@ module WWW
61
58
 
62
59
  # Add a cookie to the Jar.
63
60
  def add(uri, cookie)
64
- return unless uri.host =~ /#{cookie.domain}$/
65
- unless @jar.has_key?(cookie.domain)
66
- @jar[cookie.domain] = Hash.new
61
+ return unless uri.host =~ /#{cookie.domain}$/i
62
+ normal_domain = cookie.domain.downcase
63
+ unless @jar.has_key?(normal_domain)
64
+ @jar[normal_domain] = Hash.new
67
65
  end
68
66
 
69
- @jar[cookie.domain][cookie.name] = cookie
67
+ @jar[normal_domain][cookie.name] = cookie
70
68
  cleanup()
71
69
  cookie
72
70
  end
@@ -77,7 +75,7 @@ module WWW
77
75
  cookies = []
78
76
  url.path = '/' if url.path.empty?
79
77
  @jar.each_key do |domain|
80
- if url.host =~ /#{domain}$/
78
+ if url.host =~ /#{domain}$/i
81
79
  @jar[domain].each_key do |name|
82
80
  if url.path =~ /^#{@jar[domain][name].path}/
83
81
  if @jar[domain][name].expires.nil?
@@ -212,10 +212,12 @@ module WWW
212
212
  # puts form['name']
213
213
  class Form < GlobalForm
214
214
  attr_reader :node
215
+ attr_reader :page
215
216
 
216
- def initialize(node)
217
- @node = node
218
- super(@node, @node)
217
+ def initialize(node, mech=nil, page=nil)
218
+ super(node, node)
219
+ @page = page
220
+ @mech = mech
219
221
  end
220
222
 
221
223
  # Fetch the first field whose name is equal to field_name
@@ -268,6 +270,11 @@ module WWW
268
270
  end
269
271
  super
270
272
  end
273
+
274
+ # Submit this form with the button passed in
275
+ def submit(button=nil)
276
+ @mech.submit(self, button)
277
+ end
271
278
  end
272
279
  end
273
280
  end
@@ -213,6 +213,7 @@ module WWW
213
213
  alias :selected? :selected
214
214
 
215
215
  def initialize(node, select_list)
216
+ node.attributes ||= {}
216
217
  @text = node.all_text
217
218
  @value = node.attributes['value']
218
219
  @selected = node.attributes.has_key?('selected') ? true : false
@@ -6,6 +6,9 @@ class Hpricot::Elem
6
6
  if child.respond_to? :content
7
7
  text << child.content
8
8
  end
9
+ if child.respond_to? :all_text
10
+ text << child.all_text
11
+ end
9
12
  end
10
13
  text
11
14
  end
@@ -54,12 +54,16 @@ module WWW
54
54
  alias :and :with
55
55
 
56
56
  def method_missing(meth_sym, *args)
57
- return first.send(meth_sym) if args.empty?
58
- arg = args.first
59
- if arg.class == Regexp
60
- WWW::Mechanize::List.new(find_all { |e| e.send(meth_sym) =~ arg })
57
+ if length > 0
58
+ return first.send(meth_sym) if args.empty?
59
+ arg = args.first
60
+ if arg.class == Regexp
61
+ WWW::Mechanize::List.new(find_all { |e| e.send(meth_sym) =~ arg })
62
+ else
63
+ WWW::Mechanize::List.new(find_all { |e| e.send(meth_sym) == arg })
64
+ end
61
65
  else
62
- WWW::Mechanize::List.new(find_all { |e| e.send(meth_sym) == arg })
66
+ ''
63
67
  end
64
68
  end
65
69
  end
@@ -1,5 +1,5 @@
1
1
  module WWW
2
2
  class Mechanize
3
- Version = '0.6.0'
3
+ Version = '0.6.1'
4
4
  end
5
5
  end
@@ -17,16 +17,23 @@ module WWW
17
17
  class Page < File
18
18
  attr_reader :root, :title, :watch_for_set
19
19
  attr_reader :frames, :iframes, :links, :forms, :meta, :watches
20
+ attr_accessor :mech
20
21
 
21
- def initialize(uri=nil, response=nil, body=nil, code=nil)
22
+ def initialize(uri=nil, response=nil, body=nil, code=nil, mech=nil)
22
23
  super(uri, response, body, code)
23
24
  @watch_for_set = {}
25
+ @mech = mech
24
26
 
25
27
  yield self if block_given?
26
28
 
27
29
  raise Mechanize::ContentTypeError.new(response['content-type']) unless
28
30
  content_type() =~ /^text\/html/
29
- parse_html if body && response
31
+
32
+ # construct parser and feed with HTML
33
+ if body && response
34
+ @root = Hpricot.parse(body)
35
+ parse_html
36
+ end
30
37
  end
31
38
 
32
39
  # Get the response header
@@ -62,9 +69,6 @@ module WWW
62
69
  private
63
70
 
64
71
  def parse_html
65
- # construct parser and feed with HTML
66
- @root = Hpricot.parse(@body)
67
-
68
72
  @forms = WWW::Mechanize::List.new
69
73
  @links = WWW::Mechanize::List.new
70
74
  @meta = WWW::Mechanize::List.new
@@ -79,14 +83,14 @@ module WWW
79
83
 
80
84
  # Find all the form tags
81
85
  (@root/'form').each do |html_form|
82
- form = Form.new(html_form)
86
+ form = Form.new(html_form, @mech, self)
83
87
  form.action ||= @uri
84
88
  @forms << form
85
89
  end
86
90
 
87
91
  # Find all the 'a' tags
88
92
  (@root/'a').each do |node|
89
- @links << Link.new(node)
93
+ @links << Link.new(node, @mech, self)
90
94
  end
91
95
 
92
96
  # Find all 'meta' tags
@@ -99,19 +103,19 @@ module WWW
99
103
  if equiv != nil && equiv.downcase == 'refresh'
100
104
  if content != nil && content =~ /^\d+\s*;\s*url\s*=\s*(\S+)/i
101
105
  node.attributes['href'] = $1
102
- @meta << Meta.new(node)
106
+ @meta << Meta.new(node, @mech, self)
103
107
  end
104
108
  end
105
109
  end
106
110
 
107
111
  # Find all 'frame' tags
108
112
  (@root/'frame').each do |node|
109
- @frames << Frame.new(node)
113
+ @frames << Frame.new(node, @mech, self)
110
114
  end
111
115
 
112
116
  # Find all 'iframe' tags
113
117
  (@root/'iframe').each do |node|
114
- @iframes << Frame.new(node)
118
+ @iframes << Frame.new(node, @mech, self)
115
119
  end
116
120
 
117
121
  # Find all watch tags
@@ -13,13 +13,16 @@ module WWW
13
13
  attr_reader :href
14
14
  attr_reader :text
15
15
  attr_reader :attributes
16
+ attr_reader :page
16
17
  alias :to_s :text
17
18
 
18
- def initialize(node)
19
+ def initialize(node, mech, page)
19
20
  node.attributes ||= {}
20
21
  @node = node
21
22
  @href = node.attributes['href']
22
23
  @text = node.all_text
24
+ @page = page
25
+ @mech = mech
23
26
  @attributes = node.attributes
24
27
 
25
28
  # If there is no text, try to find an image and use it's alt text
@@ -36,6 +39,11 @@ module WWW
36
39
  def uri
37
40
  URI.parse(@href)
38
41
  end
42
+
43
+ # Click on this link
44
+ def click
45
+ @mech.click self
46
+ end
39
47
  end
40
48
 
41
49
  # This class encapsulates a Meta tag. Mechanize treats meta tags just
@@ -53,7 +61,8 @@ module WWW
53
61
  alias :src :href
54
62
  alias :name :text
55
63
 
56
- def initialize(node)
64
+ def initialize(node, mech, referer)
65
+ super(node, mech, referer)
57
66
  node.attributes ||= {}
58
67
  @node = node
59
68
  @text = node.attributes['name']
@@ -0,0 +1,37 @@
1
+ require 'web/htmltools/xmltree'
2
+ require 'mechanize/rexml'
3
+
4
+ class WWW::Mechanize::REXMLPage < WWW::Mechanize::Page
5
+ def initialize(uri=nil, response=nil, body=nil, code=nil, mech=nil)
6
+ super(uri, response, body, code)
7
+ @watch_for_set = {}
8
+ @mech = mech
9
+
10
+ yield self if block_given?
11
+
12
+ raise Mechanize::ContentTypeError.new(response['content-type']) unless
13
+ content_type() =~ /^text\/html/
14
+
15
+ # construct parser and feed with HTML
16
+ parser = HTMLTree::XMLParser.new
17
+ begin
18
+ parser.feed(@body)
19
+ rescue => ex
20
+ if ex.message =~ /attempted adding second root element to document/ and
21
+ # Put the whole document inside a single root element, which I
22
+ # simply name <root>, just to make the parser happy. It's no
23
+ #longer valid HTML, but without a single root element, it's not
24
+ # valid HTML as well.
25
+
26
+ # TODO: leave a possible doctype definition outside this element.
27
+ parser = HTMLTree::XMLParser.new
28
+ parser.feed("<root>" + @body + "</root>")
29
+ else
30
+ raise
31
+ end
32
+ end
33
+
34
+ @root = parser.document
35
+ parse_html if body && response
36
+ end
37
+ end
@@ -0,0 +1,236 @@
1
+ #
2
+ # Copyright (c) 2005 by Michael Neumann (mneumann@ntecs.de).
3
+ # Released under the same terms of license as Ruby.
4
+ #
5
+
6
+ require 'rexml/rexml'
7
+
8
+ class REXML::Text
9
+ def collect_text_recursively
10
+ value()
11
+ end
12
+ end
13
+
14
+ class REXML::Comment
15
+ def collect_text_recursively
16
+ []
17
+ end
18
+ end
19
+
20
+ module REXML::Node
21
+
22
+ # Aliasing functions to get rid of warnings. Remove when support for 1.8.2
23
+ # is dropped.
24
+ if RUBY_VERSION > "1.8.2"
25
+ alias :old_each_recursive :each_recursive
26
+ alias :old_find_first_recursive :find_first_recursive
27
+ alias :old_index_in_parent :index_in_parent
28
+ end
29
+
30
+ def search(arg)
31
+ list = WWW::Mechanize::List.new
32
+ each_recursive { |n|
33
+ list << n if n.name.downcase == arg
34
+ }
35
+ list
36
+ end
37
+
38
+ alias :/ :search
39
+
40
+ # Visit all subnodes of +self+ recursively
41
+
42
+ def each_recursive(&block) # :yields: node
43
+ self.elements.each {|node|
44
+ block.call(node)
45
+ node.each_recursive(&block)
46
+ }
47
+ end
48
+
49
+ # Find (and return) first subnode (recursively) for which the block evaluates
50
+ # to true. Returns +nil+ if none was found.
51
+
52
+ def find_first_recursive(&block) # :yields: node
53
+ each_recursive {|node|
54
+ return node if block.call(node)
55
+ }
56
+ return nil
57
+ end
58
+
59
+ # Find all subnodes (recursively) for which the block evaluates to true.
60
+
61
+ def find_all_recursive(&block) # :yields: node
62
+ arr = []
63
+ each_recursive {|node|
64
+ arr << node if block.call(node)
65
+ }
66
+ arr
67
+ end
68
+
69
+ # Returns the index that +self+ has in its parent's elements array, so that
70
+ # the following equation holds true:
71
+ #
72
+ # node == node.parent.elements[node.index_in_parent]
73
+
74
+ def index_in_parent
75
+ parent.index(self)+1
76
+ end
77
+
78
+ # Recursivly collects all text strings starting into an array.
79
+ #
80
+ # E.g. the method would return [["abc"], "def"] for this node:
81
+ #
82
+ # <i><b>abc</b>def</i>
83
+
84
+ def collect_text_recursively
85
+ map {|n| n.collect_text_recursively}
86
+ end
87
+
88
+ # Returns all text of all subnodes (recursivly), merged into one string.
89
+ # This is equivalent to:
90
+ #
91
+ # collect_text_recursively.flatten.join("")
92
+
93
+ def all_text
94
+ collect_text_recursively.flatten.join("")
95
+ end
96
+
97
+ alias :text :all_text
98
+
99
+ end
100
+
101
+ #
102
+ # Starting with +root_node+, we recursively look for a node with the given
103
+ # +tag+, the given +attributes+ (a Hash) and whoose text equals or matches the
104
+ # +text+ string or regular expression.
105
+ #
106
+ # To find the following node:
107
+ #
108
+ # <td class='abc'>text</td>
109
+ #
110
+ # We use:
111
+ #
112
+ # find_node(root, 'td', {'class' => 'abc'}, "text")
113
+ #
114
+ # Returns +nil+ if no matching node was found.
115
+
116
+ def find_node(root_node, tag, attributes, text=nil)
117
+ root_node.find_first_recursive {|node|
118
+ node.name == tag and
119
+ attributes.all? {|attr, val| node.attributes[attr] == val} and
120
+ (text ? text === node.text : true)
121
+ }
122
+ end
123
+
124
+ #
125
+ # Extract specific columns (specified by the position of it's corrensponding
126
+ # header column) from a table.
127
+ #
128
+ # Given the following table:
129
+ #
130
+ # <table>
131
+ # <tr>
132
+ # <td>A</td>
133
+ # <td>B</td>
134
+ # <td>C</td>
135
+ # </tr>
136
+ # <tr>
137
+ # <td>A.1</td>
138
+ # <td>B.1</td>
139
+ # <td>C.1</td>
140
+ # </tr>
141
+ # <tr>
142
+ # <td>A.2</td>
143
+ # <td>B.2</td>
144
+ # <td>C.2</td>
145
+ # </tr>
146
+ # </table>
147
+ #
148
+ # To extract the first (A) and last (C) column:
149
+ #
150
+ # extract_from_table(root_node, ["A", "C"])
151
+ #
152
+ # And you get this as result:
153
+ #
154
+ # [
155
+ # ["A.1", "C.1"],
156
+ # ["A.2", "C.2"]
157
+ # ]
158
+ #
159
+
160
+ def extract_from_table(root_node, headers, header_tags = %w(td th))
161
+
162
+ # extract and collect all header nodes
163
+
164
+ header_nodes = headers.collect { |header|
165
+ root_node.find_first_recursive {|node|
166
+ header_tags.include?(node.name.downcase) and header === node.all_text
167
+ }
168
+ }
169
+
170
+ raise "some headers not found" if header_nodes.compact.size < headers.size
171
+
172
+ # assert that all headers have the same parent 'header_row', which is the row
173
+ # in which the header_nodes are contained. 'table' is the surrounding table tag.
174
+
175
+ header_row = header_nodes.first.parent
176
+ table = header_row.parent
177
+
178
+ raise "different parents" unless header_nodes.all? {|n| n.parent == header_row}
179
+
180
+ # we now iterate over all rows in the table that follows the header_row.
181
+ # for each row we collect the elements at the same positions as the header_nodes.
182
+ # this is what we finally return from the method.
183
+
184
+ (header_row.index_in_parent .. table.elements.size).collect do |inx|
185
+ row = table.elements[inx]
186
+ header_nodes.collect { |n| row.elements[ n.parent.elements.index(n) ].text }
187
+ end
188
+ end
189
+
190
+ # Given a HTML table, this method returns a matrix (2-dim array), with all the
191
+ # table-data elements correctly placed in it.
192
+ #
193
+ # If there's a table data element which uses 'colspan', that node is stored in
194
+ # at the current position of the row followed by (colspan-1) nil values.
195
+ #
196
+ # Example:
197
+ #
198
+ # <table>
199
+ # <tr>
200
+ # <td>A</td>
201
+ # <td>B</td>
202
+ # </tr>
203
+ # <tr>
204
+ # <td colspan="2">C</td>
205
+ # </tr>
206
+ # </table>
207
+ #
208
+ # Result:
209
+ #
210
+ # [
211
+ # [A, B],
212
+ # [C, nil]
213
+ # ]
214
+ #
215
+ # where A, B and C are the corresponding "<td>" nodes.
216
+ #
217
+
218
+ def table_to_matrix(table_node)
219
+ matrix = []
220
+
221
+ # for each row
222
+ table_node.elements.each('tr') {|r|
223
+ row = []
224
+ r.elements.each {|data|
225
+ next unless ['td', 'th'].include?(data.name)
226
+ row << data
227
+
228
+ # fill with empty elements
229
+ colspan = (data.attributes['colspan'] || 1).to_i
230
+ (colspan - 1).times { row << nil }
231
+ }
232
+ matrix << row
233
+ }
234
+
235
+ return matrix
236
+ end
@@ -4,6 +4,7 @@
4
4
  <select name="list">
5
5
  <option value="1">Option 1</option>
6
6
  <option value="2">Option 2</option>
7
+ <option>Option No Value</option>
7
8
  <option value="3">Option 3</option>
8
9
  <option value="4">Option 4</option>
9
10
  <option value="5">Option 5</option>
@@ -0,0 +1,5 @@
1
+ <html>
2
+ <body>
3
+ This is a webpage that has a space in the filename.
4
+ </body>
5
+ </html>
@@ -0,0 +1,15 @@
1
+ <html>
2
+ <body>
3
+ <a href="thing.html"><b>Bold Dude</b></a>
4
+ <a href="thing.html">Dude</a>
5
+ <a href="thing.html">Aaron <b>James</b> Patterson</a>
6
+ <a href="thing.html"><b>Aaron</b> Patterson</a>
7
+ <a href="thing.html">Ruby <b>Rocks!</b></a>
8
+ <!-- Testing a bug with escaped stuff in links:
9
+ http://rubyforge.org/pipermail/mechanize-users/2006-September/000002.html
10
+ -->
11
+ <a href="link%20with%20space.html">encoded space</a>
12
+ <a href="link with space.html">not encoded space</a>
13
+ <!-- End escaped bug -->
14
+ </body>
15
+ </html>
@@ -0,0 +1,10 @@
1
+ <html>
2
+ <body>
3
+ <a href="/referer">Referer Servlet</a>
4
+ <br />
5
+ <form method="post" action="/referer">
6
+ <input type="text" name="first" /></br>
7
+ <input type="submit" value="Submit" />
8
+ </form>
9
+ </body>
10
+ </html>
data/test/server.rb CHANGED
@@ -23,6 +23,7 @@ s.mount("/file_upload", FileUploadTest)
23
23
  s.mount("/bad_content_type", BadContentTypeTest)
24
24
  s.mount("/content_type_test", ContentTypeTest)
25
25
  s.mount("/gzip", GzipServlet)
26
+ s.mount("/referer", RefererServlet)
26
27
 
27
28
  htpasswd = WEBrick::HTTPAuth::Htpasswd.new(base_dir + '/data/htpasswd')
28
29
  auth = WEBrick::HTTPAuth::BasicAuth.new(
data/test/servlets.rb CHANGED
@@ -4,6 +4,18 @@ require 'date'
4
4
  require 'zlib'
5
5
  require 'stringio'
6
6
 
7
+ class RefererServlet < WEBrick::HTTPServlet::AbstractServlet
8
+ def do_GET(req, res)
9
+ res['Content-Type'] = "text/html"
10
+ res.body = req['Referer']
11
+ end
12
+
13
+ def do_POST(req, res)
14
+ res['Content-Type'] = "text/html"
15
+ res.body = req['Referer']
16
+ end
17
+ end
18
+
7
19
  class GzipServlet < WEBrick::HTTPServlet::AbstractServlet
8
20
  def do_GET(req, res)
9
21
  if req['Accept-Encoding'] =~ /gzip/
@@ -15,6 +15,34 @@ class CookieJarTest < Test::Unit::TestCase
15
15
  }
16
16
  c
17
17
  end
18
+
19
+ def test_domain_case
20
+ values = { :name => 'Foo',
21
+ :value => 'Bar',
22
+ :path => '/',
23
+ :expires => Time.now + (10 * 86400),
24
+ :domain => 'rubyforge.org'
25
+ }
26
+ url = URI.parse('http://rubyforge.org/')
27
+
28
+ jar = WWW::Mechanize::CookieJar.new
29
+ assert_equal(0, jar.cookies(url).length)
30
+
31
+ # Add one cookie with an expiration date in the future
32
+ cookie = cookie_from_hash(values)
33
+ jar.add(url, cookie)
34
+ assert_equal(1, jar.cookies(url).length)
35
+
36
+ jar.add(url, cookie_from_hash( values.merge( :domain => 'RuByForge.Org',
37
+ :name => 'aaron'
38
+ ) ) )
39
+
40
+ assert_equal(2, jar.cookies(url).length)
41
+
42
+ url2 = URI.parse('http://RuByFoRgE.oRg/')
43
+ assert_equal(2, jar.cookies(url2).length)
44
+ end
45
+
18
46
  def test_add_future_cookies
19
47
  values = { :name => 'Foo',
20
48
  :value => 'Bar',
data/test/tc_forms.rb CHANGED
@@ -39,6 +39,26 @@ class FormsMechTest < Test::Unit::TestCase
39
39
  assert_not_nil(page.links.text('first:Patterson').first)
40
40
  end
41
41
 
42
+ # Test calling submit on the form object
43
+ def test_submit_on_form
44
+ page = @agent.get("http://localhost:#{PORT}/form_multival.html")
45
+ form = page.forms.name('post_form').first
46
+
47
+ assert_not_nil(form)
48
+ assert_equal(2, form.fields.name('first').length)
49
+
50
+ form.fields.name('first')[0].value = 'Aaron'
51
+ form.fields.name('first')[1].value = 'Patterson'
52
+
53
+ page = form.submit
54
+
55
+ assert_not_nil(page)
56
+
57
+ assert_equal(2, page.links.length)
58
+ assert_not_nil(page.links.text('first:Aaron').first)
59
+ assert_not_nil(page.links.text('first:Patterson').first)
60
+ end
61
+
42
62
  # Test submitting form with two fields of the same name
43
63
  def test_get_multival
44
64
  page = @agent.get("http://localhost:#{PORT}/form_multival.html")
data/test/tc_links.rb CHANGED
@@ -46,4 +46,45 @@ class LinksMechTest < Test::Unit::TestCase
46
46
  assert_equal("http://localhost:#{PORT}/form_test.html",
47
47
  @agent.history.last.uri.to_s)
48
48
  end
49
+
50
+ def test_click_method
51
+ page = @agent.get("http://localhost:#{PORT}/frame_test.html")
52
+ link = page.links.text("Form Test")
53
+ assert_not_nil(link)
54
+ assert_equal('Form Test', link.text)
55
+ page = link.click
56
+ assert_equal("http://localhost:#{PORT}/form_test.html",
57
+ @agent.history.last.uri.to_s)
58
+ end
59
+
60
+ def test_find_bold_link
61
+ page = @agent.get("http://localhost:#{PORT}/tc_links.html")
62
+ link = page.links.text(/Bold Dude/)
63
+ assert_equal(1, link.length)
64
+ assert_equal('Bold Dude', link.first.text)
65
+
66
+ link = page.links.text('Aaron James Patterson')
67
+ assert_equal(1, link.length)
68
+ assert_equal('Aaron James Patterson', link.first.text)
69
+
70
+ link = page.links.text('Aaron Patterson')
71
+ assert_equal(1, link.length)
72
+ assert_equal('Aaron Patterson', link.first.text)
73
+
74
+ link = page.links.text('Ruby Rocks!')
75
+ assert_equal(1, link.length)
76
+ assert_equal('Ruby Rocks!', link.first.text)
77
+ end
78
+
79
+ def test_link_with_encoded_space
80
+ page = @agent.get("http://localhost:#{PORT}/tc_links.html")
81
+ link = page.links.text('encoded space').first
82
+ page = @agent.click link
83
+ end
84
+
85
+ def test_link_with_space
86
+ page = @agent.get("http://localhost:#{PORT}/tc_links.html")
87
+ link = page.links.text('not encoded space').first
88
+ page = @agent.click link
89
+ end
49
90
  end
@@ -0,0 +1,46 @@
1
+ $:.unshift File.join(File.dirname(__FILE__), "..", "lib")
2
+
3
+ require 'test/unit'
4
+ require 'rubygems'
5
+ require 'mechanize'
6
+ require 'test_includes'
7
+
8
+ class RefererTest < Test::Unit::TestCase
9
+ include TestMethods
10
+
11
+ def setup
12
+ @agent = WWW::Mechanize.new
13
+ end
14
+
15
+ def test_no_referer
16
+ page = @agent.get("http://localhost:#{PORT}/referer")
17
+ assert_equal('', page.body)
18
+ end
19
+
20
+ def test_send_referer
21
+ page = @agent.get("http://localhost:#{PORT}/tc_referer.html")
22
+ page = @agent.click page.links.first
23
+ assert_equal("http://localhost:#{PORT}/tc_referer.html", page.body)
24
+ end
25
+
26
+ def test_fetch_two
27
+ page1 = @agent.get("http://localhost:#{PORT}/tc_referer.html")
28
+ page2 = @agent.get("http://localhost:#{PORT}/tc_pretty_print.html")
29
+ page = @agent.click page1.links.first
30
+ assert_equal("http://localhost:#{PORT}/tc_referer.html", page.body)
31
+ end
32
+
33
+ def test_fetch_two_first
34
+ page1 = @agent.get("http://localhost:#{PORT}/tc_referer.html")
35
+ page2 = @agent.get("http://localhost:#{PORT}/tc_pretty_print.html")
36
+ page = @agent.click page1.links
37
+ assert_equal("http://localhost:#{PORT}/tc_referer.html", page.body)
38
+ end
39
+
40
+ def test_post_form
41
+ page1 = @agent.get("http://localhost:#{PORT}/tc_referer.html")
42
+ page2 = @agent.get("http://localhost:#{PORT}/tc_pretty_print.html")
43
+ page = @agent.submit page1.forms.first
44
+ assert_equal("http://localhost:#{PORT}/tc_referer.html", page.body)
45
+ end
46
+ end
data/test/ts_mech.rb CHANGED
@@ -40,6 +40,7 @@ require 'tc_pretty_print'
40
40
  require 'tc_textarea'
41
41
  require 'tc_no_attributes'
42
42
  require 'tc_gzipping'
43
+ require 'tc_referer'
43
44
  #require 'tc_proxy'
44
45
  #require 'tc_ssl_server'
45
46
 
metadata CHANGED
@@ -3,11 +3,11 @@ rubygems_version: 0.9.0
3
3
  specification_version: 1
4
4
  name: mechanize
5
5
  version: !ruby/object:Gem::Version
6
- version: 0.6.0
7
- date: 2006-09-06 00:00:00 -07:00
6
+ version: 0.6.1
7
+ date: 2006-09-23 00:00:00 -07:00
8
8
  summary: Mechanize provides automated web-browsing
9
9
  require_paths:
10
- - lib
10
+ - lib
11
11
  email: aaronp@rubyforge.org
12
12
  homepage: mechanize.rubyforge.org
13
13
  rubyforge_project: mechanize
@@ -18,140 +18,145 @@ bindir: bin
18
18
  has_rdoc: true
19
19
  required_ruby_version: !ruby/object:Gem::Version::Requirement
20
20
  requirements:
21
- - - ">"
22
- - !ruby/object:Gem::Version
23
- version: 0.0.0
21
+ -
22
+ - ">"
23
+ - !ruby/object:Gem::Version
24
+ version: 0.0.0
24
25
  version:
25
26
  platform: ruby
26
27
  signing_key:
27
28
  cert_chain:
28
29
  post_install_message:
29
30
  authors:
30
- - Aaron Patterson
31
+ - Aaron Patterson
31
32
  files:
32
- - test/tc_mech.rb
33
- - test/ts_mech.rb
34
- - test/tc_no_attributes.rb
35
- - test/tc_links.rb
36
- - test/tc_select_all.rb
37
- - test/tc_page.rb
38
- - test/test_includes.rb
39
- - test/tc_checkboxes.rb
40
- - test/tc_watches.rb
41
- - test/tc_cookies.rb
42
- - test/proxy.rb
43
- - test/data
44
- - test/tc_cookie_jar.rb
45
- - test/tc_forms.rb
46
- - test/tc_select_none.rb
47
- - test/tc_multi_select.rb
48
- - test/tc_pluggable_parser.rb
49
- - test/tc_select_noopts.rb
50
- - test/ssl_server.rb
51
- - test/tc_pretty_print.rb
52
- - test/tc_post_form.rb
53
- - test/tc_errors.rb
54
- - test/tc_gzipping.rb
55
- - test/tc_authenticate.rb
56
- - test/README
57
- - test/tc_radiobutton.rb
58
- - test/tc_form_no_inputname.rb
59
- - test/tc_upload.rb
60
- - test/tc_cookie_class.rb
61
- - test/tc_set_fields.rb
62
- - test/tc_select.rb
63
- - test/server.rb
64
- - test/htdocs
65
- - test/tc_ssl_server.rb
66
- - test/tc_textarea.rb
67
- - test/tc_proxy.rb
68
- - test/tc_frames.rb
69
- - test/tc_bad_links.rb
70
- - test/tc_response_code.rb
71
- - test/servlets.rb
72
- - test/tc_save_file.rb
73
- - test/data/server.key
74
- - test/data/server.csr
75
- - test/data/server.pem
76
- - test/data/server.crt
77
- - test/data/htpasswd
78
- - test/htdocs/file_upload.html
79
- - test/htdocs/tc_no_attributes.html
80
- - test/htdocs/iframe_test.html
81
- - test/htdocs/form_select_all.html
82
- - test/htdocs/form_no_action.html
83
- - test/htdocs/form_test.html
84
- - test/htdocs/bad_form_test.html
85
- - test/htdocs/alt_text.html
86
- - test/htdocs/frame_test.html
87
- - test/htdocs/tc_radiobuttons.html
88
- - test/htdocs/form_multi_select.html
89
- - test/htdocs/form_set_fields.html
90
- - test/htdocs/index.html
91
- - test/htdocs/find_link.html
92
- - test/htdocs/google.html
93
- - test/htdocs/no_title_test.html
94
- - test/htdocs/button.jpg
95
- - test/htdocs/form_multival.html
96
- - test/htdocs/tc_bad_links.html
97
- - test/htdocs/tc_checkboxes.html
98
- - test/htdocs/tc_textarea.html
99
- - test/htdocs/form_select_none.html
100
- - test/htdocs/form_select_noopts.html
101
- - test/htdocs/form_select.html
102
- - test/htdocs/form_no_input_name.html
103
- - test/htdocs/tc_pretty_print.html
104
- - lib/mechanize.rb
105
- - lib/mechanize
106
- - lib/mechanize/errors.rb
107
- - lib/mechanize/page.rb
108
- - lib/mechanize/form_elements.rb
109
- - lib/mechanize/net-overrides
110
- - lib/mechanize/cookie.rb
111
- - lib/mechanize/inspect.rb
112
- - lib/mechanize/mech_version.rb
113
- - lib/mechanize/list.rb
114
- - lib/mechanize/hpricot.rb
115
- - lib/mechanize/pluggable_parsers.rb
116
- - lib/mechanize/page_elements.rb
117
- - lib/mechanize/form.rb
118
- - lib/mechanize/net-overrides/net
119
- - lib/mechanize/net-overrides/net/http.rb
120
- - lib/mechanize/net-overrides/net/https.rb
121
- - lib/mechanize/net-overrides/net/protocol.rb
122
- - README
123
- - EXAMPLES
124
- - CHANGELOG
125
- - LICENSE
126
- - NOTES
127
- - GUIDE
33
+ - test/data
34
+ - test/htdocs
35
+ - test/proxy.rb
36
+ - test/README
37
+ - test/server.rb
38
+ - test/servlets.rb
39
+ - test/ssl_server.rb
40
+ - test/tc_authenticate.rb
41
+ - test/tc_bad_links.rb
42
+ - test/tc_checkboxes.rb
43
+ - test/tc_cookie_class.rb
44
+ - test/tc_cookie_jar.rb
45
+ - test/tc_cookies.rb
46
+ - test/tc_errors.rb
47
+ - test/tc_form_no_inputname.rb
48
+ - test/tc_forms.rb
49
+ - test/tc_frames.rb
50
+ - test/tc_gzipping.rb
51
+ - test/tc_links.rb
52
+ - test/tc_mech.rb
53
+ - test/tc_multi_select.rb
54
+ - test/tc_no_attributes.rb
55
+ - test/tc_page.rb
56
+ - test/tc_pluggable_parser.rb
57
+ - test/tc_post_form.rb
58
+ - test/tc_pretty_print.rb
59
+ - test/tc_proxy.rb
60
+ - test/tc_radiobutton.rb
61
+ - test/tc_referer.rb
62
+ - test/tc_response_code.rb
63
+ - test/tc_save_file.rb
64
+ - test/tc_select.rb
65
+ - test/tc_select_all.rb
66
+ - test/tc_select_none.rb
67
+ - test/tc_select_noopts.rb
68
+ - test/tc_set_fields.rb
69
+ - test/tc_ssl_server.rb
70
+ - test/tc_textarea.rb
71
+ - test/tc_upload.rb
72
+ - test/tc_watches.rb
73
+ - test/test_includes.rb
74
+ - test/ts_mech.rb
75
+ - test/data/htpasswd
76
+ - test/data/server.crt
77
+ - test/data/server.csr
78
+ - test/data/server.key
79
+ - test/data/server.pem
80
+ - test/htdocs/alt_text.html
81
+ - test/htdocs/bad_form_test.html
82
+ - test/htdocs/button.jpg
83
+ - test/htdocs/file_upload.html
84
+ - test/htdocs/find_link.html
85
+ - test/htdocs/form_multi_select.html
86
+ - test/htdocs/form_multival.html
87
+ - test/htdocs/form_no_action.html
88
+ - test/htdocs/form_no_input_name.html
89
+ - test/htdocs/form_select.html
90
+ - test/htdocs/form_select_all.html
91
+ - test/htdocs/form_select_none.html
92
+ - test/htdocs/form_select_noopts.html
93
+ - test/htdocs/form_set_fields.html
94
+ - test/htdocs/form_test.html
95
+ - test/htdocs/frame_test.html
96
+ - test/htdocs/google.html
97
+ - test/htdocs/iframe_test.html
98
+ - test/htdocs/index.html
99
+ - test/htdocs/link with space.html
100
+ - test/htdocs/no_title_test.html
101
+ - test/htdocs/tc_bad_links.html
102
+ - test/htdocs/tc_checkboxes.html
103
+ - test/htdocs/tc_links.html
104
+ - test/htdocs/tc_no_attributes.html
105
+ - test/htdocs/tc_pretty_print.html
106
+ - test/htdocs/tc_radiobuttons.html
107
+ - test/htdocs/tc_referer.html
108
+ - test/htdocs/tc_textarea.html
109
+ - lib/mechanize
110
+ - lib/mechanize.rb
111
+ - lib/mechanize/cookie.rb
112
+ - lib/mechanize/errors.rb
113
+ - lib/mechanize/form.rb
114
+ - lib/mechanize/form_elements.rb
115
+ - lib/mechanize/hpricot.rb
116
+ - lib/mechanize/inspect.rb
117
+ - lib/mechanize/list.rb
118
+ - lib/mechanize/mech_version.rb
119
+ - lib/mechanize/net-overrides
120
+ - lib/mechanize/page.rb
121
+ - lib/mechanize/page_elements.rb
122
+ - lib/mechanize/parsers
123
+ - lib/mechanize/pluggable_parsers.rb
124
+ - lib/mechanize/rexml.rb
125
+ - lib/mechanize/net-overrides/net
126
+ - lib/mechanize/net-overrides/net/http.rb
127
+ - lib/mechanize/net-overrides/net/https.rb
128
+ - lib/mechanize/net-overrides/net/protocol.rb
129
+ - lib/mechanize/parsers/rexml_page.rb
130
+ - README
131
+ - EXAMPLES
132
+ - CHANGELOG
133
+ - LICENSE
134
+ - NOTES
135
+ - GUIDE
128
136
  test_files: []
129
-
130
137
  rdoc_options:
131
- - --main
132
- - README
133
- - --title
134
- - "'WWW::Mechanize RDoc'"
138
+ - "--main"
139
+ - README
140
+ - "--title"
141
+ - "'WWW::Mechanize RDoc'"
135
142
  extra_rdoc_files:
136
- - README
137
- - EXAMPLES
138
- - CHANGELOG
139
- - LICENSE
140
- - NOTES
141
- - GUIDE
143
+ - README
144
+ - EXAMPLES
145
+ - CHANGELOG
146
+ - LICENSE
147
+ - NOTES
148
+ - GUIDE
142
149
  executables: []
143
-
144
150
  extensions: []
145
-
146
151
  requirements: []
147
-
148
152
  dependencies:
149
- - !ruby/object:Gem::Dependency
150
- name: hpricot
151
- version_requirement:
152
- version_requirements: !ruby/object:Gem::Version::Requirement
153
- requirements:
154
- - - ">"
155
- - !ruby/object:Gem::Version
156
- version: 0.0.0
157
- version:
153
+ - !ruby/object:Gem::Dependency
154
+ name: hpricot
155
+ version_requirement:
156
+ version_requirements: !ruby/object:Gem::Version::Requirement
157
+ requirements:
158
+ -
159
+ - ">"
160
+ - !ruby/object:Gem::Version
161
+ version: 0.0.0
162
+ version: