hpricot 0.8.2 → 0.8.3

Sign up to get free protection for your applications and to get access to all the features.
data/CHANGELOG CHANGED
@@ -1,16 +1,32 @@
1
+ = 0.8.3
2
+ === 3 November, 2010
3
+ * GH#8: Nil-check before downcasing attribute key
4
+ * GH#25: Proper ruby 1.9 encoding support
5
+ * GH#28. Use integers instead of ?? on 1.9, which is just a string.
6
+ * including noscript to ElementInclusions , so that hpricot wont fail
7
+ when trying to parse a meta tag inside head section when noscript is
8
+ present.
9
+ * latest changes from fast_xs mainline
10
+ * Fixes to get Hpricot running on Rubinius:
11
+ * Use free, not XFREE
12
+ * Remove RSTRUCT craziness, don't break Array#at
13
+
1
14
  = 0.8.2
2
15
  === 5 November, 2009
3
16
  * Bring JRuby support up to speed, including Java-based hpricot_css support
4
17
  * Change JRuby fast_xs to have same escaping behavior as C fast_xs
5
-
6
- = 0.8.1
7
- === 3 April, 2009
8
18
  * fix for issue #2, downcasing of html attributes inside the parser.
9
19
  * solve issue #3 with bogus etags being preserved in `to_s` rather than just `to_original_html`.
10
20
  * fix error when attempting to reparent cleared node. (issue #5)
11
21
  * Hpricot::Attributes proxy object for using `ele.attributes[k] = v` directly.
12
22
  however, it is preferred to use the jquery-like `elements.attr(k, v)`.
13
23
 
24
+ = 0.8.1
25
+ === 3 April, 2009
26
+ * big problems on Ruby 1.8.6, use INT2FIX instead of INT2NUM. hashes were being cast to bignums.
27
+ * patch for 1.8.5 to define RARRAY_PTR. thanks, mike perham!
28
+ * inspecting empty document bug, courtesy of @TalLevAmi.
29
+
14
30
  = 0.8
15
31
  === 31st March, 2009
16
32
  * Saving memory and speed by using RStruct-based elements in the C extension.
@@ -1,4 +1,4 @@
1
- = Hpricot, Read Any HTML
1
+ # Hpricot, Read Any HTML
2
2
 
3
3
  Hpricot is a fast, flexible HTML parser written in C. It's designed to be very
4
4
  accommodating (like Tanaka Akira's HTree) and to have a very helpful library
@@ -13,21 +13,21 @@ thing.
13
13
  *Please read this entire document* before making assumptions about how this
14
14
  software works.
15
15
 
16
- == An Overview
16
+ ## An Overview
17
17
 
18
18
  Let's clear up what Hpricot is.
19
19
 
20
- # Hpricot is *a standalone library*. It requires no other libraries. Just Ruby!
21
- # While priding itself on speed, Hpricot *works hard to sort out bad HTML* and
20
+ * Hpricot is *a standalone library*. It requires no other libraries. Just Ruby!
21
+ * While priding itself on speed, Hpricot *works hard to sort out bad HTML* and
22
22
  pays a small penalty in order to get that right. So that's slightly more important
23
23
  to me than speed.
24
- # *If you can see it in Firefox, then Hpricot should parse it.* That's
24
+ * *If you can see it in Firefox, then Hpricot should parse it.* That's
25
25
  how it should be! Let me know the minute it's otherwise.
26
- # Primarily, Hpricot is used for reading HTML and tries to sort out troubled
26
+ * Primarily, Hpricot is used for reading HTML and tries to sort out troubled
27
27
  HTML by having some idea of what good HTML is. Some people still like to use
28
28
  Hpricot for XML reading, but *remember to use the Hpricot::XML() method* for that!
29
29
 
30
- == The Hpricot Kingdom
30
+ ## The Hpricot Kingdom
31
31
 
32
32
  First, here are all the links you need to know:
33
33
 
@@ -43,57 +43,57 @@ not going to say "Use at your own risk" because I don't want this library to be
43
43
  risky. If you trip on something, I'll share the liability by repairing things
44
44
  as quickly as I can. Your responsibility is to report the inadequacies.
45
45
 
46
- == Installing Hpricot
46
+ ## Installing Hpricot
47
47
 
48
48
  You may get the latest stable version from Rubyforge. Win32 binaries,
49
49
  Java binaries (for JRuby), and source gems are available.
50
50
 
51
- $ gem install hpricot
51
+ $ gem install hpricot
52
52
 
53
- == An Hpricot Showcase
53
+ ## An Hpricot Showcase
54
54
 
55
55
  We're going to run through a big pile of examples to get you jump-started.
56
56
  Many of these examples are also found at
57
57
  http://wiki.github.com/hpricot/hpricot/hpricot-basics, in case you
58
58
  want to add some of your own.
59
59
 
60
- === Loading Hpricot Itself
60
+ ### Loading Hpricot Itself
61
61
 
62
62
  You have probably got the gem, right? To load Hpricot:
63
63
 
64
- require 'rubygems'
65
- require 'hpricot'
64
+ require 'rubygems'
65
+ require 'hpricot'
66
66
 
67
67
  If you've installed the plain source distribution, go ahead and just:
68
68
 
69
- require 'hpricot'
69
+ require 'hpricot'
70
70
 
71
- === Load an HTML Page
71
+ ### Load an HTML Page
72
72
 
73
73
  The <tt>Hpricot()</tt> method takes a string or any IO object and loads the
74
74
  contents into a document object.
75
75
 
76
- doc = Hpricot("<p>A simple <b>test</b> string.</p>")
76
+ doc = Hpricot("<p>A simple <b>test</b> string.</p>")
77
77
 
78
78
  To load from a file, just get the stream open:
79
79
 
80
- doc = open("index.html") { |f| Hpricot(f) }
80
+ doc = open("index.html") { |f| Hpricot(f) }
81
81
 
82
82
  To load from a web URL, use <tt>open-uri</tt>, which comes with Ruby:
83
83
 
84
- require 'open-uri'
85
- doc = open("http://qwantz.com/") { |f| Hpricot(f) }
84
+ require 'open-uri'
85
+ doc = open("http://qwantz.com/") { |f| Hpricot(f) }
86
86
 
87
87
  Hpricot uses an internal buffer to parse the file, so the IO will stream
88
88
  properly and large documents won't be loaded into memory all at once. However,
89
89
  the parsed document object will be present in memory, in its entirety.
90
90
 
91
- === Search for Elements
91
+ ### Search for Elements
92
92
 
93
93
  Use <tt>Doc.search</tt>:
94
94
 
95
- doc.search("//p[@class='posted']")
96
- #=> #<Hpricot:Elements[{p ...}, {p ...}]>
95
+ doc.search("//p[@class='posted']")
96
+ #=> #<Hpricot:Elements[{p ...}, {p ...}]>
97
97
 
98
98
  <tt>Doc.search</tt> can take an XPath or CSS expression. In the above example,
99
99
  all paragraph <tt><p></tt> elements are grabbed which have a <tt>class</tt>
@@ -101,126 +101,127 @@ attribute of <tt>"posted"</tt>.
101
101
 
102
102
  A shortcut is to use the divisor:
103
103
 
104
- (doc/"p.posted")
105
- #=> #<Hpricot:Elements[{p ...}, {p ...}]>
104
+ (doc/"p.posted")
105
+ #=> #<Hpricot:Elements[{p ...}, {p ...}]>
106
106
 
107
- === Finding Just One Element
107
+ ### Finding Just One Element
108
108
 
109
109
  If you're looking for a single element, the <tt>at</tt> method will return the
110
110
  first element matched by the expression. In this case, you'll get back the
111
111
  element itself rather than the <tt>Hpricot::Elements</tt> array.
112
112
 
113
- doc.at("body")['onload']
113
+ doc.at("body")['onload']
114
114
 
115
115
  The above code will find the body tag and give you back the <tt>onload</tt>
116
116
  attribute. This is the most common reason to use the element directly: when
117
117
  reading and writing HTML attributes.
118
118
 
119
- === Fetching the Contents of an Element
119
+ ### Fetching the Contents of an Element
120
120
 
121
121
  Just as with browser scripting, the <tt>inner_html</tt> property can be used to
122
122
  get the inner contents of an element.
123
123
 
124
- (doc/"#elementID").inner_html
125
- #=> "..<b>contents</b>.."
124
+ (doc/"#elementID").inner_html
125
+ #=> "..contents.."
126
126
 
127
127
  If your expression matches more than one element, you'll get back the contents
128
128
  of ''all the matched elements''. So you may want to use <tt>first</tt> to be
129
129
  sure you get back only one.
130
130
 
131
- (doc/"#elementID").first.inner_html
132
- #=> "..<b>contents</b>.."
131
+ (doc/"#elementID").first.inner_html
132
+ #=> "..contents.."
133
133
 
134
- === Fetching the HTML for an Element
134
+ ### Fetching the HTML for an Element
135
135
 
136
136
  If you want the HTML for the whole element (not just the contents), use
137
137
  <tt>to_html</tt>:
138
138
 
139
- (doc/"#elementID").to_html
140
- #=> "<div id='elementID'>...</div>"
139
+ (doc/"#elementID").to_html
140
+ #=> "<div id='elementID'>...</div>"
141
141
 
142
- === Looping
142
+ ### Looping
143
143
 
144
144
  All searches return a set of <tt>Hpricot::Elements</tt>. Go ahead and loop
145
145
  through them like you would an array.
146
146
 
147
- (doc/"p/a/img").each do |img|
148
- puts img.attributes['class']
149
- end
147
+ (doc/"p/a/img").each do |img|
148
+ puts img.attributes['class']
149
+ end
150
150
 
151
- === Continuing Searches
151
+ ### Continuing Searches
152
152
 
153
153
  Searches can be continued from a collection of elements, in order to search deeper.
154
154
 
155
- # find all paragraphs.
156
- elements = doc.search("/html/body//p")
157
- # continue the search by finding any images within those paragraphs.
158
- (elements/"img")
159
- #=> #<Hpricot::Elements[{img ...}, {img ...}]>
155
+ # find all paragraphs.
156
+ elements = doc.search("/html/body//p")
157
+ # continue the search by finding any images within those paragraphs.
158
+ (elements/"img")
159
+ #=> #<Hpricot::Elements[{img ...}, {img ...}]>
160
160
 
161
161
  Searches can also be continued by searching within container elements.
162
162
 
163
- # find all images within paragraphs.
164
- doc.search("/html/body//p").each do |para|
165
- puts "== Found a paragraph =="
166
- pp para
163
+ # find all images within paragraphs.
164
+ doc.search("/html/body//p").each do |para|
165
+ puts "== Found a paragraph =="
166
+ pp para
167
167
 
168
- imgs = para.search("img")
169
- if imgs.any?
170
- puts "== Found #{imgs.length} images inside =="
171
- end
172
- end
168
+ imgs = para.search("img")
169
+ if imgs.any?
170
+ puts "== Found #{imgs.length} images inside =="
171
+ end
172
+ end
173
173
 
174
174
  Of course, the most succinct ways to do the above are using CSS or XPath.
175
175
 
176
- # the xpath version
177
- (doc/"/html/body//p//img")
178
- # the css version
179
- (doc/"html > body > p img")
180
- # ..or symbols work, too!
181
- (doc/:html/:body/:p/:img)
176
+ # the xpath version
177
+ (doc/"/html/body//p//img")
178
+ # the css version
179
+ (doc/"html > body > p img")
180
+ # ..or symbols work, too!
181
+ (doc/:html/:body/:p/:img)
182
182
 
183
- === Looping Edits
183
+ ### Looping Edits
184
184
 
185
185
  You may certainly edit objects from within your search loops. Then, when you
186
186
  spit out the HTML, the altered elements will show.
187
187
 
188
- (doc/"span.entryPermalink").each do |span|
189
- span.attributes['class'] = 'newLinks'
190
- end
191
- puts doc
188
+
189
+ (doc/"span.entryPermalink").each do |span|
190
+ span.attributes['class'] = 'newLinks'
191
+ end
192
+ puts doc
192
193
 
193
194
  This changes all <tt>span.entryPermalink</tt> elements to
194
195
  <tt>span.newLinks</tt>. Keep in mind that there are often more convenient ways
195
196
  of doing this. Such as the <tt>set</tt> method:
196
197
 
197
- (doc/"span.entryPermalink").set(:class => 'newLinks')
198
+ (doc/"span.entryPermalink").set(:class => 'newLinks')
198
199
 
199
- === Figuring Out Paths
200
+ ### Figuring Out Paths
200
201
 
201
202
  Every element can tell you its unique path (either XPath or CSS) to get to the
202
203
  element from the root tag.
203
204
 
204
205
  The <tt>css_path</tt> method:
205
206
 
206
- doc.at("div > div:nth(1)").css_path
207
- #=> "div > div:nth(1)"
208
- doc.at("#header").css_path
209
- #=> "#header"
207
+ doc.at("div > div:nth(1)").css_path
208
+ #=> "div > div:nth(1)"
209
+ doc.at("#header").css_path
210
+ #=> "#header"
210
211
 
211
212
  Or, the <tt>xpath</tt> method:
212
213
 
213
- doc.at("div > div:nth(1)").xpath
214
- #=> "/div/div:eq(1)"
215
- doc.at("#header").xpath
216
- #=> "//div[@id='header']"
214
+ doc.at("div > div:nth(1)").xpath
215
+ #=> "/div/div:eq(1)"
216
+ doc.at("#header").xpath
217
+ #=> "//div[@id='header']"
217
218
 
218
- == Hpricot Fixups
219
+ ## Hpricot Fixups
219
220
 
220
221
  When loading HTML documents, you have a few settings that can make Hpricot more
221
222
  or less intense about how it gets involved.
222
223
 
223
- == :fixup_tags
224
+ ## :fixup_tags
224
225
 
225
226
  Really, there are so many ways to clean up HTML and your intentions may be to
226
227
  keep the HTML as-is. So Hpricot's default behavior is to keep things flexible.
@@ -229,7 +230,7 @@ Making sure to open and close all the tags, but ignore any validation problems.
229
230
  As of Hpricot 0.4, there's a new <tt>:fixup_tags</tt> option which will attempt
230
231
  to shift the document's tags to meet XHTML 1.0 Strict.
231
232
 
232
- doc = open("index.html") { |f| Hpricot f, :fixup_tags => true }
233
+ doc = open("index.html") { |f| Hpricot f, :fixup_tags => true }
233
234
 
234
235
  This doesn't quite meet the XHTML 1.0 Strict standard, it just tries to follow
235
236
  the rules a bit better. Like: say Hpricot finds a paragraph in a link, it's
@@ -238,13 +239,13 @@ where paragraphs don't belong.
238
239
 
239
240
  If an unknown element is found, it is ignored. Again, <tt>:fixup_tags</tt>.
240
241
 
241
- == :xhtml_strict
242
+ ## :xhtml_strict
242
243
 
243
244
  So, let's go beyond just trying to fix the hierarchy. The
244
245
  <tt>:xhtml_strict</tt> option really tries to force the document to be an XHTML
245
246
  1.0 Strict document. Even at the cost of removing elements that get in the way.
246
247
 
247
- doc = open("index.html") { |f| Hpricot f, :xhtml_strict => true }
248
+ doc = open("index.html") { |f| Hpricot f, :xhtml_strict => true }
248
249
 
249
250
  What measures does <tt>:xhtml_strict</tt> take?
250
251
 
@@ -254,7 +255,7 @@ What measures does <tt>:xhtml_strict</tt> take?
254
255
  4. Remove illegal content.
255
256
  5. Alter the doctype to XHTML 1.0 Strict.
256
257
 
257
- == Hpricot.XML()
258
+ ## Hpricot.XML()
258
259
 
259
260
  The last option is the <tt>:xml</tt> option, which makes some slight variations
260
261
  on the standard mode. The main difference is that :xml mode won't try to output
@@ -266,9 +267,9 @@ to case, friends.
266
267
 
267
268
  The primary way to use Hpricot's XML mode is to call the Hpricot.XML method:
268
269
 
269
- doc = open("http://redhanded.hobix.com/index.xml") do |f|
270
- Hpricot.XML(f)
271
- end
270
+ doc = open("http://redhanded.hobix.com/index.xml") do |f|
271
+ Hpricot.XML(f)
272
+ end
272
273
 
273
274
  *Also, :fixup_tags is canceled out by the :xml option.* This is because
274
275
  :fixup_tags makes assumptions based how HTML is structured. Specifically, how
data/Rakefile CHANGED
@@ -1,10 +1,12 @@
1
- require 'rake'
2
1
  require 'rake/clean'
3
2
  require 'rake/gempackagetask'
4
3
  require 'rake/rdoctask'
5
4
  require 'rake/testtask'
6
- require 'fileutils'
7
- include FileUtils
5
+ begin
6
+ require 'rake/extensiontask'
7
+ rescue LoadError
8
+ abort "To build, please first gem install rake-compiler"
9
+ end
8
10
 
9
11
  RbConfig = Config unless defined?(RbConfig)
10
12
 
@@ -12,13 +14,14 @@ NAME = "hpricot"
12
14
  REV = (`#{ENV['GIT'] || "git"} rev-list HEAD`.split.length + 1).to_s
13
15
  VERS = ENV['VERSION'] || "0.8" + (REV ? ".#{REV}" : "")
14
16
  PKG = "#{NAME}-#{VERS}"
15
- BIN = "*.{bundle,jar,so,o,obj,pdb,lib,def,exp,class}"
16
- CLEAN.include ["ext/hpricot_scan/#{BIN}", "ext/fast_xs/#{BIN}", "lib/**/#{BIN}",
17
+ BIN = "*.{bundle,jar,so,o,obj,pdb,lib,def,exp,class,rbc}"
18
+ CLEAN.include ["#{BIN}", "ext/**/#{BIN}", "lib/**/#{BIN}", "test/**/#{BIN}",
17
19
  'ext/fast_xs/Makefile', 'ext/hpricot_scan/Makefile',
18
- '**/.*.sw?', '*.gem', '.config', 'pkg']
19
- RDOC_OPTS = ['--quiet', '--title', 'The Hpricot Reference', '--main', 'README', '--inline-source']
20
- PKG_FILES = %w(CHANGELOG COPYING README Rakefile) +
21
- Dir.glob("{bin,doc,test,lib,extras}/**/*") +
20
+ '**/.*.sw?', '*.gem', '.config', 'pkg', 'lib/hpricot_scan.rb', 'lib/fast_xs.rb']
21
+ RDOC_OPTS = ['--quiet', '--title', 'The Hpricot Reference', '--main', 'README.md', '--inline-source']
22
+ PKG_FILES = %w(CHANGELOG COPYING README.md Rakefile) +
23
+ Dir.glob("{bin,doc,test,extras}/**/*") +
24
+ (Dir.glob("lib/**/*.rb") - %w(lib/hpricot_scan.rb lib/fast_xs.rb)) +
22
25
  Dir.glob("ext/**/*.{h,java,c,rb,rl}") +
23
26
  %w[ext/hpricot_scan/hpricot_scan.c ext/hpricot_scan/hpricot_css.c ext/hpricot_scan/HpricotScanService.java] # needed because they are generated later
24
27
  RAGEL_C_CODE_GENERATION_STYLES = {
@@ -39,7 +42,7 @@ SPEC =
39
42
  s.platform = Gem::Platform::RUBY
40
43
  s.has_rdoc = true
41
44
  s.rdoc_options += RDOC_OPTS
42
- s.extra_rdoc_files = ["README", "CHANGELOG", "COPYING"]
45
+ s.extra_rdoc_files = ["README.md", "CHANGELOG", "COPYING"]
43
46
  s.summary = "a swift, liberal HTML parser with a fantastic library"
44
47
  s.description = s.summary
45
48
  s.author = "why the lucky stiff"
@@ -52,26 +55,54 @@ SPEC =
52
55
  s.bindir = "bin"
53
56
  end
54
57
 
55
- Win32Spec = SPEC.dup
56
- Win32Spec.platform = 'x86-mswin32'
57
- Win32Spec.files = PKG_FILES + ["lib/hpricot_scan.so", "lib/fast_xs.so"]
58
- Win32Spec.extensions = []
58
+ # FAT cross-compile
59
+ # Pass RUBY_CC_VERSION=1.8.7:1.9.2 when packaging for 1.8+1.9 mswin32 binaries
60
+ %w(hpricot_scan fast_xs).each do |target|
61
+ Rake::ExtensionTask.new(target, SPEC) do |ext|
62
+ ext.lib_dir = File.join('lib', target) if ENV['RUBY_CC_VERSION']
63
+ ext.cross_compile = true # enable cross compilation (requires cross compile toolchain)
64
+ ext.cross_platform = 'i386-mswin32' # forces the Windows platform instead of the default one
65
+ end
59
66
 
60
- WIN32_PKG_DIR = "#{PKG}-mswin32"
67
+ # HACK around 1.9.2 cross .def file creation
68
+ def_file = "tmp/i386-mswin32/#{target}/1.9.2/#{target}-i386-mingw32.def"
69
+ directory File.dirname(def_file)
70
+ file def_file => File.dirname(def_file) do |t|
71
+ File.open(t.name, "w") do |f|
72
+ f << "EXPORTS\nInit_#{target}\n"
73
+ end
74
+ end
75
+
76
+ task File.join(File.dirname(def_file), "Makefile") => def_file
77
+ # END HACK
78
+ file "lib/#{target}.rb" do |t|
79
+ File.open(t.name, "w") do |f|
80
+ f.puts %{require "#{target}/\#{RUBY_VERSION.sub(/\\.\\d+$/, '')}/#{target}"}
81
+ end
82
+ end
83
+ end
84
+ file 'ext/hpricot_scan/extconf.rb' => :ragel
85
+
86
+ desc "set environment variables to build and/or test with debug options"
87
+ task :debug do
88
+ ENV['CFLAGS'] ||= ""
89
+ ENV['CFLAGS'] += " -g -DDEBUG"
90
+ end
61
91
 
62
92
  desc "Does a full compile, test run"
63
93
  if defined?(JRUBY_VERSION)
64
- task :default => [:compile_java, :test]
94
+ task :default => [:compile_java, :clean_fat_rb, :test]
65
95
  else
66
- task :default => [:compile, :test]
96
+ task :default => [:compile, :clean_fat_rb, :test]
67
97
  end
68
98
 
69
- desc "Packages up Hpricot."
70
- task :package => [:clean, :ragel]
71
-
72
- desc "Releases packages for all Hpricot packages and platforms."
73
- task :release => [:package, :package_win32, :package_jruby]
99
+ task :clean_fat_rb do
100
+ rm_f "lib/hpricot_scan.rb"
101
+ rm_f "lib/fast_xs.rb"
102
+ end
74
103
 
104
+ desc "Packages up Hpricot for all platforms."
105
+ task :package => [:clean]
75
106
 
76
107
  desc "Run all the tests"
77
108
  Rake::TestTask.new do |t|
@@ -83,8 +114,8 @@ end
83
114
  Rake::RDocTask.new do |rdoc|
84
115
  rdoc.rdoc_dir = 'doc/rdoc'
85
116
  rdoc.options += RDOC_OPTS
86
- rdoc.main = "README"
87
- rdoc.rdoc_files.add ['README', 'CHANGELOG', 'COPYING', 'lib/**/*.rb']
117
+ rdoc.main = "README.md"
118
+ rdoc.rdoc_files.add ['README.md', 'CHANGELOG', 'COPYING', 'lib/**/*.rb']
88
119
  end
89
120
 
90
121
  Rake::GemPackageTask.new(SPEC) do |p|
@@ -92,53 +123,32 @@ Rake::GemPackageTask.new(SPEC) do |p|
92
123
  p.gem_spec = SPEC
93
124
  end
94
125
 
95
- ['hpricot_scan', 'fast_xs'].each do |extension|
96
- ext = "ext/#{extension}"
97
- ext_so = "#{ext}/#{extension}.#{Config::CONFIG['DLEXT']}"
98
- ext_files = FileList[
99
- "#{ext}/*.c",
100
- "#{ext}/*.h",
101
- "#{ext}/*.rl",
102
- "#{ext}/extconf.rb",
103
- "#{ext}/Makefile",
104
- "lib"
105
- ]
106
-
107
- desc "Builds just the #{extension} extension"
108
- task extension.to_sym => [:ragel, "#{ext}/Makefile", ext_so ]
109
-
110
- file "#{ext}/Makefile" => ["#{ext}/extconf.rb"] do
111
- Dir.chdir(ext) do ruby "extconf.rb" end
112
- end
113
-
114
- file ext_so => ext_files do
115
- Dir.chdir(ext) do
116
- sh(RUBY_PLATFORM =~ /mswin/ ? 'nmake' : 'make')
126
+ ### Win32 Packages ###
127
+ Win32Spec = SPEC.dup
128
+ Win32Spec.platform = 'i386-mswin32'
129
+ Win32Spec.files = PKG_FILES + %w(hpricot_scan fast_xs).map do |t|
130
+ unless ENV['RUBY_CC_VERSION']
131
+ file "lib/#{t}/1.8/#{t}.so" do
132
+ abort "ERROR while packaging: re-run for fat win32 gems:\nrake #{ARGV.join(' ')} RUBY_CC_VERSION=1.8.7:1.9.2"
117
133
  end
118
- cp ext_so, "lib"
119
134
  end
135
+ ["lib/#{t}.rb", "lib/#{t}/1.8/#{t}.so", "lib/#{t}/1.9/#{t}.so"]
136
+ end.flatten
137
+ Win32Spec.extensions = []
120
138
 
121
- desc "Cross-compile the #{extension} extension for win32"
122
- file "#{extension}_win32" => [WIN32_PKG_DIR] do
123
- cp "extras/mingw-rbconfig.rb", "#{WIN32_PKG_DIR}/ext/#{extension}/rbconfig.rb"
124
- sh "cd #{WIN32_PKG_DIR}/ext/#{extension}/ && ruby -I. extconf.rb && make"
125
- mv "#{WIN32_PKG_DIR}/ext/#{extension}/#{extension}.so", "#{WIN32_PKG_DIR}/lib"
126
- end
139
+ Rake::GemPackageTask.new(Win32Spec) do |p|
140
+ p.need_tar = false
141
+ p.gem_spec = Win32Spec
127
142
  end
128
143
 
129
- task "lib" do
130
- directory "lib"
131
- end
144
+ JRubySpec = SPEC.dup
145
+ JRubySpec.platform = 'java'
146
+ JRubySpec.files = PKG_FILES + ["lib/hpricot_scan.jar", "lib/fast_xs.jar"]
147
+ JRubySpec.extensions = []
132
148
 
133
- desc "Compiles the Ruby extension"
134
- task :compile => [:hpricot_scan, :fast_xs] do
135
- if Dir.glob(File.join("lib","hpricot_scan.*")).length == 0
136
- STDERR.puts "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
137
- STDERR.puts "Gem actually failed to build. Your system is"
138
- STDERR.puts "NOT configured properly to build hpricot."
139
- STDERR.puts "!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!"
140
- exit(1)
141
- end
149
+ Rake::GemPackageTask.new(JRubySpec) do |p|
150
+ p.need_tar = false
151
+ p.gem_spec = JRubySpec
142
152
  end
143
153
 
144
154
  desc "Determines the Ragel version and displays it on the console along with the location of the Ragel binary."
@@ -178,27 +188,7 @@ task :ragel_java => [:ragel_version] do
178
188
  end
179
189
  end
180
190
 
181
- ### Win32 Packages ###
182
-
183
- desc "Package up the Win32 distribution."
184
- file WIN32_PKG_DIR => [:package] do
185
- sh "tar zxf pkg/#{PKG}.tgz"
186
- mv PKG, WIN32_PKG_DIR
187
- end
188
-
189
- desc "Build the binary RubyGems package for win32"
190
- task :package_win32 => ["fast_xs_win32", "hpricot_scan_win32"] do
191
- Dir.chdir("#{WIN32_PKG_DIR}") do
192
- Gem::Builder.new(Win32Spec).build
193
- verbose(true) {
194
- mv Dir["*.gem"].first, "../pkg/"
195
- }
196
- end
197
- end
198
-
199
- CLEAN.include WIN32_PKG_DIR
200
-
201
- ### JRuby Packages ###
191
+ ### JRuby Compile ###
202
192
 
203
193
  def java_classpath_arg # myriad of ways to discover JRuby classpath
204
194
  begin
@@ -211,7 +201,11 @@ def java_classpath_arg # myriad of ways to discover JRuby classpath
211
201
  jruby_cpath = ENV['JRUBY_PARENT_CLASSPATH'] || ENV['JRUBY_HOME'] &&
212
202
  FileList["#{ENV['JRUBY_HOME']}/lib/*.jar"].join(File::PATH_SEPARATOR)
213
203
  end
214
- jruby_cpath ? "-cp \"#{jruby_cpath}\"" : ""
204
+ unless jruby_cpath || ENV['CLASSPATH'] =~ /jruby/
205
+ abort %{WARNING: No JRuby classpath has been set up.
206
+ Define JRUBY_HOME=/path/to/jruby on the command line or in the environment}
207
+ end
208
+ "-cp \"#{jruby_cpath}\""
215
209
  end
216
210
 
217
211
  def compile_java(filenames, jarname)
@@ -231,42 +225,10 @@ task :fast_xs_java do
231
225
  end
232
226
  end
233
227
 
234
- desc "Compiles the JRuby extensions"
235
- task :compile_java => [:hpricot_scan_java, :fast_xs_java] do
236
- %w(hpricot_scan fast_xs).each {|ext| mv "ext/#{ext}/#{ext}.jar", "lib"}
237
- end
238
-
239
- JRubySpec = SPEC.dup
240
- JRubySpec.platform = 'java'
241
- JRubySpec.files = PKG_FILES + ["lib/hpricot_scan.jar", "lib/fast_xs.jar"]
242
- JRubySpec.extensions = []
243
-
244
- JRUBY_PKG_DIR = "#{PKG}-java"
245
-
246
- desc "Package up the JRuby distribution."
247
- file JRUBY_PKG_DIR => [:ragel_java, :package] do
248
- sh "tar zxf pkg/#{PKG}.tgz"
249
- mv PKG, JRUBY_PKG_DIR
250
- end
251
-
252
- desc "Build the RubyGems package for JRuby"
253
- task :package_jruby => JRUBY_PKG_DIR do
254
- Dir.chdir("#{JRUBY_PKG_DIR}") do
255
- Rake::Task[:compile_java].invoke
256
- Gem::Builder.new(JRubySpec).build
257
- verbose(true) {
258
- mv Dir["*.gem"].first, "../pkg/#{JRUBY_PKG_DIR}.gem"
259
- }
228
+ %w(hpricot_scan fast_xs).each do |ext|
229
+ file "lib/#{ext}.jar" => "#{ext}_java" do |t|
230
+ mv "ext/#{ext}/#{ext}.jar", "lib"
260
231
  end
232
+ task :compile_java => "lib/#{ext}.jar"
261
233
  end
262
234
 
263
- CLEAN.include JRUBY_PKG_DIR
264
-
265
- task :install do
266
- sh %{rake package}
267
- sh %{sudo gem install pkg/#{NAME}-#{VERS}}
268
- end
269
-
270
- task :uninstall => [:clean] do
271
- sh %{sudo gem uninstall #{NAME}}
272
- end