html2tex 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README.md CHANGED
@@ -59,3 +59,22 @@ comprehensive.
59
59
  StringScanner is used to process the HTML, but cannot read from a stream
60
60
  directly, so the entire input document must be read into memory as a string
61
61
  first.
62
+
63
+ UTF-8 is assumed everywhere; other character encodings will produce odd
64
+ results. If the HTML file to be processed is not in UTF-8 encoding with unix
65
+ line endings (at least, on Linux/OS X/etc.), _fix that first_. The usual
66
+ suspects will help here:
67
+
68
+ iconv -f windows-1252 -t utf-8 < somefile-win1252.html > somefile-utf8.html
69
+ dos2unix somefile-utf8.html
70
+
71
+ Next steps
72
+ ----------
73
+
74
+ If you have XeLaTex, you can easily turn the generated `.tex` file into a PDF:
75
+
76
+ xelatex my-book.tex
77
+
78
+ For better results, tweak the font settings or use a custom class like [this][ebook.cls].
79
+
80
+ [ebook.cls]: http://github.com/threedaymonk/gutenberg2pdf/blob/master/ebook.cls
@@ -21,6 +21,7 @@ class HTML2TeX
21
21
  private
22
22
  def read_html_head
23
23
  scanner.scan %r{\s*}
24
+ scanner.scan %r{<\?xml[^>]*?\?>\s*}i
24
25
  scanner.scan %r{<!doctype[^>]*>\s*}i
25
26
  scanner.scan %r{<html[^>]*>\s*}i
26
27
  if head = scanner.scan(%r{<head[^>]*>.*?</head>}im)
@@ -1,3 +1,3 @@
1
1
  class HTML2TeX
2
- VERSION = "0.1.2"
2
+ VERSION = "0.1.3"
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: html2tex
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.2
4
+ version: 0.1.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Paul Battley
@@ -9,7 +9,7 @@ autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
11
 
12
- date: 2010-05-09 00:00:00 +01:00
12
+ date: 2010-05-10 00:00:00 +01:00
13
13
  default_executable:
14
14
  dependencies:
15
15
  - !ruby/object:Gem::Dependency