html2doc 0.9.3 → 0.9.4
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.github/workflows/macos.yml +5 -1
- data/.github/workflows/ubuntu.yml +5 -1
- data/.github/workflows/windows.yml +5 -1
- data/README.adoc +1 -1
- data/lib/html2doc/mime.rb +5 -6
- data/lib/html2doc/version.rb +1 -1
- data/spec/html2doc_spec.rb +2 -0
- metadata +2 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: dcc9db6cc57352a2e100bf1c1ff4127518d79fb67b958f52cd36580a4ab0b716
|
4
|
+
data.tar.gz: 93979a73373bad8d3405ce8dd621455b54514d3e2e0017d9d91096b038405edb
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d5d29322e87a5ec048dfacd91fe3bb3ecc27478348062cae38cff7dfa3c30fb8fdca06a20cd248e87d50d48a483fa1e24e60bb3d195ab900fb80a65289090df4
|
7
|
+
data.tar.gz: 2803cb07d404a133fd7debee942c79e95154e7eb703527c96a00b8f10a03219d36f99b414c0ef964bd3beec01aed178524b2068b8b6ce45771e5246556a769ee
|
data/.github/workflows/macos.yml
CHANGED
data/README.adoc
CHANGED
@@ -113,7 +113,7 @@ The bad news is that Word's understanding of HTML is HTML 4. In order for bookma
|
|
113
113
|
|
114
114
|
The good news with generating a Word document via HTML is that Word understands CSS, and you can determine much of what the Word document looks like by manipulating that CSS. That extends to features that are not part of HTML CSS: if you want to work out how to get Word to do something in CSS, save a Word document that already does what you want as HTML, and inspect the HTML and CSS you get.
|
115
115
|
|
116
|
-
The bad news is that Word's implementation of CSS is poorly documented -- even if Office HTML is documented in a 1300 page document (online at https://stigmortenmyre.no/mso/, https://www.rodriguezcommaj.com/assets/resources/microsoft-office-html-and-xml-reference.pdf), and the CSS selectors are only partially and selectively implemented. For list styles, for example, `mso-level-text` governs how the list label is displayed; but it is only recognised in a `@list` style: it is ignored in a CSS rule like `ol li`, or in a `style` attribute on a node. Working out the right CSS for what you want will take some trial and error, and you are better placed to try to do things Word's way than the right way.
|
116
|
+
The bad news is that Word's implementation of CSS is poorly documented -- even if Office HTML is documented in a 1300 page document (online at https://stigmortenmyre.no/mso/, https://www.rodriguezcommaj.com/assets/resources/microsoft-office-html-and-xml-reference.pdf), and the CSS selectors are only partially and selectively implemented. For list styles, for example, `mso-level-text` governs how the list label is displayed; but it is only recognised in a `@list` style: it is ignored in a CSS rule like `ol li`, or in a `style` attribute on a node. CSS selectors only support classes, in ancestor relations: `p.class1 ol.class2` is supported, but `#id1` is not, and neither is `p > ol`. Working out the right CSS for what you want will take some trial and error, and you are better placed to try to do things Word's way than the right way.
|
117
117
|
|
118
118
|
=== XSLT
|
119
119
|
|
data/lib/html2doc/mime.rb
CHANGED
@@ -87,10 +87,9 @@ module Html2Doc
|
|
87
87
|
|
88
88
|
# only processes locally stored images
|
89
89
|
def self.image_cleanup(docxml, dir, localdir)
|
90
|
-
#docxml.xpath(IMAGE_PATH).each do |i|
|
91
90
|
docxml.traverse do |i|
|
92
91
|
next unless i.element? && %w(img v:imagedata).include?(i.name)
|
93
|
-
warnsvg(i["src"])
|
92
|
+
#warnsvg(i["src"])
|
94
93
|
next if /^http/.match i["src"]
|
95
94
|
next if %r{^data:image/[^;]+;base64}.match i["src"]
|
96
95
|
local_filename = %r{^([A-Z]:)?/}.match(i["src"]) ? i["src"] :
|
@@ -115,12 +114,12 @@ module Html2Doc
|
|
115
114
|
if a.size == 2 && !(/ src="https?:/.match a[1]) &&
|
116
115
|
!(%r{ src="data:image/[^;]+;base64}.match a[1])
|
117
116
|
m = / src=['"](?<src>[^"']+)['"]/.match a[1]
|
118
|
-
warnsvg(m[:src])
|
117
|
+
#warnsvg(m[:src])
|
119
118
|
m2 = /\.(?<suffix>[a-zA-Z_0-9]+)$/.match m[:src]
|
120
|
-
new_filename = "
|
119
|
+
new_filename = "#{mkuuid}.#{m2[:suffix]}"
|
121
120
|
old_filename = %r{^([A-Z]:)?/}.match(m[:src]) ? m[:src] : File.join(localdir, m[:src])
|
122
|
-
FileUtils.cp old_filename, File.join(dir,
|
123
|
-
a[1].sub!(%r{ src=['"](?<src>[^"']+)['"]}, " src='
|
121
|
+
FileUtils.cp old_filename, File.join(dir, new_filename)
|
122
|
+
a[1].sub!(%r{ src=['"](?<src>[^"']+)['"]}, " src='file:///C:/Doc/#{filename}_files/#{new_filename}'")
|
124
123
|
end
|
125
124
|
a.join
|
126
125
|
end
|
data/lib/html2doc/version.rb
CHANGED
data/spec/html2doc_spec.rb
CHANGED
@@ -566,10 +566,12 @@ RSpec.describe Html2Doc do
|
|
566
566
|
OUTPUT
|
567
567
|
end
|
568
568
|
|
569
|
+
=begin
|
569
570
|
it "warns about SVG" do
|
570
571
|
simple_body = '<img src="https://example.com/19160-6.svg">'
|
571
572
|
expect{ Html2Doc.process(html_input(simple_body), filename: "test") }.to output("https://example.com/19160-6.svg: SVG not supported\n").to_stderr
|
572
573
|
end
|
574
|
+
=end
|
573
575
|
|
574
576
|
it "processes epub:type footnotes" do
|
575
577
|
simple_body = '<div>This is a very simple
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: html2doc
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.9.
|
4
|
+
version: 0.9.4
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Ribose Inc.
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2020-01-30 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: htmlentities
|