sablon 0.0.22 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 8dce87aaca368f43d657f2ff7fe7249e6f46ddee
4
- data.tar.gz: 8fddd1630ba575d38c6ab790d82ee33e88c33b65
3
+ metadata.gz: 7bfea533a76e6d1eea7475e9916cce5c9d0b8be1
4
+ data.tar.gz: 58707c6b8a095d4e1e7d3be17bc39bec340d8ad0
5
5
  SHA512:
6
- metadata.gz: 8f9b5dfcec4a943d674e4144cfa9d545a99340d14592b4e6aa09e19199ae363f919d182473004787ca43e7153ffa1bd48bdfd828d83bb3474ee811feb301c8ba
7
- data.tar.gz: 592faea43070727caf1608ce097aa10fc43f0821c0614a658c7a4af3474a37e501d90115d2d5ad54cdf6303250ed2a7ba2cb21ab6949d79d0cef4fabbe6bf26f
6
+ metadata.gz: 42659e24a433a882c3d5c10f8dc9204a9e2a701e87d504c6fa3aa1d080c86f3a141de937df4e76db43ae21c527ea1eb2d816f0a766b24754ac6553e19d18ec7d
7
+ data.tar.gz: 5a4077e98f5214a8ac0603376ca90dbbc90726caa30013b66d480b994dff020541b891da30e3057f48bb077ebd1698f74a809c6c171070f8ce824519bcc74d1e
data/Gemfile.lock CHANGED
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- sablon (0.0.22)
4
+ sablon (0.1.0)
5
5
  nokogiri (>= 1.6.0)
6
6
  rubyzip (>= 1.1.1)
7
7
 
data/README.md CHANGED
@@ -114,15 +114,31 @@ word_processing_ml = <<-XML.gsub("\n", "")
114
114
  </w:p>
115
115
  XML
116
116
 
117
+ context = {
118
+ long_description: Sablon.content(:word_ml, word_processing_ml)
119
+ }
120
+ template.render_to_file File.expand_path("~/Desktop/output.docx"), context
121
+ ```
122
+ In the example above the entire paragraph will be replaced because all of the nodes being inserted aren't valid children of a paragraph (w:p) element. The example below shows inline insertion, where only runs are added and instead of replacing the entire paragraph only the merge field gets removed.
123
+
124
+ **Important:** All text must be wrapped in a run tag for valid inline insertion because WordML is still inserted directly into the document "as is" without any structure transformations other than run properties being merged.
125
+
126
+ ```ruby
127
+ word_processing_ml = <<-XML.gsub("\n", "")
128
+ <w:r w:rsidRPr="00B97C39">
129
+ <w:rPr>
130
+ <w:b />
131
+ </w:rPr>
132
+ <w:t>this is bold text</w:t>
133
+ </w:r>
134
+ XML
135
+
117
136
  context = {
118
137
  long_description: Sablon.content(:word_ml, word_processing_ml)
119
138
  }
120
139
  template.render_to_file File.expand_path("~/Desktop/output.docx"), context
121
140
  ```
122
141
 
123
- IMPORTANT: This feature is very much *experimental*. Currently, the insertion
124
- will replace the containing paragraph. This means that other content in the same
125
- paragraph is discarded.
126
142
 
127
143
  ##### HTML
128
144
 
@@ -136,12 +152,43 @@ is sufficient:
136
152
  To use HTML insertion prepare the context like so:
137
153
 
138
154
  ```ruby
139
- html_body = <<-HTML
140
- <div>This text can contain <em>additional formatting</em>
141
- according to the <strong>HTML</strong> specification.</div>
142
- <p style="text-align: right; background-color: #FFFF00">Right aligned
143
- content with a yellow background color</p>
144
- <div><span style="color: #123456">Inline styles</span> are possible as well</div>
155
+ html_body = <<-HTML.strip
156
+ <div>
157
+ This text can contain <em>additional formatting</em> according to the
158
+ <strong>HTML</strong> specification. As well as links to external
159
+ <a href="https://github.com/senny/sablon">websites</a>, don't forget
160
+ the "http/https" bit.
161
+ </div>
162
+
163
+ <p style="text-align: right; background-color: #FFFF00">
164
+ Right aligned content with a yellow background color.
165
+ </p>
166
+
167
+ <div>
168
+ <span style="color: #123456">Inline styles</span> are possible as well
169
+ </div>
170
+
171
+ <table style="border: 1px solid #0000FF;">
172
+ <caption>Table's can also be created via HTML</caption>
173
+ <tr>
174
+ <td>Cell 1 only text</td>
175
+ <td>
176
+ <ul>
177
+ <li>List in Table - 1</li>
178
+ <li>List in Table - 2</li>
179
+ </ul>
180
+ </td>
181
+ </tr>
182
+ <tr>
183
+ <td></td>
184
+ <td>
185
+ <table style="border: 1px solid #FF0000;">
186
+ <tr><th>A</th><th>B</th></tr>
187
+ <tr><td>C</td><td>D</td></tr>
188
+ </table>
189
+ </td>
190
+ </tr>
191
+ </table>
145
192
  HTML
146
193
  context = {
147
194
  article: Sablon.content(:html, html_body) }
@@ -151,24 +198,49 @@ context = {
151
198
  template.render_to_file File.expand_path("~/Desktop/output.docx"), context
152
199
  ```
153
200
 
154
- Currently, HTML insertion is somewhat limited. It is recommended that the block level tags such as `p` and `div` are not nested within each other, otherwise the final document may not generate as anticipated. List tags (`ul` and `ol`) and inline tags (`span`, `b`, `em`, etc.) can be nested as deeply as needed.
155
-
156
- Not all tags are supported. Currently supported tags are defined in [configuration.rb](lib/sablon/configuration/configuration.rb) for paragraphs in method `prepare_paragraph` and for text runs in `prepare_run`.
157
-
158
- Basic conversion of CSS inline styles into matching WordML properties in supported through the `style=" ... "` attribute in the HTML markup. Not all possible styles are supported and only a small subset of CSS styles have a direct WordML equivalent. Styles are passed onto nested elements. The currently supported styles are also defined in [configuration.rb](lib/sablon/configuration/configuration.rb) in method `process_style`. Simple toggle properties that aren't directly supported can be added using the `text-decoration: ` style attribute with the proper WordML tag name as the value. Paragraph and Run property reference can be found at:
201
+ There is no 1:1 conversion between HTML and Open Office XML however, a large
202
+ number of tags are very similar. HTML insertion is relatively complete
203
+ covering several key content structures such as paragraphs, tables and lists.
204
+ The snippet above showcases some of the capabilities present, for a comprehensive
205
+ example please see the html insertion test fixture [here](test/fixtures/html/html_test_content.html).
206
+ All html element conversions are defined in [configuration.rb](lib/sablon/configuration/configuration.rb)
207
+ with their matching AST classes defined in [ast.rb](lib/sablon/html/ast.rb).
208
+
209
+ Basic conversion of CSS inline styles into matching WordML properties is possible
210
+ using the `style=" ... "` attribute in the HTML markup. Not all CSS properties
211
+ are supported as only a small subset of CSS styles have a direct Open Office XML
212
+ equivalent. Styles are passed onto nested elements if the parent can't use them.
213
+ The currently supported styles are also defined in [configuration.rb](lib/sablon/configuration/configuration.rb). Toggle
214
+ properties that aren't directly supported can be added using the
215
+ `text-decoration: ` style attribute with the proper XML tag name as the
216
+ value (i.e. `text-decoration: dstrike` for `w:dstrike`). Simple single value properties that do not need a conversion can be added using the XML property name directly, omitting the `w:` prefix i.e.
217
+ (`highlight: cyan` for `w:highlight`).
218
+
219
+ Table, Paragraph and Run property references can be found at:
159
220
  * http://officeopenxml.com/WPparagraphProperties.php
160
221
  * http://officeopenxml.com/WPtextFormatting.php
222
+ * http://officeopenxml.com/WPtableProperties.php
223
+
224
+ The full Open Office XML specification used to develop the HTML converter
225
+ can be found [here](https://www.ecma-international.org/publications/standards/Ecma-376.htm) (3rd Edition).
161
226
 
162
- If you wish to write out your HTML code in an indented human readable fashion, or you are pulling content from the ERB templating engine in rails the following regular expression can help eliminate extraneous whitespace in the final document.
163
- ```ruby
164
- # combine all white space
165
- html_str = html_str.gsub(/\s+/, ' ')
166
- # clear any white space between block level tags and other content
167
- html_str.gsub(%r{\s*<(/?(?:h\d|div|p|br|ul|ol|li).*?)>\s*}, '<\1>')
168
- ```
169
227
 
170
- IMPORTANT: Currently, the insertion will replace the containing paragraph. This means that other content in the same paragraph is discarded.
228
+ The example above shows an HTML insertion operation that will replace the entire paragraph. In the same fashion as WordML, inline HTML insertion is possible where only the merge field is replaced as long as only "inline" elements are used. "Inline" in this context does not necessarily mean the same thing as it does in CSS, in this case it means that once the HTML is converted to WordML only valid children of a paragraph (w:p) tag exist. Unlike WordML insertion plain text can be used without being wrapped in tags when working with HTML, see the example below:
171
229
 
230
+ ```ruby
231
+ inline_html = <<-HTML.strip
232
+ This text can contain <em>additional formatting</em> according to the
233
+ <strong>HTML</strong> specification. As well as links to external
234
+ <a href="https://github.com/senny/sablon">websites</a>, don't forget
235
+ the "http/https" bit.
236
+ HTML
237
+ context = {
238
+ article: Sablon.content(:html, inline_html) }
239
+ # alternative method using special key format
240
+ # 'html:article' => html_body
241
+ }
242
+ template.render_to_file File.expand_path("~/Desktop/output.docx"), context
243
+ ```
172
244
 
173
245
  #### Conditionals
174
246
 
data/lib/sablon.rb CHANGED
@@ -3,6 +3,7 @@ require 'nokogiri'
3
3
 
4
4
  require "sablon/version"
5
5
  require "sablon/configuration/configuration"
6
+ require "sablon/relationship"
6
7
 
7
8
  require "sablon/numbering"
8
9
  require "sablon/context"
@@ -53,11 +53,16 @@ module Sablon
53
53
  @permitted_html_tags = {}
54
54
  tags = {
55
55
  # special tag used for elements with no parent, i.e. top level
56
- '#document-fragment' => { type: :block, ast_class: :root, allowed_children: :_block },
56
+ '#document-fragment' => { type: :block, ast_class: :root, allowed_children: %i[_block _inline] },
57
57
 
58
58
  # block level tags
59
+ table: { type: :block, ast_class: :table, allowed_children: %i[caption thead tbody tfoot tr ]},
60
+ tr: { type: :block, ast_class: :table_row, allowed_children: %i[th td] },
61
+ th: { type: :block, ast_class: :table_cell, properties: { b: nil, jc: 'center' }, allowed_children: %i[_block _inline] },
62
+ td: { type: :block, ast_class: :table_cell, allowed_children: %i[_block _inline] },
59
63
  div: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Normal' }, allowed_children: :_inline },
60
64
  p: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Paragraph' }, allowed_children: :_inline },
65
+ caption: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Caption' }, allowed_children: :_inline },
61
66
  h1: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Heading1' }, allowed_children: :_inline },
62
67
  h2: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Heading2' }, allowed_children: :_inline },
63
68
  h3: { type: :block, ast_class: :paragraph, properties: { pStyle: 'Heading3' }, allowed_children: :_inline },
@@ -68,6 +73,11 @@ module Sablon
68
73
  ul: { type: :block, ast_class: :list, properties: { pStyle: 'ListBullet' }, allowed_children: %i[ul li] },
69
74
  li: { type: :block, ast_class: :list_paragraph },
70
75
 
76
+ # inline style tags for tables
77
+ thead: { type: :inline, ast_class: nil, properties: { tblHeader: nil }, allowed_children: :tr },
78
+ tbody: { type: :inline, ast_class: nil, properties: {}, allowed_children: :tr },
79
+ tfoot: { type: :inline, ast_class: nil, properties: {}, allowed_children: :tr },
80
+
71
81
  # inline style tags
72
82
  span: { type: :inline, ast_class: nil, properties: {} },
73
83
  strong: { type: :inline, ast_class: nil, properties: { b: nil } },
@@ -80,6 +90,7 @@ module Sablon
80
90
  sup: { type: :inline, ast_class: nil, properties: { vertAlign: 'superscript' } },
81
91
 
82
92
  # inline content tags
93
+ a: { type: :inline, ast_class: :hyperlink, properties: { rStyle: 'Hyperlink' } },
83
94
  text: { type: :inline, ast_class: :run, properties: {}, allowed_children: [] },
84
95
  br: { type: :inline, ast_class: :newline, properties: {}, allowed_children: [] }
85
96
  }
@@ -122,6 +133,67 @@ module Sablon
122
133
  },
123
134
  'text-align' => ->(v) { return 'jc', v }
124
135
  },
136
+ # Styles specific to the Table AST class
137
+ table: {
138
+ 'border' => lambda { |v|
139
+ props = @defined_style_conversions[:node][:_border].call(v)
140
+ #
141
+ return 'tblBorders', [
142
+ { top: props }, { start: props }, { bottom: props },
143
+ { end: props }, { insideH: props }, { insideV: props }
144
+ ]
145
+ },
146
+ 'margin' => lambda { |v|
147
+ vals = v.split.map do |s|
148
+ @defined_style_conversions[:node][:_sz].call(s)
149
+ end
150
+ #
151
+ props = [vals[0], vals[0], vals[0], vals[0]] if vals.length == 1
152
+ props = [vals[0], vals[1], vals[0], vals[1]] if vals.length == 2
153
+ props = [vals[0], vals[1], vals[2], vals[1]] if vals.length == 3
154
+ props = [vals[0], vals[1], vals[2], vals[3]] if vals.length > 3
155
+ return 'tblCellMar', [
156
+ { top: { w: props[0], type: 'dxa' } },
157
+ { end: { w: props[1], type: 'dxa' } },
158
+ { bottom: { w: props[2], type: 'dxa' } },
159
+ { start: { w: props[3], type: 'dxa' } }
160
+ ]
161
+ },
162
+ 'cellspacing' => lambda { |v|
163
+ v = @defined_style_conversions[:node][:_sz].call(v)
164
+ return 'tblCellSpacing', { w: v, type: 'dxa' }
165
+ },
166
+ 'width' => lambda { |v|
167
+ v = @defined_style_conversions[:node][:_sz].call(v)
168
+ return 'tblW', { w: v, type: 'dxa' }
169
+ }
170
+ },
171
+ # Styles specific to the TableCell AST class
172
+ table_cell: {
173
+ 'border' => lambda { |v|
174
+ value = @defined_style_conversions[:table]['border'].call(v)[1]
175
+ return 'tcBorders', value
176
+ },
177
+ 'colspan' => ->(v) { return 'gridSpan', v },
178
+ 'margin' => lambda { |v|
179
+ value = @defined_style_conversions[:table]['margin'].call(v)[1]
180
+ return 'tcMar', value
181
+ },
182
+ 'rowspan' => lambda { |v|
183
+ return 'vMerge', 'restart' if v == 'start'
184
+ return 'vMerge', v if v == 'continue'
185
+ return 'vMerge', nil if v == 'end'
186
+ },
187
+ 'vertical-align' => ->(v) { return 'vAlign', v },
188
+ 'white-space' => lambda { |v|
189
+ return 'noWrap', nil if v == 'nowrap'
190
+ return 'tcFitText', 'true' if v == 'fit'
191
+ },
192
+ 'width' => lambda { |v|
193
+ value = @defined_style_conversions[:table]['width'].call(v)[1]
194
+ return 'tcW', value
195
+ }
196
+ },
125
197
  # Styles specific to the Paragraph AST class
126
198
  paragraph: {
127
199
  'border' => lambda { |v|
@@ -71,11 +71,85 @@ module Sablon
71
71
  def self.id; :word_ml end
72
72
  def self.wraps?(value) false end
73
73
 
74
+ def initialize(value)
75
+ super Nokogiri::XML.fragment(value)
76
+ end
77
+
74
78
  def append_to(paragraph, display_node, env)
75
- Nokogiri::XML.fragment(xml).children.reverse.each do |child|
76
- paragraph.add_next_sibling child
79
+ # if all nodes are inline then add them to the existing paragraph
80
+ # otherwise replace the paragraph with the new content.
81
+ if all_inline?
82
+ pr_tag = display_node.parent.at_xpath('./w:rPr')
83
+ add_siblings_to(display_node.parent, pr_tag)
84
+ display_node.parent.remove
85
+ else
86
+ add_siblings_to(paragraph)
87
+ paragraph.remove
88
+ end
89
+ end
90
+
91
+ # This allows proper equality checks with other WordML content objects.
92
+ # Due to the fact the `xml` attribute is a live Nokogiri object
93
+ # the default `==` comparison returns false unless it is the exact
94
+ # same object being compared. This method instead checks if the XML
95
+ # being added to the document is the same when the `other` object is
96
+ # an instance of the WordML content class.
97
+ def ==(other)
98
+ if other.class == self.class
99
+ xml.to_s == other.xml.to_s
100
+ else
101
+ super
102
+ end
103
+ end
104
+
105
+ private
106
+
107
+ # Returns `true` if all of the xml nodes to be inserted are
108
+ def all_inline?
109
+ (xml.children.map(&:node_name) - inline_tags).empty?
110
+ end
111
+
112
+ # Array of tags allowed to be a child of the w:p XML tag as defined
113
+ # by the Open XML specification
114
+ def inline_tags
115
+ %w[w:bdo w:bookmarkEnd w:bookmarkStart w:commentRangeEnd
116
+ w:commentRangeStart w:customXml
117
+ w:customXmlDelRangeEnd w:customXmlDelRangeStart
118
+ w:customXmlInsRangeEnd w:customXmlInsRangeStart
119
+ w:customXmlMoveFromRangeEnd w:customXmlMoveFromRangeStart
120
+ w:customXmlMoveToRangeEnd w:customXmlMoveToRangeStart
121
+ w:del w:dir w:fldSimple w:hyperlink w:ins w:moveFrom
122
+ w:moveFromRangeEnd w:moveFromRangeStart w:moveTo
123
+ w:moveToRangeEnd w:moveToRangeStart m:oMath m:oMathPara
124
+ w:pPr w:proofErr w:r w:sdt w:smartTag]
125
+ end
126
+
127
+ # Adds the XML to be inserted in the document as siblings to the
128
+ # node passed in. Run properties are merged here because of namespace
129
+ # issues when working with a document fragment
130
+ def add_siblings_to(node, rpr_tag = nil)
131
+ xml.children.reverse.each do |child|
132
+ node.add_next_sibling child
133
+ # merge properties
134
+ next unless rpr_tag
135
+ merge_rpr_tags(child, rpr_tag.children)
136
+ end
137
+ end
138
+
139
+ # Merges the provided properties into the run proprties of the
140
+ # node passed in. Properties are only added if they are not already
141
+ # defined on the node itself.
142
+ def merge_rpr_tags(node, props)
143
+ # first assert that all child runs (w:r tags) have a w:rPr tag
144
+ node.xpath('.//w:r').each do |child|
145
+ child.prepend_child '<w:rPr></w:rPr>' unless child.at_xpath('./w:rPr')
146
+ end
147
+ #
148
+ # merge run props, only adding them if they aren't already defined
149
+ node.xpath('.//w:rPr').each do |pr_tag|
150
+ existing = pr_tag.children.map(&:node_name)
151
+ props.map { |pr| pr_tag << pr unless existing.include? pr.node_name }
77
152
  end
78
- paragraph.remove
79
153
  end
80
154
  end
81
155
 
@@ -5,6 +5,7 @@ module Sablon
5
5
  attr_reader :template
6
6
  attr_reader :numbering
7
7
  attr_reader :context
8
+ attr_reader :relationship
8
9
 
9
10
  # returns a new environment with merged contexts
10
11
  def alter_context(context = {})
@@ -20,9 +21,11 @@ module Sablon
20
21
  if parent_env
21
22
  @template = parent_env.template
22
23
  @numbering = parent_env.numbering
24
+ @relationship = parent_env.relationship
23
25
  else
24
26
  @template = template
25
27
  @numbering = Numbering.new
28
+ @relationship = Relationship.new
26
29
  end
27
30
  #
28
31
  @context = Context.transform_hash(context)
@@ -1,4 +1,5 @@
1
1
  require "sablon/html/ast_builder"
2
+ require "sablon/html/node_properties"
2
3
 
3
4
  module Sablon
4
5
  class HTMLConverter
@@ -90,81 +91,6 @@ module Sablon
90
91
  end
91
92
  end
92
93
 
93
- # Manages the properties for an AST node
94
- class NodeProperties
95
- attr_reader :transferred_properties
96
-
97
- def self.paragraph(properties)
98
- new('w:pPr', properties, Paragraph::PROPERTIES)
99
- end
100
-
101
- def self.run(properties)
102
- new('w:rPr', properties, Run::PROPERTIES)
103
- end
104
-
105
- def initialize(tagname, properties, whitelist)
106
- @tagname = tagname
107
- filter_properties(properties, whitelist)
108
- end
109
-
110
- def inspect
111
- @properties.map { |k, v| v ? "#{k}=#{v}" : k }.join(';')
112
- end
113
-
114
- def [](key)
115
- @properties[key]
116
- end
117
-
118
- def []=(key, value)
119
- @properties[key] = value
120
- end
121
-
122
- def to_docx
123
- "<#{@tagname}>#{properties_word_ml}</#{@tagname}>" unless @properties.empty?
124
- end
125
-
126
- private
127
-
128
- # processes properties adding those on the whitelist to the
129
- # properties instance variable and those not to the transferred_properties
130
- # isntance variable
131
- def filter_properties(properties, whitelist)
132
- @transferred_properties = {}
133
- @properties = {}
134
- #
135
- properties.each do |key, value|
136
- if whitelist.include? key.to_s
137
- @properties[key] = value
138
- else
139
- @transferred_properties[key] = value
140
- end
141
- end
142
- end
143
-
144
- # processes attributes defined on the node into wordML property syntax
145
- def properties_word_ml
146
- @properties.map { |k, v| transform_attr(k, v) }.join
147
- end
148
-
149
- # properties that have a list as the value get nested in tags and
150
- # each entry in the list is transformed. When a value is a hash the
151
- # keys in the hash are used to explicitly build the XML tag attributes.
152
- def transform_attr(key, value)
153
- if value.is_a? Array
154
- sub_attrs = value.map do |sub_prop|
155
- sub_prop.map { |k, v| transform_attr(k, v) }
156
- end
157
- "<w:#{key}>#{sub_attrs.join}</w:#{key}>"
158
- elsif value.is_a? Hash
159
- props = value.map { |k, v| format('w:%s="%s"', k, v) if v }
160
- "<w:#{key} #{props.compact.join(' ')} />"
161
- else
162
- value = format('w:val="%s" ', value) if value
163
- "<w:#{key} #{value}/>"
164
- end
165
- end
166
- end
167
-
168
94
  # A container for an array of AST nodes with convenience methods to
169
95
  # work with the internal array as if it were a regular node
170
96
  class Collection < Node
@@ -189,6 +115,10 @@ module Sablon
189
115
  def inspect
190
116
  "[#{nodes.map(&:inspect).join(', ')}]"
191
117
  end
118
+
119
+ def <<(node)
120
+ @nodes << node
121
+ end
192
122
  end
193
123
 
194
124
  # Stores all of the AST nodes from the current fragment of HTML being
@@ -217,10 +147,23 @@ module Sablon
217
147
  # An AST node representing the top level content container for a word
218
148
  # document. These cannot be nested within other paragraph elements
219
149
  class Paragraph < Node
150
+ attr_accessor :runs
151
+
220
152
  PROPERTIES = %w[framePr ind jc keepLines keepNext numPr
221
153
  outlineLvl pBdr pStyle rPr sectPr shd spacing
222
154
  tabs textAlignment].freeze
223
- attr_accessor :runs
155
+
156
+ # Permitted child tags defined by the OpenXML spec
157
+ CHILD_TAGS = %w[w:bdo w:bookmarkEnd w:bookmarkStart w:commentRangeEnd
158
+ w:commentRangeStart w:customXml
159
+ w:customXmlDelRangeEnd w:customXmlDelRangeStart
160
+ w:customXmlInsRangeEnd w:customXmlInsRangeStart
161
+ w:customXmlMoveFromRangeEnd w:customXmlMoveFromRangeStart
162
+ w:customXmlMoveToRangeEnd w:customXmlMoveToRangeStart
163
+ w:del w:dir w:fldSimple w:hyperlink w:ins w:moveFrom
164
+ w:moveFromRangeEnd w:moveFromRangeStart w:moveTo
165
+ w:moveToRangeEnd w:moveToRangeStart m:oMath m:oMathPara
166
+ w:pPr w:proofErr w:r w:sdt w:smartTag]
224
167
 
225
168
  def initialize(env, node, properties)
226
169
  super
@@ -340,6 +283,195 @@ module Sablon
340
283
  end
341
284
  end
342
285
 
286
+ # Builds a table from html table tags
287
+ class Table < Node
288
+ PROPERTIES = %w[jc shd tblBorders tblCaption tblCellMar tblCellSpacing
289
+ tblInd tblLayout tblLook tblOverlap tblpPr tblStyle
290
+ tblStyleColBandSize tblStyleRowBandSize tblW].freeze
291
+
292
+ def initialize(env, node, properties)
293
+ super
294
+
295
+ # Process properties
296
+ properties = self.class.process_properties(properties)
297
+ @properties = NodeProperties.table(properties)
298
+ trans_props = transferred_properties
299
+
300
+ # Pull out the caption node if it exists and convert it separately.
301
+ # If multiple caption tags are defined, only the first one is kept.
302
+ @caption = node.xpath('./caption').remove
303
+ @caption = nil if @caption.empty?
304
+ if @caption
305
+ cap_side_pat = /caption-side: ?(top|bottom)/
306
+ @cap_side = @caption.attr('style').to_s.match(cap_side_pat).to_a[1]
307
+ node.add_previous_sibling @caption
308
+ @caption = ASTBuilder.html_to_ast(env, @caption, trans_props)[0]
309
+ end
310
+
311
+ # convert remaining child nodes and pass on transferrable properties
312
+ @children = ASTBuilder.html_to_ast(env, node.children, trans_props)
313
+ @children = Collection.new(@children)
314
+ end
315
+
316
+ def to_docx
317
+ if @caption && @cap_side == 'bottom'
318
+ super('w:tbl') + @caption.to_docx
319
+ elsif @caption
320
+ # caption always goes above table unless explicitly set to "bottom"
321
+ @caption.to_docx + super('w:tbl')
322
+ else
323
+ super('w:tbl')
324
+ end
325
+ end
326
+
327
+ def accept(visitor)
328
+ super
329
+ @children.accept(visitor)
330
+ end
331
+
332
+ def inspect
333
+ if @caption && @cap_side == 'bottom'
334
+ "<Table{#{@properties.inspect}}: #{@children.inspect}, #{@caption.inspect}>"
335
+ elsif @caption
336
+ "<Table{#{@properties.inspect}}: #{@caption.inspect}, #{@children.inspect}>"
337
+ else
338
+ "<Table{#{@properties.inspect}}: #{@children.inspect}>"
339
+ end
340
+ end
341
+
342
+ private
343
+
344
+ def children_to_docx
345
+ @children.to_docx
346
+ end
347
+ end
348
+
349
+ # Converts html table rows into wordML table rows
350
+ class TableRow < Node
351
+ PROPERTIES = %w[cantSplit hidden jc tblCellSpacing tblHeader
352
+ trHeight tblPrEx].freeze
353
+
354
+ def initialize(env, node, properties)
355
+ super
356
+ properties = self.class.process_properties(properties)
357
+ @properties = NodeProperties.table_row(properties)
358
+ #
359
+ trans_props = transferred_properties
360
+ @children = ASTBuilder.html_to_ast(env, node.children, trans_props)
361
+ @children = Collection.new(@children)
362
+ end
363
+
364
+ def to_docx
365
+ super('w:tr')
366
+ end
367
+
368
+ def accept(visitor)
369
+ super
370
+ @children.accept(visitor)
371
+ end
372
+
373
+ def inspect
374
+ "<TableRow{#{@properties.inspect}}: #{@children.inspect}>"
375
+ end
376
+
377
+ private
378
+
379
+ def children_to_docx
380
+ @children.to_docx
381
+ end
382
+ end
383
+
384
+ # Converts html table cells into wordML table cells
385
+ class TableCell < Node
386
+ PROPERTIES = %w[gridSpan hideMark noWrap shd tcBorders tcFitText
387
+ tcMar tcW vAlign vMerge].freeze
388
+
389
+ # Permitted child tags defined by the OpenXML spec
390
+ CHILD_TAGS = %w[w:altChunk w:bookmarkEnd w:bookmarkStart w:commentRangeEnd
391
+ w:commentRangeStart w:customXml w:customXmlDelRangeEnd
392
+ w:customXmlDelRangeStart w:customXmlInsRangeEnd
393
+ w:customXmlInsRangeStart w:customXmlMoveFromRangeEnd
394
+ w:customXmlMoveFromRangeStart w:customXmlMoveToRangeEnd
395
+ w:customXmlMoveToRangeStart w:del w:ins w:moveFrom
396
+ w:moveFromRangeEnd w:moveFromRangeStart w:moveTo
397
+ w:moveToRangeEnd w:moveToRangeStart m:oMath m:oMathPara
398
+ w:p w:permEnd w:permStart w:proofErr w:sdt w:tbl w:tcPr]
399
+
400
+ def initialize(env, node, properties)
401
+ super
402
+ properties = self.class.process_properties(properties)
403
+ @properties = NodeProperties.table_cell(properties)
404
+ #
405
+ # Nodes are processed first "as is" and then based on the XML
406
+ # generated wrapped by paragraphs.
407
+ trans_props = transferred_properties
408
+ @children = ASTBuilder.html_to_ast(env, node.children, trans_props)
409
+ @children = wrap_with_paragraphs(env, @children)
410
+ end
411
+
412
+ def to_docx
413
+ super('w:tc')
414
+ end
415
+
416
+ def accept(visitor)
417
+ super
418
+ @children.accept(visitor)
419
+ end
420
+
421
+ def inspect
422
+ "<TableCell{#{@properties.inspect}}: #{@children.inspect}>"
423
+ end
424
+
425
+ private
426
+
427
+ # Wraps nodes in Paragraph AST nodes if needed to produced a valid
428
+ # document
429
+ def wrap_with_paragraphs(env, nodes)
430
+ # convert all nodes to live xml, and use first node to determine
431
+ # if that AST node should be wrapped in a paragraph
432
+ nodes_xml = nodes.map { |n| Nokogiri::XML.fragment(n.to_docx) }
433
+ #
434
+ para = nil
435
+ new_nodes = []
436
+ nodes_xml.each_with_index do |n, i|
437
+ next unless n.children.first
438
+ # add all nodes that need wrapped to a paragraph sequentially.
439
+ # New paragraphs are created when something that doesn't need
440
+ # wrapped is encountered to retain proper content ordering.
441
+ first_node_name = n.children.first.node_name
442
+ if wrapped_by_paragraph.include? first_node_name
443
+ if para.nil?
444
+ para = new_paragraph(env)
445
+ new_nodes << para
446
+ end
447
+ para.runs << nodes[i]
448
+ else
449
+ new_nodes << nodes[i]
450
+ para = nil
451
+ end
452
+ end
453
+ # Ensure the table cell has an empty paragraph if nothing else
454
+ new_nodes << new_paragraph(env) if new_nodes.empty?
455
+ # filter nils and return
456
+ Collection.new(new_nodes.compact)
457
+ end
458
+
459
+ # Returns a list of child tags that need to be wrapped in a paragraph
460
+ def wrapped_by_paragraph
461
+ Paragraph::CHILD_TAGS - self.class::CHILD_TAGS
462
+ end
463
+
464
+ # Creates a new Paragraph AST node, with no children
465
+ def new_paragraph(env)
466
+ para = Nokogiri::HTML.fragment('<p></p>').first_element_child
467
+ ASTBuilder.html_to_ast(env, [para], transferred_properties).first
468
+ end
469
+
470
+ def children_to_docx
471
+ @children.to_docx
472
+ end
473
+ end
474
+
343
475
  # Create a run of text in the document, runs cannot be nested within
344
476
  # each other
345
477
  class Run < Node
@@ -387,5 +519,46 @@ module Sablon
387
519
  "<w:br/>"
388
520
  end
389
521
  end
522
+
523
+ # Creates a clickable URL in the word document, this only supports external
524
+ # urls only
525
+ class Hyperlink < Node
526
+ def initialize(env, node, properties)
527
+ super
528
+ # properties are passed directly to runs because hyperlink nodes
529
+ # don't have a corresponding property tag like runs or paragraphs.
530
+ @runs = ASTBuilder.html_to_ast(env, node.children, properties)
531
+ @runs = Collection.new(@runs)
532
+ @target = node.attributes['href'].value
533
+ #
534
+ hyperlink_relation = {
535
+ Id: 'rId' + SecureRandom.uuid.delete('-'),
536
+ Type: 'http://schemas.openxmlformats.org/officeDocument/2006/relationships/hyperlink',
537
+ Target: @target,
538
+ TargetMode: 'External'
539
+ }
540
+ env.relationship.relationships << hyperlink_relation
541
+ @attributes = { 'r:id' => hyperlink_relation[:Id] }
542
+ end
543
+
544
+ def to_docx
545
+ super('w:hyperlink')
546
+ end
547
+
548
+ def inspect
549
+ "<Hyperlink{target:#{@target}}: #{@runs.inspect}>"
550
+ end
551
+
552
+ def accept(visitor)
553
+ super
554
+ @runs.accept(visitor)
555
+ end
556
+
557
+ private
558
+
559
+ def children_to_docx
560
+ @runs.to_docx
561
+ end
562
+ end
390
563
  end
391
564
  end