sablon 0.1.1 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (56) hide show
  1. checksums.yaml +4 -4
  2. data/Gemfile.lock +2 -2
  3. data/README.md +36 -5
  4. data/lib/sablon.rb +0 -3
  5. data/lib/sablon/configuration/html_tag.rb +1 -1
  6. data/lib/sablon/content.rb +56 -0
  7. data/lib/sablon/context.rb +2 -0
  8. data/lib/sablon/document_object_model/content_types.rb +35 -0
  9. data/lib/sablon/document_object_model/file_handler.rb +26 -0
  10. data/lib/sablon/document_object_model/model.rb +94 -0
  11. data/lib/sablon/document_object_model/numbering.rb +94 -0
  12. data/lib/sablon/document_object_model/relationships.rb +111 -0
  13. data/lib/sablon/environment.rb +13 -16
  14. data/lib/sablon/html/ast.rb +14 -13
  15. data/lib/sablon/html/ast_builder.rb +18 -5
  16. data/lib/sablon/html/node_properties.rb +3 -3
  17. data/lib/sablon/operations.rb +59 -0
  18. data/lib/sablon/processor/document.rb +48 -11
  19. data/lib/sablon/processor/section_properties.rb +11 -4
  20. data/lib/sablon/template.rb +88 -47
  21. data/lib/sablon/version.rb +1 -1
  22. data/misc/image-example.png +0 -0
  23. data/test/configuration_test.rb +22 -22
  24. data/test/content_test.rb +50 -0
  25. data/test/context_test.rb +37 -1
  26. data/test/environment_test.rb +4 -1
  27. data/test/executable_test.rb +0 -2
  28. data/test/fixtures/cv_sample.docx +0 -0
  29. data/test/fixtures/html_sample.docx +0 -0
  30. data/test/fixtures/images/c3po.jpg +0 -0
  31. data/test/fixtures/images/clone.jpg +0 -0
  32. data/test/fixtures/images/darth_vader.jpg +0 -0
  33. data/test/fixtures/images/r2d2.jpg +0 -0
  34. data/test/fixtures/images_sample.docx +0 -0
  35. data/test/fixtures/images_template.docx +0 -0
  36. data/test/fixtures/loops_sample.docx +0 -0
  37. data/test/fixtures/loops_template.docx +0 -0
  38. data/test/fixtures/recipe_sample.docx +0 -0
  39. data/test/fixtures/xml/image.xml +91 -0
  40. data/test/fixtures/xml/loop_with_unique_ids.xml +152 -0
  41. data/test/fixtures/xml/mock_document/word/document.xml +12 -0
  42. data/test/html/ast_test.rb +10 -5
  43. data/test/html/converter_style_test.rb +9 -9
  44. data/test/html/converter_test.rb +66 -81
  45. data/test/html/node_properties_test.rb +2 -2
  46. data/test/html_test.rb +2 -6
  47. data/test/processor/document_test.rb +80 -3
  48. data/test/processor/section_properties_test.rb +68 -0
  49. data/test/sablon_test.rb +77 -5
  50. data/test/test_helper.rb +109 -9
  51. metadata +33 -9
  52. data/lib/sablon/numbering.rb +0 -23
  53. data/lib/sablon/processor/numbering.rb +0 -47
  54. data/lib/sablon/relationship.rb +0 -47
  55. data/lib/sablon/test/assertions.rb +0 -22
  56. data/test/section_properties_test.rb +0 -41
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 9aa52423b659a11000ab780b41d5236af1004e39
4
- data.tar.gz: aab45da4847e119795627db459141db3d43c870b
3
+ metadata.gz: ed6349a93ba2cb199faa068f080fac477c2bc8d3
4
+ data.tar.gz: db1af586f261a43916313337fd504f6a68ddc610
5
5
  SHA512:
6
- metadata.gz: 8ef1f9feafe3b59eb90860ffef9e3db54b47a030d3195ca3cc27265d0fb92db688ddbeae47a947c939c516e3c175d313d768d100d97791ea958e7beec96c586c
7
- data.tar.gz: 5c881774d042b730eabe05a2d294cc667f035cfd28fccef66fc8abcd5f7d7ec463be6a860a8d3b6139138e0e61df1b32efe3754b4d31733687f76bbb46442cc6
6
+ metadata.gz: 63efa28431abe48b000a1cce4a8e22d065997fa8b010b9161c82fa69c13c972d6dbbbe45a8cb6ca100a884c44e867f2a483af0cfe682cfed0bfc2db66d4fbc50
7
+ data.tar.gz: 69c728d1c6cb4301e8521a19aabb80d8dfe4f09f8f2fec86a0f8ae6a311e49611a7da9be88e9c27764aaecb96f31fb343eb003283718040f59c8f0feb889232c
@@ -1,7 +1,7 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- sablon (0.1.1)
4
+ sablon (0.2.0)
5
5
  nokogiri (>= 1.6.0)
6
6
  rubyzip (>= 1.1.1)
7
7
 
@@ -10,7 +10,7 @@ GEM
10
10
  specs:
11
11
  mini_portile2 (2.3.0)
12
12
  minitest (5.10.3)
13
- nokogiri (1.8.1)
13
+ nokogiri (1.8.2)
14
14
  mini_portile2 (~> 2.3.0)
15
15
  rake (12.3.0)
16
16
  rubyzip (1.2.1)
data/README.md CHANGED
@@ -1,12 +1,12 @@
1
1
  # Sablon
2
2
 
3
- [![Gem Version](https://badge.fury.io/rb/sablon.svg)](http://badge.fury.io/rb/sablon) [![Build Status](https://travis-ci.org/senny/sablon.svg?branch=master)](https://travis-ci.org/senny/sablon)
3
+ [![Gem Version](https://badge.fury.io/rb/sablon.svg)](http://badge.fury.io/rb/sablon)
4
+ [![Build Status](https://travis-ci.org/senny/sablon.svg?branch=master)](https://travis-ci.org/senny/sablon)
4
5
 
5
6
  Is a document template processor for Word `docx` files. It leverages Word's
6
7
  built-in formatting and layouting capabilities to make template creation easy
7
8
  and efficient.
8
9
 
9
- *Note: Sablon is still in early development. Please report if you encounter any issues along the way.*
10
10
 
11
11
  #### Table of Contents
12
12
  * [Installation](#installation)
@@ -15,6 +15,7 @@ and efficient.
15
15
  * [Content Insertion](#content-insertion)
16
16
  * [WordProcessingML](#wordprocessingml)
17
17
  * [HTML](#html)
18
+ * [Images (Beta)](#images-beta)
18
19
  * [Conditionals](#conditionals)
19
20
  * [Loops](#loops)
20
21
  * [Nesting](#nesting)
@@ -225,14 +226,14 @@ The full Open Office XML specification used to develop the HTML converter
225
226
  can be found [here](https://www.ecma-international.org/publications/standards/Ecma-376.htm) (3rd Edition).
226
227
 
227
228
 
228
- The example above shows an HTML insertion operation that will replace the entire paragraph. In the same fashion as WordML, inline HTML insertion is possible where only the merge field is replaced as long as only "inline" elements are used. "Inline" in this context does not necessarily mean the same thing as it does in CSS, in this case it means that once the HTML is converted to WordML only valid children of a paragraph (w:p) tag exist. Unlike WordML insertion plain text can be used without being wrapped in tags when working with HTML, see the example below:
229
+ The example above shows an HTML insertion operation that will replace the entire paragraph. In the same fashion as WordML, inline HTML insertion is possible where only the merge field is replaced as long as only "inline" elements are used. "Inline" in this context does not necessarily mean the same thing as it does in CSS, in this case it means that once the HTML is converted to WordML only valid children of a paragraph (w:p) tag exist. As with WordML all plain text needs to be wrapped in a HTML tag. A simple `<span>..</span>` tag enclosing all other elements will suffice. See the example below:
229
230
 
230
231
  ```ruby
231
232
  inline_html = <<-HTML.strip
232
- This text can contain <em>additional formatting</em> according to the
233
+ <span>This text can contain <em>additional formatting</em> according to the
233
234
  <strong>HTML</strong> specification. As well as links to external
234
235
  <a href="https://github.com/senny/sablon">websites</a>, don't forget
235
- the "http/https" bit.
236
+ the "http/https" bit.</span>
236
237
  HTML
237
238
  context = {
238
239
  article: Sablon.content(:html, inline_html) }
@@ -242,6 +243,35 @@ context = {
242
243
  template.render_to_file File.expand_path("~/Desktop/output.docx"), context
243
244
  ```
244
245
 
246
+
247
+ ##### Images (beta)
248
+
249
+ Images can be added to the document using a placeholder image wrapped in a
250
+ pair of merge fields set up as `«@figure:start»` and `«@figure:end»`. Where
251
+ in this case "figure" is the key of the context hash storing the image.
252
+
253
+ Images are wrapped in an instance of a `Sablon::Content` class in the same
254
+ fashion as HTML or WordML strings. An image may be initialized from multiple
255
+ sources such as file paths, URLs, or any object that exposes a `#read`
256
+ method that returns image data. When using a "readable object" if the object
257
+ doesn't have a `#filename` method then a `filename: '...'` option
258
+ needs to be added to the `Sablon.content` method call.
259
+ ```ruby
260
+ context = {
261
+ figure: Sablon.content(:image, 'fixtures/images/c3po.jpg'),
262
+ figure2: Sablon.content(:image, string_io_obj, filename: 'test.png')
263
+ # alternative method using special key format for simple paths and URLs
264
+ # 'image:figure' => 'fixtures/images/c3po.jpg'
265
+ }
266
+ ```
267
+
268
+ Example:
269
+ ![image merge fields example](misc/image-example.png)
270
+
271
+ Additional examples of usage can be found in
272
+ [images_template.docx](test/fixtures/images_template.docx) and
273
+ in [sablon_test.rb](test/sablon_test.rb).
274
+
245
275
  #### Conditionals
246
276
 
247
277
  Sablon can render parts of the template conditionally based on the value of a
@@ -265,6 +295,7 @@ For more complex conditionals you can use a predicate like so:
265
295
  «body:endIf»
266
296
  ```
267
297
 
298
+
268
299
  #### Loops
269
300
 
270
301
  Loops repeat parts of the document.
@@ -3,15 +3,12 @@ require 'nokogiri'
3
3
 
4
4
  require "sablon/version"
5
5
  require "sablon/configuration/configuration"
6
- require "sablon/relationship"
7
6
 
8
- require "sablon/numbering"
9
7
  require "sablon/context"
10
8
  require "sablon/environment"
11
9
  require "sablon/template"
12
10
  require "sablon/processor/document"
13
11
  require "sablon/processor/section_properties"
14
- require "sablon/processor/numbering"
15
12
  require "sablon/parser/mail_merge"
16
13
  require "sablon/operations"
17
14
  require "sablon/html/converter"
@@ -49,7 +49,7 @@ module Sablon
49
49
  # etc. All the keys need to be symbols to avoid getting reparsed
50
50
  # with the element's CSS attributes.
51
51
  @properties = options.fetch(:properties, {})
52
- @properties = Hash[@properties.map { |k, v| [k.to_sym, v] }]
52
+ @properties = Hash[@properties.map { |k, v| [k.to_s, v] }]
53
53
  # Set permitted child tags or tag groups
54
54
  self.allowed_children = options[:allowed_children]
55
55
  end
@@ -1,3 +1,5 @@
1
+ require 'open-uri'
2
+
1
3
  module Sablon
2
4
  module Content
3
5
  class << self
@@ -170,8 +172,62 @@ module Sablon
170
172
  end
171
173
  end
172
174
 
175
+ # Handles reading image data and inserting it into the document
176
+ class Image < Struct.new(:name, :data, :local_rid)
177
+ attr_reader :rid_by_file
178
+
179
+ def self.id; :image end
180
+ def self.wraps?(value) false end
181
+
182
+ def inspect
183
+ "#<Image #{name}:#{@rid_by_file}>"
184
+ end
185
+
186
+ def initialize(source, attributes = {})
187
+ attributes = Hash[attributes.map { |k, v| [k.to_s, v] }]
188
+ # If the source object is readable, use it as such otherwise open
189
+ # and read the content
190
+ if source.respond_to?(:read)
191
+ name, img_data = process_readable(source, attributes)
192
+ else
193
+ name = File.basename(source)
194
+ img_data = IO.binread(source)
195
+ end
196
+ #
197
+ super name, img_data
198
+ @attributes = attributes
199
+ # rId's are separate for each XML file but I want to be able
200
+ # to reuse the actual image file itself.
201
+ @rid_by_file = {}
202
+ end
203
+
204
+ def append_to(paragraph, display_node, env) end
205
+
206
+ private
207
+
208
+ # Reads the data and attempts to find a filename from either the
209
+ # attributes hash or a #filename method on the source object itself.
210
+ # A filename is required inorder for MS Word to know the content type.
211
+ def process_readable(source, attributes)
212
+ if attributes['filename']
213
+ name = attributes['filename']
214
+ elsif source.respond_to?(:filename)
215
+ name = source.filename
216
+ else
217
+ begin
218
+ name = File.basename(source)
219
+ rescue TypeError
220
+ raise ArgumentError, "Error: Could not determine filename from source, try: `Sablon.content(readable_obj, filename: '...')`"
221
+ end
222
+ end
223
+ #
224
+ [File.basename(name), source.read]
225
+ end
226
+ end
227
+
173
228
  register Sablon::Content::String
174
229
  register Sablon::Content::WordML
175
230
  register Sablon::Content::HTML
231
+ register Sablon::Content::Image
176
232
  end
177
233
  end
@@ -17,6 +17,8 @@ module Sablon
17
17
  case value
18
18
  when Hash
19
19
  [key, transform_hash(value)]
20
+ when Array
21
+ [key, value.map { |v| v.is_a?(Hash) ? transform_hash(v) : v }]
20
22
  else
21
23
  [key, value]
22
24
  end
@@ -0,0 +1,35 @@
1
+ require 'sablon/document_object_model/file_handler'
2
+
3
+ module Sablon
4
+ module DOM
5
+ # Adds new content types to the document
6
+ class ContentTypes < FileHandler
7
+ #
8
+ # extends the Model class so it now has an "add_content_type" method
9
+ def self.extend_model(model_klass)
10
+ super do
11
+ define_method(:add_content_type) do |extension, type|
12
+ @dom['[Content_Types].xml'].add_content_type(extension, type)
13
+ end
14
+ end
15
+ end
16
+
17
+ # Sets up the class instance to handle new relationships for a document.
18
+ # I only care about tags that have an integer component
19
+ def initialize(xml_node)
20
+ super
21
+ #
22
+ @types = xml_node.root
23
+ end
24
+
25
+ # Adds a new content type to the file
26
+ def add_content_type(extension, type)
27
+ #
28
+ # don't add duplicate extensions to the document
29
+ return unless @types.css(%(Default[Extension="#{extension}"])).empty?
30
+ #
31
+ @types << %(<Default Extension="#{extension}" ContentType="#{type}"/>)
32
+ end
33
+ end
34
+ end
35
+ end
@@ -0,0 +1,26 @@
1
+ module Sablon
2
+ module DOM
3
+ # An abstract class used to setup other file handling classes
4
+ class FileHandler
5
+ #
6
+ # extends the Model class using instance eval with a block argument
7
+ def self.extend_model(model_klass, &block)
8
+ model_klass.instance_eval(&block)
9
+ end
10
+
11
+ # All subclasses should be initialized only accepting the content
12
+ # as a single argument.
13
+ def initialize(content); end
14
+
15
+ # Finds the maximum value of an attribute by converting it to an
16
+ # integer. Non numeric portions of values are ignored. The method can
17
+ # be either xpath or css, xpath being the default.
18
+ def max_attribute_value(xml_node, selector, attr_name, query_method: :xpath)
19
+ xml_node.send(query_method, selector).map.inject(0) do |max, node|
20
+ next max unless (match = node.attr(attr_name).match(/(\d+)/))
21
+ [max, match[1].to_i].max
22
+ end
23
+ end
24
+ end
25
+ end
26
+ end
@@ -0,0 +1,94 @@
1
+ require 'sablon/document_object_model/file_handler'
2
+ require 'sablon/document_object_model/content_types'
3
+ require 'sablon/document_object_model/numbering'
4
+ require 'sablon/document_object_model/relationships'
5
+
6
+ module Sablon
7
+ # Stores classes used to build and interact with the template by treating
8
+ # it as a full document model instead of disparate components that are
9
+ # packaged together.
10
+ module DOM
11
+ class << self
12
+ # Allows new handlers to be registered for different components of
13
+ # the MS Word document. The pattern passed in is used to determine
14
+ # if a file in the entry set should be handled by the class.
15
+ def register_dom_handler(pattern, klass)
16
+ handlers[pattern] = klass
17
+ klass.extend_model(Sablon::DOM::Model)
18
+ end
19
+
20
+ def wrap_with_handler(entry_name, content)
21
+ key = handlers.keys.detect { |pat| entry_name =~ pat }
22
+ if key
23
+ handlers[key].new(content)
24
+ else
25
+ Sablon::DOM::FileHandler.new(content)
26
+ end
27
+ end
28
+
29
+ private
30
+
31
+ def handlers
32
+ @handlers ||= {}
33
+ end
34
+ end
35
+
36
+ # Object to represent an entire template and it's XML contents
37
+ class Model
38
+ attr_accessor :current_entry
39
+ attr_reader :zip_contents
40
+
41
+ # setup the DOM by reading and storing all XML files in the template
42
+ # in memory
43
+ def initialize(zip_io_stream)
44
+ @current_entry = nil
45
+ @zip_contents = {}
46
+ zip_io_stream.each do |entry|
47
+ next unless entry.file?
48
+ content = entry.get_input_stream.read
49
+ @zip_contents[entry.name] = wrap_entry(entry.name, content)
50
+ end
51
+ #
52
+ @dom = build_dom(@zip_contents)
53
+ end
54
+
55
+ # Returns the corresponding DOM handled file
56
+ def [](entry_name)
57
+ @dom[entry_name]
58
+ end
59
+
60
+ private
61
+
62
+ # Determines how the content in the zip file entry should be wrapped
63
+ def wrap_entry(entry_name, content)
64
+ if entry_name =~ /\.(?:xml|rels)$/
65
+ Nokogiri::XML(content)
66
+ else
67
+ content
68
+ end
69
+ end
70
+
71
+ # constructs the dom model using helper clases defined under this
72
+ # namespace.
73
+ def build_dom(entries)
74
+ key_values = entries.map do |entry_name, content|
75
+ [entry_name, Sablon::DOM.wrap_with_handler(entry_name, content)]
76
+ end
77
+ #
78
+ Hash[key_values]
79
+ end
80
+
81
+ def create_entry_if_not_exist(name, init_content = '')
82
+ return unless @zip_contents[name].nil?
83
+ #
84
+ # create the entry and add it to the dom
85
+ @zip_contents[name] = wrap_entry(name, init_content)
86
+ @dom[name] = Sablon::DOM.wrap_with_handler(name, @zip_contents[name])
87
+ end
88
+ end
89
+
90
+ register_dom_handler(%r{word/numbering.xml}, Sablon::DOM::Numbering)
91
+ register_dom_handler(/.rels$/, Sablon::DOM::Relationships)
92
+ register_dom_handler(/Content_Types/, Sablon::DOM::ContentTypes)
93
+ end
94
+ end
@@ -0,0 +1,94 @@
1
+ require 'sablon/document_object_model/file_handler'
2
+
3
+ module Sablon
4
+ module DOM
5
+ # Manages the creation of new list definitions
6
+ class Numbering < FileHandler
7
+ Definition = Struct.new(:numid, :abstract_id, :style)
8
+
9
+ # extends the Model class using instance eval with a block argument
10
+ def self.extend_model(model_klass, &block)
11
+ super do
12
+ #
13
+ # adds a list definition to the numbering.xml file
14
+ define_method(:add_list_definition) do |style|
15
+ @dom['word/numbering.xml'].add_list_definition(style)
16
+ end
17
+ end
18
+ end
19
+
20
+ # Sets up the class to add new list definitions to the number.xml
21
+ # file
22
+ def initialize(xml_node)
23
+ super
24
+ #
25
+ @numbering = xml_node.root
26
+ #
27
+ @max_numid = max_attribute_value('//w:num', 'w:numId')
28
+ #
29
+ selector = '//w:abstractNum'
30
+ @max_abstract_id = max_attribute_value(selector, 'w:abstractNumId')
31
+ end
32
+
33
+ # adds a new relationship and returns the corresponding rId for it
34
+ def add_list_definition(style)
35
+ definition = create_definition(style)
36
+ #
37
+ # update numbering file with new definitions
38
+ node = @numbering.xpath('//w:abstractNum').last
39
+ node.add_next_sibling(abstract_tag(definition))
40
+ #
41
+ node = @numbering.xpath('//w:num').last
42
+ node.add_next_sibling(definition_tag(definition))
43
+ #
44
+ definition
45
+ end
46
+
47
+ private
48
+
49
+ # Finds the maximum value of an attribute by converting it to an
50
+ # integer. Non numeric portions of values are ignored.
51
+ def max_attribute_value(selector, attr_name)
52
+ super(@numbering, selector, attr_name)
53
+ end
54
+
55
+ # Creates a new list definition tag to define a list
56
+ def definition_tag(definition)
57
+ <<-XML.gsub(/^\s+|\n/, '')
58
+ <w:num w:numId="#{definition.numid}">
59
+ <w:abstractNumId w:val="#{definition.abstract_id}" />
60
+ </w:num>
61
+ XML
62
+ end
63
+
64
+ # Creates a new abstract numbering definition tag to style a list
65
+ def abstract_tag(definition)
66
+ abstract_num = find_abstract_definition(definition.style)
67
+ abstract_num['w:abstractNumId'] = definition.abstract_id
68
+ abstract_num.xpath('./w:nsid').each(&:remove)
69
+ #
70
+ abstract_num
71
+ end
72
+
73
+ # Locates and copies the first abstract numbering definition with
74
+ # the expected style. If one can not be found an error is raised.
75
+ def find_abstract_definition(style)
76
+ path = "//w:abstractNum[descendant-or-self::*[w:pStyle[@w:val='#{style}']]]"
77
+ unless (abstract_num = @numbering.at_xpath(path))
78
+ msg = "Could not find w:abstractNum definition for style: '#{style}'"
79
+ raise ArgumentError, msg
80
+ end
81
+ #
82
+ abstract_num.dup
83
+ end
84
+
85
+ # Creates a new instance of the Definition struct, after incrementing
86
+ # the max id values
87
+ def create_definition(style)
88
+ @max_numid += 1
89
+ @max_abstract_id += 1
90
+ Definition.new(@max_numid, @max_abstract_id, style)
91
+ end
92
+ end
93
+ end
94
+ end