standard-procedure-consolidate 0.3.1 → 0.4.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: e7a086625fa1f07169c2f9eba544538df266fbd83702efeedd13998d1a9e83b3
4
- data.tar.gz: f80f0c11aa839c9dba9c4dc41758471ccfc31d7d38159e902e11d48cc8016c82
3
+ metadata.gz: 39e8b33a8f43ddb4a3bff6ccfb91ee9608d741dc9dff3759ea7c45b1c3cf3d3c
4
+ data.tar.gz: 71d357d9e60a98f199c7d67ab3f6a0b5ee8999218162fabf43a4fc4283f04a85
5
5
  SHA512:
6
- metadata.gz: 72b26cbc5f54cc481ad49cbb604338591e27f1a503330372ab41920a05f3cbde4b1d6b6aa4b607b1189fbd8fd5ed30102d9741b8b6cf762c0cb39c900ac19327
7
- data.tar.gz: 33ab5fbe36794cf24ca73ce70ec6da8f91cfe5430a438d5794da54bd668e72dd956331401ac983f5d15b7f77d60ef8ac6a3d438e4a0fb6279736c6dd79342931
6
+ metadata.gz: e68a557b00ecb43feb93cdc9a5c9a4c09edd93f8421206be3372e617809bc9ce01cb9ad4928e6f97f49b6db64fd9a329dd2ff7360d5d7b618ce455d0f080c6b1
7
+ data.tar.gz: 94417403770d855c0df0d1927b3de00235a668503790e06388de83cadb3b3d2b533f394fab3ddcda5ace164bbe7394810c8ad6b00092eb553552f74fabaa2e5e
data/CHANGELOG.md CHANGED
@@ -1,3 +1,11 @@
1
+ # [0.4.0] - 2024-12-16
2
+
3
+ Image embedding works
4
+
5
+ # [0.3.9] - 2024-12-4
6
+
7
+ Image embedding - not fully tested but it seems to work in a few test cases
8
+
1
9
  ## [0.3.1] - 2024-11-22
2
10
 
3
11
  Ensure that the substituted nodes are reinserted correctly into the output document, attempting to restore formatting at the paragraph level (although it does lose formatting at lower levels than this - so-called "run" nodes which represent arbitrary spans of characters within the paragraph).
data/README.md CHANGED
@@ -32,7 +32,7 @@ Using the ruby library:
32
32
 
33
33
  ```ruby
34
34
  Consolidate::Docx::Merge.open "/path/to/file.docx" do |doc|
35
- puts doc.field_names
35
+ puts doc.text_field_names
36
36
 
37
37
  doc.data first_name: "Alice", product: "Palm Pilot", date: "23rd January 2002", user_name: "Bob"
38
38
  doc.write_to "/path/to/merge-file.docx"
@@ -55,6 +55,27 @@ examine /path/to/file.docx verbose
55
55
  consolidate /path/to/file.docx /path/to/merge-file.docx first_name=Alice "product=Palm Pilot" "date=23rd January 2022" "user_name=Bob" verbose
56
56
  ```
57
57
 
58
+ ### Embedding images
59
+
60
+ If any of the merge fields end with `_image` then it is assumed that they represent an image to be substituted into the document. (At present this is not available on the command line).
61
+
62
+ ```ruby
63
+ Consolidate::Docx::Merge.open "/path/to/file.docx" do |doc|
64
+ puts doc.text_field_names
65
+ puts doc.image_field_names
66
+
67
+ doc.data first_name: "Alice", product: "Palm Pilot", date: "23rd January 2002", user_name: "Bob", header_logo_image: Consolidate::Image.new(name: "logo.png", width: 1024, height: 128, path: "/path/to/logo.png"), promotion_image: Consolidate::Image.new(name: "promotion.jpg", width: 2048, height: 2048, url: "https://myshop.com/promotion.jpg"), local_image: Consolidate::Image.new(name: "local.png", width: 256, height: 256, contents: File.open("/path/to/local.png"))
68
+ doc.write_to "/path/to/merge-file.docx"
69
+ end
70
+ ```
71
+
72
+ The `Consolidate::Image` can be used to provide an image in three ways:
73
+ - via a path
74
+ - via a URL
75
+ - via an IO object
76
+
77
+ Note - if the merge routine detects _any_ image fields within the document, it will attempt to load all image data (triggering file reads or HTTP requests) and it will embed those images into the output .docx file - even if the merge field in question is never used.
78
+
58
79
  ### History
59
80
 
60
81
  Originally, this gem was intended to open a Word .docx file, find the mailmerge fields within it and then substitute new values.
@@ -0,0 +1 @@
1
+ 778ec8cf89aa2d0932dd9169e3282acb810a898213efdf92b1cd50e9b8bec739142b8e50006bc8223c1a7b58b3ca86f64a9c1809ec8ce27ed7ef715d36e1b4d0
@@ -0,0 +1 @@
1
+ 58d1597374e775340c60e3e49b9d65437909b482f7be3040fb9fdca5d8d4b4d3020508a9ee01e3521b2c552e1b853d0475335d1ca1c092b4172c1e9ac86df640
@@ -0,0 +1,32 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "zip"
4
+ require "nokogiri"
5
+
6
+ module Consolidate
7
+ module Docx
8
+ class Image < SimpleDelegator
9
+ # Path to use when referencing this image from other documents
10
+ def media_path = "media/#{name}"
11
+
12
+ # Path to use when storing this image within the docx
13
+ def storage_path = "word/#{media_path}"
14
+
15
+ # Convert width from pixels to EMU
16
+ def width = super * emu_per_width_pixel
17
+
18
+ # Convert height from pixels to EMU
19
+ def height = super * emu_per_height_pixel
20
+
21
+ # Get the width of this image in EMU up to a maximum page width (also in EMU)
22
+ def clamped_width(maximum = 7_772_400) = [width, maximum].min
23
+
24
+ # Get the height of this image in EMU adjusted for a maximum page width (also in EMU)
25
+ def clamped_height(maximum = 7_772_400) = (height * clamped_width(maximum).to_f / width.to_f).to_i
26
+
27
+ def emu_per_width_pixel = 914_400 / dpi[:x]
28
+
29
+ def emu_per_height_pixel = 914_400 / dpi[:y]
30
+ end
31
+ end
32
+ end
@@ -0,0 +1,88 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "zip"
4
+ require "nokogiri"
5
+
6
+ module Consolidate
7
+ module Docx
8
+ class ImageReferenceNodeBuilder < Data.define(:field_name, :image, :node_id, :image_number, :document)
9
+ def call
10
+ Nokogiri::XML::Node.new("w:drawing", document).tap do |drawing|
11
+ drawing["xmlns:a"] = "http://schemas.openxmlformats.org/drawingml/2006/main"
12
+ drawing << Nokogiri::XML::Node.new("wp:inline", document).tap do |inline|
13
+ inline["distT"] = "0"
14
+ inline["distB"] = "0"
15
+ inline["distL"] = "0"
16
+ inline["distR"] = "0"
17
+ inline << Nokogiri::XML::Node.new("wp:extent", document).tap do |extent|
18
+ extent["cx"] = image.clamped_width(max_width_from(document))
19
+ extent["cy"] = image.clamped_height(max_width_from(document))
20
+ end
21
+ inline << Nokogiri::XML::Node.new("wp:effectExtent", document).tap do |effect_extent|
22
+ effect_extent["l"] = "0"
23
+ effect_extent["t"] = "0"
24
+ effect_extent["r"] = "0"
25
+ effect_extent["b"] = "0"
26
+ end
27
+ inline << Nokogiri::XML::Node.new("wp:cNvGraphicFramePr", document).tap do |c_nv_graphic_frame_pr|
28
+ c_nv_graphic_frame_pr << Nokogiri::XML::Node.new("a:graphicFrameLocks", document).tap do |graphic_frame_locks|
29
+ graphic_frame_locks["noChangeAspect"] = true
30
+ end
31
+ end
32
+ inline << Nokogiri::XML::Node.new("a:graphic", document).tap do |graphic|
33
+ graphic["xmlns:a"] = "http://schemas.openxmlformats.org/drawingml/2006/main"
34
+ graphic << Nokogiri::XML::Node.new("a:graphicData", document).tap do |graphic_data|
35
+ graphic_data["uri"] = "http://schemas.openxmlformats.org/drawingml/2006/picture"
36
+ graphic_data << Nokogiri::XML::Node.new("pic:pic", document).tap do |pic|
37
+ pic["xmlns:pic"] = "http://schemas.openxmlformats.org/drawingml/2006/picture"
38
+ pic << Nokogiri::XML::Node.new("pic:nvPicPr", document).tap do |nv_pic_pr|
39
+ nv_pic_pr << Nokogiri::XML::Node.new("pic:cNvPr", document).tap do |c_nv_pr|
40
+ c_nv_pr["id"] = image_number
41
+ c_nv_pr["name"] = image.name
42
+ c_nv_pr["descr"] = image.name
43
+ c_nv_pr["hidden"] = false
44
+ c_nv_pr << Nokogiri::XML::Node.new("pic:cNvPicPr", document)
45
+ end
46
+ end
47
+ pic << Nokogiri::XML::Node.new("pic:blipFill", document).tap do |blip_fill|
48
+ blip_fill << Nokogiri::XML::Node.new("a:blip", document).tap do |blip|
49
+ blip["r:embed"] = node_id
50
+ end
51
+ blip_fill << Nokogiri::XML::Node.new("a:stretch", document).tap do |stretch|
52
+ stretch << Nokogiri::XML::Node.new("a:fillRect", document)
53
+ end
54
+ end
55
+ pic << Nokogiri::XML::Node.new("pic:spPr", document).tap do |sp_pr|
56
+ sp_pr << Nokogiri::XML::Node.new("a:xfrm", document).tap do |xfrm|
57
+ xfrm << Nokogiri::XML::Node.new("a:off", document).tap do |off|
58
+ off["x"] = "0"
59
+ off["y"] = "0"
60
+ end
61
+ xfrm << Nokogiri::XML::Node.new("a:ext", document).tap do |ext|
62
+ ext["cx"] = image.clamped_width(max_width_from(document))
63
+ ext["cy"] = image.clamped_height(max_width_from(document))
64
+ end
65
+ end
66
+ sp_pr << Nokogiri::XML::Node.new("a:prstGeom", document).tap do |prst_geom|
67
+ prst_geom["prst"] = "rect"
68
+ prst_geom << Nokogiri::XML::Node.new("a:avLst", document)
69
+ end
70
+ end
71
+ end
72
+ end
73
+ end
74
+ end
75
+ end
76
+ end
77
+
78
+ DEFAULT_PAGE_WIDTH = 12_240
79
+ TWENTIETHS_OF_A_POINT_TO_EMU = 635
80
+ DEFAULT_PAGE_WIDTH_IN_EMU = DEFAULT_PAGE_WIDTH * TWENTIETHS_OF_A_POINT_TO_EMU
81
+
82
+ private def max_width_from document
83
+ page_width = (document.at_xpath("//w:sectPr/w:pgSz/@w:w")&.value || DEFAULT_PAGE_WIDTH).to_i
84
+ page_width * TWENTIETHS_OF_A_POINT_TO_EMU
85
+ end
86
+ end
87
+ end
88
+ end
@@ -2,69 +2,88 @@
2
2
 
3
3
  require "zip"
4
4
  require "nokogiri"
5
+ require_relative "image_reference_node_builder"
6
+ require_relative "image"
5
7
 
6
8
  module Consolidate
7
9
  module Docx
8
10
  class Merge
9
11
  def self.open(path, verbose: false, &block)
10
- new(path, verbose: verbose, &block)
12
+ new(path, verbose: verbose, &block).tap do |merge|
13
+ block&.call merge
14
+ end
11
15
  path
12
16
  end
13
17
 
14
- def initialize(path, verbose: false, &block)
18
+ def initialize(path, verbose: false)
15
19
  @verbose = verbose
16
- @output = {}
17
20
  @zip = Zip::File.open(path)
18
21
  @documents = load_documents
19
- block&.call self
22
+ @relations = load_relations
23
+ @contents_xml = load_and_update_contents_xml
24
+ @output = {}
25
+ @images = {}
26
+ @mapping = {}
20
27
  end
21
28
 
22
29
  # Helper method to display the contents of the document and the merge fields from the CLI
23
30
  def examine
24
- documents = document_names.join(", ")
25
- fields = field_names.join(", ")
26
- puts "Documents: #{documents}"
27
- puts "Merge fields: #{fields}"
31
+ puts "Documents: #{document_names.join(", ")}"
32
+ puts "Content documents: #{content_document_names.join(", ")}"
33
+ puts "Merge fields: #{text_field_names.join(", ")}"
34
+ puts "Image fields: #{image_field_names.join(", ")}"
28
35
  end
29
36
 
30
37
  # Read all documents within the docx and extract any merge fields
31
- def field_names
32
- tag_nodes.collect do |tag_node|
33
- field_names_from tag_node
34
- end.flatten.compact.uniq
35
- end
38
+ def text_field_names = @text_field_names ||= tag_nodes.collect { |tag_node| text_field_names_from tag_node }.flatten.compact.uniq
39
+
40
+ # Read all documents within the docx and extract any image fields
41
+ def image_field_names = @image_field_names ||= tag_nodes.collect { |tag_node| image_field_names_from tag_node }.flatten.compact.uniq
36
42
 
37
43
  # List the documents stored within this docx
38
- def document_names
39
- @zip.entries.collect { |entry| entry.name }
40
- end
44
+ def document_names = @zip.entries.map(&:name)
41
45
 
42
- # Substitute the data from the merge fields with the values provided
43
- def data mapping = {}
44
- mapping = mapping.transform_keys(&:to_s)
46
+ # List the content within this docx
47
+ def content_document_names = @documents.keys
48
+
49
+ # List the field names that are present in the merge data
50
+ def merge_field_names = @mapping.keys
45
51
 
52
+ # Set the merge data and erform the substitution - creating copies of any documents that contain merge tags and replacing the tags with the supplied data
53
+ def data mapping = {}
54
+ @mapping = mapping.transform_keys(&:to_s)
46
55
  if verbose
47
- puts "...substitutions..."
48
- mapping.each do |key, value|
49
- puts " #{key} => #{value}"
50
- end
56
+ puts "...mapping data"
57
+ puts @mapping.keys.select { |field_name| text_field_names.include?(field_name) }.map { |field_name| "... #{field_name} => #{@mapping[field_name]}" }.join("\n")
51
58
  end
52
59
 
53
- @documents.each do |name, document|
54
- output_document = substitute document.dup, mapping: mapping, document_name: name
60
+ @images = load_images_and_link_relations
55
61
 
56
- @output[name] = output_document.serialize save_with: 0
62
+ @documents.each do |name, document|
63
+ @output[name] = substitute(document.dup, document_name: name).serialize save_with: 0
57
64
  end
58
65
  end
59
66
 
60
- # Write the new document to the given path
61
67
  def write_to path
62
68
  puts "...writing to #{path}" if verbose
63
69
  Zip::File.open(path, Zip::File::CREATE) do |out|
64
- zip.each do |entry|
65
- out.get_output_stream(entry.name) do |o|
66
- o.write(output[entry.name] || zip.read(entry.name))
67
- end
70
+ @output[contents_xml] = @contents_xml.serialize save_with: 0
71
+
72
+ @images.each do |field_name, image|
73
+ puts "... writing image #{field_name} to #{image.storage_path}" if verbose
74
+ out.get_output_stream(image.storage_path) { |o| o.write image.contents }
75
+ end
76
+
77
+ @relations.each do |relation_name, relations|
78
+ puts "... writing relations #{relation_name}" if verbose
79
+ out.get_output_stream(relation_name) { |o| o.write relations }
80
+ end
81
+
82
+ @zip.reject do |entry|
83
+ @relations.key? entry.name
84
+ end.each do |entry|
85
+ puts "... writing updated document to #{entry.name}" if verbose
86
+ out.get_output_stream(entry.name) { |o| o.write(@output[entry.name] || @relations[entry.name] || @zip.read(entry.name)) }
68
87
  end
69
88
  end
70
89
  end
@@ -72,81 +91,195 @@ module Consolidate
72
91
  private
73
92
 
74
93
  attr_reader :verbose
75
- attr_reader :zip
76
- attr_reader :xml
77
- attr_reader :documents
78
- attr_accessor :output
79
- TAG = /\{\{\s*(\S+)\s*\}\}/
94
+
95
+ def contents_xml = "[Content_Types].xml"
96
+
97
+ # Regex to find merge fields that contain text
98
+ def text_tag = /\{\{\s*(?!.*_image\b)(\S+)\s*\}\}/i
99
+
100
+ # Regex to find merge fields that contain images
101
+ def image_tag = /\{\{\s*(\S+_image)\s*\}\}/i
102
+
103
+ # Regex to find merge fields containing the given field name
104
+ def tag_for(field_name) = /\{\{\s*#{field_name}\s*\}\}/
105
+
106
+ # Find all nodes in all relevant documents that contain a merge field
107
+ def tag_nodes = @documents.collect { |name, document| tag_nodes_for document }.flatten
108
+
109
+ # go through all paragraph nodes of the document
110
+ # selecting any that contain a merge tag
111
+ def tag_nodes_for(document) = (document / "//w:p").select { |paragraph| paragraph.content.match(text_tag) || paragraph.content.match(image_tag) }
112
+
113
+ # Extract the text field name(s) from the paragraph
114
+ def text_field_names_from(tag_node) = (matches = tag_node.content.scan(text_tag)).empty? ? nil : matches.flatten.map(&:strip)
115
+
116
+ # Extract the image field name(s) from the paragraph
117
+ def image_field_names_from(tag_node) = (matches = tag_node.content.scan(image_tag)).empty? ? nil : matches.flatten.map(&:strip)
118
+
119
+ # Unique number for each image field
120
+ def relation_number_for(field_name) = @mapping.keys.index(field_name) + 1000
121
+
122
+ # Identifier to use when linking a merge field to the actual image file contents
123
+ def relation_id_for(field_name) = "rId#{relation_number_for(field_name)}"
124
+
125
+ # Empty elations document for documents that do not already have one
126
+ def default_relations_document = %(<?xml version="1.0" encoding="UTF-8"?><Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"></Relationships>)
80
127
 
81
128
  def load_documents
82
- @zip.entries.each_with_object({}) do |entry, documents|
129
+ @zip.entries.each_with_object({}) do |entry, results|
130
+ next unless entry.name.match?(/word\/(document|header|footer|footnotes|endnotes).?\.xml/)
131
+ puts "...reading document #{entry.name}" if verbose
132
+ contents = @zip.get_input_stream entry
133
+ results[entry.name] = Nokogiri::XML(contents) { |x| x.noent }
134
+ end
135
+ end
136
+
137
+ def load_relations
138
+ @zip.entries.each_with_object({}) do |entry, results|
83
139
  next unless entry.name.match?(/word\/(document|header|footer|footnotes|endnotes).?\.xml/)
84
- puts "...reading #{entry.name}" if verbose
85
- xml = @zip.get_input_stream entry
86
- documents[entry.name] = Nokogiri::XML(xml) { |x| x.noent }
140
+ relation_document = entry.name.gsub("word/", "word/_rels/").gsub(".xml", ".xml.rels")
141
+ puts "...reading or building relations for #{relation_document}" if verbose
142
+ contents = @zip.find_entry(relation_document) ? @zip.get_input_stream(relation_document) : default_relations_document
143
+ results[relation_document] = Nokogiri::XML(contents) { |x| x.noent }
87
144
  end
88
145
  ensure
89
146
  @zip.close
90
147
  end
91
148
 
92
- # Collect all the nodes that contain merge fields
93
- def tag_nodes
94
- documents.collect do |name, document|
95
- tag_nodes_for document
96
- end.flatten
149
+ # Create relation links for each image field and store the image data
150
+ def load_images_and_link_relations
151
+ load_images.tap do |images|
152
+ link_relations_to images
153
+ end
154
+ end
155
+
156
+ # Build a mapping of image paths to the image data so that the image data can be stored in the output docx
157
+ def load_images
158
+ image_field_names.each_with_object({}) do |field_name, result|
159
+ result[field_name] = Consolidate::Docx::Image.new(@mapping[field_name])
160
+ puts "... #{field_name} => #{result[field_name].media_path}" if verbose
161
+ end
97
162
  end
98
163
 
99
- # go through all w:t (Word Text???) nodes of the document
100
- # find any nodes that contain "{{"
101
- # then find the ancestor node that also includes the ending "}}"
102
- # This collection of nodes contains all the merge fields for this document
103
- def tag_nodes_for document
104
- (document / "//w:p").select do |paragraph|
105
- paragraph.content.match(TAG)
164
+ # Update all relation documents to include a relationship for each image field and its stored image path
165
+ def link_relations_to images
166
+ @relations.each do |name, xml|
167
+ puts "... linking images in #{name}" if verbose
168
+ images.each do |field_name, image|
169
+ # Is this image already referenced in this relationship document?
170
+ next unless xml.at_xpath("//Relationship[@Target=\"#{image.media_path}\"]").nil?
171
+ puts "... #{relation_id_for(field_name)} => #{image.media_path}" if verbose
172
+ xml.root << Nokogiri::XML::Node.new("Relationship", xml).tap do |relation|
173
+ relation["Id"] = relation_id_for(field_name)
174
+ relation["Type"] = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/image"
175
+ relation["Target"] = image.media_path
176
+ end
177
+ end
106
178
  end
107
179
  end
108
180
 
109
- # Extract the merge field name from the node
110
- def field_names_from(tag_node)
111
- matches = tag_node.content.scan(TAG)
112
- matches.empty? ? nil : matches.flatten.map(&:strip)
181
+ def load_and_update_contents_xml
182
+ puts "...reading and updating #{contents_xml}" if verbose
183
+ content = @zip.get_input_stream(contents_xml)
184
+ Nokogiri::XML(content) { |x| x.noent }.tap do |document|
185
+ add_content_relations_to document
186
+ end
113
187
  end
114
188
 
115
189
  # Go through the given document, replacing any merge fields with the values provided
116
190
  # and storing the results in a new document
117
- def substitute document, document_name:, mapping: {}
191
+ def substitute document, document_name:
192
+ puts "...substituting fields in #{document_name}" if verbose && tag_nodes_for(document).any?
193
+ substitute_text document, document_name: document_name
194
+ substitute_images document, document_name: document_name
195
+ end
196
+
197
+ def substitute_text document, document_name:
118
198
  tag_nodes_for(document).each do |tag_node|
119
- field_names = field_names_from tag_node
120
- puts "Original Node for #{field_names} is #{tag_node}" if verbose
199
+ field_names = text_field_names_from(tag_node) || []
121
200
 
122
- # Extract the paragraph properties node if it exists
201
+ # Extract the properties (formatting) nodes if they exist
123
202
  paragraph_properties = tag_node.search ".//w:pPr"
124
203
  run_properties = tag_node.at_xpath ".//w:rPr"
125
204
 
205
+ # Get the current contents, then substitute any text fields
126
206
  text = tag_node.content
207
+
127
208
  field_names.each do |field_name|
128
- field_value = mapping[field_name].to_s
129
- puts "...substituting #{field_name} with #{field_value} in #{document_name}" if verbose
130
- text = text.gsub(/{{\s*#{field_name}\s*}}/, field_value)
209
+ field_value = @mapping[field_name].to_s
210
+ puts "... substituting '#{field_name}' with '#{field_value}'" if verbose
211
+ text = text.gsub(tag_for(field_name), field_value)
131
212
  end
132
213
 
133
214
  # Create a new text node with the substituted text
134
215
  text_node = Nokogiri::XML::Node.new("w:t", tag_node.document)
135
216
  text_node.content = text
136
217
 
137
- # Create a new run node to hold the substituted text and the paragraph properties
218
+ # Create a new run node to hold the run properties and substitute text node
138
219
  run_node = Nokogiri::XML::Node.new("w:r", tag_node.document)
139
- run_node << run_properties if run_properties
220
+ run_node << run_properties unless run_properties.nil?
140
221
  run_node << text_node
222
+ # Add the paragraph properties and the run node to the tag node
141
223
  tag_node.children = Nokogiri::XML::NodeSet.new(document, paragraph_properties.to_a + [run_node])
224
+ rescue => ex
225
+ # Have to mangle the exception message otherwise it outputs the entire document
226
+ puts ex.message.to_s[0..255]
227
+ puts ex.backtrace.first
228
+ end
229
+ document
230
+ end
231
+
232
+ # Go through the given document, replacing any merge fields with the values provided
233
+ # and storing the results in a new document
234
+ def substitute_images document, document_name:
235
+ tag_nodes_for(document).each do |tag_node|
236
+ field_names = image_field_names_from(tag_node) || []
237
+ # Extract the properties (formatting) nodes if they exist
238
+ paragraph_properties = tag_node.search ".//w:pPr"
239
+ run_properties = tag_node.at_xpath ".//w:rPr"
142
240
 
143
- puts "TAG NODE FOR #{field_names} IS #{tag_node}" if verbose
241
+ pieces = tag_node.content.split(image_tag)
242
+ # Split the content into pieces - either text or an image merge field
243
+ # Then replace the text with text nodes or the image merge fields with drawing nodes
244
+ replacement_nodes = pieces.collect do |piece|
245
+ field_name = piece.strip
246
+ if field_names.include? field_name
247
+ image = @images[field_name]
248
+ puts "... substituting '#{field_name}' with '<#{relation_id_for(field_name)}/>'" if verbose
249
+ ImageReferenceNodeBuilder.new(field_name: field_name, image: image, node_id: relation_id_for(field_name), image_number: relation_number_for(field_name), document: document).call
250
+ else
251
+ Nokogiri::XML::Node.new("w:t", document) { |t| t.content = piece }
252
+ end
253
+ end
254
+ run_nodes = (replacement_nodes.map { |node| Nokogiri::XML::Node.new("w:r", document) { |run_node| run_node.children = node } } + [run_properties]).compact
255
+ tag_node.children = Nokogiri::XML::NodeSet.new(document, paragraph_properties.to_a + run_nodes)
144
256
  rescue => ex
145
257
  # Have to mangle the exception message otherwise it outputs the entire document
146
258
  puts ex.message.to_s[0..255]
259
+ puts ex.backtrace.first
147
260
  end
148
261
  document
149
262
  end
263
+
264
+ CONTENT_RELATIONS = {
265
+ jpeg: "image/jpg",
266
+ png: "image/png",
267
+ bmp: "image/bmp",
268
+ gif: "image/gif",
269
+ tif: "image/tif",
270
+ pdf: "application/pdf",
271
+ mov: "application/movie"
272
+ }.freeze
273
+
274
+ def add_content_relations_to document
275
+ CONTENT_RELATIONS.each do |file_type, content_type|
276
+ next unless document.at_xpath("//Default[@Extension=\"#{file_type}\"]").nil?
277
+ document.root << Nokogiri::XML::Node.new("Default", document).tap do |relation|
278
+ relation["Extension"] = file_type
279
+ relation["ContentType"] = content_type
280
+ end
281
+ end
282
+ end
150
283
  end
151
284
  end
152
285
  end
@@ -0,0 +1,27 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Consolidate
4
+ class Image
5
+ attr_reader :name, :width, :height, :aspect_ratio, :dpi
6
+
7
+ def initialize name:, width:, height:, path: nil, url: nil, contents: nil
8
+ @name = name
9
+ @width = width
10
+ @height = height
11
+ @path = path
12
+ @url = url
13
+ @contents = contents
14
+ @aspect_ratio = width.to_f / height.to_f
15
+ #  TODO: Read this from the contents
16
+ @dpi = {x: 72, y: 72}
17
+ end
18
+
19
+ def to_s = name
20
+
21
+ def contents = @contents ||= contents_from_path || contents_from_url
22
+
23
+ private def contents_from_path = @path.nil? ? nil : File.read(@path)
24
+
25
+ private def contents_from_url = @url.nil? ? nil : URI.open(@url).read # standard:disable Security/Open
26
+ end
27
+ end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Consolidate
4
- VERSION = "0.3.1"
4
+ VERSION = "0.4.0"
5
5
  end
data/lib/consolidate.rb CHANGED
@@ -1,4 +1,5 @@
1
1
  module Consolidate
2
2
  require_relative "consolidate/version"
3
+ require_relative "consolidate/image"
3
4
  require_relative "consolidate/docx/merge"
4
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: standard-procedure-consolidate
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.1
4
+ version: 0.4.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Rahoul Baruah
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2024-11-22 00:00:00.000000000 Z
11
+ date: 2024-12-16 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rubyzip
@@ -63,10 +63,15 @@ files:
63
63
  - checksums/standard-procedure-consolidate-0.2.0.gem.sha512
64
64
  - checksums/standard-procedure-consolidate-0.3.0.gem.sha512
65
65
  - checksums/standard-procedure-consolidate-0.3.1.gem.sha512
66
+ - checksums/standard-procedure-consolidate-0.3.9.gem.sha512
67
+ - checksums/standard-procedure-consolidate-0.4.0.gem.sha512
66
68
  - exe/consolidate
67
69
  - exe/examine
68
70
  - lib/consolidate.rb
71
+ - lib/consolidate/docx/image.rb
72
+ - lib/consolidate/docx/image_reference_node_builder.rb
69
73
  - lib/consolidate/docx/merge.rb
74
+ - lib/consolidate/image.rb
70
75
  - lib/consolidate/version.rb
71
76
  - sig/standard/procedure/consolidate.rbs
72
77
  - tmp/.keep
@@ -78,7 +83,7 @@ metadata:
78
83
  homepage_uri: https://github.com/standard-procedure/standard-procedure-consolidate
79
84
  source_code_uri: https://github.com/standard-procedure/standard-procedure-consolidate
80
85
  changelog_uri: https://github.com/standard-procedure/standard-procedure-consolidate/blob/main/CHANGELOG.md
81
- post_install_message:
86
+ post_install_message:
82
87
  rdoc_options: []
83
88
  require_paths:
84
89
  - lib
@@ -94,7 +99,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
94
99
  version: '0'
95
100
  requirements: []
96
101
  rubygems_version: 3.4.19
97
- signing_key:
102
+ signing_key:
98
103
  specification_version: 4
99
104
  summary: Simple ruby mailmerge for Microsoft Word .docx files.
100
105
  test_files: []