standard-procedure-consolidate 0.1.3 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.nova/Configuration.json +3 -0
- data/CHANGELOG.md +9 -3
- data/README.md +61 -16
- data/checksums/standard-procedure-consolidate-0.1.4.gem.sha512 +1 -0
- data/checksums/standard-procedure-consolidate-0.2.0.gem.sha512 +1 -0
- data/exe/consolidate +31 -0
- data/exe/examine +18 -0
- data/lib/consolidate/docx/merge.rb +40 -58
- data/lib/consolidate/version.rb +1 -1
- metadata +7 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: f61fdd9cdf73d8609574952bca3a57839983f74f100aea6f8b54a4248b526efa
|
4
|
+
data.tar.gz: 0a64110c54d435f10c1465b96dd61e150c2ff56ae887788b5880d7bf489e93d5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: ac0a18d02265eb79b2c9ffda09b2a13825e520a08faef45c1a0f3fd29877b11abf46776554a7c7e2e4157e560f0269889a95f3a9dc3dc4f49b66bda51848e37c
|
7
|
+
data.tar.gz: 1fce15f81e519792c058dc731cd298db46dc4163d7c0c6cd753c87c2ec7a672aa4b9e350d526d8a641ebdb13bd3dd98b1a9ebd8bb854cd6d0e2342899103b883
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,12 @@
|
|
1
|
+
## [0.2.0] - 2023-09-13
|
2
|
+
|
3
|
+
Thrown away the mail-merge implementation and replaced it with a simple search/replace.
|
4
|
+
Added in the command line utilities for examining and consolidating documents.
|
5
|
+
|
6
|
+
## [0.1.4] - 2023-09-11
|
7
|
+
|
8
|
+
Updated which files get exclusions after crashes in production with some client files
|
9
|
+
|
1
10
|
## [0.1.3] - 2023-09-07
|
2
11
|
|
3
12
|
Customer had an issue where some fields were not merging correctly
|
@@ -21,6 +30,3 @@ Customer had an issue where some fields were not merging correctly
|
|
21
30
|
- Initial release
|
22
31
|
|
23
32
|
## [Unreleased]
|
24
|
-
|
25
|
-
|
26
|
-
|
data/README.md
CHANGED
@@ -1,35 +1,80 @@
|
|
1
1
|
# Standard::Procedure::Consolidate
|
2
2
|
|
3
|
-
A simple gem for performing
|
3
|
+
A simple gem for performing search and replace on Microsoft Word .docx files.
|
4
4
|
|
5
|
-
Important: I can't claim the credit for this - I found [this gist](https://gist.github.com/ericmason/7200448) and have
|
5
|
+
Important: I can't claim the credit for this - I found [this gist](https://gist.github.com/ericmason/7200448) and have adapted it for my needs.
|
6
6
|
|
7
|
-
|
7
|
+
## Search/Replace for field placeholders
|
8
8
|
|
9
|
+
If you have a Word .docx file that looks like this (ignoring formatting):
|
9
10
|
|
10
|
-
|
11
|
+
```
|
12
|
+
Dear {{ first_name }},
|
11
13
|
|
12
|
-
|
14
|
+
Thank you for your purchase of {{ product }} on the {{ date }}. We hope that with the proper care and attention it will give you years of happy use.
|
15
|
+
|
16
|
+
Regards
|
17
|
+
|
18
|
+
{{ user_name }}
|
19
|
+
```
|
20
|
+
|
21
|
+
We have marked out the "fields" by using squiggly brackets - in a manner similar to (but simpler than) Liquid or Mustache templates. In this example, we have fields for first_name, product, date and user_name.
|
22
|
+
|
23
|
+
Consolidate reads your .docx file, locates these fields and then replaces them with the values you have supplied, writing the output to a new file.
|
24
|
+
|
25
|
+
NOTE: These are not traditional Word "mail-merge fields" - these are just fragments of text that are within the Word document. See the history section for why this does not work with merge-fields.
|
13
26
|
|
14
27
|
## Usage
|
15
28
|
|
16
|
-
|
29
|
+
There is a command-line tool or a ruby library that you can include into your code.
|
30
|
+
|
31
|
+
Using the ruby library:
|
17
32
|
|
18
33
|
```ruby
|
19
|
-
Consolidate::Docx::Merge.open "/path/to/docx" do |
|
20
|
-
puts
|
21
|
-
|
34
|
+
Consolidate::Docx::Merge.open "/path/to/file.docx" do |doc|
|
35
|
+
puts doc.field_names
|
36
|
+
|
37
|
+
doc.data first_name: "Alice", product: "Palm Pilot", date: "23rd January 2002", user_name: "Bob"
|
38
|
+
doc.write_to "/path/to/merge-file.docx"
|
39
|
+
end
|
22
40
|
```
|
23
|
-
To perform a merge, replacing merge fields with supplied values:
|
24
41
|
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
|
29
|
-
|
42
|
+
Using the command line:
|
43
|
+
|
44
|
+
```sh
|
45
|
+
examine /path/to/file.docx
|
46
|
+
|
47
|
+
consolidate /path/to/file.docx /path/to/merge-file.docx first_name=Alice "product=Palm Pilot" "date=23rd January 2022" "user_name=Bob"
|
48
|
+
```
|
49
|
+
|
50
|
+
If you want to see what the routine is doing you can add the `verbose` option.
|
51
|
+
|
52
|
+
```sh
|
53
|
+
examine /path/to/file.docx verbose
|
54
|
+
|
55
|
+
consolidate /path/to/file.docx /path/to/merge-file.docx first_name=Alice "product=Palm Pilot" "date=23rd January 2022" "user_name=Bob" verbose
|
30
56
|
```
|
31
57
|
|
32
|
-
|
58
|
+
### History
|
59
|
+
|
60
|
+
Originally, this gem was intended to open a Word .docx file, find the mailmerge fields within it and then substitute new values.
|
61
|
+
|
62
|
+
I managed to get a basic version of this working with a variety of different Word files and all seemed good. Until my client reported that when they went to print the document, the mailmerge fields reappeared! My best guess is that Word is thinking that print-time is a trigger for merging in a data source (for example, printing out a form letter to 200 customers), so all the substitution work that this gem does is then discarded and Word asks for the merge data again. The frustrating thing is I can't figure out how Word keeps the references to the merge fields after they've been substituted.
|
63
|
+
|
64
|
+
So instead, this does a simple search and replace - looking for fields within squiggly brackets and substituting them.
|
65
|
+
|
66
|
+
### How it works
|
67
|
+
|
68
|
+
This is a bit sketchy and pieced together from the [code I found]((https://gist.github.com/ericmason/7200448)) and various bits of skimming through the published file format.
|
69
|
+
|
70
|
+
A .docx file is actually a .zip file that contains a number of .xml documents. Some of these are for storing formatting information (fonts, styles and various metadata) and some are for storing the actual document contents.
|
71
|
+
|
72
|
+
Consolidate looks for word/document.xml files, plus any files that match word/header*.xml, word/footer*.xml, word/footnotes*.xml and word/endnotes*.xml. It parses the XML, looking for text nodes that contain squiggly brackets. If it finds them, it then checks to see if we have supplied a data value for the matching field-name and replaces the contents of the node.
|
73
|
+
|
74
|
+
## Installation
|
75
|
+
|
76
|
+
$ bundle add standard-procedure-consolidate
|
77
|
+
|
33
78
|
|
34
79
|
## Development
|
35
80
|
|
@@ -0,0 +1 @@
|
|
1
|
+
da681c73463215031518b2606899503d3217ab92e1877bd8f966e5dc3ebc971ae686958b78da821feb24374f84d179e86cdaf29dbb90f0bacb0a7cbde8e66c6b
|
@@ -0,0 +1 @@
|
|
1
|
+
872129b729c3531b72fbecc9bea75262ebc771d8e0a9f4d41b30ad87f73ee664aaccf8d602a410c3b500f067cd636f88c74eecb887cb68a345b6ea9d575c2140
|
data/exe/consolidate
CHANGED
@@ -1,3 +1,34 @@
|
|
1
1
|
#!/usr/bin/env ruby
|
2
2
|
|
3
3
|
require "consolidate"
|
4
|
+
|
5
|
+
if ARGV[0].nil? || ARGV[0] == ""
|
6
|
+
puts "# Standard::Procedure::Consolidate"
|
7
|
+
puts "## Mailmerge for simple Microsoft Word .docx files."
|
8
|
+
puts ""
|
9
|
+
puts "Create a new file with the mailmerge fields replaced by the data you have supplied"
|
10
|
+
puts "USAGE: consolidate path/to/myfile.docx path/to/mynewfile.docx \"field1=value1\" \"field2=value2\" \"field3=value3\""
|
11
|
+
puts ""
|
12
|
+
else
|
13
|
+
input = ARGV[0]
|
14
|
+
output = ARGV[1]
|
15
|
+
data = {}
|
16
|
+
verbose = false
|
17
|
+
2.upto ARGV.length do |index|
|
18
|
+
arg = ARGV[index]
|
19
|
+
next if arg.nil?
|
20
|
+
if arg.strip == "verbose"
|
21
|
+
verbose = true
|
22
|
+
else
|
23
|
+
pieces = arg.split("=")
|
24
|
+
key = pieces.first.strip
|
25
|
+
value = pieces.last.strip
|
26
|
+
data[key] = value
|
27
|
+
end
|
28
|
+
end
|
29
|
+
|
30
|
+
Consolidate::Docx::Merge.open input, verbose: verbose do |doc|
|
31
|
+
doc.data data
|
32
|
+
doc.write_to output
|
33
|
+
end and nil
|
34
|
+
end
|
data/exe/examine
ADDED
@@ -0,0 +1,18 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
# frozen_string_literal: true
|
3
|
+
|
4
|
+
require "consolidate"
|
5
|
+
|
6
|
+
if ARGV[0].nil? || ARGV[0] == ""
|
7
|
+
puts "# Standard::Procedure::Consolidate"
|
8
|
+
puts "## Mailmerge for simple Microsoft Word .docx files."
|
9
|
+
puts ""
|
10
|
+
puts "Examine the mailmerge fields inside a .docx file"
|
11
|
+
puts "USAGE: examine path/to/myfile.docx"
|
12
|
+
puts "Option: examine path/to/myfile.docx verbose"
|
13
|
+
puts ""
|
14
|
+
else
|
15
|
+
Consolidate::Docx::Merge.open ARGV[0], verbose: (ARGV[1] == "verbose") do |doc|
|
16
|
+
doc.examine
|
17
|
+
end and nil
|
18
|
+
end
|
@@ -12,23 +12,47 @@ module Consolidate
|
|
12
12
|
end
|
13
13
|
|
14
14
|
def examine
|
15
|
-
|
16
|
-
|
15
|
+
documents = document_names.join(", ")
|
16
|
+
fields = field_names.join(", ")
|
17
|
+
puts "Documents: #{documents}"
|
18
|
+
puts "Merge fields: #{fields}"
|
19
|
+
end
|
20
|
+
|
21
|
+
def field_names
|
22
|
+
documents.collect do |name, document|
|
23
|
+
(document / "//w:t").collect do |text_node|
|
24
|
+
next unless (matches = text_node.content.match(/{{\s*(\S+)\s*}}/))
|
25
|
+
field_name = matches[1].strip
|
26
|
+
puts "...field #{field_name} found in #{name}" if verbose
|
27
|
+
field_name
|
28
|
+
end.compact
|
29
|
+
end.flatten
|
30
|
+
end
|
31
|
+
|
32
|
+
def document_names
|
33
|
+
@zip.entries.collect { |entry| entry.name }
|
17
34
|
end
|
18
35
|
|
19
36
|
def data fields = {}
|
20
37
|
fields = fields.transform_keys(&:to_s)
|
21
38
|
|
39
|
+
if verbose
|
40
|
+
puts "...substitutions..."
|
41
|
+
fields.each do |key, value|
|
42
|
+
puts " #{key} => #{value}"
|
43
|
+
end
|
44
|
+
end
|
45
|
+
|
22
46
|
@documents.each do |name, document|
|
23
47
|
result = document.dup
|
24
|
-
result =
|
25
|
-
result = substitute_style_two_with result, fields
|
48
|
+
result = substitute result, fields, name
|
26
49
|
|
27
50
|
@output[name] = result.serialize save_with: 0
|
28
51
|
end
|
29
52
|
end
|
30
53
|
|
31
54
|
def write_to path
|
55
|
+
puts "...writing to #{path}" if verbose
|
32
56
|
Zip::File.open(path, Zip::File::CREATE) do |out|
|
33
57
|
zip.each do |entry|
|
34
58
|
out.get_output_stream(entry.name) do |o|
|
@@ -46,8 +70,6 @@ module Consolidate
|
|
46
70
|
attr_reader :documents
|
47
71
|
attr_accessor :output
|
48
72
|
|
49
|
-
EXCLUSIONS = %w{_rels/.rels [Content_Types].xml word/_rels/document.xml.rels word/theme/theme1.xml word/settings.xml word/_rels/settings.xml.rels word/styles.xml word/webSettings.xml word/fontTable.xml docProps/core.xml docProps/app.xml}
|
50
|
-
|
51
73
|
def initialize(path, verbose: false, &block)
|
52
74
|
raise "No block given" unless block
|
53
75
|
@verbose = verbose
|
@@ -56,8 +78,8 @@ module Consolidate
|
|
56
78
|
begin
|
57
79
|
@zip = Zip::File.open(path)
|
58
80
|
@zip.entries.each do |entry|
|
59
|
-
next
|
60
|
-
puts "
|
81
|
+
next unless entry.name.match?(/word\/(document|header|footer|footnotes|endnotes).?\.xml/)
|
82
|
+
puts "...reading #{entry.name}" if verbose
|
61
83
|
xml = @zip.get_input_stream entry
|
62
84
|
@documents[entry.name] = Nokogiri::XML(xml) { |x| x.noent }
|
63
85
|
end
|
@@ -67,56 +89,16 @@ module Consolidate
|
|
67
89
|
end
|
68
90
|
end
|
69
91
|
|
70
|
-
def
|
71
|
-
|
72
|
-
|
73
|
-
|
74
|
-
|
75
|
-
|
76
|
-
|
77
|
-
|
78
|
-
|
79
|
-
|
80
|
-
(document / "//w:fldSimple").collect do |field|
|
81
|
-
value = field.attributes["instr"].value.strip
|
82
|
-
puts "...found #{value} (v1) in #{name}" if verbose
|
83
|
-
value.include?("MERGEFIELD") ? value.gsub("MERGEFIELD", "").strip : nil
|
84
|
-
end.compact
|
85
|
-
end.flatten
|
86
|
-
end
|
87
|
-
|
88
|
-
def extract_style_two
|
89
|
-
documents.collect do |name, document|
|
90
|
-
(document / "//w:instrText").collect do |instr|
|
91
|
-
value = instr.inner_text
|
92
|
-
puts "...found #{value} (v2) in #{name}" if verbose
|
93
|
-
value.include?("MERGEFIELD") ? value.gsub("MERGEFIELD", "").strip : nil
|
94
|
-
end.compact
|
95
|
-
end.flatten
|
96
|
-
end
|
97
|
-
|
98
|
-
def substitute_style_one_with document, fields
|
99
|
-
# Word's first way of doing things
|
100
|
-
(document / "//w:fldSimple").each do |field|
|
101
|
-
if field.attributes["instr"].value =~ /MERGEFIELD (\S+)/
|
102
|
-
text_node = (field / ".//w:t").first
|
103
|
-
next unless text_node
|
104
|
-
puts "...substituting v1 #{field.attributes["instr"]} with #{fields[$1]}" if verbose
|
105
|
-
text_node.inner_html = fields[$1].to_s
|
106
|
-
end
|
107
|
-
end
|
108
|
-
document
|
109
|
-
end
|
110
|
-
|
111
|
-
def substitute_style_two_with document, fields
|
112
|
-
# Word's second way of doing things
|
113
|
-
(document / "//w:instrText").each do |instr|
|
114
|
-
if instr.inner_text =~ /MERGEFIELD (\S+)/
|
115
|
-
text_node = instr.parent.next_sibling.next_sibling.xpath(".//w:t").first
|
116
|
-
text_node ||= instr.parent.next_sibling.next_sibling.next_sibling.xpath(".//w:t").first
|
117
|
-
next unless text_node
|
118
|
-
puts "...substituting v2 #{instr.inner_text} with #{fields[$1]}" if verbose
|
119
|
-
text_node.inner_html = fields[$1].to_s
|
92
|
+
def substitute document, fields, document_name
|
93
|
+
(document / "//w:t").each do |text_node|
|
94
|
+
next unless (matches = text_node.content.match(/{{\s*(\S+)\s*}}/))
|
95
|
+
field_name = matches[1].strip
|
96
|
+
if fields.has_key? field_name
|
97
|
+
field_value = fields[field_name]
|
98
|
+
puts "...substituting #{field_name} with #{field_value} in #{document_name}" if verbose
|
99
|
+
text_node.content = text_node.content.gsub(matches[1], field_value).gsub("{{", "").gsub("}}", "")
|
100
|
+
elsif verbose
|
101
|
+
puts "...found #{field_name} but no replacement value"
|
120
102
|
end
|
121
103
|
end
|
122
104
|
document
|
data/lib/consolidate/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: standard-procedure-consolidate
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Rahoul Baruah
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2023-09-
|
11
|
+
date: 2023-09-13 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: rubyzip
|
@@ -43,10 +43,12 @@ email:
|
|
43
43
|
- rahoulb@standardprocedure.app
|
44
44
|
executables:
|
45
45
|
- consolidate
|
46
|
+
- examine
|
46
47
|
extensions: []
|
47
48
|
extra_rdoc_files: []
|
48
49
|
files:
|
49
50
|
- ".devcontainer/devcontainer.json"
|
51
|
+
- ".nova/Configuration.json"
|
50
52
|
- ".rspec"
|
51
53
|
- ".ruby-version"
|
52
54
|
- ".standard.yml"
|
@@ -58,7 +60,10 @@ files:
|
|
58
60
|
- checksums/standard-procedure-consolidate-0.1.0.gem.sha512
|
59
61
|
- checksums/standard-procedure-consolidate-0.1.2.gem.sha512
|
60
62
|
- checksums/standard-procedure-consolidate-0.1.3.gem.sha512
|
63
|
+
- checksums/standard-procedure-consolidate-0.1.4.gem.sha512
|
64
|
+
- checksums/standard-procedure-consolidate-0.2.0.gem.sha512
|
61
65
|
- exe/consolidate
|
66
|
+
- exe/examine
|
62
67
|
- lib/consolidate.rb
|
63
68
|
- lib/consolidate/docx/merge.rb
|
64
69
|
- lib/consolidate/version.rb
|