standard-procedure-consolidate 0.1.4 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f45c16dc8e6f37f91043673804db8076030a804ab6c08fa1139f4d86cdf54b79
4
- data.tar.gz: 7c77802d6b578ac50b9f9c8b5bad9ff097918fab5644eb3c771b2baf98451a36
3
+ metadata.gz: 1bd2015cc0f88bc4b16bae7d8ecd335fb7b28c2bc940f81b1479e4a58ecce647
4
+ data.tar.gz: 44905254b9536f40f839d64d0448c4fb40b1a3720fb4d9ba3e2251468175eb9b
5
5
  SHA512:
6
- metadata.gz: 2ed822b460867d75395fdb9253ef3501619781d3ab31acaca7db4daf699658f8253d4a063336bcc2321bf6dd08f0207e91600172f8ec84e4263af055ee9107a7
7
- data.tar.gz: 37a8f85beb0b9c65ba76bf9457edcb9e141d1e87aca1f65c01b5a4e43f45f143c66695d5922eb3ffafca43625f9e3745db8fd2e3c516dd6ffb73172a8c9437ac
6
+ metadata.gz: 36dee64a5f21766e06e7b5eb213e5a5b4dee9ca04da8bde4b429c4d93c5bd49275226975c7e9887214ea5a3b5975c4f955b245ab45e0103656b928ab59854789
7
+ data.tar.gz: 687bba668eb1f137b67c82819561ec2e8c2c3bc9a54f03aa4b6ac76958ae3a15988af8db5f78e7e95adb59968b5764b32c37e71d257b7885d1eb0e10d6e3f56a
@@ -1,7 +1,23 @@
1
1
  // For format details, see https://aka.ms/devcontainer.json. For config options, see the
2
2
  // README at: https://github.com/devcontainers/templates/tree/main/src/ruby
3
3
  {
4
- "name": "Ruby",
5
- "image": "mcr.microsoft.com/devcontainers/ruby:1-3.2-bullseye",
6
- "postCreateCommand": "bundle install"
7
- }
4
+ "name": "Ruby",
5
+ "image": "mcr.microsoft.com/devcontainers/ruby:1-3.2-bullseye",
6
+ "postCreateCommand": "bundle install",
7
+ "customizations": {
8
+ "vscode": {
9
+ "extensions": [
10
+ "Shopify.ruby-extensions-pack",
11
+ "testdouble.vscode-standard-ruby",
12
+ "manuelpuyol.erb-linter",
13
+ "Shopify.ruby-lsp",
14
+ "aki77.rails-db-schema",
15
+ "miguel-savignano.ruby-symbols",
16
+ "sibiraj-s.vscode-scss-formatter",
17
+ "Thadeu.vscode-run-rspec-file",
18
+ "Cronos87.yaml-symbols",
19
+ "aliariff.vscode-erb-beautify"
20
+ ]
21
+ }
22
+ }
23
+ }
data/.ruby-version CHANGED
@@ -1 +1 @@
1
- 3.2.2
1
+ 3.2.5
data/.standard.yml CHANGED
@@ -1,3 +1,3 @@
1
1
  # For available configuration options, see:
2
2
  # https://github.com/testdouble/standard
3
- ruby_version: 2.6
3
+ ruby_version: 3.2.5
data/CHANGELOG.md CHANGED
@@ -1,3 +1,12 @@
1
+ ## [0.3.0] - 2024-11-21
2
+
3
+ Updated the code that examines the docx file for merge fields to deal with Word formatting tags being inserted in the middle of the merge fields.
4
+
5
+ ## [0.2.0] - 2023-09-13
6
+
7
+ Thrown away the mail-merge implementation and replaced it with a simple search/replace.
8
+ Added in the command line utilities for examining and consolidating documents.
9
+
1
10
  ## [0.1.4] - 2023-09-11
2
11
 
3
12
  Updated which files get exclusions after crashes in production with some client files
data/README.md CHANGED
@@ -1,51 +1,84 @@
1
1
  # Standard::Procedure::Consolidate
2
2
 
3
- A simple gem for performing mailmerge on Microsoft Word .docx files.
3
+ A simple gem for performing search and replace on Microsoft Word .docx files.
4
4
 
5
- Important: I can't claim the credit for this - I found [this gist](https://gist.github.com/ericmason/7200448) and have just adapted it.
5
+ Important: I can't claim the credit for this - I found [this gist](https://gist.github.com/ericmason/7200448) and have adapted it for my needs.
6
6
 
7
- It's pretty simple, so it probably won't work with complex Word documents, but it does what I need. YMMV.
7
+ ## Search/Replace for field placeholders
8
8
 
9
+ If you have a Word .docx file that looks like this (ignoring formatting):
9
10
 
10
- ## Installation
11
+ ```
12
+ Dear {{ first_name }},
11
13
 
12
- $ bundle add standard-procedure-consolidate
14
+ Thank you for your purchase of {{ product }} on the {{ date }}. We hope that with the proper care and attention it will give you years of happy use.
15
+
16
+ Regards
17
+
18
+ {{ user_name }}
19
+ ```
20
+
21
+ We have marked out the "fields" by using squiggly brackets - in a manner similar to (but simpler than) Liquid or Mustache templates. In this example, we have fields for first_name, product, date and user_name.
22
+
23
+ Consolidate reads your .docx file, locates these fields and then replaces them with the values you have supplied, writing the output to a new file.
24
+
25
+ NOTE: These are not traditional Word "mail-merge fields" - these are just fragments of text that are within the Word document. See the history section for why this does not work with merge-fields.
13
26
 
14
27
  ## Usage
15
28
 
16
- To list the merge fields within a document:
29
+ There is a command-line tool or a ruby library that you can include into your code.
30
+
31
+ Using the ruby library:
17
32
 
18
33
  ```ruby
19
- Consolidate::Docx::Merge.open "/path/to/docx" do |merge|
20
- merge.examine
21
- end and nil
34
+ Consolidate::Docx::Merge.open "/path/to/file.docx" do |doc|
35
+ puts doc.field_names
36
+
37
+ doc.data first_name: "Alice", product: "Palm Pilot", date: "23rd January 2002", user_name: "Bob"
38
+ doc.write_to "/path/to/merge-file.docx"
39
+ end
22
40
  ```
23
- To perform a merge, replacing merge fields with supplied values:
24
41
 
25
- ```ruby
26
- Consolidate::Docx::Merge.open "/path/to/docx" do |merge|
27
- merge.data "Name" => "Alice Aadvark", "Company" => "TinyCo", "Job_Title" => "CEO"
28
- merge.write_to "/path/to/output.docx"
29
- end
42
+ Using the command line:
43
+
44
+ ```sh
45
+ examine /path/to/file.docx
46
+
47
+ consolidate /path/to/file.docx /path/to/merge-file.docx first_name=Alice "product=Palm Pilot" "date=23rd January 2022" "user_name=Bob"
48
+ ```
49
+
50
+ If you want to see what the routine is doing you can add the `verbose` option.
51
+
52
+ ```sh
53
+ examine /path/to/file.docx verbose
54
+
55
+ consolidate /path/to/file.docx /path/to/merge-file.docx first_name=Alice "product=Palm Pilot" "date=23rd January 2022" "user_name=Bob" verbose
30
56
  ```
31
57
 
32
- NOTE: The merge fields are case-sensitive - which is why they should be supplied as strings (using the older `{ "key" => "value" }` style ruby hash).
58
+ ### History
33
59
 
34
- If your merge isn't working, you can pass `verbose: true` to `open` and it will list the internal .xml documents it finds, the fields it finds within those .xml documents and the substitutions it is trying to perform.
60
+ Originally, this gem was intended to open a Word .docx file, find the mailmerge fields within it and then substitute new values.
35
61
 
36
- ## How it works
62
+ I managed to get a basic version of this working with a variety of different Word files and all seemed good. Until my client reported that when they went to print the document, the mailmerge fields reappeared! My best guess is that Word is thinking that print-time is a trigger for merging in a data source (for example, printing out a form letter to 200 customers), so all the substitution work that this gem does is then discarded and Word asks for the merge data again. The frustrating thing is I can't figure out how Word keeps the references to the merge fields after they've been substituted.
63
+
64
+ So instead, this does a simple search and replace - looking for fields within squiggly brackets and substituting them.
65
+
66
+ ### How it works
37
67
 
38
68
  This is a bit sketchy and pieced together from the [code I found]((https://gist.github.com/ericmason/7200448)) and various bits of skimming through the published file format.
39
69
 
40
- A .docx file (unlike the previous .doc file), is actually a .zip file that contains a number of .xml files. The contents of the document are stored in these .xml files, along with configuration information and stuff like fonts and styles.
70
+ A .docx file is actually a .zip file that contains a number of .xml documents. Some of these are for storing formatting information (fonts, styles and various metadata) and some are for storing the actual document contents.
71
+
72
+ Consolidate looks for word/document.xml files, plus any files that match word/header*.xml, word/footer*.xml, word/footnotes*.xml and word/endnotes*.xml. It parses the XML, looking for text nodes that contain squiggly brackets. If it finds them, it then checks to see if we have supplied a data value for the matching field-name and replaces the contents of the node.
41
73
 
42
- When setting up a merge field, Word adds some special tags into the .xml file. There appear to be two different versions of how it does this - using `w:fldSimple` or `w:instrText` tags. Consolidate goes through each .xml file within the document (ignoring some which are configuration related) and looks for these two types of tag.
74
+ ## Installation
75
+
76
+ $ bundle add standard-procedure-consolidate
43
77
 
44
- The `data` method uses the hash of `field: value` data you supply, copies the .xml files and performs a straight text substitution on any matching merge fields. Then `write_to`
45
78
 
46
79
  ## Development
47
80
 
48
- The repo contains a .devcontainer folder - this contains instructions for a development container that has everything needed to build the project. Once the container has started, you can use `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
81
+ The repo contains a .devcontainer folder - this contains instructions for a development container that has everything needed to build the project. Once the container has started, you can use `bin/setup` to install dependencies. Then, run `bundle exec rake` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
49
82
 
50
83
  `bundle exec rake install` will install the gem on your local machine (obviously not from within the devcontainer though). To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and the created tag, and push the `.gem` file to [rubygems.org](https://rubygems.org).
51
84
 
data/Rakefile CHANGED
@@ -7,4 +7,4 @@ RSpec::Core::RakeTask.new(:spec)
7
7
 
8
8
  require "standard/rake"
9
9
 
10
- task default: %i[spec standard]
10
+ task default: %i[standard:fix spec]
@@ -0,0 +1 @@
1
+ 872129b729c3531b72fbecc9bea75262ebc771d8e0a9f4d41b30ad87f73ee664aaccf8d602a410c3b500f067cd636f88c74eecb887cb68a345b6ea9d575c2140
@@ -0,0 +1 @@
1
+ 098078b13e3bb4e3370cbc6bdb0195300489dfb33a5c27a201a4c5f1f0184bba73557264e674f49da41dbc62ad991e8ae99bef1debf417a5b1a363b1d56829f1
data/exe/consolidate CHANGED
@@ -1,3 +1,34 @@
1
1
  #!/usr/bin/env ruby
2
2
 
3
3
  require "consolidate"
4
+
5
+ if ARGV[0].nil? || ARGV[0] == ""
6
+ puts "# Standard::Procedure::Consolidate"
7
+ puts "## Mailmerge for simple Microsoft Word .docx files."
8
+ puts ""
9
+ puts "Create a new file with the mailmerge fields replaced by the data you have supplied"
10
+ puts "USAGE: consolidate path/to/myfile.docx path/to/mynewfile.docx \"field1=value1\" \"field2=value2\" \"field3=value3\""
11
+ puts ""
12
+ else
13
+ input = ARGV[0]
14
+ output = ARGV[1]
15
+ data = {}
16
+ verbose = false
17
+ 2.upto ARGV.length do |index|
18
+ arg = ARGV[index]
19
+ next if arg.nil?
20
+ if arg.strip == "verbose"
21
+ verbose = true
22
+ else
23
+ pieces = arg.split("=")
24
+ key = pieces.first.strip
25
+ value = pieces.last.strip
26
+ data[key] = value
27
+ end
28
+ end
29
+
30
+ Consolidate::Docx::Merge.open input, verbose: verbose do |doc|
31
+ doc.data data
32
+ doc.write_to output
33
+ end and nil
34
+ end
data/exe/examine ADDED
@@ -0,0 +1,18 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require "consolidate"
5
+
6
+ if ARGV[0].nil? || ARGV[0] == ""
7
+ puts "# Standard::Procedure::Consolidate"
8
+ puts "## Mailmerge for simple Microsoft Word .docx files."
9
+ puts ""
10
+ puts "Examine the mailmerge fields inside a .docx file"
11
+ puts "USAGE: examine path/to/myfile.docx"
12
+ puts "Option: examine path/to/myfile.docx verbose"
13
+ puts ""
14
+ else
15
+ Consolidate::Docx::Merge.open ARGV[0], verbose: (ARGV[1] == "verbose") do |doc|
16
+ doc.examine
17
+ end and nil
18
+ end
@@ -11,24 +11,55 @@ module Consolidate
11
11
  path
12
12
  end
13
13
 
14
+ def initialize(path, verbose: false, &block)
15
+ @verbose = verbose
16
+ @output = {}
17
+ @zip = Zip::File.open(path)
18
+ @documents = load_documents
19
+ block&.call self
20
+ end
21
+
22
+ # Helper method to display the contents of the document and the merge fields from the CLI
14
23
  def examine
15
- puts "Documents: #{extract_document_names}"
16
- puts "Merge fields: #{extract_field_names}"
24
+ documents = document_names.join(", ")
25
+ fields = field_names.join(", ")
26
+ puts "Documents: #{documents}"
27
+ puts "Merge fields: #{fields}"
17
28
  end
18
29
 
19
- def data fields = {}
20
- fields = fields.transform_keys(&:to_s)
30
+ # Read all documents within the docx and extract any merge fields
31
+ def field_names
32
+ tag_nodes.collect do |tag_node|
33
+ field_name_from tag_node
34
+ end.compact.uniq
35
+ end
36
+
37
+ # List the documents stored within this docx
38
+ def document_names
39
+ @zip.entries.collect { |entry| entry.name }
40
+ end
41
+
42
+ # Substitute the data from the merge fields with the values provided
43
+ def data mapping = {}
44
+ mapping = mapping.transform_keys(&:to_s)
45
+
46
+ if verbose
47
+ puts "...substitutions..."
48
+ fields.each do |key, value|
49
+ puts " #{key} => #{value}"
50
+ end
51
+ end
21
52
 
22
53
  @documents.each do |name, document|
23
- result = document.dup
24
- result = substitute_style_one_with result, fields
25
- result = substitute_style_two_with result, fields
54
+ output_document = substitute document.dup, mapping: mapping, document_name: name
26
55
 
27
- @output[name] = result.serialize save_with: 0
56
+ @output[name] = output_document.serialize save_with: 0
28
57
  end
29
58
  end
30
59
 
60
+ # Write the new document to the given path
31
61
  def write_to path
62
+ puts "...writing to #{path}" if verbose
32
63
  Zip::File.open(path, Zip::File::CREATE) do |out|
33
64
  zip.each do |entry|
34
65
  out.get_output_stream(entry.name) do |o|
@@ -38,7 +69,7 @@ module Consolidate
38
69
  end
39
70
  end
40
71
 
41
- protected
72
+ private
42
73
 
43
74
  attr_reader :verbose
44
75
  attr_reader :zip
@@ -46,82 +77,62 @@ module Consolidate
46
77
  attr_reader :documents
47
78
  attr_accessor :output
48
79
 
49
- def initialize(path, verbose: false, &block)
50
- raise "No block given" unless block
51
- @verbose = verbose
52
- @output = {}
53
- @documents = {}
54
- begin
55
- @zip = Zip::File.open(path)
56
- @zip.entries.each do |entry|
57
- next unless entry.name =~ /word\/(document|header|footer|footnotes|endnotes).?\.xml/
58
- puts "Reading #{entry.name}" if verbose
59
- xml = @zip.get_input_stream entry
60
- @documents[entry.name] = Nokogiri::XML(xml) { |x| x.noent }
61
- end
62
- yield self
63
- ensure
64
- @zip.close
80
+ def load_documents
81
+ @zip.entries.each_with_object({}) do |entry, documents|
82
+ next unless entry.name.match?(/word\/(document|header|footer|footnotes|endnotes).?\.xml/)
83
+ puts "...reading #{entry.name}" if verbose
84
+ xml = @zip.get_input_stream entry
85
+ documents[entry.name] = Nokogiri::XML(xml) { |x| x.noent }
65
86
  end
87
+ ensure
88
+ @zip.close
66
89
  end
67
90
 
68
- def extract_document_names
69
- @zip.entries.collect { |entry| entry.name }.join(", ")
70
- end
71
-
72
- def extract_field_names
73
- (extract_style_one + extract_style_two).uniq.join(", ")
74
- end
75
-
76
- def extract_style_one
91
+ # Collect all the nodes that contain merge fields
92
+ def tag_nodes
77
93
  documents.collect do |name, document|
78
- (document / "//w:fldSimple").collect do |field|
79
- value = field.attributes["instr"].value.strip
80
- puts "...found #{value} (v1) in #{name}" if verbose
81
- value.include?("MERGEFIELD") ? value.gsub("MERGEFIELD", "").strip : nil
82
- end.compact
94
+ tag_nodes_for document
83
95
  end.flatten
84
96
  end
85
97
 
86
- def extract_style_two
87
- documents.collect do |name, document|
88
- (document / "//w:instrText").collect do |instr|
89
- value = instr.inner_text
90
- puts "...found #{value} (v2) in #{name}" if verbose
91
- value.include?("MERGEFIELD") ? value.gsub("MERGEFIELD", "").strip : nil
92
- end.compact
93
- end.flatten
98
+ # go through all w:t (Word Text???) nodes of the document
99
+ # find any nodes that contain "{{"
100
+ # then find the ancestor node that also includes the ending "}}"
101
+ # This collection of nodes contains all the merge fields for this document
102
+ def tag_nodes_for document
103
+ (document / "//w:t").collect do |node|
104
+ (node.children.any? { |child| child.content.include? "{{" }) ? enclosing_node_for_start_tag(node) : nil
105
+ end.compact
94
106
  end
95
107
 
96
- def substitute_style_one_with document, fields
97
- # Word's first way of doing things
98
- (document / "//w:fldSimple").each do |field|
99
- if field.attributes["instr"].value =~ /MERGEFIELD (\S+)/
100
- text_node = (field / ".//w:t").first
101
- next unless text_node
102
- puts "...substituting v1 #{field.attributes["instr"]} with #{fields[$1]}" if verbose
103
- text_node.inner_html = fields[$1].to_s
104
- end
105
- end
106
- document
108
+ # Extract the merge field name from the node
109
+ def field_name_from(tag_node)
110
+ return nil unless (matches = tag_node.content.match(/{{\s*(\S+)\s*}}/))
111
+ field_name = matches[1].strip
112
+ puts "...field #{field_name} found in #{name}" if verbose
113
+ field_name.to_s
107
114
  end
108
115
 
109
- def substitute_style_two_with document, fields
110
- # Word's second way of doing things
111
- (document / "//w:instrText").each do |instr|
112
- if instr.inner_text =~ /MERGEFIELD (\S+)/
113
- text_node = instr.parent.next_sibling.next_sibling.xpath(".//w:t").first
114
- text_node ||= instr.parent.next_sibling.next_sibling.next_sibling.xpath(".//w:t").first
115
- next unless text_node
116
- puts "...substituting v2 #{instr.inner_text} with #{fields[$1]}" if verbose
117
- text_node.inner_html = fields[$1].to_s
118
- end
116
+ # Go through the given document, replacing any merge fields with the values provided
117
+ # and storing the results in a new document
118
+ def substitute document, document_name:, mapping: {}
119
+ tag_nodes_for(document).each do |tag_node|
120
+ field_name = field_name_from tag_node
121
+ next unless mapping.has_key? field_name
122
+ field_value = mapping[field_name]
123
+ puts "...substituting #{field_name} with #{field_value} in #{document_name}" if verbose
124
+ tag_node.content = tag_node.content.gsub(field_name, field_value).gsub(/{{\s*/, "").gsub(/\s*}}/, "")
125
+ rescue => ex
126
+ # Have to mangle the exception message otherwise it outputs the entire document
127
+ puts ex.message.to_s[0..255]
119
128
  end
120
129
  document
121
130
  end
122
131
 
123
- def close
124
- zip.close
132
+ # Find the ancestor node that contains both the start {{ text and the end }} text enclosing the merge field
133
+ def enclosing_node_for_start_tag(node)
134
+ return node if node.content.include? "}}"
135
+ node.parent.nil? ? nil : enclosing_node_for_start_tag(node.parent)
125
136
  end
126
137
  end
127
138
  end
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Consolidate
4
- VERSION = "0.1.4"
4
+ VERSION = "0.3.0"
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: standard-procedure-consolidate
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.4
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Rahoul Baruah
8
- autorequire:
8
+ autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-09-11 00:00:00.000000000 Z
11
+ date: 2024-11-21 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rubyzip
@@ -43,11 +43,11 @@ email:
43
43
  - rahoulb@standardprocedure.app
44
44
  executables:
45
45
  - consolidate
46
+ - examine
46
47
  extensions: []
47
48
  extra_rdoc_files: []
48
49
  files:
49
50
  - ".devcontainer/devcontainer.json"
50
- - ".nova/Configuration.json"
51
51
  - ".rspec"
52
52
  - ".ruby-version"
53
53
  - ".standard.yml"
@@ -60,7 +60,10 @@ files:
60
60
  - checksums/standard-procedure-consolidate-0.1.2.gem.sha512
61
61
  - checksums/standard-procedure-consolidate-0.1.3.gem.sha512
62
62
  - checksums/standard-procedure-consolidate-0.1.4.gem.sha512
63
+ - checksums/standard-procedure-consolidate-0.2.0.gem.sha512
64
+ - checksums/standard-procedure-consolidate-0.3.0.gem.sha512
63
65
  - exe/consolidate
66
+ - exe/examine
64
67
  - lib/consolidate.rb
65
68
  - lib/consolidate/docx/merge.rb
66
69
  - lib/consolidate/version.rb
@@ -74,7 +77,7 @@ metadata:
74
77
  homepage_uri: https://github.com/standard-procedure/standard-procedure-consolidate
75
78
  source_code_uri: https://github.com/standard-procedure/standard-procedure-consolidate
76
79
  changelog_uri: https://github.com/standard-procedure/standard-procedure-consolidate/blob/main/CHANGELOG.md
77
- post_install_message:
80
+ post_install_message:
78
81
  rdoc_options: []
79
82
  require_paths:
80
83
  - lib
@@ -89,8 +92,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
89
92
  - !ruby/object:Gem::Version
90
93
  version: '0'
91
94
  requirements: []
92
- rubygems_version: 3.4.10
93
- signing_key:
95
+ rubygems_version: 3.4.19
96
+ signing_key:
94
97
  specification_version: 4
95
98
  summary: Simple ruby mailmerge for Microsoft Word .docx files.
96
99
  test_files: []
@@ -1,3 +0,0 @@
1
- {
2
-
3
- }