mitimes-htmltoword 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: b2cb47f2c09c5cfba188e84d7e1dcb542fdd3919
4
+ data.tar.gz: d3b1e4627bd5d398f47a86cf527336eff42c780d
5
+ SHA512:
6
+ metadata.gz: c8709bfe39cce07dd9958e6ea0e0a69d24e7eca3aa818702f173ee72386267cbbd4357330c59a676fb14f7fe6be99a2da502c99823ed754b14a200c9ab7f956b
7
+ data.tar.gz: 6801104c290c4155dbc8397fe8ff4cb3f661e1268a7a85507f8c3c1777326cdaf4624815d5cab4ea1dfeee41eafbf0dd4bc722a92cac7c72e1fc6f358eb18f8a
@@ -0,0 +1,179 @@
1
+ # Ruby Html to word Gem
2
+
3
+ This simple gem allows you to create MS Word docx documents from simple html documents. This makes it easy to create dynamic reports and forms that can be downloaded by your users as simple MS Word docx files.
4
+
5
+ Add this line to your application's Gemfile:
6
+
7
+ gem 'htmltoword'
8
+
9
+ And then execute:
10
+
11
+ $ bundle
12
+
13
+ Or install it yourself as:
14
+
15
+ $ gem install htmltoword
16
+
17
+
18
+ ** Note: ** Since version 0.4.0 the ```create``` method will return a string with the contents of the file. If you want to save the file please use ```create_and_save```. See the usage for more
19
+
20
+ ## Usage
21
+
22
+ ### Standalone
23
+
24
+ By default, the file will be saved at the specified location. In case you want to handle the contents of the file
25
+ as a string and do what suits you best, you can specify that when calling the create function.
26
+
27
+ Using the default word file as template
28
+ ```ruby
29
+ require 'htmltoword'
30
+
31
+ my_html = '<html><head></head><body><p>Hello</p></body></html>'
32
+ document = Htmltoword::Document.create(my_html)
33
+ file = Htmltoword::Document.create_and_save(my_html, file_path)
34
+ ```
35
+
36
+ Using your custom word file as a template, where you can setup your own style for normal text, h1,h2, etc.
37
+ ```ruby
38
+ require 'htmltoword'
39
+
40
+ # Configure the location of your custom templates
41
+ Htmltoword.config.custom_templates_path = 'some_path'
42
+
43
+ my_html = '<html><head></head><body><p>Hello</p></body></html>'
44
+ document = Htmltoword::Document.create(my_html, word_template_file_name)
45
+ file = Htmltoword::Document.create_and_save(my_html, file_path, word_template_file_name)
46
+ ```
47
+
48
+ The ```create``` function will return a string with the file, so you can do with it what you consider best.
49
+ The ```create_and_save``` function will create the file in the specified file_path.
50
+
51
+ ### With Rails
52
+ **For htmltoword version >= 0.2**
53
+ An action controller renderer has been defined, so there's no need to declare the mime-type and you can just respond to .docx format. It will look then for views with the extension ```.docx.erb``` which will provide the HTML that will be rendered in the Word file.
54
+
55
+ ```ruby
56
+ # On your controller.
57
+ respond_to :docx
58
+
59
+ # filename and word_template are optional. By default it will name the file as your action and use the default template provided by the gem. The use of the .docx in the filename and word_template is optional.
60
+ def my_action
61
+ # ...
62
+ respond_with(@object, filename: 'my_file.docx', word_template: 'my_template.docx')
63
+ # Alternatively, if you don't want to create the .docx.erb template you could
64
+ respond_with(@object, content: '<html><body>some html</body></html>', filename: 'my_file.docx')
65
+ end
66
+
67
+ def my_action2
68
+ # ...
69
+ respond_to do |format|
70
+ format.docx do
71
+ render docx: 'my_view', filename: 'my_file.docx'
72
+ # Alternatively, if you don't want to create the .docx.erb template you could
73
+ render docx: 'my_file.docx', content: '<html><body>some html</body></html>'
74
+ end
75
+ end
76
+ end
77
+ ```
78
+
79
+ Example of my_view.docx.erb
80
+ ```
81
+ <h1> My custom template </h1>
82
+ <%= render partial: 'my_partial', collection: @objects, as: :item %>
83
+ ```
84
+ Example of _my_partial.docx.erb
85
+ ```
86
+ <h3><%= item.title %></h3>
87
+ <p> My html for item <%= item.id %> goes here </p>
88
+ ```
89
+
90
+ **For htmltoword version <= 0.1.8**
91
+ ```ruby
92
+ # Add mime-type in /config/initializers/mime_types.rb:
93
+ Mime::Type.register "application/vnd.openxmlformats-officedocument.wordprocessingml.document", :docx
94
+
95
+ # Add docx responder in your controller
96
+ def show
97
+ respond_to do |format|
98
+ format.docx do
99
+ file = Htmltoword::Document.create params[:docx_html_source], "file_name.docx"
100
+ send_file file.path, :disposition => "attachment"
101
+ end
102
+ end
103
+ end
104
+ ```
105
+
106
+ ```javascript
107
+ // OPTIONAL: Use a jquery click handler to store the markup in a hidden form field before the form is submitted.
108
+ // Using this strategy makes it easy to allow users to dynamically edit the document that will be turned
109
+ // into a docx file, for example by toggling sections of a document.
110
+ $('#download-as-docx').on('click', function () {
111
+ $('input[name="docx_html_source"]').val('<!DOCTYPE html>\n' + $('.delivery').html());
112
+ });
113
+ ```
114
+
115
+ ### Configure templates and xslt paths
116
+
117
+ From version 2.0 you can configure the location of default and custom templates and xslt files. By default templates are defined under ```lib/htmltoword/templates``` and xslt under ```lib/htmltoword/xslt```
118
+
119
+ ```ruby
120
+ Htmltoword.configure do |config|
121
+ config.custom_templates_path = 'path_for_custom_templates'
122
+ # If you modify this path, there should be a 'default.docx' file in there
123
+ config.default_templates_path = 'path_for_default_template'
124
+ # If you modify this path, there should be a 'html_to_wordml.xslt' file in there
125
+ config.default_xslt_path = 'some_path'
126
+ # The use of additional custom xslt will come soon
127
+ config.custom_xslt_path = 'some_path'
128
+ end
129
+ ```
130
+
131
+ ## Features
132
+
133
+ All standard html elements are supported and will create the closest equivalent in wordml. For example spans will create inline elements and divs will create block like elements.
134
+
135
+ ### Highlighting text
136
+
137
+ You can add highlighting to text by wrapping it in a span with class h and adding a data style with a color that wordml supports (http://www.schemacentral.com/sc/ooxml/t-w_ST_HighlightColor.html) ie:
138
+
139
+ ```html
140
+ <span class="h" data-style="green">This text will have a green highlight</span>
141
+ ```
142
+
143
+ ### Page breaks
144
+
145
+ To create page breaks simply add a div with class -page-break ie:
146
+
147
+ ```html
148
+ <div class="-page-break"></div>
149
+ ````
150
+
151
+ ## Contributing / Extending
152
+
153
+ Word docx files are essentially just a zipped collection of xml files and resources.
154
+ This gem contains a standard empty MS Word docx file and a stylesheet to transform arbitrary html into wordml.
155
+ The basic functioning of this gem can be summarised as:
156
+
157
+ 1. Transform inputed html to wordml.
158
+ 2. Unzip empty word docx file bundled with gem and replace its document.xml content with the new transformed result of step 1.
159
+ 3. Zip up contents again into a resulting .docx file.
160
+
161
+ For more info about WordML: http://rep.oio.dk/microsoft.com/officeschemas/wordprocessingml_article.htm
162
+
163
+ Contributions would be very much appreciated.
164
+
165
+ 1. Fork it
166
+ 2. Create your feature branch (`git checkout -b my-new-feature`)
167
+ 3. Commit your changes (`git commit -am 'Add some feature'`)
168
+ 4. Push to the branch (`git push origin my-new-feature`)
169
+ 5. Create new Pull Request
170
+
171
+ ## License
172
+
173
+ (The MIT License)
174
+
175
+ Copyright © 2013:
176
+
177
+ * Cristina Matonte
178
+
179
+ * Nicholas Frandsen
@@ -0,0 +1,4 @@
1
+ require "bundler/gem_tasks"
2
+ require 'rspec/core/rake_task'
3
+ task :default => :spec
4
+ RSpec::Core::RakeTask.new
@@ -0,0 +1,36 @@
1
+ #!/usr/bin/env ruby
2
+ require 'methadone'
3
+ require 'rmultimarkdown'
4
+ require_relative '../lib/htmltoword'
5
+
6
+ include Methadone::Main
7
+ include Methadone::CLILogging
8
+
9
+ main do |input, output|
10
+ puts "Converting #{input} to #{output}" if options[:verbose]
11
+ markup = File.read input
12
+ if options[:format] == 'markdown'
13
+ markup = markdown2html(markup)
14
+ end
15
+ Htmltoword::Document.create_and_save(markup, output, options[:template_name], options[:extras])
16
+ puts "Done" if options[:verbose]
17
+ end
18
+
19
+ def markdown2html(text)
20
+ MultiMarkdown.new(text.to_s).to_html
21
+ end
22
+
23
+ version Htmltoword::VERSION
24
+ description 'Convert simple html input (or markdown) to MS Word (docx)'
25
+ arg :input, :required
26
+ arg :output, :required
27
+
28
+ on('--verbose', '-v', 'Be verbose')
29
+ on('--extras', '-e', 'Use extra formatting features')
30
+ on('--template', '-t', 'Use custom word base template (.docx file)')
31
+ on('-f FORMAT', '--format', 'Format', /markdown|html/)
32
+
33
+ # options['ip-address'] = '127.0.0.1'
34
+ # on('-i IP_ADDRESS', '--ip-address', 'IP Address', /^\d+\.\d+\.\d+\.\d+$/)
35
+
36
+ go!
@@ -0,0 +1,28 @@
1
+ # encoding: UTF-8
2
+ require 'nokogiri'
3
+ require 'zip'
4
+ require_relative 'htmltoword/configuration'
5
+
6
+ module Htmltoword
7
+ class << self
8
+ def configure
9
+ yield configuration
10
+ end
11
+
12
+ def configuration
13
+ @configuration ||= Configuration.new
14
+ end
15
+
16
+ alias_method :config, :configuration
17
+ end
18
+ end
19
+
20
+ require_relative 'htmltoword/version'
21
+ require_relative 'htmltoword/helpers/templates_helper'
22
+ require_relative 'htmltoword/helpers/xslt_helper'
23
+ require_relative 'htmltoword/document'
24
+
25
+ if defined?(Rails)
26
+ require_relative 'htmltoword/renderer'
27
+ require_relative 'htmltoword/railtie'
28
+ end
@@ -0,0 +1,12 @@
1
+ module Htmltoword
2
+ class Configuration
3
+ attr_accessor :default_templates_path, :custom_templates_path, :default_xslt_path, :custom_xslt_path
4
+
5
+ def initialize
6
+ @default_templates_path = File.join(File.expand_path('../', __FILE__), 'templates')
7
+ @custom_templates_path = File.join(File.expand_path('../', __FILE__), 'templates')
8
+ @default_xslt_path = File.join(File.expand_path('../', __FILE__), 'xslt')
9
+ @custom_xslt_path = File.join(File.expand_path('../', __FILE__), 'xslt')
10
+ end
11
+ end
12
+ end
@@ -0,0 +1,96 @@
1
+ module Htmltoword
2
+ class Document
3
+ include XSLTHelper
4
+
5
+ class << self
6
+ include TemplatesHelper
7
+ def create(content, template_name = nil, extras = false)
8
+ template_name += extension if template_name && !template_name.end_with?(extension)
9
+ document = new(template_file(template_name))
10
+ document.replace_files(content, extras)
11
+ document.generate
12
+ end
13
+
14
+ def create_and_save(content, file_path, template_name = nil, extras = false)
15
+ File.open(file_path, 'wb') do |out|
16
+ out << create(content, template_name, extras)
17
+ end
18
+ end
19
+
20
+ def create_with_content(template, content, extras = false)
21
+ template += extension unless template.end_with?(extension)
22
+ document = new(template_file(template))
23
+ document.replace_files(content, extras)
24
+ document.generate
25
+ end
26
+
27
+ def extension
28
+ '.docx'
29
+ end
30
+
31
+ def doc_xml_file
32
+ 'word/document.xml'
33
+ end
34
+
35
+ def numbering_xml_file
36
+ 'word/numbering.xml'
37
+ end
38
+
39
+ def relations_xml_file
40
+ 'word/_rels/document.xml.rels'
41
+ end
42
+ end
43
+
44
+ def initialize(template_path)
45
+ @replaceable_files = {}
46
+ @template_path = template_path
47
+ end
48
+
49
+ #
50
+ # Generate a string representing the contents of a docx file.
51
+ #
52
+ def generate
53
+ Zip::File.open(@template_path) do |template_zip|
54
+ buffer = Zip::OutputStream.write_buffer do |out|
55
+ template_zip.each do |entry|
56
+ out.put_next_entry entry.name
57
+ if @replaceable_files[entry.name] && entry.name == Document.doc_xml_file
58
+ source = entry.get_input_stream.read
59
+ # Change only the body of document. TODO: Improve this...
60
+ source = source.sub(/(<w:body>)((.|\n)*?)(<w:sectPr)/, "\\1#{@replaceable_files[entry.name]}\\4")
61
+ out.write(source)
62
+ elsif @replaceable_files[entry.name]
63
+ out.write(@replaceable_files[entry.name])
64
+ else
65
+ out.write(template_zip.read(entry.name))
66
+ end
67
+ end
68
+ end
69
+ buffer.string
70
+ end
71
+ end
72
+
73
+ def replace_files(html, extras = false)
74
+ html = '<body></body>' if html.nil? || html.empty?
75
+ source = Nokogiri::HTML(html.gsub(/>\s+</, '><'))
76
+ transform_and_replace(source, xslt_path('numbering'), Document.numbering_xml_file)
77
+ transform_and_replace(source, xslt_path('relations'), Document.relations_xml_file)
78
+ transform_doc_xml(source, extras)
79
+ end
80
+
81
+ def transform_doc_xml(source, extras = false)
82
+ transformed_source = xslt(stylesheet_name: 'cleanup').transform(source)
83
+ transformed_source = xslt(stylesheet_name: 'inline_elements').transform(transformed_source)
84
+ transform_and_replace(transformed_source, document_xslt(extras), Document.doc_xml_file, extras)
85
+ end
86
+
87
+ private
88
+
89
+ def transform_and_replace(source, stylesheet_path, file, remove_ns = false)
90
+ stylesheet = xslt(stylesheet_path: stylesheet_path)
91
+ content = stylesheet.apply_to(source)
92
+ content.gsub!(/\s*xmlns:(\w+)="(.*?)\s*"/, '') if remove_ns
93
+ @replaceable_files[file] = content
94
+ end
95
+ end
96
+ end
@@ -0,0 +1,9 @@
1
+ module Htmltoword
2
+ module TemplatesHelper
3
+ def template_file(template_file_name = nil)
4
+ default_path = File.join(::Htmltoword.config.default_templates_path, 'default.docx')
5
+ template_path = template_file_name.nil? ? '' : File.join(::Htmltoword.config.custom_templates_path, template_file_name)
6
+ File.exist?(template_path) ? template_path : default_path
7
+ end
8
+ end
9
+ end
@@ -0,0 +1,17 @@
1
+ module Htmltoword
2
+ module XSLTHelper
3
+ def document_xslt(extras = false)
4
+ file_name = extras ? 'htmltoword' : 'base'
5
+ xslt_path(file_name)
6
+ end
7
+
8
+ def xslt_path(template_name)
9
+ File.join(Htmltoword.config.default_xslt_path, "#{template_name}.xslt")
10
+ end
11
+
12
+ def xslt(stylesheet_name: nil, stylesheet_path: nil)
13
+ return Nokogiri::XSLT(File.open(stylesheet_path)) if stylesheet_path
14
+ Nokogiri::XSLT(File.open(xslt_path(stylesheet_name)))
15
+ end
16
+ end
17
+ end
@@ -0,0 +1,25 @@
1
+ module Htmltoword
2
+ class Railtie < ::Rails::Railtie
3
+ initializer 'htmltoword.setup' do
4
+ unless defined? Mime::DOCX
5
+ Mime::Type.register 'application/vnd.openxmlformats-officedocument.wordprocessingml.document', :docx
6
+ end
7
+
8
+ ActionController::Renderers.add :docx do |file_name, options|
9
+ Htmltoword::Renderer.send_file(self, file_name, options)
10
+ end
11
+
12
+ if defined? ActionController::Responder
13
+ ActionController::Responder.class_eval do
14
+ def to_docx
15
+ if @default_response
16
+ @default_response.call(options)
17
+ else
18
+ controller.render({ docx: controller.action_name }.merge(options))
19
+ end
20
+ end
21
+ end
22
+ end
23
+ end
24
+ end
25
+ end
@@ -0,0 +1,43 @@
1
+ module Htmltoword
2
+ class Renderer
3
+ class << self
4
+ def send_file(context, filename, options = {})
5
+ new(context, filename, options).send_file
6
+ end
7
+ end
8
+
9
+ def initialize(context, filename, options)
10
+ @word_template = options[:word_template].presence
11
+ @disposition = options.fetch(:disposition, 'attachment')
12
+ @use_extras = options.fetch(:extras, false)
13
+ @file_name = file_name(filename, options)
14
+ @context = context
15
+ define_template(filename, options)
16
+ @content = options[:content] || @context.render_to_string(options)
17
+ end
18
+
19
+ def send_file
20
+ document = Htmltoword::Document.create(@content, @word_template, @use_extras)
21
+ @context.send_data(document, filename: @file_name, type: Mime::DOCX, disposition: @disposition)
22
+ end
23
+
24
+ private
25
+
26
+ def define_template(filename, options)
27
+ if options[:template] == @context.action_name
28
+ if filename =~ %r{^([^\/]+)/(.+)$}
29
+ options[:prefixes] ||= []
30
+ options[:prefixes].unshift $1
31
+ options[:template] = $2
32
+ else
33
+ options[:template] = filename
34
+ end
35
+ end
36
+ end
37
+
38
+ def file_name(filename, options)
39
+ name = options[:filename].presence || filename
40
+ name =~ /\.docx$/ ? name : "#{name}.docx"
41
+ end
42
+ end
43
+ end