combine_pdf 0.0.2 → 0.0.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 797001336a5b4f1598ae399bd69161f9f2b4b4ef
4
- data.tar.gz: a684409037ef5205aff23512a8fe9a04ec53d828
3
+ metadata.gz: 47a88cf13f6bb93a5a75cbe4c227c204e16e3ec3
4
+ data.tar.gz: 3571716409c5586fef95c067211d4b890baa017a
5
5
  SHA512:
6
- metadata.gz: f74a67926f556606587211b4885e736cd9a8690c85aa62b56eb9deda4497a928ebb55a37ae4156f4ff748903c23b77e1aedb3f686a9c0a002201c6c6abe64afc
7
- data.tar.gz: 3e73fd966be0fa30d5626d2216a5bc92bb133c26abd7648374991fadc8438f7cf13d6e549a264af6a8533ad110f76d9f244aa8443282440cd950d214e098d40d
6
+ metadata.gz: 89cce7a4ebc0dee9d51563568ea64918d8047b1d37fd7084f9dea424e6e4a06c5f5ad731fc868051d6f9c264d51ab6d7a93e8fbeb402a43542a3f4f109269273
7
+ data.tar.gz: 80a54f7d2c58399115b4be46a5b039fe1d0ece65c714ec878d1c22a0bf55a4d273506f8893448fdf2c1284a2646b9c1a43590e6da889e6ea25401ecd323e9161
@@ -1,95 +1,94 @@
1
1
  # -*- encoding : utf-8 -*-
2
- ########################################################
3
- ## Thoughts from reading the ISO 32000-1:2008
4
- ## this file is part of the CombinePDF library and the code
5
- ## is subject to the same license (GPLv3).
6
- ##
7
- ##
8
- ## === Merge PDFs!
9
- ## This is a pure ruby library to merge PDF files.
10
- ## In the future, this library will also allow stamping and watermarking PDFs (it allows this now, only with some issues).
11
- ##
12
- ## I started the project as a model within a RoR (Ruby on Rails) application, and as it grew I moved it to a local gem.
13
- ## I fell in love with the project, even if it is still young and in the raw.
14
- ## It is very simple to parse pdfs - from files:
15
- ## >> pdf = CombinePDF.new "file_name.pdf"
16
- ## or from data:
17
- ## >> pdf = CombinePDF.parse "%PDF-1.4 .... [data]"
18
- ## It's also easy to start an empty pdf:
19
- ## >> pdf = CombinePDF.new
20
- ## Merging is a breeze:
21
- ## >> pdf << CombinePDF.new "another_file_name.pdf"
22
- ## and saving the final PDF is a one-liner:
23
- ## >> pdf.save "output_file_name.pdf"
24
- ## Also, as a side effect, we can get all sorts of info about our pdf... such as the page count:
25
- ## >> pdf.version # will tell you the PDF version (if discovered). you can also reset this manually.
26
- ## >> pdf.pages.length # will tell you how much pages are actually displayed
27
- ## >> pdf.all_pages.length # will tell you how many page objects actually exist (can be more or less then the pages displayed)
28
- ## >> pdf.info # a hash with the Info dictionary from the PDF file (if discovered).
29
- ## === Stamp PDF files
30
- ## <b>has issues with specific PDF files - please see the issues</b>: https://github.com/boazsegev/combine_pdf/issues/2
31
- ## You can use PDF files as stamps.
32
- ## For instance, lets say you have this wonderful PDF (maybe one you created with prawn), and you want to stump the company header and footer on every page.
33
- ## So you created your Prawn PDF file (Amazing library and hard work there, I totally recommend to have a look @ https://github.com/prawnpdf/prawn ):
34
- ## >> prawn_pdf = Prawn::Document.new
35
- ## >> ...(fill your new PDF with goodies)...
36
- ## Stamping every page is a breeze.
37
- ## We start by moving the PDF created by prawn into a CombinePDF object.
38
- ## >> pdf = CombinePDF.parse prawn_pdf.render
39
- ## Next we extract the stamp from our stamp pdf template:
40
- ## >> pdf_stamp = CombinePDF.new "stamp_file_name.pdf"
41
- ## >> stamp_page = pdf_stamp.pages[0]
42
- ## And off we stamp each page:
43
- ## >> pdf.pages.each {|page| pages << stamp_page}
44
- ## Of cource, we can save the stamped output:
45
- ## >> pdf.save "output_file_name.pdf"
46
- ## === Decryption & Filters
47
- ## Some PDF files are encrypted and some are compressed (the use of filters)...
48
- ## There is very little support for encrypted files and very very basic and limited support for compressed files.
49
- ## I need help with that.
50
- ## === Comments and file structure
51
- ## If you want to help with the code, please be aware:
52
- ## I'm a self learned hobbiest at heart. The documentation is lacking and the comments in the code are poor guidlines.
53
- ## The code itself should be very straight forward, but feel free to ask whatever you want.
54
- ## === Credit
55
- ## Caige Nichols wrote an amazing RC4 gem which I used in my code.
56
- ## I wanted to install the gem, but I had issues with the internet and ended up copying the code itself into the combine_pdf_decrypt class file.
57
- ## Credit to his wonderful is given here. Please respect his license and copyright... and mine.
58
- ## === License
59
- ## GPLv3
60
- ########################################################
2
+
3
+ # this file is part of the CombinePDF library and the code
4
+ # is subject to the same license (GPLv3).
5
+ #########################################################
6
+
7
+
8
+
9
+ # PDF object types cross reference:
10
+ # Indirect objects, references, dictionaries and streams are Hash
11
+ # arrays are Array
12
+ # strings are String
13
+ # names are Symbols (String.to_sym)
14
+ # numbers are Fixnum or Float
15
+ # boolean are TrueClass or FalseClass
16
+
61
17
  require 'zlib'
62
18
  require 'strscan'
63
19
  require 'combine_pdf/combine_pdf_pdf'
64
20
  require 'combine_pdf/combine_pdf_decrypt'
65
21
  require 'combine_pdf/combine_pdf_filter'
66
22
  require 'combine_pdf/combine_pdf_parser'
23
+
24
+ # This is a pure ruby library to merge PDF files.
25
+ # In the future, this library will also allow stamping and watermarking PDFs (it allows this now, only with some issues).
26
+ #
27
+ # PDF objects can be used to combine or to inject data.
28
+ # == Combine / Merge
29
+ # To combine PDF files (or data):
30
+ # pdf = CombinePDF.new
31
+ # pdf << CombinePDF.new "file1.pdf" # one way to combine, very fast.
32
+ # CombinePDF.new("file2.pdf").pages.each {|page| pdf << page} # different way to combine, slower.
33
+ # pdf.save "combined.pdf"
34
+ # == Stamp / Watermark
35
+ # <b>has issues with specific PDF files - please see the issues</b>: https://github.com/boazsegev/combine_pdf/issues/2
36
+ # To combine PDF files (or data), first create the stamp from a PDF file:
37
+ # stamp_pdf_file = CombinePDF.new "stamp_pdf_file.pdf"
38
+ # stamp_page = stamp_pdf_file.pages[0]
39
+ # After the stamp was created, inject to PDF pages:
40
+ # pdf = CombinePDF.new "file1.pdf"
41
+ # pdf.pages.each {|page| page << stamp_page}
42
+ # Notice the << operator is on a page and not a PDF object. The << operator acts differently on PDF objects and on Pages.
43
+ #
44
+ # Notice that page objects are Hash class objects and the << operator was added to the Page instances without altering the class.
45
+ #
46
+ # == Decryption & Filters
47
+ #
48
+ # Some PDF files are encrypted and some are compressed (the use of filters)...
49
+ #
50
+ # There is very little support for encrypted files and very very basic and limited support for compressed files.
51
+ #
52
+ # I need help with that.
53
+ #
54
+ # == Comments and file structure
55
+ #
56
+ # If you want to help with the code, please be aware:
57
+ #
58
+ # I'm a self learned hobbiest at heart. The documentation is lacking and the comments in the code are poor guidlines.
59
+ #
60
+ # The code itself should be very straight forward, but feel free to ask whatever you want.
61
+ #
62
+ # == Credit
63
+ #
64
+ # Caige Nichols wrote an amazing RC4 gem which I used in my code.
65
+ #
66
+ # I wanted to install the gem, but I had issues with the internet and ended up copying the code itself into the combine_pdf_decrypt class file.
67
+ #
68
+ # Credit to his wonderful is given here. Please respect his license and copyright... and mine.
69
+ #
70
+ # == License
71
+ #
72
+ # GPLv3
67
73
  module CombinePDF
68
74
  module_function
69
- ################################################################
70
- ## These are the "gateway" functions for the model.
71
- ## These functions are open to the public.
72
- ################################################################
73
- # PDF object types cross reference:
74
- # Indirect objects, references, dictionaries and streams are Hash
75
- # arrays are Array
76
- # strings are String
77
- # names are Symbols (String.to_sym)
78
- # numbers are Fixnum or Float
79
- # boolean are TrueClass or FalseClass
80
75
 
76
+ # Create an empty PDF object or create a PDF object from a file (parsing the file).
77
+ # file_name:: is the name of a file to be parsed.
81
78
  def new(file_name = "")
82
79
  raise TypeError, "couldn't parse and data, expecting type String" unless file_name.is_a? String
83
80
  return PDF.new() if file_name == ''
84
81
  PDF.new( PDFParser.new( IO.read(file_name).force_encoding(Encoding::ASCII_8BIT) ) )
85
82
  end
83
+ # Create a PDF object from a raw PDF data (parsing the data).
84
+ # data:: is a string that represents the content of a PDF file.
86
85
  def parse(data)
87
86
  raise TypeError, "couldn't parse and data, expecting type String" unless data.is_a? String
88
87
  PDF.new( PDFParser.new(data) )
89
88
  end
90
89
  end
91
90
 
92
- module CombinePDF
91
+ module CombinePDF #:nodoc: all
93
92
  ################################################################
94
93
  ## These are common functions, used within the different classes
95
94
  ## These functions aren't open to the public.
@@ -105,7 +104,7 @@ module CombinePDF
105
104
  41 => 41, #)
106
105
  92 => 92 #\
107
106
  }
108
- module PDFOperations
107
+ module PDFOperations #:nodoc: all
109
108
  module_function
110
109
  def inject_to_page page = {Type: :Page, MediaBox: [0,0,612.0,792.0], Resources: {}, Contents: []}, stream = nil, top = true
111
110
  # make sure both the page reciving the new data and the injected page are of the correct data type.
@@ -158,9 +157,12 @@ module CombinePDF
158
157
  page
159
158
  end
160
159
  # copy_and_secure_for_injection(page)
161
- # - page is a page in the pages array, i.e. pdf.pages[0]
160
+ # - page is a page in the pages array, i.e.
161
+ # pdf.pages[0]
162
162
  # takes a page object and:
163
+ #
163
164
  # makes a deep copy of the page (Ruby defaults to pointers, so this will copy the memory).
165
+ #
164
166
  # then it will rewrite the content stream with renamed resources, so as to avoid name conflicts.
165
167
  def copy_and_secure_for_injection(page)
166
168
  # copy page
@@ -335,6 +337,7 @@ module CombinePDF
335
337
 
336
338
 
337
339
 
340
+ # Formats an object into PDF format. This is used my the PDF object to format the PDF file and it is used in the secure injection which is still being developed.
338
341
  def _object_to_pdf object
339
342
  case
340
343
  when object.nil?
@@ -5,7 +5,7 @@
5
5
  ## is subject to the same license.
6
6
  ########################################################
7
7
 
8
- module CombinePDF
8
+ module CombinePDF #:nodoc: all
9
9
 
10
10
  class PDFWriter
11
11
 
@@ -5,7 +5,7 @@
5
5
  ## is subject to the same license.
6
6
  ########################################################
7
7
 
8
- module CombinePDF
8
+ module CombinePDF #:nodoc: all
9
9
  class PDFDecrypt
10
10
 
11
11
  def initialize objects=[], root_doctionary = {}
@@ -151,48 +151,47 @@ module CombinePDF
151
151
  ## copying it from the web page I had in my cache.
152
152
  ## This wonderful work was done by Caige Nichols.
153
153
  #####################################################
154
+ # class RC4
155
+ # def initialize(str)
156
+ # begin
157
+ # raise SyntaxError, "RC4: Key supplied is blank" if str.eql?('')
154
158
 
155
- class RC4
156
- def initialize(str)
157
- begin
158
- raise SyntaxError, "RC4: Key supplied is blank" if str.eql?('')
159
+ # @q1, @q2 = 0, 0
160
+ # @key = []
161
+ # str.each_byte { |elem| @key << elem } while @key.size < 256
162
+ # @key.slice!(256..@key.size-1) if @key.size >= 256
163
+ # @s = (0..255).to_a
164
+ # j = 0
165
+ # 0.upto(255) do |i|
166
+ # j = (j + @s[i] + @key[i] ) % 256
167
+ # @s[i], @s[j] = @s[j], @s[i]
168
+ # end
169
+ # end
170
+ # end
159
171
 
160
- @q1, @q2 = 0, 0
161
- @key = []
162
- str.each_byte { |elem| @key << elem } while @key.size < 256
163
- @key.slice!(256..@key.size-1) if @key.size >= 256
164
- @s = (0..255).to_a
165
- j = 0
166
- 0.upto(255) do |i|
167
- j = (j + @s[i] + @key[i] ) % 256
168
- @s[i], @s[j] = @s[j], @s[i]
169
- end
170
- end
171
- end
172
+ # def encrypt!(text)
173
+ # process text
174
+ # end
172
175
 
173
- def encrypt!(text)
174
- process text
175
- end
176
+ # def encrypt(text)
177
+ # process text.dup
178
+ # end
176
179
 
177
- def encrypt(text)
178
- process text.dup
179
- end
180
+ # alias_method :decrypt, :encrypt
180
181
 
181
- alias_method :decrypt, :encrypt
182
+ # private
182
183
 
183
- private
184
+ # def process(text)
185
+ # text.unpack("C*").map { |c| c ^ round }.pack("C*")
186
+ # end
184
187
 
185
- def process(text)
186
- text.unpack("C*").map { |c| c ^ round }.pack("C*")
187
- end
188
-
189
- def round
190
- @q1 = (@q1 + 1) % 256
191
- @q2 = (@q2 + @s[@q1]) % 256
192
- @s[@q1], @s[@q2] = @s[@q2], @s[@q1]
193
- @s[(@s[@q1]+@s[@q2]) % 256]
194
- end
195
- end
188
+ # def round
189
+ # @q1 = (@q1 + 1) % 256
190
+ # @q2 = (@q2 + @s[@q1]) % 256
191
+ # @s[@q1], @s[@q2] = @s[@q2], @s[@q1]
192
+ # @s[(@s[@q1]+@s[@q2]) % 256]
193
+ # end
194
+ # end
196
195
 
197
196
  end
198
197
 
@@ -5,7 +5,7 @@
5
5
  ## is subject to the same license.
6
6
  ########################################################
7
7
 
8
- module CombinePDF
8
+ module CombinePDF #:nodoc: all
9
9
 
10
10
  module PDFFilter
11
11
  module_function
@@ -4,7 +4,7 @@
4
4
  ## this file is part of the CombinePDF library and the code
5
5
  ## is subject to the same license.
6
6
  ########################################################
7
- module CombinePDF
7
+ module CombinePDF #:nodoc: all
8
8
 
9
9
  ########################################################
10
10
  ## This is the Parser class.
@@ -5,11 +5,26 @@
5
5
  ## is subject to the same license.
6
6
  ########################################################
7
7
  module CombinePDF
8
- ########################################################
9
- ## PDF class is the PDF object that can save itself to
10
- ## a file and that can be used as a container for a full
11
- ## PDF file data, including version etc'.
12
- ########################################################
8
+ #######################################################
9
+ # PDF class is the PDF object that can save itself to
10
+ # a file and that can be used as a container for a full
11
+ # PDF file data, including version etc'.
12
+ #
13
+ # PDF objects can be used to combine or to inject data.
14
+ # == Combine
15
+ # To combine PDF files (or data):
16
+ # pdf = CombinePDF.new
17
+ # pdf << CombinePDF.new "file1.pdf" # one way to combine, very fast.
18
+ # CombinePDF.new("file2.pdf").pages.each {|page| pdf << page} # different way to combine, slower.
19
+ # pdf.save "combined.pdf"
20
+ # == Stamp / Watermark
21
+ # To combine PDF files (or data), first create the stamp from a PDF file:
22
+ # stamp_pdf_file = CombinePDF.new "stamp_pdf_file.pdf"
23
+ # stamp_page = stamp_pdf_file.pages[0]
24
+ # After the stamp was created, inject to PDF pages:
25
+ # pdf = CombinePDF.new "file1.pdf"
26
+ # pdf.pages.each {|page| page << stamp_page} # notice the << operator is on a page and not a PDF object.
27
+ #######################################################
13
28
  class PDF
14
29
  attr_reader :objects, :info
15
30
  attr_accessor :string_output
@@ -43,7 +58,9 @@ module CombinePDF
43
58
  end
44
59
 
45
60
  # Formats the data to PDF formats and returns a binary string that represents the PDF file content.
61
+ #
46
62
  # This method is used by the save(file_name) method to save the content to a file.
63
+ #
47
64
  # use this to export the PDF file without saving to disk (such as sending through HTTP ect').
48
65
  def to_pdf
49
66
  #reset version if not specified
@@ -90,15 +107,22 @@ module CombinePDF
90
107
  end
91
108
 
92
109
  # Seve the PDF to file.
93
- # save(file_name)
94
- # - file_name is a string or path object for the output.
95
- # Notice! if the file exists, it WILL be overwritten.
110
+ #
111
+ # file_name:: is a string or path object for the output.
112
+ #
113
+ # <b>Notice!</b> if the file exists, it <b>WILL</b> be overwritten.
96
114
  def save(file_name)
97
115
  IO.binwrite file_name, to_pdf
98
116
  end
99
- # this function returns all the pages cataloged in the catalog.
117
+ # this method returns all the pages cataloged in the catalog.
118
+ #
100
119
  # if no catalog is passed, it seeks the existing catalog(s) and searches
101
120
  # for any registered Page objects.
121
+ #
122
+ # This method also adds the << operator to each page instance, so that content can be
123
+ # injected to the pages, as described above.
124
+ #
125
+ # (page objects are Hash class objects. the << operator is added to the specific instances without changing the class)
102
126
  def pages(catalogs = nil)
103
127
  page_list = []
104
128
  if catalogs == nil
@@ -136,19 +160,13 @@ module CombinePDF
136
160
  page_list
137
161
  end
138
162
 
139
- # this function returns all the Page objects - regardless of order and even if not cataloged
140
- # could be used for finding "lost" pages... but actually rather useless.
141
- def all_pages
142
- #########
143
- ## Only return the page item, but make sure all references are connected so that
144
- ## referenced items and be reached through the connections.
145
- [].tap {|out| each_object {|obj| out << obj if obj.is_a?(Hash) && obj[:Type] == :Page } }
146
- end
147
-
148
163
  # this function adds pages or CombinePDF objects at the end of the file (merge)
149
164
  # for example:
165
+ #
150
166
  # pdf = CombinePDF.new "first_file.pdf"
167
+ #
151
168
  # pdf << CombinePDF.new "second_file.pdf"
169
+ #
152
170
  # pdf.save "both_files_merged.pdf"
153
171
  def << (obj)
154
172
  #########
@@ -181,7 +199,16 @@ module CombinePDF
181
199
  warn "Shouldn't add objects to the file if they are not top-level indirect PDF objects."
182
200
  end
183
201
  end
184
-
202
+ end
203
+ class PDF #:nodoc: all
204
+ # this function returns all the Page objects - regardless of order and even if not cataloged
205
+ # could be used for finding "lost" pages... but actually rather useless.
206
+ def all_pages
207
+ #########
208
+ ## Only return the page item, but make sure all references are connected so that
209
+ ## referenced items and be reached through the connections.
210
+ [].tap {|out| each_object {|obj| out << obj if obj.is_a?(Hash) && obj[:Type] == :Page } }
211
+ end
185
212
  def serialize_objects_and_references(object = nil)
186
213
  warn "connecting objects with their references (serialize_objects_and_references)."
187
214
 
@@ -322,7 +349,8 @@ module CombinePDF
322
349
  catalog_object
323
350
  end
324
351
  # this is an alternative to the rebuild_catalog catalog method
325
- # this method might eventually be used by the to_pdf method, for streamlining the PDF output.
352
+ # this method is used by the to_pdf method, for streamlining the PDF output.
353
+ # there is no point is calling the method before preparing the output.
326
354
  def rebuild_catalog_and_objects
327
355
  catalog = rebuild_catalog
328
356
  @objects = []
@@ -332,6 +360,7 @@ module CombinePDF
332
360
  catalog
333
361
  end
334
362
 
363
+ # disabled, don't use. simpley returns true.
335
364
  def rebuild_resources
336
365
 
337
366
  warn "Resources re-building disabled as it isn't worth the price in peformance as of yet."
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: combine_pdf
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.2
4
+ version: 0.0.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - Boaz Segev