pdfbeads 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/COPYING ADDED
@@ -0,0 +1,339 @@
1
+ GNU GENERAL PUBLIC LICENSE
2
+ Version 2, June 1991
3
+
4
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
5
+ 675 Mass Ave, Cambridge, MA 02139, USA
6
+ Everyone is permitted to copy and distribute verbatim copies
7
+ of this license document, but changing it is not allowed.
8
+
9
+ Preamble
10
+
11
+ The licenses for most software are designed to take away your
12
+ freedom to share and change it. By contrast, the GNU General Public
13
+ License is intended to guarantee your freedom to share and change free
14
+ software--to make sure the software is free for all its users. This
15
+ General Public License applies to most of the Free Software
16
+ Foundation's software and to any other program whose authors commit to
17
+ using it. (Some other Free Software Foundation software is covered by
18
+ the GNU Library General Public License instead.) You can apply it to
19
+ your programs, too.
20
+
21
+ When we speak of free software, we are referring to freedom, not
22
+ price. Our General Public Licenses are designed to make sure that you
23
+ have the freedom to distribute copies of free software (and charge for
24
+ this service if you wish), that you receive source code or can get it
25
+ if you want it, that you can change the software or use pieces of it
26
+ in new free programs; and that you know you can do these things.
27
+
28
+ To protect your rights, we need to make restrictions that forbid
29
+ anyone to deny you these rights or to ask you to surrender the rights.
30
+ These restrictions translate to certain responsibilities for you if you
31
+ distribute copies of the software, or if you modify it.
32
+
33
+ For example, if you distribute copies of such a program, whether
34
+ gratis or for a fee, you must give the recipients all the rights that
35
+ you have. You must make sure that they, too, receive or can get the
36
+ source code. And you must show them these terms so they know their
37
+ rights.
38
+
39
+ We protect your rights with two steps: (1) copyright the software, and
40
+ (2) offer you this license which gives you legal permission to copy,
41
+ distribute and/or modify the software.
42
+
43
+ Also, for each author's protection and ours, we want to make certain
44
+ that everyone understands that there is no warranty for this free
45
+ software. If the software is modified by someone else and passed on, we
46
+ want its recipients to know that what they have is not the original, so
47
+ that any problems introduced by others will not reflect on the original
48
+ authors' reputations.
49
+
50
+ Finally, any free program is threatened constantly by software
51
+ patents. We wish to avoid the danger that redistributors of a free
52
+ program will individually obtain patent licenses, in effect making the
53
+ program proprietary. To prevent this, we have made it clear that any
54
+ patent must be licensed for everyone's free use or not licensed at all.
55
+
56
+ The precise terms and conditions for copying, distribution and
57
+ modification follow.
58
+
59
+ GNU GENERAL PUBLIC LICENSE
60
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
61
+
62
+ 0. This License applies to any program or other work which contains
63
+ a notice placed by the copyright holder saying it may be distributed
64
+ under the terms of this General Public License. The "Program", below,
65
+ refers to any such program or work, and a "work based on the Program"
66
+ means either the Program or any derivative work under copyright law:
67
+ that is to say, a work containing the Program or a portion of it,
68
+ either verbatim or with modifications and/or translated into another
69
+ language. (Hereinafter, translation is included without limitation in
70
+ the term "modification".) Each licensee is addressed as "you".
71
+
72
+ Activities other than copying, distribution and modification are not
73
+ covered by this License; they are outside its scope. The act of
74
+ running the Program is not restricted, and the output from the Program
75
+ is covered only if its contents constitute a work based on the
76
+ Program (independent of having been made by running the Program).
77
+ Whether that is true depends on what the Program does.
78
+
79
+ 1. You may copy and distribute verbatim copies of the Program's
80
+ source code as you receive it, in any medium, provided that you
81
+ conspicuously and appropriately publish on each copy an appropriate
82
+ copyright notice and disclaimer of warranty; keep intact all the
83
+ notices that refer to this License and to the absence of any warranty;
84
+ and give any other recipients of the Program a copy of this License
85
+ along with the Program.
86
+
87
+ You may charge a fee for the physical act of transferring a copy, and
88
+ you may at your option offer warranty protection in exchange for a fee.
89
+
90
+ 2. You may modify your copy or copies of the Program or any portion
91
+ of it, thus forming a work based on the Program, and copy and
92
+ distribute such modifications or work under the terms of Section 1
93
+ above, provided that you also meet all of these conditions:
94
+
95
+ a) You must cause the modified files to carry prominent notices
96
+ stating that you changed the files and the date of any change.
97
+
98
+ b) You must cause any work that you distribute or publish, that in
99
+ whole or in part contains or is derived from the Program or any
100
+ part thereof, to be licensed as a whole at no charge to all third
101
+ parties under the terms of this License.
102
+
103
+ c) If the modified program normally reads commands interactively
104
+ when run, you must cause it, when started running for such
105
+ interactive use in the most ordinary way, to print or display an
106
+ announcement including an appropriate copyright notice and a
107
+ notice that there is no warranty (or else, saying that you provide
108
+ a warranty) and that users may redistribute the program under
109
+ these conditions, and telling the user how to view a copy of this
110
+ License. (Exception: if the Program itself is interactive but
111
+ does not normally print such an announcement, your work based on
112
+ the Program is not required to print an announcement.)
113
+
114
+ These requirements apply to the modified work as a whole. If
115
+ identifiable sections of that work are not derived from the Program,
116
+ and can be reasonably considered independent and separate works in
117
+ themselves, then this License, and its terms, do not apply to those
118
+ sections when you distribute them as separate works. But when you
119
+ distribute the same sections as part of a whole which is a work based
120
+ on the Program, the distribution of the whole must be on the terms of
121
+ this License, whose permissions for other licensees extend to the
122
+ entire whole, and thus to each and every part regardless of who wrote it.
123
+
124
+ Thus, it is not the intent of this section to claim rights or contest
125
+ your rights to work written entirely by you; rather, the intent is to
126
+ exercise the right to control the distribution of derivative or
127
+ collective works based on the Program.
128
+
129
+ In addition, mere aggregation of another work not based on the Program
130
+ with the Program (or with a work based on the Program) on a volume of
131
+ a storage or distribution medium does not bring the other work under
132
+ the scope of this License.
133
+
134
+ 3. You may copy and distribute the Program (or a work based on it,
135
+ under Section 2) in object code or executable form under the terms of
136
+ Sections 1 and 2 above provided that you also do one of the following:
137
+
138
+ a) Accompany it with the complete corresponding machine-readable
139
+ source code, which must be distributed under the terms of Sections
140
+ 1 and 2 above on a medium customarily used for software interchange; or,
141
+
142
+ b) Accompany it with a written offer, valid for at least three
143
+ years, to give any third party, for a charge no more than your
144
+ cost of physically performing source distribution, a complete
145
+ machine-readable copy of the corresponding source code, to be
146
+ distributed under the terms of Sections 1 and 2 above on a medium
147
+ customarily used for software interchange; or,
148
+
149
+ c) Accompany it with the information you received as to the offer
150
+ to distribute corresponding source code. (This alternative is
151
+ allowed only for noncommercial distribution and only if you
152
+ received the program in object code or executable form with such
153
+ an offer, in accord with Subsection b above.)
154
+
155
+ The source code for a work means the preferred form of the work for
156
+ making modifications to it. For an executable work, complete source
157
+ code means all the source code for all modules it contains, plus any
158
+ associated interface definition files, plus the scripts used to
159
+ control compilation and installation of the executable. However, as a
160
+ special exception, the source code distributed need not include
161
+ anything that is normally distributed (in either source or binary
162
+ form) with the major components (compiler, kernel, and so on) of the
163
+ operating system on which the executable runs, unless that component
164
+ itself accompanies the executable.
165
+
166
+ If distribution of executable or object code is made by offering
167
+ access to copy from a designated place, then offering equivalent
168
+ access to copy the source code from the same place counts as
169
+ distribution of the source code, even though third parties are not
170
+ compelled to copy the source along with the object code.
171
+
172
+ 4. You may not copy, modify, sublicense, or distribute the Program
173
+ except as expressly provided under this License. Any attempt
174
+ otherwise to copy, modify, sublicense or distribute the Program is
175
+ void, and will automatically terminate your rights under this License.
176
+ However, parties who have received copies, or rights, from you under
177
+ this License will not have their licenses terminated so long as such
178
+ parties remain in full compliance.
179
+
180
+ 5. You are not required to accept this License, since you have not
181
+ signed it. However, nothing else grants you permission to modify or
182
+ distribute the Program or its derivative works. These actions are
183
+ prohibited by law if you do not accept this License. Therefore, by
184
+ modifying or distributing the Program (or any work based on the
185
+ Program), you indicate your acceptance of this License to do so, and
186
+ all its terms and conditions for copying, distributing or modifying
187
+ the Program or works based on it.
188
+
189
+ 6. Each time you redistribute the Program (or any work based on the
190
+ Program), the recipient automatically receives a license from the
191
+ original licensor to copy, distribute or modify the Program subject to
192
+ these terms and conditions. You may not impose any further
193
+ restrictions on the recipients' exercise of the rights granted herein.
194
+ You are not responsible for enforcing compliance by third parties to
195
+ this License.
196
+
197
+ 7. If, as a consequence of a court judgment or allegation of patent
198
+ infringement or for any other reason (not limited to patent issues),
199
+ conditions are imposed on you (whether by court order, agreement or
200
+ otherwise) that contradict the conditions of this License, they do not
201
+ excuse you from the conditions of this License. If you cannot
202
+ distribute so as to satisfy simultaneously your obligations under this
203
+ License and any other pertinent obligations, then as a consequence you
204
+ may not distribute the Program at all. For example, if a patent
205
+ license would not permit royalty-free redistribution of the Program by
206
+ all those who receive copies directly or indirectly through you, then
207
+ the only way you could satisfy both it and this License would be to
208
+ refrain entirely from distribution of the Program.
209
+
210
+ If any portion of this section is held invalid or unenforceable under
211
+ any particular circumstance, the balance of the section is intended to
212
+ apply and the section as a whole is intended to apply in other
213
+ circumstances.
214
+
215
+ It is not the purpose of this section to induce you to infringe any
216
+ patents or other property right claims or to contest validity of any
217
+ such claims; this section has the sole purpose of protecting the
218
+ integrity of the free software distribution system, which is
219
+ implemented by public license practices. Many people have made
220
+ generous contributions to the wide range of software distributed
221
+ through that system in reliance on consistent application of that
222
+ system; it is up to the author/donor to decide if he or she is willing
223
+ to distribute software through any other system and a licensee cannot
224
+ impose that choice.
225
+
226
+ This section is intended to make thoroughly clear what is believed to
227
+ be a consequence of the rest of this License.
228
+
229
+ 8. If the distribution and/or use of the Program is restricted in
230
+ certain countries either by patents or by copyrighted interfaces, the
231
+ original copyright holder who places the Program under this License
232
+ may add an explicit geographical distribution limitation excluding
233
+ those countries, so that distribution is permitted only in or among
234
+ countries not thus excluded. In such case, this License incorporates
235
+ the limitation as if written in the body of this License.
236
+
237
+ 9. The Free Software Foundation may publish revised and/or new versions
238
+ of the General Public License from time to time. Such new versions will
239
+ be similar in spirit to the present version, but may differ in detail to
240
+ address new problems or concerns.
241
+
242
+ Each version is given a distinguishing version number. If the Program
243
+ specifies a version number of this License which applies to it and "any
244
+ later version", you have the option of following the terms and conditions
245
+ either of that version or of any later version published by the Free
246
+ Software Foundation. If the Program does not specify a version number of
247
+ this License, you may choose any version ever published by the Free Software
248
+ Foundation.
249
+
250
+ 10. If you wish to incorporate parts of the Program into other free
251
+ programs whose distribution conditions are different, write to the author
252
+ to ask for permission. For software which is copyrighted by the Free
253
+ Software Foundation, write to the Free Software Foundation; we sometimes
254
+ make exceptions for this. Our decision will be guided by the two goals
255
+ of preserving the free status of all derivatives of our free software and
256
+ of promoting the sharing and reuse of software generally.
257
+
258
+ NO WARRANTY
259
+
260
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
261
+ FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
262
+ OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
263
+ PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
264
+ OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
265
+ MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
266
+ TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
267
+ PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
268
+ REPAIR OR CORRECTION.
269
+
270
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
271
+ WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
272
+ REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
273
+ INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
274
+ OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
275
+ TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
276
+ YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
277
+ PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
278
+ POSSIBILITY OF SUCH DAMAGES.
279
+
280
+ END OF TERMS AND CONDITIONS
281
+
282
+ Appendix: How to Apply These Terms to Your New Programs
283
+
284
+ If you develop a new program, and you want it to be of the greatest
285
+ possible use to the public, the best way to achieve this is to make it
286
+ free software which everyone can redistribute and change under these terms.
287
+
288
+ To do so, attach the following notices to the program. It is safest
289
+ to attach them to the start of each source file to most effectively
290
+ convey the exclusion of warranty; and each file should have at least
291
+ the "copyright" line and a pointer to where the full notice is found.
292
+
293
+ <one line to give the program's name and a brief idea of what it does.>
294
+ Copyright (C) 19yy <name of author>
295
+
296
+ This program is free software; you can redistribute it and/or modify
297
+ it under the terms of the GNU General Public License as published by
298
+ the Free Software Foundation; either version 2 of the License, or
299
+ (at your option) any later version.
300
+
301
+ This program is distributed in the hope that it will be useful,
302
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
303
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
304
+ GNU General Public License for more details.
305
+
306
+ You should have received a copy of the GNU General Public License
307
+ along with this program; if not, write to the Free Software
308
+ Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
309
+
310
+ Also add information on how to contact you by electronic and paper mail.
311
+
312
+ If the program is interactive, make it output a short notice like this
313
+ when it starts in an interactive mode:
314
+
315
+ Gnomovision version 69, Copyright (C) 19yy name of author
316
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
317
+ This is free software, and you are welcome to redistribute it
318
+ under certain conditions; type `show c' for details.
319
+
320
+ The hypothetical commands `show w' and `show c' should show the appropriate
321
+ parts of the General Public License. Of course, the commands you use may
322
+ be called something other than `show w' and `show c'; they could even be
323
+ mouse-clicks or menu items--whatever suits your program.
324
+
325
+ You should also get your employer (if you work as a programmer) or your
326
+ school, if any, to sign a "copyright disclaimer" for the program, if
327
+ necessary. Here is a sample; alter the names:
328
+
329
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
330
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
331
+
332
+ <signature of Ty Coon>, 1 April 1989
333
+ Ty Coon, President of Vice
334
+
335
+ This General Public License does not permit incorporating your program into
336
+ proprietary programs. If your program is a subroutine library, you may
337
+ consider it more useful to permit linking proprietary applications with the
338
+ library. If this is what you want to do, use the GNU Library General
339
+ Public License instead of this License.
data/ChangeLog ADDED
@@ -0,0 +1,3 @@
1
+ 2010 November 7 (Alexey Kryukov) Version 1.0.0
2
+
3
+ * Initial release
data/README ADDED
@@ -0,0 +1,53 @@
1
+ PDFBeads -- convert scanned images to a single PDF file
2
+ Version 1.0 (November 2010)
3
+
4
+ Copyright (C) 2010 Alexey Kryukov (amkryukov@gmail.com).
5
+ All rights reserved.
6
+
7
+ PDFBeads is a small utility written in Ruby which takes scanned
8
+ page images and converts them into a single PDF file. Unlike other
9
+ PDF creation tools, PDFBeads attempts to implement the approach
10
+ typically used for DjVu books. Its key feature is separating scanned
11
+ text (typically black, but indexed images with a small number of
12
+ colors are also accepted) from halftone pictures. Each type of
13
+ graphical data is encoded into its own layer with a specific
14
+ compression method and resolution.
15
+
16
+ The name `PDFBeads' has been selected for the package because
17
+ building PDF files from separate image is comparable to threading
18
+ beads on a string. It also seems to be a good choice for a Ruby
19
+ application.
20
+
21
+ Here's a few operations you can perform with PDFBeads:
22
+
23
+ * encode B&W images using either CCITT Group 4 Fax or JBIG2
24
+ compression method (you'll need Adam Langley's jbig2 utility,
25
+ available at https://github.com/agl/jbig2enc/ , for JBIG2
26
+ compression);
27
+
28
+ * combine halftone or indexed pictures with previously binarized
29
+ text pages, placing them into the background layer. Various
30
+ compression methods of background images (JPEG2000, JPEG or
31
+ PNG-styled deflate compression) are supported;
32
+
33
+ * split mixed images where binarized text is combined with color
34
+ or grayscale pictures (such pages may be produced with ScanTailor --
35
+ an interactive post-processing tool for scanned page, available
36
+ at http://scantailor.sourceforge.net) and encode each layer
37
+ separately;
38
+
39
+ * correctly process indexed images with a limited number of colors,
40
+ encoding each color separately into the foreground layer;
41
+
42
+ * split color images into background and foreground layers (similar
43
+ to BG44 and FG44 chunks in a DjVu file) according to a given mask;
44
+
45
+ * create PDF files with TOC and metadata;
46
+
47
+ * read text from hOCR files and create a hidden text layer in the PDF
48
+ file.
49
+
50
+ Note that PDFBeads is intended for creating PDF files from previously
51
+ processed images, and so it can't done some operations (e. g. converting
52
+ color or grayscale scans to B&W) which should be typically performed with
53
+ a special scan processing application, such as ScanTailor.
data/bin/pdfbeads ADDED
@@ -0,0 +1,189 @@
1
+ #!/usr/bin/env ruby1.9.1
2
+ # encoding: UTF-8
3
+
4
+ ######################################################################
5
+ #
6
+ # PDFBeads -- convert scanned images to a single PDF file
7
+ # Version 1.0
8
+ #
9
+ # Unlike other PDF creation tools, this utility attempts to implement
10
+ # the approach typically used for DjVu books. Its key feature is
11
+ # separating scanned text (typically black, but indexed images with
12
+ # a small number of colors are also accepted) from halftone images
13
+ # placed into a background layer.
14
+ #
15
+ # Copyright (C) 2010 Alexey Kryukov (amkryukov@gmail.com).
16
+ # All rights reserved.
17
+ #
18
+ # This program is free software; you can redistribute it and/or modify
19
+ # it under the terms of the GNU General Public License as published by
20
+ # the Free Software Foundation; either version 2 of the License, or
21
+ # (at your option) any later version.
22
+ #
23
+ # This program is distributed in the hope that it will be useful,
24
+ # but WITHOUT ANY WARRANTY; without even the implied warranty of
25
+ # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
26
+ # GNU General Public License for more details.
27
+ #
28
+ # You should have received a copy of the GNU General Public License
29
+ # along with this program; if not, write to the Free Software
30
+ # Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
31
+ #
32
+ #######################################################################
33
+
34
+ require 'optparse'
35
+ require 'iconv'
36
+ require 'time'
37
+
38
+ require 'pdfbeads'
39
+ include PDFBeads
40
+
41
+ pdfargs = Hash[
42
+ :labels => nil,
43
+ :toc => nil,
44
+ :pagelayout => 'TwoPageRight',
45
+ :meta => nil
46
+ ]
47
+ pageargs = Hash[
48
+ :maxcolors => 4,
49
+ :st_resolution => 0,
50
+ :bg_resolution => 300,
51
+ :bg_format => 'JP2',
52
+ :st_format => 'JBIG2',
53
+ :pages_per_dict => 15,
54
+ :force_update => false,
55
+ :force_grayscale => false
56
+ ]
57
+ outpath = 'STDOUT'
58
+
59
+ OptionParser.new() do |opts|
60
+ opts.set_summary_width(24)
61
+ opts.set_summary_indent(" ")
62
+
63
+ opts.banner = "Usage: pdfbeads [options] [files to process] > out.pdf"
64
+
65
+ opts.separator "\n"
66
+ opts.separator "PDF file properties:\n"
67
+
68
+ opts.on("-C", "--toc TOCFILE",
69
+ "Build PDF outline dictionary from a text file") do |tocfile|
70
+ pdfargs[:toc] = tocfile
71
+ end
72
+ opts.on("-L", "--labels LSPEC",
73
+ "Specify page labels for user-friendly page numbering") do |lspec|
74
+ pdfargs[:labels] = lspec
75
+ end
76
+ opts.on("-M", "--meta METAFILE",
77
+ "Take metadata for the PDF file from a text file") do |metafile|
78
+ pdfargs[:meta] = metafile
79
+ end
80
+ opts.on("-P", "--pagelayout LAYOUT",
81
+ ['SinglePage', 'OneColumn', 'TwoColumnLeft', 'TwoColumnRight', 'TwoPageLeft', 'TwoPageRight'],
82
+ "Specify the default page layout for PDF viewer, where",
83
+ "LAYOUT is `SinglePage', `OneColumn', `TwoColumnLeft'",
84
+ "`TwoColumnRight', `TwoPageLeft', or `TwoPageRight'") do |pagelayout|
85
+
86
+ pdfargs[:pagelayout] = pagelayout
87
+ end
88
+
89
+ opts.separator "\n"
90
+ opts.separator "Image encoding and compression options:\n"
91
+
92
+ opts.on("-f", "--force-update",
93
+ "Always write subsidiary image files even if a file",
94
+ "with the same name is already found on the disk") do |f|
95
+ pageargs[:force_update] = f
96
+ end
97
+ opts.on("-m", "--mask-compression FORMAT", ['JBIG2', 'jbig2', 'G4', 'g4', 'Group4', 'CCITTFax'],
98
+ "Compression method for foreground text mask in PDF",
99
+ "pages (JBIG2 or G4). JBIG2 is used by default, unless",
100
+ "the encoder is not available" ) do |format|
101
+ if 'JBIG2'.eql? format
102
+ pageargs[:st_format] = 'JBIG2'
103
+ else
104
+ pageargs[:st_format] = 'G4'
105
+ end
106
+ end
107
+ opts.on("-p", "--pages-per-dict NUM",
108
+ "Generate one shared JBIG2 dictionary per NUM pages.",
109
+ "This option is only applied when JBIG2 compression",
110
+ "is used. Default value is #{pageargs[:pages_per_dict]}") do |p|
111
+ pageargs[:pages_per_dict] = p
112
+ end
113
+ opts.on("-r", "--force-resolution DPI",
114
+ "Set resolution for foreground mask images to the",
115
+ "specified value (in pixels per inch). Note that the",
116
+ "image is not actually resampled.") do |r|
117
+ pageargs[:st_resolution] = r
118
+ end
119
+ opts.on("-x", "--max-colors NUM",
120
+ "If pdfbeads finds an indexed file with NUM or",
121
+ "less colors, then it will attempt to split it into",
122
+ "several bitonal images and encode them all into the",
123
+ "PDF page mask. Otherwise the file is treated just",
124
+ "like a normal greyscale or color image. Default",
125
+ "value is #{pageargs[:maxcolors]}") do |num|
126
+ pageargs[:maxcolors] = num
127
+ end
128
+
129
+ opts.separator "\n"
130
+ opts.separator "The following options are only applied when pdfbeads attempts to split"
131
+ opts.separator "a mixed source image into text mask and background layer:\n"
132
+
133
+ opts.on("-b", "--bg-compression FORMAT",
134
+ ['JP2', 'JPX', 'J2K', 'JPEG2000', 'JPG', 'JPEG', 'LOSSLESS', 'PNG', 'DEFLATE'],
135
+ "Compression method for background images. Acceptable",
136
+ "values are JP2|JPX|JPEG2000, JPG|JPEG or LOSSLESS.",
137
+ "JP2 is used by default, unless this format is not",
138
+ "supported by the available version of ImageMagick" ) do |format|
139
+ case format.upcase
140
+ when 'JP2', 'JPX', 'J2K', 'JPEG2000'
141
+ pageargs[:bg_format] = 'JP2'
142
+ when 'JPG', 'JPEG'
143
+ pageargs[:bg_format] = 'JPG'
144
+ else
145
+ pageargs[:bg_format] = 'PNG'
146
+ end
147
+ end
148
+ opts.on("-B", "--bg-resolution DPI",
149
+ "Set resolution for background images (300dpi default)" ) do |dpi|
150
+ pageargs[:bg_resolution] = dpi.to_f
151
+ end
152
+ opts.on("-g", "--grayscale",
153
+ "When separating text from background, always convert",
154
+ "background images to grayscale") do |g|
155
+ pageargs[:force_grayscale] = g
156
+ end
157
+
158
+ opts.separator "\n"
159
+ opts.separator "General options:\n"
160
+
161
+ opts.on("-o", "--output FILE",
162
+ "Print output to a file instead of STDERR") do |f|
163
+ outpath = f
164
+ end
165
+ opts.on_tail("-h", "--help", "Show this message") do
166
+ puts opts
167
+ exit
168
+ end
169
+
170
+ opts.parse!(ARGV)
171
+ end
172
+
173
+ if ARGV.length == 0
174
+ files = Dir.glob("*").sort
175
+ else
176
+ files = ARGV.sort
177
+ end
178
+
179
+ pages = PageDataProvider.new( files,pageargs )
180
+ if pages.length == 0
181
+ $stderr.puts "pdfbeads: no pages to process"
182
+ else
183
+ if pageargs[:st_format].eql? 'JBIG2'
184
+ pageargs[:st_format] = 'G4' unless pages.jbig2Encode()
185
+ end
186
+ pdfproc = PDFBuilder.new( pdfargs )
187
+ pdfproc.process( pages,pageargs[:st_format] )
188
+ pdfproc.output( outpath )
189
+ end