file_pipeline 0.0.1 → 0.0.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. checksums.yaml +4 -4
  2. data/README.rdoc +513 -0
  3. metadata +4 -216
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fe1107c14a49c1372b1d3f62c76db36b262311113d3661b4be3ac3b7275f97b8
4
- data.tar.gz: f01ab4be7e0eb246a7da41fa734a2e88491dc90ce01033b3b7a125c012029525
3
+ metadata.gz: cda1036fc126999807374257bd750430e152b18e401af8a0acb8087a47e92951
4
+ data.tar.gz: b9acc95e3f3a4cdbdd70d61ebbb1a9bd8c41614cd390455f12ecc18156ceb586
5
5
  SHA512:
6
- metadata.gz: bc53dff7ae7c5a9955f72e83ee2be935f59db398093b555531b5bdbd619faf4637d6bf5effa1842fcf8b3906a90c1626394b892aa6fd24042063b7e78c2f86fb
7
- data.tar.gz: 14f493a7bf76fc55c4e60df8d2f999ae66b5770a064c3cbf03b46e52b8a03602629e399cfb6d395109cbf87d89f7dfc459d3f9911c7b1bdc959fc8b9e49c7881
6
+ metadata.gz: 2bb410c9ed6ec2d7a1d9eaa57012d1f48521311f930ab3ffd96c2ce249498d0e7cc164a9676fd0d6f66b3060d89fec7a1347680eaf1107c9433ea899ad5615e5
7
+ data.tar.gz: 41bd9a73cfc94b6c378a21d2eee38cf6badbd59ab29c55faa2a3668a43db01dd0cf5e31e9cd431d2492bc1d854900140844e77014ed0fbfd5bd92bc7d4848100
data/README.rdoc ADDED
@@ -0,0 +1,513 @@
1
+ = file_pipeline
2
+
3
+ The <tt>file_pipeline</tt> gem provides a framework for nondestructive
4
+ application of file operation batches to files.
5
+
6
+ == Installation
7
+
8
+ gem install file_pipeline
9
+
10
+ == Dependencies
11
+
12
+ The file operations included in the gem require
13
+ {ruby-vips}[https://github.com/libvips/ruby-vips] for image manipulation and
14
+ {multi_exiftool}[https://github.com/janfri/multi_exiftool] for image file
15
+ metdata extraction and manipulation.
16
+
17
+ While these dependencies should be installed automatically with the gem,
18
+ ruby-vips depends on {libvips}[https://libvips.github.io/libvips/], and
19
+ multi_exiftool depends on
20
+ {Exiftool}[https://www.sno.phy.queensu.ca/~phil/exiftool/], which will not be
21
+ installed automatically.
22
+
23
+ == Usage
24
+
25
+ The basic usage is to create a new FilePipeline::Pipeline object and define any
26
+ file operations that are to be performed, apply it to a
27
+ FilePipeline::VersionedFile object initialized with the image to be processed,
28
+ then finalize the versioned file.
29
+
30
+ require 'file_pipeline'
31
+
32
+ # create a new instance of Pipeline
33
+ my_pipeline = FilePipeline::Pipeline.new
34
+
35
+ # configure an operation to scale an image to 1280 x 960 pixels
36
+ my_pipeline.define_operation('scale', :width => 1280, :height => 960)
37
+
38
+ # create an instance of VersionedFile for the file '~/image.jpg'
39
+ image = FilePipeline::VersionedFile.new('~/image.jpg')
40
+
41
+ # apply the pipeline to the versioned file
42
+ my_pipeline.apply_to(image)
43
+
44
+ # finalize the versioned file, replacing the original
45
+ image.finalize(:overwrite => true)
46
+
47
+ === Setting up a Pipeline
48
+
49
+ Pipeline objects can be set up to contain default file operations included in
50
+ the gem or with custom file operations (see
51
+ {Custom file operations}[rdoc-label:label-Custom+file+operations] for
52
+ instructions on how to create custom operations).
53
+
54
+ ==== Basic set up with default operations
55
+
56
+ To define an operation, pass the class name of the operation in underscore
57
+ notation and without the containing module name, and any options to
58
+ {#define_operation}[rdoc-ref:FilePipeline::Pipeline#define_operation].
59
+
60
+ The example below adds an instance of
61
+ PtiffConversion[rdoc-ref:FilePipeline::FileOperations::PtiffConversion] with
62
+ the <tt>:tile_width</tt> and <tt>:tile_height</tt> options each set to 64
63
+ pixels.
64
+
65
+ my_pipeline = FilePipeline::Pipeline.new
66
+ my_pipeline.define_operation('ptiff_conversion',
67
+ :tile_width => 64, :tile_height => 64)
68
+
69
+ Chaining is possible
70
+
71
+ my_pipeline = FilePipeline::Pipeline.new
72
+ my_pipeline.define_operation('scale', :width => 1280, :height => 1024)
73
+ .define_operation('exif_restoration')
74
+
75
+ Alternatively, operations can be defined during initialization by passing a
76
+ block to {#new}[rdoc-ref:FilePipeline::Pipeline.new].
77
+
78
+ my_pipeline = FilePipeline::Pipeline.new do |pipeline|
79
+ pipeline.define_operation('scale', :width => 1280, :height => 1024)
80
+ pipeline.define_operation('exif_restoration')
81
+ end
82
+
83
+ When using the default operations included in the gem, it is sufficient to
84
+ call <tt>#define_operation</tt> with the desired operations and options.
85
+
86
+ ==== Using custom file operations
87
+
88
+ When file operations are to be used that are not included in the gem, place
89
+ the source files for the class definitions in one or more directories and
90
+ initialize the Pipeline object with the directory paths. The directories will
91
+ be added to the {source directories}[rdoc-ref:FilePipeline.source_directories].
92
+
93
+ Directories are added to the source directories in reverse order, so that
94
+ directories added later will have precedence when searching source files. The
95
+ default operations directory included in the gem will be searched last. This
96
+ allows overriding of operations without changing the code in existing classes.
97
+
98
+ If, for example, there are two directories with custom file operation classes,
99
+ <tt>'~/custom_operations'</tt> and <tt>'~/other_operations'</tt>, the new
100
+ instance of Pipeline can be set up to look for source files first in
101
+ <tt>'~/other_operations'</tt>, then in <tt>'~/custom_operations'</tt>, and
102
+ finally in the included default operations.
103
+
104
+ The basename for source files _must_ be the class name in underscore notation
105
+ without the containing module name. If, for example, the operation is
106
+ <tt>FileOperations::MyOperation</tt>, the source file basename should be
107
+ <tt>'my_operation.rb'</tt>
108
+
109
+ my_pipeline = FilePipeline::Pipeline.new('~/custom_operations',
110
+ '~/other_operations')
111
+ my_pipeline.define_operation('my_operation')
112
+
113
+ See {Custom file operations}[rdoc-label:label-Custom+file+operations] for
114
+ instructions for how to write file operations.
115
+
116
+ === Nondestructive application to files
117
+
118
+ Pipelines[rdoc-ref:FilePipeline::Pipeline] work on
119
+ {versioned files}[rdoc-ref:FilePipeline::VersionedFile], which allow for
120
+ non-destructive application of all file operations.
121
+
122
+ To create a versioned file, initialize an instance with the path to the
123
+ original file:
124
+
125
+ # create an instance of VersionedFile for the file '~/image.jpg'
126
+ image = FilePipeline::VersionedFile.new('~/image.jpg')
127
+
128
+ As long as no operations have been applied, this will have no effect in the
129
+ file system. Only when the first operation is applied will VersionedFile
130
+ create a {working directory}[rdoc-ref:FilePipeline::VersionedFile#directory] in
131
+ the same directory as the original file. The working directory will have the
132
+ name of the file basename without extension and the suffix <tt>'_versions'</tt>
133
+ added.
134
+
135
+ Pipelines can be applied to a singe versioned file with the
136
+ the {#apply_to}[rdoc-ref:FilePipeline::Pipeline#apply_to] method of the pipeline
137
+ instance, or to an array of versioned files with the
138
+ {#batch_apply}[rdoc-ref:FilePipeline::Pipeline#batch_apply] method of the
139
+ pipeline instance.
140
+
141
+ === Accessing file metadata and captured data.
142
+
143
+ *Limitations*: this currently only works for _Exif_ metadata of image files.
144
+
145
+ VersionedFile provides access to a files metadata via the
146
+ {#metadata}[rdoc-ref:FilePipeline::VersionedFile#metadata] method of the
147
+ versioned file instance.
148
+
149
+ Both the metadata for the original file and the current (latest) version can
150
+ be accessed:
151
+
152
+ image = FilePipeline::VersionedFile.new('~/image.jpg')
153
+
154
+ # access the metadata for the current version
155
+ image.metadata
156
+
157
+ Note that if no file operations have been applied by a pipeline object, this
158
+ will return the metadata for the original, which in that case is the current
159
+ (latest) version.
160
+
161
+ To explicitly get the metadata for the original file even if there are newer
162
+ versions available, pass the <tt>:for_version</tt> option with the symbol
163
+ <tt>:original</tt>:
164
+
165
+ # access the metadata for the original file
166
+ image.metadata(:for_version => :original)
167
+
168
+ Some file operations can comprise metadata; many image processing libraries
169
+ will not preserve all _Exif_ tags and their values when converting images to
170
+ a different format, but only write a small subset of tags to the file they
171
+ create. In these cases, the
172
+ {ExifRestoration}[rdoc-ref:FilePipeline::FileOperations::ExifRestoration]
173
+ operation can be used to try to restore the tags that have been discarded, but
174
+ it can not write all tags. It will store all tags and their values that it could
175
+ not write back to the file and return them as captured data.
176
+
177
+ Likewise, if the
178
+ {ExifRedaction}[rdoc-ref:FilePipeline::FileOperations::ExifRedaction] is applied
179
+ to delete sensitive tags (e.g. GPS location data), it will return all deleted
180
+ exif tags and their values as captured data.
181
+
182
+ The
183
+ {#recovered_metadata}[rdoc-ref:FilePipeline::VersionedFile#recovered_metadata]
184
+ of the versioned file instance will return a hash with all metadata that was
185
+ deleted or could not be restored by operations:
186
+
187
+ delete_tags = ['CreatorTool', 'Software']
188
+
189
+ my_pipeline = FilePipeline::Pipeline.new do |pipeline|
190
+ pipeline.define_operation('scale', width: 1280, height: 1024)
191
+ pipeline.define_operation('exif_restoration')
192
+ Pipeline.define_operation('exif_redaction', :redact_tags => delete_tags)
193
+ end
194
+
195
+ image = FilePipeline::VersionedFile.new('~/image.jpg')
196
+ my_pipeline.apply_to(image)
197
+
198
+ image.recovered_metadata
199
+
200
+ For information on retrieving other kinds of captured data, refer to
201
+ the versioned file instance methods
202
+ {#captured_data}[rdoc-ref:FilePipeline::VersionedFile#captured_data],
203
+ {#captured_data_for}[rdoc-ref:FilePipeline::VersionedFile#captured_data_for],
204
+ and
205
+ {#captured_data_with}[rdoc-ref:FilePipeline::VersionedFile#captured_data_with].
206
+
207
+ === Finalizing files
208
+
209
+ Once all file operations of a pipeline object have been applied to a
210
+ versioned file object, it can be finalized by calling the
211
+ {#finalize}[rdoc-ref:FilePipeline::VersionedFile#finalize] method of the
212
+ instance.
213
+
214
+ Finalization will write the current version to the same directory that
215
+ contains the original. It will by default preserve the original by adding
216
+ a suffix to the basename of the final version. If the <tt>:overwrite</tt>
217
+ option for the method is passed with +true+, it will delete the original and
218
+ write the final version to the same basename as the original.
219
+
220
+ image = FilePipeline::VersionedFile.new('~/image.jpg')
221
+
222
+ # finalize the versioned file, preserving the original
223
+ image.finalize
224
+
225
+ # finalize the versioned file, replacing the original
226
+ image.finalize(:overwrite => true)
227
+
228
+ The work directory with all other versions will be deleted after the final
229
+ version has been written.
230
+
231
+ == Custom file operations
232
+
233
+ === Module nesting
234
+
235
+ File operation classes _must_ be defined in the FilePipeline::FileOperations
236
+ module for {automatic requiring}[rdoc-ref:FilePipeline.load] of source files to
237
+ work.
238
+
239
+ === Implementing from scratch
240
+
241
+ ==== Initializer
242
+
243
+ The <tt>#initialize</tt> method _must_ take an +options+ argument (a hash
244
+ with a default value, or a <em>double splat</em>) and _must_ be exposed
245
+ through an <tt>#options</tt> getter method.
246
+
247
+ The options passed can be any for the file operation to properly configure
248
+ a specific instance of a method.
249
+
250
+ This requirement is imposed by the
251
+ {#define_operation}[rdoc-ref:FilePipeline::Pipeline#define_operation] instance
252
+ method of Pipeline, which will automatically load and initialize an instance of
253
+ the file operation with any options provided as a hash.
254
+
255
+ ===== Examples
256
+
257
+ class MyOperation
258
+ attr_reader :options
259
+
260
+ # initializer with a default
261
+ def initialize(options = {})
262
+ @options = options
263
+ end
264
+ end
265
+
266
+ class MyOperation
267
+ attr_reader :options
268
+
269
+ # initializer with a double splat
270
+ def initialize(**options)
271
+ @options = options
272
+ end
273
+ end
274
+
275
+ Consider a file operation +CopyrightNotice+ that whill add copyright
276
+ information to an image file's _Exif_ metadata, the value for the copyright
277
+ tag could be passed as an option.
278
+
279
+ copyright_notice = CopyrightNotice.new(:copyright => 'The Photographer')
280
+
281
+ ==== The <tt>run</tt> method
282
+
283
+ File operations _must_ implement a <tt>#run</tt> method that takes three
284
+ arguments (or a _splat_) in order to be used in a Pipeline.
285
+
286
+ ===== Arguments
287
+
288
+ The three arguments required for implementations of <tt>#run</tt> are:
289
+ * the path to the <em>file to be modified</em>
290
+ * the path to the _directory_ to which new files will be saved.
291
+ * the path to the <em>original file</em>, from which the first version in a
292
+ succession of modified versions has been created.
293
+
294
+ The <em>original file</em> will only be used by file operations that require
295
+ it for reference, e.g. to restore file metadata that was compromised by
296
+ other file operations.
297
+
298
+ ===== Return value
299
+
300
+ The method _must_ return the path to the file that was created by the
301
+ operation (perferrably in the _directory_). It _may_ also return a
302
+ {Results}[rdoc-ref:FilePipeline::FileOperations::Results] object, containing the
303
+ operation itself, a _success_ flag (+true+ or +false+), and any logs or data
304
+ returned by the operation.
305
+
306
+ If results are returned with the path to the created file, both values must
307
+ be wrapped in an array, with the path as the first element, the results as
308
+ the second.
309
+
310
+ ===== Example
311
+
312
+ def run(src_file, directory, original)
313
+ # make a path to which the created file will be written
314
+ out_file = File.join(directory, 'new_file_name.extension')
315
+
316
+ # create a Results object reporting success with no logs or data
317
+ results = Results.new(self, true, nil)
318
+
319
+ # create a new out_file based on src_file in directory
320
+ # ...
321
+
322
+ # return the path to the new file and the results object
323
+ [out_file, results]
324
+ end
325
+
326
+ ==== Captured data tags
327
+
328
+ Captured data tags can be used to
329
+ {filter captured data}[rdoc-ref:FilePipeline::VersionedFile#captured_data_with]
330
+ accumulated during successive file operations.
331
+
332
+ Operations that return data as part of the results _should_ respond to
333
+ <tt>:captured_data_tag</tt> and return one of the
334
+ {tag constants}[rdoc-ref:FilePipeline::FileOperations::CapturedDataTags].
335
+
336
+ ===== Example
337
+
338
+ # returns NO_DATA
339
+ def captured_data_tag
340
+ CapturedDataTags::NO_DATA
341
+ end
342
+
343
+ === Subclassing FileOperation
344
+
345
+ The {FileOperation}[rdoc-ref:FilePipeline::FileOperations::FileOperation] class
346
+ is an abstract superclass that provides a scaffold to facilitate the creation of
347
+ file operations that conform to the requirements.
348
+
349
+ It implements a
350
+ {#run}[rdoc-ref:FilePipeline::FileOperations::FileOperation#run] method, that
351
+ takes the required three arguments and returns the path to the newly created
352
+ file and a Results object.
353
+
354
+ When the operation was successful,
355
+ {success}[rdoc-ref:FilePipeline::FileOperations::Results#success] will be
356
+ +true+. When an exception was raised, that exeption will be rescued and returned
357
+ as the {log}[rdoc-ref:FilePipeline::FileOperations::Results#log], and
358
+ {success}[rdoc-ref:FilePipeline::FileOperations::Results#success] will be
359
+ +false+.
360
+
361
+ The standard <tt>#run</tt> method of the FileOperation class does not contain
362
+ logic to perform the actual file operation, but will call an
363
+ {#operation method}[rdoc-label:label-The+operation+method] that _must_ be
364
+ defined in the subclass unless the subclass overrides the <tt>#run</tt> method.
365
+
366
+ The <tt>#run</tt> method will generate the new path that is passed to the
367
+ <tt>#operation</tt> method, and to which the latter will write the new
368
+ version of the file. The new file path will need an appropriate file type
369
+ extension. The default behavior is to assume that the extension will be the
370
+ same as for the file that was passed in as the basis from which the new
371
+ version will be created. If the operation will result in a different file
372
+ type, the subclass _should_ define a <tt>#target_extension</tt> method that
373
+ returns the appropriate file extension (see
374
+ {Target file extensions}[rdoc-label:label-Target+file+extensions]).
375
+
376
+ ==== Initializer
377
+
378
+ The +initialize+ method _must_ take an +options+ argument (a hash with a
379
+ default value or a <em>double splat</em>).
380
+
381
+ ===== Options and defaults
382
+
383
+ The initializer can call +super+ and pass the +options+ hash and any
384
+ defaults (a hash with default options). This will update the defaults with
385
+ the actual options passed to +initialize+ and assign them to the
386
+ {#options}[rdoc-ref:FilePipeline::FileOperations::FileOperation#options]
387
+ attribute.
388
+
389
+ If the initializer does not call +super+, it _must_ assign the options to
390
+ the <tt>@options</tt> instance variable or expose them through an
391
+ <tt>#options</tt> getter method.
392
+
393
+ If it calls +super+ but must ensure some options are always set to a
394
+ specific value, those should be set after the call to +super+.
395
+
396
+ ===== Examples
397
+
398
+ # initializer without defaults callings super
399
+ def initialize(**options)
400
+ super(options)
401
+ end
402
+
403
+ # initializer with defaults calling super
404
+ def initialize(**options)
405
+ defaults = { :option_a => true, :option_b => false }
406
+ super(options, defaults)
407
+ end
408
+
409
+ # initializer with defaults calling super, ensures :option_c => true
410
+ def initialize(**options)
411
+ defaults = { :option_a => true, :option_b => false }
412
+ super(options, defaults)
413
+ @options[:option_c] = true
414
+ end
415
+
416
+ # initilizer that does not call super
417
+ def initialize(**options)
418
+ @options = options
419
+ end
420
+
421
+ ==== The <tt>operation</tt> method
422
+
423
+ The <tt>#operation</tt> method contains the logic specific to a given
424
+ subclass of FileOperation and must be defined in that subclass unless the
425
+ <tt>#run</tt> method is overwritten.
426
+
427
+ ===== Arguments
428
+
429
+ The <tt>#operation</tt> method must accept three arguments:
430
+
431
+ * the path to the <em>file to be modified</em>
432
+ * the path for the <em>file to be created</em> by the operation.
433
+ * the path to the <em>original file</em>, from which the first version in a
434
+ succession of modified versions has been created.
435
+
436
+ The <em>original file</em> will only be used by file operations that require
437
+ it for reference, e.g. to restore file metadata that was compromised by
438
+ other file operations.
439
+
440
+ ===== Return Value
441
+
442
+ The method _can_ return anything that can be interpreted by
443
+ {LogDataParser}[rdoc-ref:FilePipeline::FileOperations::LogDataParser],
444
+ including nothing.
445
+
446
+ It will usually return any log outpout that the logic of <tt>#operation</tt>
447
+ has generated, and/or data captured. If data is captured that is to be used
448
+ later, the subclass should override the <tt>#captured_data_tag</tt> method to
449
+ return the appropriate
450
+ {tag constant}[rdoc-ref:FilePipeline::FileOperations::CapturedDataTags].
451
+
452
+ ===== Examples
453
+
454
+ # creates out_file based on src_file, captures metadata differences
455
+ # between out_file and original, returns log messages and captured data
456
+ def operation(src_file, out_file, original)
457
+ captured_data = {}
458
+ log_messages = []
459
+
460
+ # write the new version based on src_file to out_file
461
+ # compare metadata of out_file with original, store any differences
462
+ # in captures_data and append any log messages to log_messages
463
+
464
+ [log_messages, captured_data]
465
+ end
466
+
467
+ # takes the third argument for the original file but does not use it
468
+ # creates out_file based on src_file, returns log messages
469
+ def operation(src_file, out_file, _)
470
+ src_file, out_file = args
471
+ log_messages = []
472
+
473
+ # write the new version based on src_file to out_file
474
+
475
+ log_messages
476
+ end
477
+
478
+ # takes arguments as a splat and destructures them to avoid having the
479
+ # unused thirs argumen
480
+ # creates out_file based on src_file, returns nothing
481
+ def operation(*args)
482
+ src_file, out_file = args
483
+
484
+ # write the new version based on src_file to out_file
485
+
486
+ return
487
+ end
488
+
489
+ ==== Target file extensions
490
+
491
+ If the file that the operation creates is of a different type than the file
492
+ the version is based upon, the class _must_ define the
493
+ <tt>#target_extension</tt> method that returns the appropriate file type
494
+ extension.
495
+
496
+ In most cases, the resulting file type will be predictable (static), and in
497
+ such cases, the method can just return a string with the extension.
498
+
499
+ An alternative would be to provide the expected extension as an #option
500
+ to the initializer.
501
+
502
+ ===== Examples
503
+
504
+ # returns always '.tiff.
505
+ def target_extension
506
+ '.tiff'
507
+ end
508
+
509
+ # returns the extension specified in #options +:extension+
510
+ # my_operation = MyOperation(:extension => '.dng')
511
+ def target_extension
512
+ options[:extension]
513
+ end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: file_pipeline
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.0.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Martin Stein
@@ -38,227 +38,15 @@ dependencies:
38
38
  - - "~>"
39
39
  - !ruby/object:Gem::Version
40
40
  version: 2.0.14
41
- description: "= file_pipeline\n\nThe <tt>file_pipeline</tt> gem provides a framework
42
- for nondestructive \napplication of file operation batches to files.\n\n== Installation\n\n
43
- \ gem install file_pipeline\n\n== Dependencies\n\nThe file operations included in
44
- the gem require \n{ruby-vips}[https://github.com/libvips/ruby-vips] for image manipulation
45
- and\n{multi_exiftool}[https://github.com/janfri/multi_exiftool] for image file\nmetdata
46
- extraction and manipulation.\n\nWhile these dependencies should be installed automatically
47
- with the gem, \nruby-vips depends on {libvips}[https://libvips.github.io/libvips/],
48
- and\nmulti_exiftool depends on\n{Exiftool}[https://www.sno.phy.queensu.ca/~phil/exiftool/],
49
- which will not be\ninstalled automatically.\n\n== Usage\n\nThe basic usage is to
50
- create a new FilePipeline::Pipeline object and define any\nfile operations that
51
- are to be performed, apply it to a \nFilePipeline::VersionedFile object initialized
52
- with the image to be processed,\nthen finalize the versioned file.\n\n require
53
- 'file_pipeline'\n\n # create a new instance of Pipeline\n my_pipeline = FilePipeline::Pipeline.new\n\n
54
- \ # configure an operation to scale an image to 1280 x 960 pixels\n my_pipeline.define_operation('scale',
55
- :width => 1280, :height => 960)\n\n # create an instance of VersionedFile for the
56
- file '~/image.jpg'\n image = FilePipeline::VersionedFile.new('~/image.jpg')\n\n
57
- \ # apply the pipeline to the versioned file\n my_pipeline.apply_to(image)\n\n
58
- \ # finalize the versioned file, replacing the original\n image.finalize(:overwrite
59
- => true)\n\n=== Setting up a Pipeline\n\nPipeline objects can be set up to contain
60
- default file operations included in\nthe gem or with custom file operations (see\n{Custom
61
- file operations}[rdoc-label:label-Custom+file+operations] for\ninstructions on how
62
- to create custom operations).\n\n==== Basic set up with default operations\n\nTo
63
- define an operation, pass the class name of the operation in underscore\nnotation
64
- and without the containing module name, and any options to\n{#define_operation}[rdoc-ref:FilePipeline::Pipeline#define_operation].\n\nThe
65
- example below adds an instance of \nPtiffConversion[rdoc-ref:FilePipeline::FileOperations::PtiffConversion]
66
- with\nthe <tt>:tile_width</tt> and <tt>:tile_height</tt> options each set to 64\npixels.\n\n
67
- \ my_pipeline = FilePipeline::Pipeline.new\n my_pipeline.define_operation('ptiff_conversion',\n
68
- \ :tile_width => 64, :tile_height => 64)\n\nChaining
69
- is possible\n\n my_pipeline = FilePipeline::Pipeline.new\n my_pipeline.define_operation('scale',
70
- :width => 1280, :height => 1024)\n .define_operation('exif_restoration')\n\nAlternatively,
71
- operations can be defined during initialization by passing a\nblock to {#new}[rdoc-ref:FilePipeline::Pipeline.new].\n\n
72
- \ my_pipeline = FilePipeline::Pipeline.new do |pipeline|\n pipeline.define_operation('scale',
73
- :width => 1280, :height => 1024)\n pipeline.define_operation('exif_restoration')\n
74
- \ end\n\nWhen using the default operations included in the gem, it is sufficient
75
- to\ncall <tt>#define_operation</tt> with the desired operations and options.\n\n====
76
- Using custom file operations\n\nWhen file operations are to be used that are not
77
- included in the gem, place\nthe source files for the class definitions in one or
78
- more directories and\ninitialize the Pipeline object with the directory paths. The
79
- directories will\nbe added to the {source directories}[rdoc-ref:FilePipeline.source_directories].\n\nDirectories
80
- are added to the source directories in reverse order, so that\ndirectories added
81
- later will have precedence when searching source files. The\ndefault operations
82
- directory included in the gem will be searched last. This\nallows overriding of
83
- operations without changing the code in existing classes.\n\nIf, for example, there
84
- are two directories with custom file operation classes,\n<tt>'~/custom_operations'</tt>
85
- and <tt>'~/other_operations'</tt>, the new\ninstance of Pipeline can be set up to
86
- look for source files first in\n<tt>'~/other_operations'</tt>, then in <tt>'~/custom_operations'</tt>,
87
- and\nfinally in the included default operations.\n\nThe basename for source files
88
- _must_ be the class name in underscore notation\nwithout the containing module name.
89
- If, for example, the operation is\n<tt>FileOperations::MyOperation</tt>, the source
90
- file basename should be\n<tt>'my_operation.rb'</tt>\n\n my_pipeline = FilePipeline::Pipeline.new('~/custom_operations',\n
91
- \ '~/other_operations')\n my_pipeline.define_operation('my_operation')\n\nSee
92
- {Custom file operations}[rdoc-label:label-Custom+file+operations] for \ninstructions
93
- for how to write file operations.\n\n=== Nondestructive application to files\n\nPipelines[rdoc-ref:FilePipeline::Pipeline]
94
- work on \n{versioned files}[rdoc-ref:FilePipeline::VersionedFile], which allow for\nnon-destructive
95
- application of all file operations.\n\nTo create a versioned file, initialize an
96
- instance with the path to the \noriginal file:\n\n # create an instance of VersionedFile
97
- for the file '~/image.jpg'\n image = FilePipeline::VersionedFile.new('~/image.jpg')\n\nAs
98
- long as no operations have been applied, this will have no effect in the\nfile system.
99
- Only when the first operation is applied will VersionedFile\ncreate a {working directory}[rdoc-ref:FilePipeline::VersionedFile#directory]
100
- in\nthe same directory as the original file. The working directory will have the\nname
101
- of the file basename without extension and the suffix <tt>'_versions'</tt>\nadded.\n\nPipelines
102
- can be applied to a singe versioned file with the\nthe {#apply_to}[rdoc-ref:FilePipeline::Pipeline#apply_to]
103
- method of the pipeline\ninstance, or to an array of versioned files with the\n{#batch_apply}[rdoc-ref:FilePipeline::Pipeline#batch_apply]
104
- method of the \npipeline instance.\n\n=== Accessing file metadata and captured data.\n\n*Limitations*:
105
- this currently only works for _Exif_ metadata of image files.\n\nVersionedFile provides
106
- access to a files metadata via the\n{#metadata}[rdoc-ref:FilePipeline::VersionedFile#metadata]
107
- method of the \nversioned file instance.\n\nBoth the metadata for the original file
108
- and the current (latest) version can\nbe accessed:\n\n image = FilePipeline::VersionedFile.new('~/image.jpg')\n\n
109
- \ # access the metadata for the current version\n image.metadata\n\nNote that if
110
- no file operations have been applied by a pipeline object, this\nwill return the
111
- metadata for the original, which in that case is the current\n(latest) version.\n\nTo
112
- explicitly get the metadata for the original file even if there are newer\nversions
113
- available, pass the <tt>:for_version</tt> option with the symbol\n<tt>:original</tt>:\n\n
114
- \ # access the metadata for the original file\n image.metadata(:for_version =>
115
- :original)\n\nSome file operations can comprise metadata; many image processing
116
- libraries\nwill not preserve all _Exif_ tags and their values when converting images
117
- to\na different format, but only write a small subset of tags to the file they\ncreate.
118
- In these cases, the \n{ExifRestoration}[rdoc-ref:FilePipeline::FileOperations::ExifRestoration]\noperation
119
- can be used to try to restore the tags that have been discarded, but\nit can not
120
- write all tags. It will store all tags and their values that it could \nnot write
121
- back to the file and return them as captured data.\n\nLikewise, if the \n{ExifRedaction}[rdoc-ref:FilePipeline::FileOperations::ExifRedaction]
122
- is applied\nto delete sensitive tags (e.g. GPS location data), it will return all
123
- deleted \nexif tags and their values as captured data.\n\nThe\n{#recovered_metadata}[rdoc-ref:FilePipeline::VersionedFile#recovered_metadata]\nof
124
- the versioned file instance will return a hash with all metadata that was\ndeleted
125
- or could not be restored by operations:\n\n delete_tags = ['CreatorTool', 'Software']\n\n
126
- \ my_pipeline = FilePipeline::Pipeline.new do |pipeline|\n pipeline.define_operation('scale',
127
- width: 1280, height: 1024)\n pipeline.define_operation('exif_restoration')\n
128
- \ Pipeline.define_operation('exif_redaction', :redact_tags => delete_tags)\n end\n\n
129
- \ image = FilePipeline::VersionedFile.new('~/image.jpg')\n my_pipeline.apply_to(image)\n\n
130
- \ image.recovered_metadata\n\nFor information on retrieving other kinds of captured
131
- data, refer to\nthe versioned file instance methods\n{#captured_data}[rdoc-ref:FilePipeline::VersionedFile#captured_data],\n{#captured_data_for}[rdoc-ref:FilePipeline::VersionedFile#captured_data_for],
132
- \nand\n{#captured_data_with}[rdoc-ref:FilePipeline::VersionedFile#captured_data_with].\n\n===
133
- Finalizing files\n\nOnce all file operations of a pipeline object have been applied
134
- to a\nversioned file object, it can be finalized by calling the\n{#finalize}[rdoc-ref:FilePipeline::VersionedFile#finalize]
135
- method of the\ninstance.\n\nFinalization will write the current version to the same
136
- directory that\ncontains the original. It will by default preserve the original
137
- by adding\na suffix to the basename of the final version. If the <tt>:overwrite</tt>\noption
138
- for the method is passed with +true+, it will delete the original and\nwrite the
139
- final version to the same basename as the original.\n\n image = FilePipeline::VersionedFile.new('~/image.jpg')\n\n
140
- \ # finalize the versioned file, preserving the original\n image.finalize\n\n #
141
- finalize the versioned file, replacing the original\n image.finalize(:overwrite
142
- => true)\n\nThe work directory with all other versions will be deleted after the
143
- final\nversion has been written.\n\n== Custom file operations\n\n=== Module nesting\n\nFile
144
- operation classes _must_ be defined in the FilePipeline::FileOperations\nmodule
145
- for {automatic requiring}[rdoc-ref:FilePipeline.load] of source files to\nwork.\n\n===
146
- Implementing from scratch\n\n==== Initializer\n\nThe <tt>#initialize</tt> method
147
- _must_ take an +options+ argument (a hash\nwith a default value, or a <em>double
148
- splat</em>) and _must_ be exposed\nthrough an <tt>#options</tt> getter method.\n\nThe
149
- options passed can be any for the file operation to properly configure\na specific
150
- instance of a method.\n\nThis requirement is imposed by the \n{#define_operation}[rdoc-ref:FilePipeline::Pipeline#define_operation]
151
- instance\nmethod of Pipeline, which will automatically load and initialize an instance
152
- of\nthe file operation with any options provided as a hash.\n\n===== Examples\n\n
153
- \ class MyOperation\n attr_reader :options\n\n # initializer with a default\n
154
- \ def initialize(options = {})\n @options = options\n end\n end\n\n class
155
- MyOperation\n attr_reader :options\n\n # initializer with a double splat\n
156
- \ def initialize(**options)\n @options = options\n end\n end\n\nConsider
157
- a file operation +CopyrightNotice+ that whill add copyright\ninformation to an image
158
- file's _Exif_ metadata, the value for the copyright\ntag could be passed as an option.\n\n
159
- copyright_notice = CopyrightNotice.new(:copyright => 'The Photographer')\n\n====
160
- The <tt>run</tt> method\n\nFile operations _must_ implement a <tt>#run</tt> method
161
- that takes three\narguments (or a _splat_) in order to be used in a Pipeline.\n\n=====
162
- Arguments\n\nThe three arguments required for implementations of <tt>#run</tt> are:\n*
163
- the path to the <em>file to be modified</em>\n* the path to the _directory_ to which
164
- new files will be saved.\n* the path to the <em>original file</em>, from which the
165
- first version in a\n succession of modified versions has been created.\n\nThe <em>original
166
- file</em> will only be used by file operations that require\nit for reference, e.g.
167
- to restore file metadata that was compromised by\nother file operations.\n\n=====
168
- Return value\n\nThe method _must_ return the path to the file that was created by
169
- the\noperation (perferrably in the _directory_). It _may_ also return a \n{Results}[rdoc-ref:FilePipeline::FileOperations::Results]
170
- object, containing the\noperation itself, a _success_ flag (+true+ or +false+),
171
- and any logs or data\nreturned by the operation.\n\nIf results are returned with
172
- the path to the created file, both values must\nbe wrapped in an array, with the
173
- path as the first element, the results as\nthe second.\n\n===== Example\n\n def
174
- run(src_file, directory, original)\n # make a path to which the created file
175
- will be written\n out_file = File.join(directory, 'new_file_name.extension')\n\n
176
- \ # create a Results object reporting success with no logs or data\n results
177
- = Results.new(self, true, nil)\n\n # create a new out_file based on src_file
178
- in directory\n # ...\n\n # return the path to the new file and the results
179
- object\n [out_file, results]\n end\n\n==== Captured data tags\n\nCaptured data
180
- tags can be used to\n{filter captured data}[rdoc-ref:FilePipeline::VersionedFile#captured_data_with]
181
- \naccumulated during successive file operations.\n\nOperations that return data
182
- as part of the results _should_ respond to\n<tt>:captured_data_tag</tt> and return
183
- one of the \n{tag constants}[rdoc-ref:FilePipeline::FileOperations::CapturedDataTags].\n\n=====
184
- Example\n\n # returns NO_DATA\n def captured_data_tag\n CapturedDataTags::NO_DATA\n
185
- \ end\n\n=== Subclassing FileOperation\n\nThe {FileOperation}[rdoc-ref:FilePipeline::FileOperations::FileOperation]
186
- class\nis an abstract superclass that provides a scaffold to facilitate the creation
187
- of\nfile operations that conform to the requirements.\n\nIt implements a \n{#run}[rdoc-ref:FilePipeline::FileOperations::FileOperation#run]
188
- method, that\ntakes the required three arguments and returns the path to the newly
189
- created\nfile and a Results object.\n\nWhen the operation was successful,\n{success}[rdoc-ref:FilePipeline::FileOperations::Results#success]
190
- will be\n+true+. When an exception was raised, that exeption will be rescued and
191
- returned\nas the {log}[rdoc-ref:FilePipeline::FileOperations::Results#log], and\n{success}[rdoc-ref:FilePipeline::FileOperations::Results#success]
192
- will be \n+false+.\n\nThe standard <tt>#run</tt> method of the FileOperation class
193
- does not contain\nlogic to perform the actual file operation, but will call an \n{#operation
194
- method}[rdoc-label:label-The+operation+method] that _must_ be\ndefined in the subclass
195
- unless the subclass overrides the <tt>#run</tt> method.\n\nThe <tt>#run</tt> method
196
- will generate the new path that is passed to the\n<tt>#operation</tt> method, and
197
- to which the latter will write the new\nversion of the file. The new file path will
198
- need an appropriate file type\nextension. The default behavior is to assume that
199
- the extension will be the\nsame as for the file that was passed in as the basis
200
- from which the new\nversion will be created. If the operation will result in a different
201
- file\ntype, the subclass _should_ define a <tt>#target_extension</tt> method that\nreturns
202
- the appropriate file extension (see\n{Target file extensions}[rdoc-label:label-Target+file+extensions]).\n\n====
203
- Initializer\n\nThe +initialize+ method _must_ take an +options+ argument (a hash
204
- with a\ndefault value or a <em>double splat</em>).\n\n===== Options and defaults\n\nThe
205
- initializer can call +super+ and pass the +options+ hash and any\ndefaults (a hash
206
- with default options). This will update the defaults with\nthe actual options passed
207
- to +initialize+ and assign them to the\n{#options}[rdoc-ref:FilePipeline::FileOperations::FileOperation#options]
208
- \nattribute.\n\nIf the initializer does not call +super+, it _must_ assign the options
209
- to\nthe <tt>@options</tt> instance variable or expose them through an\n<tt>#options</tt>
210
- getter method.\n\nIf it calls +super+ but must ensure some options are always set
211
- to a\nspecific value, those should be set after the call to +super+.\n\n===== Examples\n\n
212
- \ # initializer without defaults callings super\n def initialize(**options)\n super(options)\n
213
- \ end\n\n # initializer with defaults calling super\n def initialize(**options)\n
214
- \ defaults = { :option_a => true, :option_b => false }\n super(options, defaults)\n
215
- \ end\n\n # initializer with defaults calling super, ensures :option_c => true\n
216
- \ def initialize(**options)\n defaults = { :option_a => true, :option_b => false
217
- }\n super(options, defaults)\n @options[:option_c] = true\n end\n\n # initilizer
218
- that does not call super\n def initialize(**options)\n @options = options\n
219
- \ end\n\n==== The <tt>operation</tt> method\n\nThe <tt>#operation</tt> method contains
220
- the logic specific to a given\nsubclass of FileOperation and must be defined in
221
- that subclass unless the\n<tt>#run</tt> method is overwritten.\n\n===== Arguments\n\nThe
222
- <tt>#operation</tt> method must accept three arguments:\n\n* the path to the <em>file
223
- to be modified</em>\n* the path for the <em>file to be created</em> by the operation.\n*
224
- the path to the <em>original file</em>, from which the first version in a\n succession
225
- of modified versions has been created.\n\nThe <em>original file</em> will only be
226
- used by file operations that require\nit for reference, e.g. to restore file metadata
227
- that was compromised by\nother file operations.\n\n===== Return Value\n\nThe method
228
- _can_ return anything that can be interpreted by\n{LogDataParser}[rdoc-ref:FilePipeline::FileOperations::LogDataParser],\nincluding
229
- nothing.\n\nIt will usually return any log outpout that the logic of <tt>#operation</tt>\nhas
230
- generated, and/or data captured. If data is captured that is to be used\nlater,
231
- the subclass should override the <tt>#captured_data_tag</tt> method to\nreturn the
232
- appropriate \n{tag constant}[rdoc-ref:FilePipeline::FileOperations::CapturedDataTags].\n\n=====
233
- Examples\n\n # creates out_file based on src_file, captures metadata differences\n
234
- \ # between out_file and original, returns log messages and captured data\n def
235
- operation(src_file, out_file, original)\n captured_data = {}\n log_messages
236
- = []\n\n # write the new version based on src_file to out_file\n # compare
237
- metadata of out_file with original, store any differences\n # in captures_data
238
- and append any log messages to log_messages\n\n [log_messages, captured_data]\n
239
- \ end\n\n # takes the third argument for the original file but does not use it\n
240
- \ # creates out_file based on src_file, returns log messages\n def operation(src_file,
241
- out_file, _)\n src_file, out_file = args\n log_messages = []\n\n # write
242
- the new version based on src_file to out_file\n\n log_messages\n end\n\n #
243
- takes arguments as a splat and destructures them to avoid having the\n # unused
244
- thirs argumen\n # creates out_file based on src_file, returns nothing\n def operation(*args)\n
245
- \ src_file, out_file = args\n\n # write the new version based on src_file to
246
- out_file\n\n return\n end\n\n==== Target file extensions\n\nIf the file that
247
- the operation creates is of a different type than the file\nthe version is based
248
- upon, the class _must_ define the\n<tt>#target_extension</tt> method that returns
249
- the appropriate file type\nextension.\n\nIn most cases, the resulting file type
250
- will be predictable (static), and in\nsuch cases, the method can just return a string
251
- with the extension.\n\nAn alternative would be to provide the expected extension
252
- as an #option\nto the initializer.\n\n===== Examples\n\n # returns always '.tiff.\n
253
- \ def target_extension\n '.tiff'\n end\n\n # returns the extension specified
254
- in #options +:extension+\n # my_operation = MyOperation(:extension => '.dng')\n
255
- \ def target_extension\n options[:extension]\n end\n"
41
+ description: The <tt>file_pipeline</tt> gem provides a framework fornondestructive
42
+ application of file operation batches to files.
256
43
  email: loveablelobster@fastmail.fm
257
44
  executables: []
258
45
  extensions: []
259
46
  extra_rdoc_files: []
260
47
  files:
261
48
  - LICENSE
49
+ - README.rdoc
262
50
  - lib/file_pipeline.rb
263
51
  - lib/file_pipeline/errors.rb
264
52
  - lib/file_pipeline/errors/failed_modification_error.rb