dicom 0.9.4 → 0.9.5

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 39b5531cc2e35e76e34a22584cd3543462c371c6
4
+ data.tar.gz: d4c4309d054a1c2495c746956c5aa5a67536edcb
5
+ SHA512:
6
+ metadata.gz: 68a07acd231106b00284009a3799621f44fa050944ea0076367545f807b76e5b57c1771cca985879951c80f0004ada6615e9c2a7bd9170d07fffe6dae62a57d1
7
+ data.tar.gz: 28f1655dea19ba8530a4f6aacaf307e7734ea153f0a546248289b43ccf949c8734179bb768db8bc57418ce25e1e112e918099bdfc03023d01b4110e0f36d0d2c
@@ -1,3 +1,47 @@
1
+ = 0.9.5
2
+
3
+ === 26th March, 2013
4
+
5
+ * DICOM module:
6
+ * Use hyphen instead of underscore in ruby-dicom application title.
7
+ * Added the DICOM load module method, a flexible method for loading DICOM data:
8
+ * Accepts one or more directories (in which all files are scanned).
9
+ * Accepts one or more file paths (which are loaded as DICOM objects).
10
+ * Accepts one or more DICOM objects, which are called with #to_dcm and returned.
11
+ * Network:
12
+ * Bind the DServer to '0.0.0.0' by default to avoid Econnrefused issues.
13
+ * Significantly improved network receive performance.
14
+ * Enable adding private UIDs to ruby-dicom's DICOM library.
15
+ * Read element and UID dictionaries with UTF-8 encoding.
16
+ * DObject (/Item):
17
+ * Added DObject#anonymize for easy anonymization of a single DICOM object.
18
+ * Added add_element and add_sequence methods for conveniently creating new elements/sequences belonging to a specific DObject or Item.
19
+ * Fixed an issue where the NArray library where needed when trying to pass an array to the pixels= method.
20
+ * Fixed an issue where both Magick libraries where needed when trying to pass an image object to the image= method.
21
+ * Added DObject#was_dcm_on_input attribute to separate between file and DObject elements given to DICOM::load.
22
+ * Added option :include_empty_parents to DObject#write.
23
+ * Removed the deprecated DObjet#write :add_meta option.
24
+ * Added DObject#source attribute for keeping track of a DICOM object's origin.
25
+ * Defaults to ignoring duplicate data elements and sequences instead of replacing the original element.
26
+ * Added the :overwrite option to DObject#read and #parse which makes ruby-dicom overwrite elements when duplicates are encountered.
27
+ * Anonymizer:
28
+ * Added Anonymizer#to_anonymizer, as well as equality and state methods.
29
+ * Use 'O' instead of 'N' as the default replacement value for Patient's Sex.
30
+ * Added :random_file_name option for increased security in the case of sensitive file name information.
31
+ * Make all Anonymizer attributes accessible through options with Anonymizer#new.
32
+ * Removed the deprecated identity_file feature.
33
+ * Deprecated Anonymizer#execute, along with its accompanying methods.
34
+ * Added the Anonymizer#anonymize method, which is intended to replace the execute method.
35
+ * Improved the Anonymizer's conformance with the guidelines in the DICOM standard:
36
+ * Delete the File Meta Information group (0002) on anonymization.
37
+ * Add the Patient Identity Removed element with value 'YES'.
38
+ * Add de-identification method code sequence.
39
+ * Added 7 new elements to the default anonymization list.
40
+ * Enabled complete remapping of UIDs in the Anonymizer (keeping all references between files/series/studies valid).
41
+ * Added Anonymizer :recursive option, for anonymizing entire element trees (not just the top level).
42
+ * Added Anonymizer :encryption option, which allows the simulatenous preservation of privacy and key/value relations in the audit file.
43
+
44
+
1
45
  = 0.9.4
2
46
 
3
47
  === 10th September, 2012
@@ -0,0 +1,83 @@
1
+ = Contributing Code
2
+
3
+ So you want to contribute to ruby-dicom? That's great! Thank you! If you are a
4
+ first time committer however, you must to take a moment to read these instructions.
5
+
6
+ The preferred method, by far, is to convey your code contribution in the form
7
+ of a pull request on {github}[https://github.com/dicom/ruby-dicom].
8
+
9
+ == Committer's Recipe
10
+
11
+ * Fork the repository (for bonus points, use a topical branch name).
12
+ * Execute the specification (rspec tests) to verify that all spec examples pass
13
+ (if in doubt, check out the rakefile for instructions).
14
+ * Add a spec example for your change. Only refactoring and documentation changes
15
+ require no new tests. If you are adding functionality or fixing a bug, a test is mandatory!
16
+ * Alter the code to make the new spec example(s) pass.
17
+ * Keep it simple! One issue per pull request. Mixing two or more independent
18
+ issues in the same pull request will complicate the review of your request and
19
+ may result in a rejection (even if independent parts of the commit are sound).
20
+ * Don't modify the version or changelog.
21
+ * Push to your fork and submit a pull request.
22
+ * Wait for feedback (this shouldn't take too long). The pull request may be accepted right
23
+ away, it may be rejected (with a reason specified), or it may spark a discussion where
24
+ changes/improvements are suggested in order for the pull request to be accepted.
25
+
26
+ == Guidelines
27
+
28
+ In order to increase the chances of your pull request being accepted,
29
+ please follow the project's coding guidelines. Ideally, your contribution must not
30
+ add {technical debt}[http://en.wikipedia.org/wiki/Technical_debt] to the project.
31
+
32
+ * Provide thorough documentation. It should follow the format used by this project
33
+ and give information on parameters, exceptions and return values where relevant.
34
+ Provide examples for non-trivial use cases.
35
+ * Read the excellent {Github Ruby Styleguide}[https://github.com/styleguide/ruby]
36
+ if you are new to collaborative Ruby development. Do note though, that we actually
37
+ don't follow all styles listed yet (perhaps we should?!).
38
+ * Some sample patterns of ours:
39
+ * Indentation: Two spaces (no tabs).
40
+ * No trailing whitespace. Blank lines should not have any space.
41
+ * Method parameters: my_method(my_arg) is preferred instead of my_method( my_arg ) or my_method my_arg
42
+ * Assignment: a = b and not a=b
43
+ * In general: Follow the conventions you see used in the source code already.
44
+
45
+ == Contribution Agreement
46
+
47
+ ruby-dicom is licensed under the {GPL v3}[http://www.gnu.org/licenses/gpl.html],
48
+ and to be in the best position to enforce the GPL, the copyright status of ruby-dicom
49
+ needs to be as simple as possible. To achieve this, contributors should only provide
50
+ contributions which are their own work, and either:
51
+
52
+ a) Assign the copyright on the contribution to myself, Christoffer Lervåg
53
+
54
+ or
55
+
56
+ b) Disclaim copyright on it and thus put it in the public domain
57
+
58
+ Copyright assignment (a) is the preferred and encouraged option for larger
59
+ code contributions, and is assumed unless otherwise is specified.
60
+
61
+ Please see the {GNU FAQ}[http://www.gnu.org/licenses/gpl-faq.html#AssignCopyright]
62
+ for a fuller explanation of the need for this.
63
+
64
+ == Credits
65
+
66
+ All contributors are credited, with full name and link to their github account,
67
+ in the README file. If such an accreditation is not wanted (for whatever reason),
68
+ please let me know so, either in the pull request or in private.
69
+
70
+
71
+ = Other ways to contribute
72
+
73
+ Don't want to get your hands dirty with source code and git? Don't worry,
74
+ there are other ways in which you can contribute to the project as well!
75
+
76
+ * Create an issue on github for feature requests or bug reports.
77
+ * Weigh in with your opinion on existing issues.
78
+ * Write a tutorial.
79
+ * Answer questions, or tell the community about your exciting ruby-dicom projects in the
80
+ {mailing list}[http://groups.google.com/group/ruby-dicom].
81
+ * Academic works: Properly reference ruby-dicom in your work and tell us about it.
82
+ * Spread the word: Tell your colleagues about ruby-dicom.
83
+ * Make a donation.
@@ -110,7 +110,7 @@ Example:
110
110
 
111
111
  == COPYRIGHT
112
112
 
113
- Copyright 2008-2012 Christoffer Lervåg
113
+ Copyright 2008-2013 Christoffer Lervåg
114
114
 
115
115
  This program is free software: you can redistribute it and/or modify
116
116
  it under the terms of the GNU General Public License as published by
@@ -143,6 +143,7 @@ Please don't hesitate to email me if you have any feedback related to this proje
143
143
  * {Jeff Miller}[https://github.com/jeffmax]
144
144
  * {Donnie Millar}[https://github.com/dmillar]
145
145
  * {Björn Albers}[https://github.com/bjoernalbers]
146
- * {Lars Benner}[https://github.com/Maturin]
147
146
  * {Felix Petriconi}[https://github.com/FelixPetriconi]
148
147
  * {Steven Bedrick}[https://github.com/stevenbedrick]
148
+ * {Lars Benner}[https://github.com/Maturin]
149
+ * {Brett Goulder}[https://github.com/brettgoulder]
@@ -9,23 +9,22 @@ Gem::Specification.new do |s|
9
9
  s.date = Time.now
10
10
  s.summary = "Library for handling DICOM files and DICOM network communication."
11
11
  s.require_paths = ['lib']
12
- s.author = "Christoffer Lervag"
13
- s.email = "chris.lervag@gmail.com"
14
- s.homepage = "http://dicom.rubyforge.org/"
15
- s.license = "GPLv3"
12
+ s.author = 'Christoffer Lervag'
13
+ s.email = 'chris.lervag@gmail.com'
14
+ s.homepage = 'http://dicom.rubyforge.org/'
15
+ s.license = 'GPLv3'
16
16
  s.description = "DICOM is a standard widely used throughout the world to store and transfer medical image data. This library enables efficient and powerful handling of DICOM in Ruby, to the benefit of any student or professional who would like to use their favorite language to process DICOM files and communicate across the network."
17
17
  s.files = Dir["{lib}/**/*", "[A-Z]*"]
18
18
  s.rubyforge_project = 'dicom'
19
19
 
20
20
  s.required_ruby_version = '>= 1.9.2'
21
- s.required_rubygems_version = '>= 1.8.6'
22
21
 
23
- s.add_development_dependency('bundler', '>= 1.0.0')
24
- s.add_development_dependency('rake', '>= 0.9.2.2')
25
- s.add_development_dependency('rspec', '>= 2.9.0')
26
- s.add_development_dependency('mocha', '>= 0.10.5')
27
- s.add_development_dependency('narray', '>= 0.6.0.0')
28
- s.add_development_dependency('rmagick', '>= 2.12.0')
29
- s.add_development_dependency('mini_magick', '>= 3.2.1')
30
- s.add_development_dependency('yard', '>= 0.8.2')
22
+ s.add_development_dependency('bundler', '~> 1.3')
23
+ s.add_development_dependency('mocha', '~> 0.13')
24
+ s.add_development_dependency('mini_magick', '~> 3.5')
25
+ s.add_development_dependency('narray', '~> 0.6.0.8')
26
+ s.add_development_dependency('rake', '~> 0.9.6')
27
+ s.add_development_dependency('rmagick', '~> 2.13.2')
28
+ s.add_development_dependency('rspec', '~> 2.13')
29
+ s.add_development_dependency('yard', '~> 0.8.5')
31
30
  end
@@ -1,670 +1,649 @@
1
- module DICOM
2
-
3
- # This is a convenience class for handling the anonymization (de-identification) of DICOM files.
4
- #
5
- # @note
6
- # For 'advanced' anonymization, a good resource might be the work on "Clinical Trials
7
- # De-identification Profiles" by the DICOM Standards Committee, Working Group 18:
8
- # ftp://medical.nema.org/medical/dicom/supps/sup142_pc.pdf
9
- #
10
- class Anonymizer
11
- include Logging
12
-
13
- # An AuditTrail instance used for this anonymization (if specified).
14
- attr_reader :audit_trail
15
- # The file name used for the AuditTrail serialization (if specified).
16
- attr_reader :audit_trail_file
17
- # A boolean that if set as true will cause all anonymized tags to be blank instead of get some generic value.
18
- attr_accessor :blank
19
- # A boolean that if set as true will cause all anonymized tags to be get enumerated values, to enable post-anonymization identification by the user.
20
- attr_accessor :enumeration
21
- # The identity file attribute.
22
- attr_reader :identity_file
23
- # A boolean that if set as true, will make the anonymization delete all private tags.
24
- attr_accessor :delete_private
25
- # The path where the anonymized files will be saved. If this value is not set, the original DICOM files will be overwritten.
26
- attr_accessor :write_path
27
- # A boolean indicating whether or not UIDs shall be replaced when executing the anonymization.
28
- attr_accessor :uid
29
- # The DICOM UID root to use when generating new UIDs.
30
- attr_accessor :uid_root
31
- # An array of UID tags that will be anonymized if the uid option is used.
32
- attr_accessor :uids
33
-
34
- # Creates an Anonymizer instance.
35
- #
36
- # @note To customize logging behaviour, refer to the Logging module documentation.
37
- # @param [Hash] options the options to create an anonymizer instance with
38
- # @option options [String] :audit_trail a file name path. If the file contains old audit data, these are loaded and used in the current anonymization.
39
- # @option options [Boolean] :uid if true, all (top level) UIDs will be replaced with custom generated UIDs. To preserve UID relations in studies/series, the AuditTrail feature must be used.
40
- # @option options [String] :uid_root an organization (or custom) UID root to use when replacing UIDs.
41
- # @example Create an Anonymizer instance and restrict the log output
42
- # a = Anonymizer.new
43
- # a.logger.level = Logger::ERROR
44
- # @example Perform anonymization using the audit trail feature
45
- # a = Anonymizer.new(:audit_trail => "trail.json")
46
- # a.enumeration = true
47
- # a.folder = "//dicom/today/"
48
- # a.write_path = "//anonymized/"
49
- # a.execute
50
- #
51
- def initialize(options={})
52
- # Default value of accessors:
53
- @blank = false
54
- @enumeration = false
55
- @delete_private = false
56
- # Array of folders to be processed for anonymization:
57
- @folders = Array.new
58
- # Folders that will be skipped:
59
- @exceptions = Array.new
60
- # Data elements which will be anonymized (the array will hold a list of tag strings):
61
- @tags = Array.new
62
- # Default values to use on anonymized data elements:
63
- @values = Array.new
64
- # Which data elements will have enumeration applied, if requested by the user:
65
- @enumerations = Array.new
66
- # We use a Hash to store information from DICOM files if enumeration is desired:
67
- @enum_old_hash = Hash.new
68
- @enum_new_hash = Hash.new
69
- # All the files to be anonymized will be put in this array:
70
- @files = Array.new
71
- # Write paths will be determined later and put in this array:
72
- @write_paths = Array.new
73
- # Register the uid anonymization option:
74
- @uid = options[:uid]
75
- # Set the uid_root to be used when anonymizing study_uid series_uid and sop_instance_uid
76
- @uid_root = options[:uid_root] ? options[:uid_root] : UID_ROOT
77
- # Setup audit trail if requested:
78
- if options[:audit_trail]
79
- @audit_trail_file = options[:audit_trail]
80
- if File.exists?(@audit_trail_file) && File.size(@audit_trail_file) > 2
81
- # Load the pre-existing audit trail from file:
82
- @audit_trail = AuditTrail.read(@audit_trail_file)
83
- else
84
- # Start from scratch with an empty audit trail:
85
- @audit_trail = AuditTrail.new
86
- end
87
- end
88
- # Set the default data elements to be anonymized:
89
- set_defaults
90
- end
91
-
92
- # Adds an exception folder which will be avoided when anonymizing.
93
- #
94
- # @param [String] path a path that will be avoided
95
- # @example Adding a folder
96
- # a.add_exception("/home/dicom/tutorials/")
97
- #
98
- def add_exception(path)
99
- raise ArgumentError, "Expected String, got #{path.class}." unless path.is_a?(String)
100
- if path
101
- # Remove last character if the path ends with a file separator:
102
- path.chop! if path[-1..-1] == File::SEPARATOR
103
- @exceptions << path
104
- end
105
- end
106
-
107
- # Adds a folder who's files will be anonymized.
108
- #
109
- # @param [String] path a path that will be included in the anonymization
110
- # @example Adding a folder
111
- # a.add_folder("/home/dicom")
112
- #
113
- def add_folder(path)
114
- raise ArgumentError, "Expected String, got #{path.class}." unless path.is_a?(String)
115
- @folders << path
116
- end
117
-
118
- # Checks the enumeration status of this tag.
119
- #
120
- # @param [String] tag a data element tag
121
- # @return [Boolean, NilClass] the enumeration status of the tag, or nil if the tag has no match
122
- #
123
- def enum(tag)
124
- raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
125
- raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
126
- pos = @tags.index(tag)
127
- if pos
128
- return @enumerations[pos]
129
- else
130
- logger.warn("The specified tag (#{tag}) was not found in the list of tags to be anonymized.")
131
- return nil
132
- end
133
- end
134
-
135
- # Executes the anonymization process.
136
- #
137
- # This method is run when all settings have been finalized for the Anonymization instance.
138
- #
139
- # @note Only top level data elements are anonymized!
140
- #
141
- def execute
142
- # FIXME: This method has grown a bit lengthy. Perhaps it should be looked at one day.
143
- # Search through the folders to gather all the files to be anonymized:
144
- logger.info("Initiating anonymization process.")
145
- start_time = Time.now.to_f
146
- logger.info("Searching for files...")
147
- load_files
148
- logger.info("Done.")
149
- if @files.length > 0
150
- if @tags.length > 0
151
- logger.info(@files.length.to_s + " files have been identified in the specified folder(s).")
152
- if @write_path
153
- # Determine the write paths, as anonymized files will be written to a separate location:
154
- logger.info("Processing write paths...")
155
- process_write_paths
156
- logger.info("Done")
157
- else
158
- # Overwriting old files:
159
- logger.warn("Separate write folder not specified. Existing DICOM files will be overwritten.")
160
- @write_paths = @files
161
- end
162
- # If the user wants enumeration, we need to prepare variables for storing
163
- # existing information associated with each tag:
164
- create_enum_hash if @enumeration
165
- # Start the read/update/write process:
166
- logger.info("Initiating read/update/write process. This may take some time...")
167
- # Monitor whether every file read/write was successful:
168
- all_read = true
169
- all_write = true
170
- files_written = 0
171
- files_failed_read = 0
172
- begin
173
- require 'progressbar'
174
- pbar = ProgressBar.new("Anonymizing", @files.length)
175
- rescue LoadError
176
- pbar = nil
177
- end
178
- # Temporarily increase the log threshold to suppress messages from the DObject class:
179
- anonymizer_level = logger.level
180
- logger.level = Logger::FATAL
181
- @files.each_index do |i|
182
- pbar.inc if pbar
183
- # Read existing file to DICOM object:
184
- dcm = DObject.read(@files[i])
185
- if dcm.read?
186
- # Anonymize the desired tags:
187
- @tags.each_index do |j|
188
- if dcm.exists?(@tags[j])
189
- element = dcm[@tags[j]]
190
- if element.is_a?(Element)
191
- if @blank
192
- value = ""
193
- elsif @enumeration
194
- old_value = element.value
195
- # Only launch enumeration logic if there is an actual value to the data element:
196
- if old_value
197
- value = enumerated_value(old_value, j)
198
- else
199
- value = ""
200
- end
201
- else
202
- # Use the value that has been set for this tag:
203
- value = @values[j]
204
- end
205
- element.value = value
206
- end
207
- end
208
- end
209
- # Handle UIDs if requested:
210
- replace_uids(dcm) if @uid
211
- # Delete private tags?
212
- dcm.delete_private if @delete_private
213
- # Delete Tags marked for removal:
214
- @delete_tags.each_index do |j|
215
- dcm.delete(@delete_tags[j]) if dcm.exists?(@delete_tags[j])
216
- end
217
- # Write DICOM file:
218
- dcm.write(@write_paths[i])
219
- if dcm.written?
220
- files_written += 1
221
- else
222
- all_write = false
223
- end
224
- else
225
- all_read = false
226
- files_failed_read += 1
227
- end
228
- end
229
- pbar.finish if pbar
230
- # Finished anonymizing files. Reset the log threshold:
231
- logger.level = anonymizer_level
232
- # Print elapsed time and status of anonymization:
233
- end_time = Time.now.to_f
234
- logger.info("Anonymization process completed!")
235
- if all_read
236
- logger.info("All files in the specified folder(s) were SUCCESSFULLY read to DICOM objects.")
237
- else
238
- logger.warn("Some files were NOT successfully read (#{files_failed_read} files). If some folder(s) contain non-DICOM files, this is expected.")
239
- end
240
- if all_write
241
- logger.info("All DICOM objects were SUCCESSFULLY written as DICOM files (#{files_written} files).")
242
- else
243
- logger.warn("Some DICOM objects were NOT succesfully written to file. You are advised to investigate the result (#{files_written} files succesfully written).")
244
- end
245
- @audit_trail.write(@audit_trail_file) if @audit_trail
246
- # Has user requested enumeration and specified an identity file in which to store the anonymized values?
247
- if @enumeration and @identity_file and !@audit_trail
248
- logger.info("Writing identity file.")
249
- write_identity_file
250
- logger.info("Done")
251
- end
252
- elapsed = (end_time-start_time).to_s
253
- logger.info("Elapsed time: #{elapsed[0..elapsed.index(".")+1]} seconds")
254
- else
255
- logger.warn("No tags were selected for anonymization. Aborting.")
256
- end
257
- else
258
- logger.warn("No files were found in specified folders. Aborting.")
259
- end
260
- end
261
-
262
- # Setter method for the identity file.
263
- #
264
- # @deprecated The identity file feature is deprecated!
265
- # Please use the AuditTrail feature instead.
266
- # @param [String] file_name the path of the identity file
267
- #
268
- def identity_file=(file_name)
269
- # Deprecation warning:
270
- logger.warn("The identity_file feature of the Anonymization class has been deprecated! Please use the AuditTrail feature instead.")
271
- @identity_file = file_name
272
- end
273
-
274
- # Prints to screen a list of which tags are currently selected for anonymization along with
275
- # the replacement values that will be used and enumeration status.
276
- #
277
- def print
278
- # Extract the string lengths which are needed to make the formatting nice:
279
- names = Array.new
280
- types = Array.new
281
- tag_lengths = Array.new
282
- name_lengths = Array.new
283
- type_lengths = Array.new
284
- value_lengths = Array.new
285
- @tags.each_index do |i|
286
- name, vr = LIBRARY.name_and_vr(@tags[i])
287
- names << name
288
- types << vr
289
- tag_lengths[i] = @tags[i].length
290
- name_lengths[i] = names[i].length
291
- type_lengths[i] = types[i].length
292
- value_lengths[i] = @values[i].to_s.length unless @blank
293
- value_lengths[i] = "" if @blank
294
- end
295
- # To give the printed output a nice format we need to check the string lengths of some of these arrays:
296
- tag_maxL = tag_lengths.max
297
- name_maxL = name_lengths.max
298
- type_maxL = type_lengths.max
299
- value_maxL = value_lengths.max
300
- # Format string array for print output:
301
- lines = Array.new
302
- @tags.each_index do |i|
303
- # Configure empty spaces:
304
- s = " "
305
- f1 = " "*(tag_maxL-@tags[i].length+1)
306
- f2 = " "*(name_maxL-names[i].length+1)
307
- f3 = " "*(type_maxL-types[i].length+1)
308
- f4 = " " if @blank
309
- f4 = " "*(value_maxL-@values[i].to_s.length+1) unless @blank
310
- if @enumeration
311
- enum = @enumerations[i]
312
- else
313
- enum = ""
314
- end
315
- if @blank
316
- value = ""
317
- else
318
- value = @values[i]
319
- end
320
- tag = @tags[i]
321
- lines << tag + f1 + names[i] + f2 + types[i] + f3 + value.to_s + f4 + enum.to_s
322
- end
323
- # Print to screen:
324
- lines.each do |line|
325
- puts line
326
- end
327
- end
328
-
329
- # Removes a tag from the list of tags that will be anonymized.
330
- #
331
- # @param [String] tag a data element tag
332
- # @example Do not anonymize the Patient's Name tag
333
- # a.remove_tag("0010,0010")
334
- #
335
- def remove_tag(tag)
336
- raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
337
- raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
338
- pos = @tags.index(tag)
339
- if pos
340
- @tags.delete_at(pos)
341
- @values.delete_at(pos)
342
- @enumerations.delete_at(pos)
343
- end
344
- end
345
-
346
- # Compeletely deletes a tag from the file
347
- #
348
- # @param [String] tag a data element tag
349
- # @example Completely delete the Patient's Name tag from the DICOM files
350
- # a.delete_tag("0010,0010")
351
- #
352
- def delete_tag(tag)
353
- raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
354
- raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
355
- @delete_tags.push(tag) if not @delete_tags.include?(tag)
356
- end
357
-
358
- # Sets the anonymization settings for the specified tag. If the tag is already present in the list
359
- # of tags to be anonymized, its settings are updated, and if not, a new tag entry is created.
360
- #
361
- # @param [String] tag a data element tag
362
- # @param [Hash] options the anonymization settings for the specified tag
363
- # @option options [String, Integer, Float] :value the replacement value to be used when anonymizing this data element. Defaults to the pre-existing value and "" for new tags.
364
- # @option options [String, Integer, Float] :enum specifies if enumeration is to be used for this tag. Defaults to the pre-existing value and false for new tags.
365
- # @example Set the anonymization settings of the Patient's Name tag
366
- # a.set_tag("0010,0010", :value => "MrAnonymous", :enum => true)
367
- #
368
- def set_tag(tag, options={})
369
- raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
370
- raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
371
- pos = @tags.index(tag)
372
- if pos
373
- # Update existing values:
374
- @values[pos] = options[:value] if options[:value]
375
- @enumerations[pos] = options[:enum] if options[:enum] != nil
376
- else
377
- # Add new elements:
378
- @tags << tag
379
- @values << (options[:value] ? options[:value] : default_value(tag))
380
- @enumerations << (options[:enum] ? options[:enum] : false)
381
- end
382
- end
383
-
384
- # Gives the value which will be used when anonymizing this tag.
385
- #
386
- # @note If enumeration is selected for a string type tag, a number will be
387
- # appended in addition to the string that is returned here.
388
- #
389
- # @param [String] tag a data element tag
390
- # @return [String, Integer, Float, NilClass] the replacement value for the specified tag, or nil if the tag is not matched
391
- #
392
- def value(tag)
393
- raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
394
- raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
395
- pos = @tags.index(tag)
396
- if pos
397
- return @values[pos]
398
- else
399
- logger.warn("The specified tag (#{tag}) was not found in the list of tags to be anonymized.")
400
- return nil
401
- end
402
- end
403
-
404
-
405
- private
406
-
407
-
408
- # Finds the common path (if any) in the instance file path array, by performing a recursive search
409
- # on the folders that make up the path of one such file.
410
- #
411
- # @param [Array<String>] str_arr an array of folder strings from the path of a select file
412
- # @param [Fixnum] index the index of the folder in str_arr to check against all file paths
413
- # @return [Fixnum] the index of the last folder in the path of the selected file that is common for all file paths
414
- #
415
- def common_path(str_arr, index=0)
416
- common_folders = Array.new
417
- # Find out how much of the path is similar for all files in @files array:
418
- folder = str_arr[index]
419
- all_match = true
420
- @files.each do |f|
421
- all_match = false unless f.include?(folder)
422
- end
423
- if all_match
424
- # Need to check the next folder in the array:
425
- result = common_path(str_arr, index + 1)
426
- else
427
- # Current folder did not match, which means last possible match is current index -1.
428
- result = index - 1
429
- end
430
- return result
431
- end
432
-
433
- # Creates a hash that is used for storing information that is used when enumeration is selected.
434
- #
435
- def create_enum_hash
436
- @enumerations.each_index do |i|
437
- @enum_old_hash[@tags[i]] = Array.new
438
- @enum_new_hash[@tags[i]] = Array.new
439
- end
440
- end
441
-
442
- # Determines a default value to use for anonymizing the given tag.
443
- #
444
- # @param [String] tag a data element tag
445
- # @return [String, Integer, Float] the default replacement value for a given tag
446
- #
447
- def default_value(tag)
448
- name, vr = LIBRARY.name_and_vr(tag)
449
- conversion = VALUE_CONVERSION[vr] || :to_s
450
- case conversion
451
- when :to_i then return 0
452
- when :to_f then return 0.0
453
- else
454
- # Assume type is string and return an empty string:
455
- return ""
456
- end
457
- end
458
-
459
- # Handles the enumeration for the given data element tag.
460
- # If its value has been encountered before, its corresponding enumerated
461
- # replacement value is retrieved, and if a new original value is encountered,
462
- # a new enumerated replacement value is found by increasing an index by 1.
463
- #
464
- # @param [String, Integer, Float] original the original value of the tag to be anonymized
465
- # @param [Fixnum] j the index of this tag in the tag-related instance arrays
466
- # @return [String, Integer, Float] the replacement value which is used for the anonymization of the tag
467
- #
468
- def enumerated_value(original, j)
469
- # Is enumeration requested for this tag?
470
- if @enumerations[j]
471
- if @audit_trail
472
- # Check if the UID has been encountered already:
473
- replacement = @audit_trail.replacement(@tags[j], original)
474
- unless replacement
475
- # This original value has not been encountered yet. Determine the index to use.
476
- index = @audit_trail.records(@tags[j]).length + 1
477
- # Create the replacement value:
478
- if @values[j].is_a?(String)
479
- replacement = @values[j] + index.to_s
480
- else
481
- replacement = @values[j] + index
482
- end
483
- # Add this tag record to the audit trail:
484
- @audit_trail.add_record(@tags[j], original, replacement)
485
- end
486
- else
487
- # Retrieve earlier used anonymization values:
488
- previous_old = @enum_old_hash[@tags[j]]
489
- previous_new = @enum_new_hash[@tags[j]]
490
- p_index = previous_old.length
491
- if previous_old.index(original) == nil
492
- # Current value has not been encountered before:
493
- replacement = @values[j]+(p_index + 1).to_s
494
- # Store value in array (and hash):
495
- previous_old << original
496
- previous_new << replacement
497
- @enum_old_hash[@tags[j]] = previous_old
498
- @enum_new_hash[@tags[j]] = previous_new
499
- else
500
- # Current value has been observed before:
501
- replacement = previous_new[previous_old.index(original)]
502
- end
503
- end
504
- else
505
- replacement = @values[j]
506
- end
507
- return replacement
508
- end
509
-
510
- # Discovers all the files contained in the specified directory (all its sub-directories),
511
- # and adds these files to the instance file array.
512
- #
513
- def load_files
514
- # Load find library:
515
- require 'find'
516
- # Iterate through the folders (and its subfolders) to extract all files:
517
- for dir in @folders
518
- Find.find(dir) do |path|
519
- if FileTest.directory?(path)
520
- proceed = true
521
- @exceptions.each do |e|
522
- proceed = false if e == path
523
- end
524
- if proceed
525
- next
526
- else
527
- Find.prune # Don't look any further into this directory.
528
- end
529
- else
530
- @files << path # Store the file in our array
531
- end
532
- end
533
- end
534
- end
535
-
536
- # Analyzes the write_path and the 'read' file path to determine if they have some common root.
537
- # If there are parts of the file path that exists also in the write path, the common parts will
538
- # not be added to the write_path. The processed paths are put in a write_path instance array.
539
- #
540
- def process_write_paths
541
- # First make sure @write_path ends with a file separator character:
542
- last_character = @write_path[-1..-1]
543
- @write_path = @write_path + File::SEPARATOR unless last_character == File::SEPARATOR
544
- # Differing behaviour if we have one, or several files in our array:
545
- if @files.length == 1
546
- # Write path is requested write path + old file name:
547
- str_arr = @files[0].split(File::SEPARATOR)
548
- @write_paths << @write_path + str_arr.last
549
- else
550
- # Several files.
551
- # Find out how much of the path they have in common, remove that and
552
- # add the remaining to the @write_path:
553
- str_arr = @files[0].split(File::SEPARATOR)
554
- last_match_index = common_path(str_arr)
555
- if last_match_index >= 0
556
- # Remove the matching folders from the path that will be added to @write_path:
557
- @files.each do |file|
558
- arr = file.split(File::SEPARATOR)
559
- part_to_write = arr[(last_match_index+1)..(arr.length-1)].join(File::SEPARATOR)
560
- @write_paths << @write_path + part_to_write
561
- end
562
- else
563
- # No common folders. Add all of original path to write path:
564
- @files.each do |file|
565
- @write_paths << @write_path + file
566
- end
567
- end
568
- end
569
- end
570
-
571
- # Replaces the UIDs of the given DICOM object.
572
- #
573
- # @note Empty UIDs are ignored (we don't generate new UIDs for these).
574
- # @note If AuditTrail is set, the relationship between old and new UIDs are preserved,
575
- # and the relations between files in a study/series should remain valid.
576
- # @param [DObject] dcm the dicom object to be processed
577
- #
578
- def replace_uids(dcm)
579
- @uids.each_pair do |tag, prefix|
580
- original = dcm.value(tag)
581
- if original && original.length > 0
582
- # We have a UID value, go ahead and replace it:
583
- if @audit_trail
584
- # Check if the UID has been encountered already:
585
- replacement = @audit_trail.replacement(tag, original)
586
- unless replacement
587
- # The UID has not been stored previously. Generate a new one:
588
- replacement = DICOM.generate_uid(@uid_root, prefix)
589
- # Add this tag record to the audit trail:
590
- @audit_trail.add_record(tag, original, replacement)
591
- end
592
- # Replace the UID in the DICOM object:
593
- dcm[tag].value = replacement
594
- # NB! The SOP Instance UID must also be written to the Media Storage SOP Instance UID tag:
595
- dcm["0002,0003"].value = replacement if tag == "0008,0018" && dcm.exists?("0002,0003")
596
- else
597
- # We don't care about preserving UID relations. Just insert a custom UID:
598
- dcm[tag].value = DICOM.generate_uid(@uid_root, prefix)
599
- end
600
- end
601
- end
602
- end
603
-
604
- # Sets up some default information variables that are used by the Anonymizer.
605
- #
606
- def set_defaults
607
- # A hash of UID tags to be replaced (if requested) and prefixes to use for each tag:
608
- @uids = {
609
- "0008,0018" => 3, # SOP Instance UID
610
- "0020,000D" => 1, # Study Instance UID
611
- "0020,000E" => 2, # Series Instance UID
612
- "0020,0052" => 9 # Frame of Reference UID
613
- }
614
- # Sets up default tags that will be anonymized, along with default replacement values and enumeration settings.
615
- # This data is stored in 3 separate instance arrays for tags, values and enumeration.
616
- data = [
617
- ["0008,0012", "20000101", false], # Instance Creation Date
618
- ["0008,0013", "000000.00", false], # Instance Creation Time
619
- ["0008,0020", "20000101", false], # Study Date
620
- ["0008,0023", "20000101", false], # Image Date
621
- ["0008,0030", "000000.00", false], # Study Time
622
- ["0008,0033", "000000.00", false], # Image Time
623
- ["0008,0050", "", true], # Accession Number
624
- ["0008,0080", "Institution", true], # Institution name
625
- ["0008,0090", "Physician", true], # Referring Physician's name
626
- ["0008,1010", "Station", true], # Station name
627
- ["0008,1070", "Operator", true], # Operator's Name
628
- ["0010,0010", "Patient", true], # Patient's name
629
- ["0010,0020", "ID", true], # Patient's ID
630
- ["0010,0030", "20000101", false], # Patient's Birth Date
631
- ["0010,0040", "N", false], # Patient's Sex
632
- ["0020,4000", "", false], # Image Comments
633
- ].transpose
634
- @tags = data[0]
635
- @values = data[1]
636
- @enumerations = data[2]
637
- # Tags to be deleted completely during anonymization:
638
- @delete_tags = [
639
- ]
640
- end
641
-
642
- # Writes an identity file, which allows reidentification of DICOM files that have been anonymized
643
- # using the enumeration feature. Values are saved in a text file, using semi colon delineation.
644
- #
645
- # @deprecated The identity file feature is deprecated!
646
- # Please use the AuditTrail feature instead.
647
- #
648
- def write_identity_file
649
- raise ArgumentError, "Expected String, got #{@identity_file.class}. Unable to write identity file." unless @identity_file.is_a?(String)
650
- # Open file and prepare to write text:
651
- File.open(@identity_file, 'w') do |output|
652
- # Cycle through each
653
- @tags.each_index do |i|
654
- if @enumerations[i]
655
- # This tag has had enumeration. Gather original and anonymized values:
656
- old_values = @enum_old_hash[@tags[i]]
657
- new_values = @enum_new_hash[@tags[i]]
658
- # Print the tag label, then new_value;old_value in the following rows.
659
- output.print @tags[i] + "\n"
660
- old_values.each_index do |j|
661
- output.print new_values[j].to_s.rstrip + ";" + old_values[j].to_s.rstrip + "\n"
662
- end
663
- # Print empty line for separation between different tags:
664
- output.print "\n"
665
- end
666
- end
667
- end
668
- end
669
- end
670
- end
1
+ module DICOM
2
+
3
+ # This is a convenience class for handling the anonymization
4
+ # (de-identification) of DICOM files.
5
+ #
6
+ # @note
7
+ # For a thorough introduction to the concept of DICOM anonymization,
8
+ # please refer to The DICOM Standard, Part 15: Security and System
9
+ # Management Profiles, Annex E: Attribute Confidentiality Profiles.
10
+ # For guidance on settings for individual data elements, please
11
+ # refer to DICOM PS 3.15, Annex E, Table E.1-1: Application Level
12
+ # Confidentiality Profile Attributes.
13
+ #
14
+ class Anonymizer
15
+ include Logging
16
+
17
+ # An AuditTrail instance used for this anonymization (if specified).
18
+ attr_reader :audit_trail
19
+ # The file name used for the AuditTrail serialization (if specified).
20
+ attr_reader :audit_trail_file
21
+ # A boolean that if set as true will cause all anonymized tags to be blank instead of get some generic value.
22
+ attr_accessor :blank
23
+ # An hash of elements (represented by tag keys) that will be deleted from the DICOM objects on anonymization.
24
+ attr_reader :delete
25
+ # A boolean that if set as true, will make the anonymization delete all private tags.
26
+ attr_accessor :delete_private
27
+ # The cryptographic hash function to be used for encrypting DICOM values recorded in an audit trail file.
28
+ attr_reader :encryption
29
+ # A boolean that if set as true will cause all anonymized tags to be get enumerated values, to enable post-anonymization re-identification by the user.
30
+ attr_accessor :enumeration
31
+ # The logger level which is applied to DObject operations during anonymization (defaults to Logger::FATAL).
32
+ attr_reader :logger_level
33
+ # A boolean that if set as true will cause all anonymized files to be written with random file names (if write_path has been specified).
34
+ attr_accessor :random_file_name
35
+ # A boolean that if set as true, will cause the anonymization to run on all levels of the DICOM file tag hierarchy.
36
+ attr_accessor :recursive
37
+ # A boolean indicating whether or not UIDs shall be replaced when executing the anonymization.
38
+ attr_accessor :uid
39
+ # The DICOM UID root to use when generating new UIDs.
40
+ attr_accessor :uid_root
41
+ # The path where the anonymized files will be saved. If this value is not set, the original DICOM files will be overwritten.
42
+ attr_accessor :write_path
43
+
44
+ # Creates an Anonymizer instance.
45
+ #
46
+ # @note To customize logging behaviour, refer to the Logging module documentation.
47
+ # @param [Hash] options the options to create an anonymizer instance with
48
+ # @option options [String] :audit_trail a file name path (if the file contains old audit data, these are loaded and used in the current anonymization)
49
+ # @option options [Boolean] :blank toggles whether to set the values of anonymized elements as empty instead of some generic value
50
+ # @option options [Boolean] :delete_private toggles whether private elements are to be deleted
51
+ # @option options [TrueClass, Digest::Class] :encryption if set as true, the default hash function (MD5) will be used for representing DICOM values in an audit file. Otherwise a Digest class can be given, e.g. Digest::SHA256
52
+ # @option options [Boolean] :enumeration toggles whether (some) elements get enumerated values (to enable post-anonymization re-identification)
53
+ # @option options [Fixnum] :logger_level the logger level which is applied to DObject operations during anonymization (defaults to Logger::FATAL)
54
+ # @option options [Boolean] :random_file_name toggles whether anonymized files will be given random file names when rewritten (in combination with the :write_path option)
55
+ # @option options [Boolean] :recursive toggles whether to anonymize on all sub-levels of the DICOM object tag hierarchies
56
+ # @option options [Boolean] :uid toggles whether UIDs will be replaced with custom generated UIDs (beware that to preserve UID relations in studies/series, the audit_trail feature must be used)
57
+ # @option options [String] :uid_root an organization (or custom) UID root to use when replacing UIDs
58
+ # @option options [String] :write_path a directory where the anonymized files are re-written (if not specified, files are overwritten)
59
+ # @example Create an Anonymizer instance and increase the log output
60
+ # a = Anonymizer.new
61
+ # a.logger.level = Logger::INFO
62
+ # @example Perform anonymization using the audit trail feature
63
+ # a = Anonymizer.new(:audit_trail => 'trail.json')
64
+ # a.enumeration = true
65
+ # a.write_path = '//anonymized/'
66
+ # a.anonymize('//dicom/today/')
67
+ #
68
+ def initialize(options={})
69
+ # Transfer options to attributes:
70
+ @blank = options[:blank]
71
+ @delete_private = options[:delete_private]
72
+ @enumeration = options[:enumeration]
73
+ @logger_level = options[:logger_level] || Logger::FATAL
74
+ @random_file_name = options[:random_file_name]
75
+ @recursive = options[:recursive]
76
+ @uid = options[:uid]
77
+ @uid_root = options[:uid_root] ? options[:uid_root] : UID_ROOT
78
+ @write_path = options[:write_path]
79
+ # Array of folders to be processed for anonymization:
80
+ @folders = Array.new
81
+ # Folders that will be skipped:
82
+ @exceptions = Array.new
83
+ # Data elements which will be anonymized (the array will hold a list of tag strings):
84
+ @tags = Array.new
85
+ # Default values to use on anonymized data elements:
86
+ @values = Array.new
87
+ # Which data elements will have enumeration applied, if requested by the user:
88
+ @enumerations = Array.new
89
+ # We use a Hash to store information from DICOM files if enumeration is desired:
90
+ @enum_old_hash = Hash.new
91
+ @enum_new_hash = Hash.new
92
+ # All the files to be anonymized will be put in this array:
93
+ @files = Array.new
94
+ @prefixes = Hash.new
95
+ # Setup audit trail if requested:
96
+ if options[:audit_trail]
97
+ @audit_trail_file = options[:audit_trail]
98
+ if File.exists?(@audit_trail_file) && File.size(@audit_trail_file) > 2
99
+ # Load the pre-existing audit trail from file:
100
+ @audit_trail = AuditTrail.read(@audit_trail_file)
101
+ else
102
+ # Start from scratch with an empty audit trail:
103
+ @audit_trail = AuditTrail.new
104
+ end
105
+ # Set up encryption if indicated:
106
+ if options[:encryption]
107
+ require 'digest'
108
+ if options[:encryption].respond_to?(:hexdigest)
109
+ @encryption = options[:encryption]
110
+ else
111
+ @encryption = Digest::MD5
112
+ end
113
+ end
114
+ end
115
+ # Set the default data elements to be anonymized:
116
+ set_defaults
117
+ end
118
+
119
+ # Checks for equality.
120
+ #
121
+ # Other and self are considered equivalent if they are
122
+ # of compatible types and their attributes are equivalent.
123
+ #
124
+ # @param other an object to be compared with self.
125
+ # @return [Boolean] true if self and other are considered equivalent
126
+ #
127
+ def ==(other)
128
+ if other.respond_to?(:to_anonymizer)
129
+ other.send(:state) == state
130
+ end
131
+ end
132
+
133
+ alias_method :eql?, :==
134
+
135
+ # Anonymizes the given DICOM data with the settings of this Anonymizer instance.
136
+ #
137
+ # @param [String, DObject, Array<String, DObject>] data single or multiple DICOM data (directories, file paths, binary strings, DICOM objects)
138
+ # @return [Array<DObject>] an array of the anonymized DICOM objects
139
+ #
140
+ def anonymize(data)
141
+ dicom = prepare(data)
142
+ if @tags.length > 0
143
+ dicom.each do |dcm|
144
+ anonymize_dcm(dcm)
145
+ # Write DICOM object to file unless it was passed to the anonymizer as an object:
146
+ write(dcm) unless dcm.was_dcm_on_input
147
+ end
148
+ else
149
+ logger.warn("No tags have been selected for anonymization. Aborting anonymization.")
150
+ end
151
+ # Reset the ruby-dicom log threshold to its original level:
152
+ logger.level = @original_level
153
+ # Save the audit trail (if used):
154
+ @audit_trail.write(@audit_trail_file) if @audit_trail
155
+ logger.info("Anonymization complete.")
156
+ dicom
157
+ end
158
+
159
+ # Specifies that the given tag is to be completely deleted
160
+ # from the anonymized DICOM objects.
161
+ #
162
+ # @param [String] tag a data element tag
163
+ # @example Completely delete the Patient's Name tag from the DICOM files
164
+ # a.delete_tag('0010,0010')
165
+ #
166
+ def delete_tag(tag)
167
+ raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
168
+ raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
169
+ @delete[tag] = true
170
+ end
171
+
172
+ # Checks the enumeration status of this tag.
173
+ #
174
+ # @param [String] tag a data element tag
175
+ # @return [Boolean, NilClass] the enumeration status of the tag, or nil if the tag has no match
176
+ #
177
+ def enum(tag)
178
+ raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
179
+ raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
180
+ pos = @tags.index(tag)
181
+ if pos
182
+ return @enumerations[pos]
183
+ else
184
+ logger.warn("The specified tag (#{tag}) was not found in the list of tags to be anonymized.")
185
+ return nil
186
+ end
187
+ end
188
+
189
+ # Computes a hash code for this object.
190
+ #
191
+ # @note Two objects with the same attributes will have the same hash code.
192
+ #
193
+ # @return [Fixnum] the object's hash code
194
+ #
195
+ def hash
196
+ state.hash
197
+ end
198
+
199
+ # Removes a tag from the list of tags that will be anonymized.
200
+ #
201
+ # @param [String] tag a data element tag
202
+ # @example Do not anonymize the Patient's Name tag
203
+ # a.remove_tag('0010,0010')
204
+ #
205
+ def remove_tag(tag)
206
+ raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
207
+ raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
208
+ pos = @tags.index(tag)
209
+ if pos
210
+ @tags.delete_at(pos)
211
+ @values.delete_at(pos)
212
+ @enumerations.delete_at(pos)
213
+ end
214
+ end
215
+
216
+ # Sets the anonymization settings for the specified tag. If the tag is already present in the list
217
+ # of tags to be anonymized, its settings are updated, and if not, a new tag entry is created.
218
+ #
219
+ # @param [String] tag a data element tag
220
+ # @param [Hash] options the anonymization settings for the specified tag
221
+ # @option options [String, Integer, Float] :value the replacement value to be used when anonymizing this data element. Defaults to the pre-existing value and '' for new tags.
222
+ # @option options [String, Integer, Float] :enum specifies if enumeration is to be used for this tag. Defaults to the pre-existing value and false for new tags.
223
+ # @example Set the anonymization settings of the Patient's Name tag
224
+ # a.set_tag('0010,0010', :value => 'MrAnonymous', :enum => true)
225
+ #
226
+ def set_tag(tag, options={})
227
+ raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
228
+ raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
229
+ pos = @tags.index(tag)
230
+ if pos
231
+ # Update existing values:
232
+ @values[pos] = options[:value] if options[:value]
233
+ @enumerations[pos] = options[:enum] if options[:enum] != nil
234
+ else
235
+ # Add new elements:
236
+ @tags << tag
237
+ @values << (options[:value] ? options[:value] : default_value(tag))
238
+ @enumerations << (options[:enum] ? options[:enum] : false)
239
+ end
240
+ end
241
+
242
+ # Returns self.
243
+ #
244
+ # @return [Anonymizer] self
245
+ #
246
+ def to_anonymizer
247
+ self
248
+ end
249
+
250
+ # Gives the value which will be used when anonymizing this tag.
251
+ #
252
+ # @note If enumeration is selected for a string type tag, a number will be
253
+ # appended in addition to the string that is returned here.
254
+ #
255
+ # @param [String] tag a data element tag
256
+ # @return [String, Integer, Float, NilClass] the replacement value for the specified tag, or nil if the tag is not matched
257
+ #
258
+ def value(tag)
259
+ raise ArgumentError, "Expected String, got #{tag.class}." unless tag.is_a?(String)
260
+ raise ArgumentError, "Expected a valid tag of format 'GGGG,EEEE', got #{tag}." unless tag.tag?
261
+ pos = @tags.index(tag)
262
+ if pos
263
+ return @values[pos]
264
+ else
265
+ logger.warn("The specified tag (#{tag}) was not found in the list of tags to be anonymized.")
266
+ return nil
267
+ end
268
+ end
269
+
270
+
271
+ private
272
+
273
+
274
+ # Performs anonymization on a DICOM object.
275
+ #
276
+ # @param [DObject] dcm a DICOM object
277
+ #
278
+ def anonymize_dcm(dcm)
279
+ # Extract the data element parents to investigate:
280
+ parents = element_parents(dcm)
281
+ parents.each do |parent|
282
+ # Anonymize the desired tags:
283
+ @tags.each_index do |j|
284
+ if parent.exists?(@tags[j])
285
+ element = parent[@tags[j]]
286
+ if element.is_a?(Element)
287
+ if @blank
288
+ value = ''
289
+ elsif @enumeration
290
+ old_value = element.value
291
+ # Only launch enumeration logic if there is an actual value to the data element:
292
+ if old_value
293
+ value = enumerated_value(old_value, j)
294
+ else
295
+ value = ''
296
+ end
297
+ else
298
+ # Use the value that has been set for this tag:
299
+ value = @values[j]
300
+ end
301
+ element.value = value
302
+ end
303
+ end
304
+ end
305
+ # Delete elements marked for deletion:
306
+ @delete.each_key do |tag|
307
+ parent.delete(tag) if parent.exists?(tag)
308
+ end
309
+ end
310
+ # General DICOM object manipulation:
311
+ # Add a Patient Identity Removed attribute (as per
312
+ # DICOM PS 3.15, Annex E, E.1.1 De-Identifier, point 6):
313
+ dcm.add(Element.new('0012,0062', 'YES'))
314
+ # Add a De-Identification Method Code Sequence Item:
315
+ dcm.add(Sequence.new('0012,0064')) unless dcm.exists?('0012,0064')
316
+ i = dcm['0012,0064'].add_item
317
+ i.add(Element.new('0012,0063', 'De-identified by the ruby-dicom Anonymizer'))
318
+ # FIXME: At some point we should add a set of de-indentification method codes, as per
319
+ # DICOM PS 3.16 CID 7050 which corresponds to the settings chosen for the anonymizer.
320
+ # Delete the old File Meta Information group (as per
321
+ # DICOM PS 3.15, Annex E, E.1.1 De-Identifier, point 7):
322
+ dcm.delete_group('0002')
323
+ # Handle UIDs if requested:
324
+ replace_uids(parents) if @uid
325
+ # Delete private tags if indicated:
326
+ dcm.delete_private if @delete_private
327
+ end
328
+
329
+ # Gives the value to be used for the audit trail, which is either
330
+ # the original value itself, or an encrypted string based on it.
331
+ #
332
+ # @param [String, Integer, Float] original the original value of the tag to be anonymized
333
+ # @return [String, Integer, Float] with encryption, a hash string is returned, otherwise the original value
334
+ #
335
+ def at_value(original)
336
+ @encryption ? @encryption.hexdigest(original) : original
337
+ end
338
+
339
+ # Creates a hash that is used for storing information that is used when enumeration is selected.
340
+ #
341
+ def create_enum_hash
342
+ @enumerations.each_index do |i|
343
+ @enum_old_hash[@tags[i]] = Array.new
344
+ @enum_new_hash[@tags[i]] = Array.new
345
+ end
346
+ end
347
+
348
+ # Determines a default value to use for anonymizing the given tag.
349
+ #
350
+ # @param [String] tag a data element tag
351
+ # @return [String, Integer, Float] the default replacement value for a given tag
352
+ #
353
+ def default_value(tag)
354
+ name, vr = LIBRARY.name_and_vr(tag)
355
+ conversion = VALUE_CONVERSION[vr] || :to_s
356
+ case conversion
357
+ when :to_i then return 0
358
+ when :to_f then return 0.0
359
+ else
360
+ # Assume type is string and return an empty string:
361
+ return ''
362
+ end
363
+ end
364
+
365
+ # Creates a write path for the given DICOM object, based on the object's
366
+ # original file path and the write_path attribute.
367
+ #
368
+ # @param [DObject] dcm a DICOM object
369
+ # @return [String] the destination directory path
370
+ #
371
+ def destination(dcm)
372
+ # Split the source path into dir and file:
373
+ source_dir = File.dirname(dcm.source)
374
+ source_folders = source_dir.split(File::SEPARATOR)
375
+ target_folders = @write_path.split(File::SEPARATOR)
376
+ # If the first element is the current dir symbol, get rid of it:
377
+ source_folders.delete('.')
378
+ # Check for equalness of folder names in a range limited by the shortest array:
379
+ common_length = [source_folders.length, target_folders.length].min
380
+ uncommon_index = nil
381
+ common_length.times do |i|
382
+ if target_folders[i] != source_folders[i]
383
+ uncommon_index = i
384
+ break
385
+ end
386
+ end
387
+ # Create the output path by joining the two paths together using the determined index:
388
+ append_path = uncommon_index ? source_folders[uncommon_index..-1] : nil
389
+ [target_folders, append_path].compact.join(File::SEPARATOR)
390
+ end
391
+
392
+ # Extracts all parents from a DObject instance which potentially
393
+ # have child (data) elements. This typically means the DObject
394
+ # instance itself as well as items (i.e. not sequences).
395
+ # Note that unless the @recursive attribute has been set,
396
+ # this method will only return the DObject (placed inside an array).
397
+ #
398
+ # @param [DObject] dcm a DICOM object
399
+ # @return [Array<DObject, Item>] an array containing either just a DObject or also all parental child items within the tag hierarchy
400
+ #
401
+ def element_parents(dcm)
402
+ parents = Array.new
403
+ parents << dcm
404
+ if @recursive
405
+ dcm.sequences.each do |s|
406
+ parents += element_parents_recursive(s)
407
+ end
408
+ end
409
+ parents
410
+ end
411
+
412
+ # Recursively extracts all item parents from a sequence instance (including
413
+ # any sub-sequences) which actually contain child (data) elements.
414
+ #
415
+ # @param [Sequence] sequence a Sequence instance
416
+ # @return [Array<Item>] an array containing items within the tag hierarchy that contains child elements
417
+ #
418
+ def element_parents_recursive(sequence)
419
+ parents = Array.new
420
+ sequence.items.each do |i|
421
+ parents << i if i.elements?
422
+ i.sequences.each do |s|
423
+ parents += element_parents_recursive(s)
424
+ end
425
+ end
426
+ parents
427
+ end
428
+
429
+ # Handles the enumeration for the given data element tag.
430
+ # If its value has been encountered before, its corresponding enumerated
431
+ # replacement value is retrieved, and if a new original value is encountered,
432
+ # a new enumerated replacement value is found by increasing an index by 1.
433
+ #
434
+ # @param [String, Integer, Float] original the original value of the tag to be anonymized
435
+ # @param [Fixnum] j the index of this tag in the tag-related instance arrays
436
+ # @return [String, Integer, Float] the replacement value which is used for the anonymization of the tag
437
+ #
438
+ def enumerated_value(original, j)
439
+ # Is enumeration requested for this tag?
440
+ if @enumerations[j]
441
+ if @audit_trail
442
+ # Check if the UID has been encountered already:
443
+ replacement = @audit_trail.replacement(@tags[j], at_value(original))
444
+ unless replacement
445
+ # This original value has not been encountered yet. Determine the index to use.
446
+ index = @audit_trail.records(@tags[j]).length + 1
447
+ # Create the replacement value:
448
+ if @values[j].is_a?(String)
449
+ replacement = @values[j] + index.to_s
450
+ else
451
+ replacement = @values[j] + index
452
+ end
453
+ # Add this tag record to the audit trail:
454
+ @audit_trail.add_record(@tags[j], at_value(original), replacement)
455
+ end
456
+ else
457
+ # Retrieve earlier used anonymization values:
458
+ previous_old = @enum_old_hash[@tags[j]]
459
+ previous_new = @enum_new_hash[@tags[j]]
460
+ p_index = previous_old.length
461
+ if previous_old.index(original) == nil
462
+ # Current value has not been encountered before:
463
+ replacement = @values[j]+(p_index + 1).to_s
464
+ # Store value in array (and hash):
465
+ previous_old << original
466
+ previous_new << replacement
467
+ @enum_old_hash[@tags[j]] = previous_old
468
+ @enum_new_hash[@tags[j]] = previous_new
469
+ else
470
+ # Current value has been observed before:
471
+ replacement = previous_new[previous_old.index(original)]
472
+ end
473
+ end
474
+ else
475
+ replacement = @values[j]
476
+ end
477
+ return replacement
478
+ end
479
+
480
+ # Establishes a prefix for a given UID tag.
481
+ # This makes it somewhat easier to distinguish
482
+ # between different types of random generated UIDs.
483
+ #
484
+ # @param [String] tag a data element string tag
485
+ #
486
+ def prefix(tag)
487
+ if @prefixes[tag]
488
+ @prefixes[tag]
489
+ else
490
+ @prefixes[tag] = @prefixes.length + 1
491
+ @prefixes[tag]
492
+ end
493
+ end
494
+
495
+ # Prepares the data for anonymization.
496
+ #
497
+ # @param [String, DObject, Array<String, DObject>] data single or multiple DICOM data (directories, file paths, binary strings, DICOM objects)
498
+ # @return [Array] the original data (wrapped in an array) as well as an array of loaded DObject instances
499
+ #
500
+ def prepare(data)
501
+ logger.info("Loading DICOM data.")
502
+ # Temporarily adjust the ruby-dicom log threshold (usually to suppress messages from the DObject class):
503
+ @original_level = logger.level
504
+ logger.level = @logger_level
505
+ dicom = DICOM.load(data)
506
+ logger.level = @original_level
507
+ logger.info("#{dicom.length} DICOM objects have been prepared for anonymization.")
508
+ logger.level = @logger_level
509
+ # Set up enumeration if requested:
510
+ create_enum_hash if @enumeration
511
+ require 'securerandom' if @random_file_name
512
+ dicom
513
+ end
514
+
515
+ # Replaces the UIDs of the given DICOM object.
516
+ #
517
+ # @note Empty UIDs are ignored (we don't generate new UIDs for these).
518
+ # @note If AuditTrail is set, the relationship between old and new UIDs are preserved,
519
+ # and the relations between files in a study/series should remain valid.
520
+ # @param [Array<DObject, Item>] parents dicom parent objects who's child elements will be investigated
521
+ #
522
+ def replace_uids(parents)
523
+ parents.each do |parent|
524
+ parent.each_element do |element|
525
+ if element.vr == ('UI') and !@static_uids[element.tag]
526
+ original = element.value
527
+ if original && original.length > 0
528
+ # We have a UID value, go ahead and replace it:
529
+ if @audit_trail
530
+ # Check if the UID has been encountered already:
531
+ replacement = @audit_trail.replacement('uids', original)
532
+ unless replacement
533
+ # The UID has not been stored previously. Generate a new one:
534
+ replacement = DICOM.generate_uid(@uid_root, prefix(element.tag))
535
+ # Add this tag record to the audit trail:
536
+ @audit_trail.add_record('uids', original, replacement)
537
+ end
538
+ # Replace the UID in the DICOM object:
539
+ element.value = replacement
540
+ else
541
+ # We don't care about preserving UID relations. Just insert a custom UID:
542
+ element.value = DICOM.generate_uid(@uid_root, prefix(element.tag))
543
+ end
544
+ end
545
+ end
546
+ end
547
+ end
548
+ end
549
+
550
+ # Sets up some default information variables that are used by the Anonymizer.
551
+ #
552
+ def set_defaults
553
+ # Some UIDs should not be remapped even if uid anonymization has been requested:
554
+ @static_uids = {
555
+ # Private related:
556
+ '0002,0100' => true,
557
+ '0004,1432' => true,
558
+ # Coding scheme related:
559
+ '0008,010C' => true,
560
+ '0008,010D' => true,
561
+ # Transfer syntax related:
562
+ '0002,0010' => true,
563
+ '0400,0010' => true,
564
+ '0400,0510' => true,
565
+ '0004,1512' => true,
566
+ # SOP class related:
567
+ '0000,0002' => true,
568
+ '0000,0003' => true,
569
+ '0002,0002' => true,
570
+ '0004,1510' => true,
571
+ '0004,151A' => true,
572
+ '0008,0016' => true,
573
+ '0008,001A' => true,
574
+ '0008,001B' => true,
575
+ '0008,0062' => true,
576
+ '0008,1150' => true,
577
+ '0008,115A' => true
578
+ }
579
+ # Sets up default tags that will be anonymized, along with default replacement values and enumeration settings.
580
+ # This data is stored in 3 separate instance arrays for tags, values and enumeration.
581
+ data = [
582
+ ['0008,0012', '20000101', false], # Instance Creation Date
583
+ ['0008,0013', '000000.00', false], # Instance Creation Time
584
+ ['0008,0020', '20000101', false], # Study Date
585
+ ['0008,0021', '20000101', false], # Series Date
586
+ ['0008,0022', '20000101', false], # Acquisition Date
587
+ ['0008,0023', '20000101', false], # Image Date
588
+ ['0008,0030', '000000.00', false], # Study Time
589
+ ['0008,0031', '000000.00', false], # Series Time
590
+ ['0008,0032', '000000.00', false], # Acquisition Time
591
+ ['0008,0033', '000000.00', false], # Image Time
592
+ ['0008,0050', '', true], # Accession Number
593
+ ['0008,0080', 'Institution', true], # Institution name
594
+ ['0008,0081', 'Address', true], # Institution Address
595
+ ['0008,0090', 'Physician', true], # Referring Physician's name
596
+ ['0008,1010', 'Station', true], # Station name
597
+ ['0008,1040', 'Department', true], # Institutional Department name
598
+ ['0008,1070', 'Operator', true], # Operator's Name
599
+ ['0010,0010', 'Patient', true], # Patient's name
600
+ ['0010,0020', 'ID', true], # Patient's ID
601
+ ['0010,0030', '20000101', false], # Patient's Birth Date
602
+ ['0010,0040', 'O', false], # Patient's Sex
603
+ ['0010,1010', '', false], # Patient's Age
604
+ ['0020,4000', '', false], # Image Comments
605
+ ].transpose
606
+ @tags = data[0]
607
+ @values = data[1]
608
+ @enumerations = data[2]
609
+ # Tags to be deleted completely during anonymization:
610
+ @delete = Hash.new
611
+ end
612
+
613
+ # Collects the attributes of this instance.
614
+ #
615
+ # @return [Array] an array of attributes
616
+ #
617
+ def state
618
+ [
619
+ @tags, @values, @enumerations, @delete, @blank,
620
+ @delete_private, @enumeration, @logger_level,
621
+ @random_file_name, @recursive, @uid, @uid_root, @write_path
622
+ ]
623
+ end
624
+
625
+ # Writes a DICOM object to file.
626
+ #
627
+ # @param [DObject] dcm a DICOM object
628
+ #
629
+ def write(dcm)
630
+ if @write_path
631
+ # The DICOM object is to be written to a separate directory. If the
632
+ # original and the new directories have a common root, this is taken into
633
+ # consideration when determining the object's write path:
634
+ path = destination(dcm)
635
+ if @random_file_name
636
+ file_name = "#{SecureRandom.hex(16)}.dcm"
637
+ else
638
+ file_name = File.basename(dcm.source)
639
+ end
640
+ dcm.write(File.join(path, file_name))
641
+ else
642
+ # The original DICOM file is overwritten with the anonymized DICOM object:
643
+ dcm.write(dcm.source)
644
+ end
645
+ end
646
+
647
+ end
648
+
649
+ end