caruby-tissue 1.2.1 → 1.2.2
Sign up to get free protection for your applications and to get access to all the features.
- data/History.txt +4 -0
- data/LICENSE +1 -1
- data/README.md +79 -30
- data/bin/crtmigrate +2 -2
- data/{examples/galena/bin → bin}/migrate.rb +0 -0
- data/bin/seed +26 -0
- data/bin/seed.rb +43 -0
- data/conf/extract/simple_fields.yaml +4 -0
- data/conf/migration/filter_fields.yaml +7 -0
- data/{examples/galena/conf → conf}/migration/filter_migration.yaml +0 -0
- data/conf/migration/frozen_fields.yaml +11 -0
- data/{examples/galena/conf → conf}/migration/frozen_migration.yaml +0 -0
- data/conf/migration/general_fields.yaml +44 -0
- data/{examples/galena/conf → conf}/migration/general_migration.yaml +0 -0
- data/conf/migration/simple_fields.yaml +30 -0
- data/{examples/galena/conf → conf}/migration/simple_migration.yaml +0 -0
- data/{examples/galena/conf → conf}/migration/small_fields.yaml +0 -0
- data/{examples/galena/conf → conf}/migration/small_migration.yaml +0 -0
- data/examples/galena/README.md +46 -6
- data/examples/galena/bin/seed +26 -0
- data/examples/galena/conf/migration/frozen_fields.yaml +1 -0
- data/examples/galena/conf/migration/general_fields.yaml +2 -0
- data/examples/galena/data/filter.csv +1 -1
- data/examples/galena/data/frozen.csv +1 -1
- data/examples/galena/data/general.csv +1 -1
- data/examples/galena/doc/CaTissue.html +2 -2
- data/examples/galena/doc/CaTissue/Participant.html +1 -1
- data/examples/galena/doc/CaTissue/SpecimenCollectionGroup.html +1 -1
- data/examples/galena/doc/CaTissue/StorageContainer.html +6 -6
- data/examples/galena/doc/CaTissue/TissueSpecimen.html +1 -1
- data/examples/galena/doc/Galena.html +4 -122
- data/examples/galena/doc/Galena/Seed.html +1 -1
- data/examples/galena/doc/Galena/Seed/Defaults.html +28 -24
- data/examples/galena/doc/_index.html +1 -8
- data/examples/galena/doc/class_list.html +1 -1
- data/examples/galena/doc/file.README.html +52 -7
- data/examples/galena/doc/index.html +52 -7
- data/examples/galena/doc/method_list.html +9 -25
- data/examples/galena/doc/top-level-namespace.html +1 -1
- data/examples/galena/lib/galena/migration/frozen_shims.rb +4 -15
- data/examples/galena/lib/galena/seed/defaults.rb +16 -4
- data/{examples/galena/lib → lib}/README.html +0 -0
- data/lib/catissue/cli/command.rb +6 -9
- data/lib/catissue/cli/migrate.rb +11 -10
- data/lib/catissue/cli/smoke.rb +5 -5
- data/lib/catissue/database.rb +31 -8
- data/lib/catissue/domain/abstract_specimen.rb +1 -1
- data/lib/catissue/domain/collection_protocol.rb +29 -13
- data/lib/catissue/domain/participant_medical_identifier.rb +1 -1
- data/lib/catissue/domain/site.rb +3 -0
- data/lib/catissue/domain/specimen.rb +17 -14
- data/lib/catissue/domain/specimen_collection_group.rb +2 -5
- data/lib/catissue/extract/delta.rb +2 -6
- data/lib/catissue/migration/migrator.rb +6 -0
- data/lib/catissue/resource.rb +5 -2
- data/lib/catissue/util/log.rb +3 -3
- data/lib/catissue/version.rb +1 -1
- data/{examples/galena/lib → lib}/galena.rb +0 -0
- data/{examples/galena/bin → lib/galena/cli}/seed.rb +1 -1
- data/lib/galena/migration/filter_shims.rb +43 -0
- data/lib/galena/migration/frozen_shims.rb +53 -0
- data/lib/galena/seed/defaults.rb +109 -0
- metadata +27 -17
- data/examples/galena/doc/CaTissue/CollectionProtocolRegistration.html +0 -181
data/History.txt
CHANGED
data/LICENSE
CHANGED
data/README.md
CHANGED
@@ -1,44 +1,93 @@
|
|
1
|
-
caRuby Tissue
|
2
|
-
|
1
|
+
Galena caRuby Tissue example
|
2
|
+
============================
|
3
3
|
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
**License**: MIT License
|
9
|
-
**Latest Version**: 1.2.1
|
10
|
-
**Release Date**: November 23rd 2010
|
4
|
+
Synopsis
|
5
|
+
--------
|
6
|
+
This directory contains the caRuby Tissue example for the hypothetical Galena Cancer Center.
|
7
|
+
The example files are a useful template for building your own migrator.
|
11
8
|
|
12
|
-
|
9
|
+
The Galena example demonstrates how to load the content of a custom tissue bank into caTissue.
|
10
|
+
The use cases illustrate several common migration impediments:
|
13
11
|
|
14
|
-
|
15
|
-
|
12
|
+
* Different terminology than caTissue
|
13
|
+
* Different associations than caTissue
|
14
|
+
* Incomplete input for caTissue
|
15
|
+
* Denormalized input
|
16
|
+
* Inconsistent input
|
17
|
+
* Input data scrubbing
|
18
|
+
* Aliquot inference
|
19
|
+
* Pre-defined caTissue protocol
|
16
20
|
|
17
|
-
|
21
|
+
Setup
|
22
|
+
-----
|
23
|
+
1. Run the `crtexample --list` command to display the Galena example location.
|
18
24
|
|
19
|
-
2.
|
25
|
+
2. Copy the example into a location of your choosing.
|
20
26
|
|
21
|
-
3.
|
27
|
+
3. Configure a caTissue client to connect to a test caTissue instance, as described in the
|
28
|
+
caTissue Technical Guide.
|
22
29
|
|
23
|
-
|
24
|
-
|
30
|
+
4. Define the caRuby Tissue access property file as described in the configuration
|
31
|
+
[FAQ](how-do-i-configure-caruby-to-work-with-catissue).
|
25
32
|
|
26
|
-
|
33
|
+
Migration
|
34
|
+
---------
|
35
|
+
The example migration input data resides in the `data` directory.
|
36
|
+
Each CSV input file holds one row for each specimen.
|
27
37
|
|
28
|
-
|
29
|
-
|
38
|
+
Each example has a field mapping configuration in the `conf/migration` directory.
|
39
|
+
For example, the `simple.csv` input file is migrated into caTissue using the
|
40
|
+
`simple_migration.yaml` configuration file.
|
30
41
|
|
31
|
-
|
42
|
+
Migrate the Galena `simple` example as follows:
|
32
43
|
|
33
|
-
|
34
|
-
---------
|
44
|
+
1. Open a console in the copied Galena example location.
|
35
45
|
|
36
|
-
|
37
|
-
- Initial public release
|
46
|
+
2. Run the following:
|
38
47
|
|
39
|
-
|
40
|
-
|
48
|
+
bin/seed
|
49
|
+
|
50
|
+
This command initializes the administrative objects in the Galena test database,
|
51
|
+
including the Galena collection protocol, site, cancer center, tissue bank and coordinator.
|
52
|
+
|
53
|
+
3. Run the following:
|
54
|
+
|
55
|
+
crtmigrate --target TissueSpecimen --mapping conf/migration/simple_fields.yaml data/simple.csv
|
56
|
+
|
57
|
+
This command migrates the CSV record in the `simple.csv` input file into a caTissue
|
58
|
+
`TissueSpecimen` based on the `simple_fields.yaml` mapping file.
|
59
|
+
The command will take a couple of minutes to finish, since the less information
|
60
|
+
you provide caRuby the more it works to fill in the missing bits. In the meantime,
|
61
|
+
peruse the configuration and data files to see which data are migrated and
|
62
|
+
where this data ends up in caTissue.
|
63
|
+
|
64
|
+
4. Open the caTissue application on the test server and verify the content of the
|
65
|
+
Galena CP collection protocol.
|
66
|
+
|
67
|
+
The other examples are run in a similar manner. Each example demonstrates different
|
68
|
+
features of the caRuby Migration utility as follows:
|
69
|
+
|
70
|
+
* simple - a good starting point with limited input fields
|
71
|
+
* minimal - the fewest possible input fields without writing custom Ruby shim code
|
72
|
+
* general - lots of input fields, no custom Ruby code
|
73
|
+
* filter - a smattering of custom Ruby shim code to convert input values to caTissue values
|
74
|
+
* frozen - an example demonstrating how to import storage locations
|
75
|
+
|
76
|
+
Try running an example with the `--debug` flag and look at the `log/migration.log` file to see
|
77
|
+
what caRuby is up to behind the scenes (hint: a lot!).
|
78
|
+
|
79
|
+
Input data
|
80
|
+
----------
|
81
|
+
The sample Galena Tissue Bank CSV input files hold one row for each specimen.
|
82
|
+
Common fields are as follows:
|
41
83
|
|
42
|
-
|
43
|
-
|
44
|
-
|
84
|
+
* MRN - Patient Medical Record Number
|
85
|
+
* Initials - Patient name initials
|
86
|
+
* Frozen? - Flag indicating whether the specimen is frozen
|
87
|
+
* SPN - Surgical Pathology Number
|
88
|
+
* Collection Date - Date the specimen was acquired by the tissue bank
|
89
|
+
* Received Date - Date the specimen was donated by the participant
|
90
|
+
* Quantity - Amount collected
|
91
|
+
* Box - Tissue storage container
|
92
|
+
* X - the tissue box column
|
93
|
+
* Y - the tissue box row
|
data/bin/crtmigrate
CHANGED
File without changes
|
data/bin/seed
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
#!/usr/bin/env jruby
|
2
|
+
#
|
3
|
+
# seed: seeds the Galena example administrative objects in the database
|
4
|
+
#
|
5
|
+
|
6
|
+
# load the caRuby Tissue gem
|
7
|
+
require 'rubygems'
|
8
|
+
begin
|
9
|
+
gem 'caruby-tissue'
|
10
|
+
rescue LoadError
|
11
|
+
# if the gem is not available, then try a local source
|
12
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', '..', '..', 'lib')
|
13
|
+
end
|
14
|
+
|
15
|
+
# add the Galena lib to the path
|
16
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', 'lib')
|
17
|
+
|
18
|
+
# the default log file
|
19
|
+
DEF_LOG_FILE = 'log/migration.log'
|
20
|
+
|
21
|
+
require 'catissue'
|
22
|
+
require 'catissue/cli/command'
|
23
|
+
|
24
|
+
require 'galena/seed/defaults'
|
25
|
+
|
26
|
+
CaTissue::CLI::Command.new.start { Galena::Seed.defaults.ensure_exists }
|
data/bin/seed.rb
ADDED
@@ -0,0 +1,43 @@
|
|
1
|
+
#!/usr/bin/env jruby
|
2
|
+
#
|
3
|
+
# crtseed-galena: seeds the Galena example administrative objects in the database
|
4
|
+
#
|
5
|
+
# == Usage
|
6
|
+
#
|
7
|
+
# catissue-seed-galena.rb [options] file
|
8
|
+
#
|
9
|
+
# --help, -h:
|
10
|
+
# print this help message and exit
|
11
|
+
#
|
12
|
+
# --log file, -l file:
|
13
|
+
# log file (default ./log/migration.log)
|
14
|
+
#
|
15
|
+
# --debug, -d:
|
16
|
+
# print debug messages to log (optional)
|
17
|
+
#
|
18
|
+
# == License
|
19
|
+
#
|
20
|
+
# This program is licensed under the terms of the +LEGAL+ file in
|
21
|
+
# the source distribution.
|
22
|
+
|
23
|
+
# load the required gems
|
24
|
+
require 'rubygems'
|
25
|
+
begin
|
26
|
+
gem 'caruby-tissue'
|
27
|
+
rescue LoadError
|
28
|
+
# if the gem is not available, then try a local source
|
29
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', '..', '..', 'lib')
|
30
|
+
end
|
31
|
+
|
32
|
+
# add the Galena lib to the path
|
33
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', 'lib')
|
34
|
+
|
35
|
+
# the default log file
|
36
|
+
DEF_LOG_FILE = 'log/migration.log'
|
37
|
+
|
38
|
+
require 'catissue'
|
39
|
+
require 'catissue/cli/command'
|
40
|
+
|
41
|
+
require 'galena/seed/defaults'
|
42
|
+
|
43
|
+
CaTissue::Command.new.start { Galena::Seed.defaults.ensure_exists }
|
@@ -0,0 +1,7 @@
|
|
1
|
+
Protocol: CollectionProtocol.short_title
|
2
|
+
MRN: ParticipantMedicalIdentifier.medical_record_number
|
3
|
+
Initials: Participant.first_name, Participant.last_name
|
4
|
+
Frozen?: TissueSpecimen.specimen_type
|
5
|
+
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
6
|
+
Collection Date: ReceivedEventParameters.timestamp
|
7
|
+
Quantity: TissueSpecimen.initial_quantity
|
File without changes
|
@@ -0,0 +1,11 @@
|
|
1
|
+
# This is the Galena frozen example migration field mapping file.
|
2
|
+
# This example extends the simple migration with storage fields
|
3
|
+
# and the frozen defaults.
|
4
|
+
|
5
|
+
MRN: ParticipantMedicalIdentifier.medical_record_number, Participant.last_name
|
6
|
+
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
7
|
+
Collection Date: ReceivedEventParameters.timestamp
|
8
|
+
Quantity: TissueSpecimen.initial_quantity
|
9
|
+
Box: SpecimenPosition.container.name
|
10
|
+
X: SpecimenPosition.position_dimension_one
|
11
|
+
Y: SpecimenPosition.position_dimension_two
|
File without changes
|
@@ -0,0 +1,44 @@
|
|
1
|
+
# This is the Galena general example migration field mapping file in the format:
|
2
|
+
# input field: caTissue property
|
3
|
+
|
4
|
+
Protocol: CollectionProtocol.short_title
|
5
|
+
|
6
|
+
Collection Site: Site.name
|
7
|
+
|
8
|
+
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
9
|
+
|
10
|
+
PPI: CollectionProtocolRegistration.protocol_participant_identifier
|
11
|
+
|
12
|
+
MRN: ParticipantMedicalIdentifier.medical_record_number
|
13
|
+
|
14
|
+
First Name: Participant.first_name
|
15
|
+
|
16
|
+
Last Name: Participant.last_name
|
17
|
+
|
18
|
+
Collection Point: CollectionProtocolEvent.event_point
|
19
|
+
|
20
|
+
Receiver: ReceivedEventParameters.user.login_name
|
21
|
+
|
22
|
+
Received Timestamp: ReceivedEventParameters.timestamp
|
23
|
+
|
24
|
+
Collector: CollectionEventParameters.user.login_name
|
25
|
+
|
26
|
+
Collected Timestamp: CollectionEventParameters.timestamp
|
27
|
+
|
28
|
+
Diagnosis: SpecimenCollectionGroup.clinical_diagnosis
|
29
|
+
|
30
|
+
Anatomic Site: SpecimenCharacteristics.tissue_site
|
31
|
+
|
32
|
+
Development: TissueSpecimen.pathological_status
|
33
|
+
|
34
|
+
Label: TissueSpecimen.label
|
35
|
+
|
36
|
+
Type: TissueSpecimen.specimen_type
|
37
|
+
|
38
|
+
Quantity: TissueSpecimen.initial_quantity
|
39
|
+
|
40
|
+
Box: SpecimenPosition.container.name
|
41
|
+
|
42
|
+
X: SpecimenPosition.position_dimension_one
|
43
|
+
|
44
|
+
Y: SpecimenPosition.position_dimension_two
|
File without changes
|
@@ -0,0 +1,30 @@
|
|
1
|
+
# This is the Galena simple example migration field mapping file.
|
2
|
+
# Each entry is in the format:
|
3
|
+
# heading: paths
|
4
|
+
# The heading is the CSV file label in the leading heading record.
|
5
|
+
# The paths is a comma-separated list of CaTissue domain object attribute paths specifying
|
6
|
+
# how the CSV heading maps to a CaTissue attribute value. Each attribute path is in the format:
|
7
|
+
# class.attribute[.attribute]
|
8
|
+
# where class is a CaTissue class and attribute is a Ruby accessor method defined in the CaTissue class.
|
9
|
+
# The accessor method can be one of the following:
|
10
|
+
# * a Java property name, e.g. lastName
|
11
|
+
# * the pre-defined Rubyized underscore form of the Java property, e.g. last_name
|
12
|
+
|
13
|
+
# The collection protocol title for the specimen
|
14
|
+
Protocol: CollectionProtocol.short_title
|
15
|
+
|
16
|
+
# The input MRN field is used for the PMI MRN, CPR PPI and Participant last name.
|
17
|
+
# Since the migration source is for a single collection site, the MRN is unique within the protocol.
|
18
|
+
# The simple input does not have a Participant name field. It is a useful practice to set the caTissue
|
19
|
+
# name to some value, since caTissue uses the name for display and sorting. For want of a better
|
20
|
+
# value, the name is set to the MRN.
|
21
|
+
MRN: ParticipantMedicalIdentifier.medical_record_number, Participant.last_name
|
22
|
+
|
23
|
+
# The input SPN field is the SCG SPN value.
|
24
|
+
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
25
|
+
|
26
|
+
# The input Collection Date is the Specimen received timestamp.
|
27
|
+
Collection Date: ReceivedEventParameters.timestamp
|
28
|
+
|
29
|
+
# The input Quantity is the target Specimen initial quantity.
|
30
|
+
Quantity: Specimen.initial_quantity
|
File without changes
|
File without changes
|
File without changes
|
data/examples/galena/README.md
CHANGED
@@ -27,8 +27,8 @@ Setup
|
|
27
27
|
3. Configure a caTissue client to connect to a test caTissue instance, as described in the
|
28
28
|
caTissue Technical Guide.
|
29
29
|
|
30
|
-
4. Define the caRuby Tissue access property file as described in
|
31
|
-
[
|
30
|
+
4. Define the caRuby Tissue access property file as described in the configuration
|
31
|
+
[FAQ](how-do-i-configure-caruby-to-work-with-catissue).
|
32
32
|
|
33
33
|
Migration
|
34
34
|
---------
|
@@ -45,9 +45,49 @@ Migrate the Galena `simple` example as follows:
|
|
45
45
|
|
46
46
|
2. Run the following:
|
47
47
|
|
48
|
-
|
48
|
+
bin/seed
|
49
|
+
|
50
|
+
This command initializes the administrative objects in the Galena test database,
|
51
|
+
including the Galena collection protocol, site, cancer center, tissue bank and coordinator.
|
52
|
+
|
53
|
+
3. Run the following:
|
49
54
|
|
50
|
-
|
51
|
-
|
55
|
+
crtmigrate --target TissueSpecimen --mapping conf/migration/simple_fields.yaml data/simple.csv
|
56
|
+
|
57
|
+
This command migrates the CSV record in the `simple.csv` input file into a caTissue
|
58
|
+
`TissueSpecimen` based on the `simple_fields.yaml` mapping file.
|
59
|
+
The command will take a couple of minutes to finish, since the less information
|
60
|
+
you provide caRuby the more it works to fill in the missing bits. In the meantime,
|
61
|
+
peruse the configuration and data files to see which data are migrated and
|
62
|
+
where this data ends up in caTissue.
|
63
|
+
|
64
|
+
4. Open the caTissue application on the test server and verify the content of the
|
65
|
+
Galena CP collection protocol.
|
52
66
|
|
53
|
-
The other examples are run in a similar manner.
|
67
|
+
The other examples are run in a similar manner. Each example demonstrates different
|
68
|
+
features of the caRuby Migration utility as follows:
|
69
|
+
|
70
|
+
* simple - a good starting point with limited input fields
|
71
|
+
* minimal - the fewest possible input fields without writing custom Ruby shim code
|
72
|
+
* general - lots of input fields, no custom Ruby code
|
73
|
+
* filter - a smattering of custom Ruby shim code to convert input values to caTissue values
|
74
|
+
* frozen - an example demonstrating how to import storage locations
|
75
|
+
|
76
|
+
Try running an example with the `--debug` flag and look at the `log/migration.log` file to see
|
77
|
+
what caRuby is up to behind the scenes (hint: a lot!).
|
78
|
+
|
79
|
+
Input data
|
80
|
+
----------
|
81
|
+
The sample Galena Tissue Bank CSV input files hold one row for each specimen.
|
82
|
+
Common fields are as follows:
|
83
|
+
|
84
|
+
* MRN - Patient Medical Record Number
|
85
|
+
* Initials - Patient name initials
|
86
|
+
* Frozen? - Flag indicating whether the specimen is frozen
|
87
|
+
* SPN - Surgical Pathology Number
|
88
|
+
* Collection Date - Date the specimen was acquired by the tissue bank
|
89
|
+
* Received Date - Date the specimen was donated by the participant
|
90
|
+
* Quantity - Amount collected
|
91
|
+
* Box - Tissue storage container
|
92
|
+
* X - the tissue box column
|
93
|
+
* Y - the tissue box row
|
@@ -0,0 +1,26 @@
|
|
1
|
+
#!/usr/bin/env jruby
|
2
|
+
#
|
3
|
+
# seed: seeds the Galena example administrative objects in the database
|
4
|
+
#
|
5
|
+
|
6
|
+
# load the caRuby Tissue gem
|
7
|
+
require 'rubygems'
|
8
|
+
begin
|
9
|
+
gem 'caruby-tissue'
|
10
|
+
rescue LoadError
|
11
|
+
# if the gem is not available, then try a local source
|
12
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', '..', '..', 'lib')
|
13
|
+
end
|
14
|
+
|
15
|
+
# add the Galena lib to the path
|
16
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', 'lib')
|
17
|
+
|
18
|
+
# the default log file
|
19
|
+
DEF_LOG_FILE = 'log/migration.log'
|
20
|
+
|
21
|
+
require 'catissue'
|
22
|
+
require 'catissue/cli/command'
|
23
|
+
|
24
|
+
require 'galena/seed/defaults'
|
25
|
+
|
26
|
+
CaTissue::CLI::Command.new.start { Galena::Seed.defaults.ensure_exists }
|
@@ -2,6 +2,7 @@
|
|
2
2
|
# This example extends the simple migration with storage fields
|
3
3
|
# and the frozen defaults.
|
4
4
|
|
5
|
+
Protocol: CollectionProtocol.short_title
|
5
6
|
MRN: ParticipantMedicalIdentifier.medical_record_number, Participant.last_name
|
6
7
|
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
7
8
|
Collection Date: ReceivedEventParameters.timestamp
|