caruby-tissue 1.2.1 → 1.2.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/History.txt +4 -0
- data/LICENSE +1 -1
- data/README.md +79 -30
- data/bin/crtmigrate +2 -2
- data/{examples/galena/bin → bin}/migrate.rb +0 -0
- data/bin/seed +26 -0
- data/bin/seed.rb +43 -0
- data/conf/extract/simple_fields.yaml +4 -0
- data/conf/migration/filter_fields.yaml +7 -0
- data/{examples/galena/conf → conf}/migration/filter_migration.yaml +0 -0
- data/conf/migration/frozen_fields.yaml +11 -0
- data/{examples/galena/conf → conf}/migration/frozen_migration.yaml +0 -0
- data/conf/migration/general_fields.yaml +44 -0
- data/{examples/galena/conf → conf}/migration/general_migration.yaml +0 -0
- data/conf/migration/simple_fields.yaml +30 -0
- data/{examples/galena/conf → conf}/migration/simple_migration.yaml +0 -0
- data/{examples/galena/conf → conf}/migration/small_fields.yaml +0 -0
- data/{examples/galena/conf → conf}/migration/small_migration.yaml +0 -0
- data/examples/galena/README.md +46 -6
- data/examples/galena/bin/seed +26 -0
- data/examples/galena/conf/migration/frozen_fields.yaml +1 -0
- data/examples/galena/conf/migration/general_fields.yaml +2 -0
- data/examples/galena/data/filter.csv +1 -1
- data/examples/galena/data/frozen.csv +1 -1
- data/examples/galena/data/general.csv +1 -1
- data/examples/galena/doc/CaTissue.html +2 -2
- data/examples/galena/doc/CaTissue/Participant.html +1 -1
- data/examples/galena/doc/CaTissue/SpecimenCollectionGroup.html +1 -1
- data/examples/galena/doc/CaTissue/StorageContainer.html +6 -6
- data/examples/galena/doc/CaTissue/TissueSpecimen.html +1 -1
- data/examples/galena/doc/Galena.html +4 -122
- data/examples/galena/doc/Galena/Seed.html +1 -1
- data/examples/galena/doc/Galena/Seed/Defaults.html +28 -24
- data/examples/galena/doc/_index.html +1 -8
- data/examples/galena/doc/class_list.html +1 -1
- data/examples/galena/doc/file.README.html +52 -7
- data/examples/galena/doc/index.html +52 -7
- data/examples/galena/doc/method_list.html +9 -25
- data/examples/galena/doc/top-level-namespace.html +1 -1
- data/examples/galena/lib/galena/migration/frozen_shims.rb +4 -15
- data/examples/galena/lib/galena/seed/defaults.rb +16 -4
- data/{examples/galena/lib → lib}/README.html +0 -0
- data/lib/catissue/cli/command.rb +6 -9
- data/lib/catissue/cli/migrate.rb +11 -10
- data/lib/catissue/cli/smoke.rb +5 -5
- data/lib/catissue/database.rb +31 -8
- data/lib/catissue/domain/abstract_specimen.rb +1 -1
- data/lib/catissue/domain/collection_protocol.rb +29 -13
- data/lib/catissue/domain/participant_medical_identifier.rb +1 -1
- data/lib/catissue/domain/site.rb +3 -0
- data/lib/catissue/domain/specimen.rb +17 -14
- data/lib/catissue/domain/specimen_collection_group.rb +2 -5
- data/lib/catissue/extract/delta.rb +2 -6
- data/lib/catissue/migration/migrator.rb +6 -0
- data/lib/catissue/resource.rb +5 -2
- data/lib/catissue/util/log.rb +3 -3
- data/lib/catissue/version.rb +1 -1
- data/{examples/galena/lib → lib}/galena.rb +0 -0
- data/{examples/galena/bin → lib/galena/cli}/seed.rb +1 -1
- data/lib/galena/migration/filter_shims.rb +43 -0
- data/lib/galena/migration/frozen_shims.rb +53 -0
- data/lib/galena/seed/defaults.rb +109 -0
- metadata +27 -17
- data/examples/galena/doc/CaTissue/CollectionProtocolRegistration.html +0 -181
data/History.txt
CHANGED
data/LICENSE
CHANGED
data/README.md
CHANGED
@@ -1,44 +1,93 @@
|
|
1
|
-
caRuby Tissue
|
2
|
-
|
1
|
+
Galena caRuby Tissue example
|
2
|
+
============================
|
3
3
|
|
4
|
-
|
5
|
-
|
6
|
-
|
7
|
-
|
8
|
-
**License**: MIT License
|
9
|
-
**Latest Version**: 1.2.1
|
10
|
-
**Release Date**: November 23rd 2010
|
4
|
+
Synopsis
|
5
|
+
--------
|
6
|
+
This directory contains the caRuby Tissue example for the hypothetical Galena Cancer Center.
|
7
|
+
The example files are a useful template for building your own migrator.
|
11
8
|
|
12
|
-
|
9
|
+
The Galena example demonstrates how to load the content of a custom tissue bank into caTissue.
|
10
|
+
The use cases illustrate several common migration impediments:
|
13
11
|
|
14
|
-
|
15
|
-
|
12
|
+
* Different terminology than caTissue
|
13
|
+
* Different associations than caTissue
|
14
|
+
* Incomplete input for caTissue
|
15
|
+
* Denormalized input
|
16
|
+
* Inconsistent input
|
17
|
+
* Input data scrubbing
|
18
|
+
* Aliquot inference
|
19
|
+
* Pre-defined caTissue protocol
|
16
20
|
|
17
|
-
|
21
|
+
Setup
|
22
|
+
-----
|
23
|
+
1. Run the `crtexample --list` command to display the Galena example location.
|
18
24
|
|
19
|
-
2.
|
25
|
+
2. Copy the example into a location of your choosing.
|
20
26
|
|
21
|
-
3.
|
27
|
+
3. Configure a caTissue client to connect to a test caTissue instance, as described in the
|
28
|
+
caTissue Technical Guide.
|
22
29
|
|
23
|
-
|
24
|
-
|
30
|
+
4. Define the caRuby Tissue access property file as described in the configuration
|
31
|
+
[FAQ](how-do-i-configure-caruby-to-work-with-catissue).
|
25
32
|
|
26
|
-
|
33
|
+
Migration
|
34
|
+
---------
|
35
|
+
The example migration input data resides in the `data` directory.
|
36
|
+
Each CSV input file holds one row for each specimen.
|
27
37
|
|
28
|
-
|
29
|
-
|
38
|
+
Each example has a field mapping configuration in the `conf/migration` directory.
|
39
|
+
For example, the `simple.csv` input file is migrated into caTissue using the
|
40
|
+
`simple_migration.yaml` configuration file.
|
30
41
|
|
31
|
-
|
42
|
+
Migrate the Galena `simple` example as follows:
|
32
43
|
|
33
|
-
|
34
|
-
---------
|
44
|
+
1. Open a console in the copied Galena example location.
|
35
45
|
|
36
|
-
|
37
|
-
- Initial public release
|
46
|
+
2. Run the following:
|
38
47
|
|
39
|
-
|
40
|
-
|
48
|
+
bin/seed
|
49
|
+
|
50
|
+
This command initializes the administrative objects in the Galena test database,
|
51
|
+
including the Galena collection protocol, site, cancer center, tissue bank and coordinator.
|
52
|
+
|
53
|
+
3. Run the following:
|
54
|
+
|
55
|
+
crtmigrate --target TissueSpecimen --mapping conf/migration/simple_fields.yaml data/simple.csv
|
56
|
+
|
57
|
+
This command migrates the CSV record in the `simple.csv` input file into a caTissue
|
58
|
+
`TissueSpecimen` based on the `simple_fields.yaml` mapping file.
|
59
|
+
The command will take a couple of minutes to finish, since the less information
|
60
|
+
you provide caRuby the more it works to fill in the missing bits. In the meantime,
|
61
|
+
peruse the configuration and data files to see which data are migrated and
|
62
|
+
where this data ends up in caTissue.
|
63
|
+
|
64
|
+
4. Open the caTissue application on the test server and verify the content of the
|
65
|
+
Galena CP collection protocol.
|
66
|
+
|
67
|
+
The other examples are run in a similar manner. Each example demonstrates different
|
68
|
+
features of the caRuby Migration utility as follows:
|
69
|
+
|
70
|
+
* simple - a good starting point with limited input fields
|
71
|
+
* minimal - the fewest possible input fields without writing custom Ruby shim code
|
72
|
+
* general - lots of input fields, no custom Ruby code
|
73
|
+
* filter - a smattering of custom Ruby shim code to convert input values to caTissue values
|
74
|
+
* frozen - an example demonstrating how to import storage locations
|
75
|
+
|
76
|
+
Try running an example with the `--debug` flag and look at the `log/migration.log` file to see
|
77
|
+
what caRuby is up to behind the scenes (hint: a lot!).
|
78
|
+
|
79
|
+
Input data
|
80
|
+
----------
|
81
|
+
The sample Galena Tissue Bank CSV input files hold one row for each specimen.
|
82
|
+
Common fields are as follows:
|
41
83
|
|
42
|
-
|
43
|
-
|
44
|
-
|
84
|
+
* MRN - Patient Medical Record Number
|
85
|
+
* Initials - Patient name initials
|
86
|
+
* Frozen? - Flag indicating whether the specimen is frozen
|
87
|
+
* SPN - Surgical Pathology Number
|
88
|
+
* Collection Date - Date the specimen was acquired by the tissue bank
|
89
|
+
* Received Date - Date the specimen was donated by the participant
|
90
|
+
* Quantity - Amount collected
|
91
|
+
* Box - Tissue storage container
|
92
|
+
* X - the tissue box column
|
93
|
+
* Y - the tissue box row
|
data/bin/crtmigrate
CHANGED
File without changes
|
data/bin/seed
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
#!/usr/bin/env jruby
|
2
|
+
#
|
3
|
+
# seed: seeds the Galena example administrative objects in the database
|
4
|
+
#
|
5
|
+
|
6
|
+
# load the caRuby Tissue gem
|
7
|
+
require 'rubygems'
|
8
|
+
begin
|
9
|
+
gem 'caruby-tissue'
|
10
|
+
rescue LoadError
|
11
|
+
# if the gem is not available, then try a local source
|
12
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', '..', '..', 'lib')
|
13
|
+
end
|
14
|
+
|
15
|
+
# add the Galena lib to the path
|
16
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', 'lib')
|
17
|
+
|
18
|
+
# the default log file
|
19
|
+
DEF_LOG_FILE = 'log/migration.log'
|
20
|
+
|
21
|
+
require 'catissue'
|
22
|
+
require 'catissue/cli/command'
|
23
|
+
|
24
|
+
require 'galena/seed/defaults'
|
25
|
+
|
26
|
+
CaTissue::CLI::Command.new.start { Galena::Seed.defaults.ensure_exists }
|
data/bin/seed.rb
ADDED
@@ -0,0 +1,43 @@
|
|
1
|
+
#!/usr/bin/env jruby
|
2
|
+
#
|
3
|
+
# crtseed-galena: seeds the Galena example administrative objects in the database
|
4
|
+
#
|
5
|
+
# == Usage
|
6
|
+
#
|
7
|
+
# catissue-seed-galena.rb [options] file
|
8
|
+
#
|
9
|
+
# --help, -h:
|
10
|
+
# print this help message and exit
|
11
|
+
#
|
12
|
+
# --log file, -l file:
|
13
|
+
# log file (default ./log/migration.log)
|
14
|
+
#
|
15
|
+
# --debug, -d:
|
16
|
+
# print debug messages to log (optional)
|
17
|
+
#
|
18
|
+
# == License
|
19
|
+
#
|
20
|
+
# This program is licensed under the terms of the +LEGAL+ file in
|
21
|
+
# the source distribution.
|
22
|
+
|
23
|
+
# load the required gems
|
24
|
+
require 'rubygems'
|
25
|
+
begin
|
26
|
+
gem 'caruby-tissue'
|
27
|
+
rescue LoadError
|
28
|
+
# if the gem is not available, then try a local source
|
29
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', '..', '..', 'lib')
|
30
|
+
end
|
31
|
+
|
32
|
+
# add the Galena lib to the path
|
33
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', 'lib')
|
34
|
+
|
35
|
+
# the default log file
|
36
|
+
DEF_LOG_FILE = 'log/migration.log'
|
37
|
+
|
38
|
+
require 'catissue'
|
39
|
+
require 'catissue/cli/command'
|
40
|
+
|
41
|
+
require 'galena/seed/defaults'
|
42
|
+
|
43
|
+
CaTissue::Command.new.start { Galena::Seed.defaults.ensure_exists }
|
@@ -0,0 +1,7 @@
|
|
1
|
+
Protocol: CollectionProtocol.short_title
|
2
|
+
MRN: ParticipantMedicalIdentifier.medical_record_number
|
3
|
+
Initials: Participant.first_name, Participant.last_name
|
4
|
+
Frozen?: TissueSpecimen.specimen_type
|
5
|
+
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
6
|
+
Collection Date: ReceivedEventParameters.timestamp
|
7
|
+
Quantity: TissueSpecimen.initial_quantity
|
File without changes
|
@@ -0,0 +1,11 @@
|
|
1
|
+
# This is the Galena frozen example migration field mapping file.
|
2
|
+
# This example extends the simple migration with storage fields
|
3
|
+
# and the frozen defaults.
|
4
|
+
|
5
|
+
MRN: ParticipantMedicalIdentifier.medical_record_number, Participant.last_name
|
6
|
+
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
7
|
+
Collection Date: ReceivedEventParameters.timestamp
|
8
|
+
Quantity: TissueSpecimen.initial_quantity
|
9
|
+
Box: SpecimenPosition.container.name
|
10
|
+
X: SpecimenPosition.position_dimension_one
|
11
|
+
Y: SpecimenPosition.position_dimension_two
|
File without changes
|
@@ -0,0 +1,44 @@
|
|
1
|
+
# This is the Galena general example migration field mapping file in the format:
|
2
|
+
# input field: caTissue property
|
3
|
+
|
4
|
+
Protocol: CollectionProtocol.short_title
|
5
|
+
|
6
|
+
Collection Site: Site.name
|
7
|
+
|
8
|
+
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
9
|
+
|
10
|
+
PPI: CollectionProtocolRegistration.protocol_participant_identifier
|
11
|
+
|
12
|
+
MRN: ParticipantMedicalIdentifier.medical_record_number
|
13
|
+
|
14
|
+
First Name: Participant.first_name
|
15
|
+
|
16
|
+
Last Name: Participant.last_name
|
17
|
+
|
18
|
+
Collection Point: CollectionProtocolEvent.event_point
|
19
|
+
|
20
|
+
Receiver: ReceivedEventParameters.user.login_name
|
21
|
+
|
22
|
+
Received Timestamp: ReceivedEventParameters.timestamp
|
23
|
+
|
24
|
+
Collector: CollectionEventParameters.user.login_name
|
25
|
+
|
26
|
+
Collected Timestamp: CollectionEventParameters.timestamp
|
27
|
+
|
28
|
+
Diagnosis: SpecimenCollectionGroup.clinical_diagnosis
|
29
|
+
|
30
|
+
Anatomic Site: SpecimenCharacteristics.tissue_site
|
31
|
+
|
32
|
+
Development: TissueSpecimen.pathological_status
|
33
|
+
|
34
|
+
Label: TissueSpecimen.label
|
35
|
+
|
36
|
+
Type: TissueSpecimen.specimen_type
|
37
|
+
|
38
|
+
Quantity: TissueSpecimen.initial_quantity
|
39
|
+
|
40
|
+
Box: SpecimenPosition.container.name
|
41
|
+
|
42
|
+
X: SpecimenPosition.position_dimension_one
|
43
|
+
|
44
|
+
Y: SpecimenPosition.position_dimension_two
|
File without changes
|
@@ -0,0 +1,30 @@
|
|
1
|
+
# This is the Galena simple example migration field mapping file.
|
2
|
+
# Each entry is in the format:
|
3
|
+
# heading: paths
|
4
|
+
# The heading is the CSV file label in the leading heading record.
|
5
|
+
# The paths is a comma-separated list of CaTissue domain object attribute paths specifying
|
6
|
+
# how the CSV heading maps to a CaTissue attribute value. Each attribute path is in the format:
|
7
|
+
# class.attribute[.attribute]
|
8
|
+
# where class is a CaTissue class and attribute is a Ruby accessor method defined in the CaTissue class.
|
9
|
+
# The accessor method can be one of the following:
|
10
|
+
# * a Java property name, e.g. lastName
|
11
|
+
# * the pre-defined Rubyized underscore form of the Java property, e.g. last_name
|
12
|
+
|
13
|
+
# The collection protocol title for the specimen
|
14
|
+
Protocol: CollectionProtocol.short_title
|
15
|
+
|
16
|
+
# The input MRN field is used for the PMI MRN, CPR PPI and Participant last name.
|
17
|
+
# Since the migration source is for a single collection site, the MRN is unique within the protocol.
|
18
|
+
# The simple input does not have a Participant name field. It is a useful practice to set the caTissue
|
19
|
+
# name to some value, since caTissue uses the name for display and sorting. For want of a better
|
20
|
+
# value, the name is set to the MRN.
|
21
|
+
MRN: ParticipantMedicalIdentifier.medical_record_number, Participant.last_name
|
22
|
+
|
23
|
+
# The input SPN field is the SCG SPN value.
|
24
|
+
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
25
|
+
|
26
|
+
# The input Collection Date is the Specimen received timestamp.
|
27
|
+
Collection Date: ReceivedEventParameters.timestamp
|
28
|
+
|
29
|
+
# The input Quantity is the target Specimen initial quantity.
|
30
|
+
Quantity: Specimen.initial_quantity
|
File without changes
|
File without changes
|
File without changes
|
data/examples/galena/README.md
CHANGED
@@ -27,8 +27,8 @@ Setup
|
|
27
27
|
3. Configure a caTissue client to connect to a test caTissue instance, as described in the
|
28
28
|
caTissue Technical Guide.
|
29
29
|
|
30
|
-
4. Define the caRuby Tissue access property file as described in
|
31
|
-
[
|
30
|
+
4. Define the caRuby Tissue access property file as described in the configuration
|
31
|
+
[FAQ](how-do-i-configure-caruby-to-work-with-catissue).
|
32
32
|
|
33
33
|
Migration
|
34
34
|
---------
|
@@ -45,9 +45,49 @@ Migrate the Galena `simple` example as follows:
|
|
45
45
|
|
46
46
|
2. Run the following:
|
47
47
|
|
48
|
-
|
48
|
+
bin/seed
|
49
|
+
|
50
|
+
This command initializes the administrative objects in the Galena test database,
|
51
|
+
including the Galena collection protocol, site, cancer center, tissue bank and coordinator.
|
52
|
+
|
53
|
+
3. Run the following:
|
49
54
|
|
50
|
-
|
51
|
-
|
55
|
+
crtmigrate --target TissueSpecimen --mapping conf/migration/simple_fields.yaml data/simple.csv
|
56
|
+
|
57
|
+
This command migrates the CSV record in the `simple.csv` input file into a caTissue
|
58
|
+
`TissueSpecimen` based on the `simple_fields.yaml` mapping file.
|
59
|
+
The command will take a couple of minutes to finish, since the less information
|
60
|
+
you provide caRuby the more it works to fill in the missing bits. In the meantime,
|
61
|
+
peruse the configuration and data files to see which data are migrated and
|
62
|
+
where this data ends up in caTissue.
|
63
|
+
|
64
|
+
4. Open the caTissue application on the test server and verify the content of the
|
65
|
+
Galena CP collection protocol.
|
52
66
|
|
53
|
-
The other examples are run in a similar manner.
|
67
|
+
The other examples are run in a similar manner. Each example demonstrates different
|
68
|
+
features of the caRuby Migration utility as follows:
|
69
|
+
|
70
|
+
* simple - a good starting point with limited input fields
|
71
|
+
* minimal - the fewest possible input fields without writing custom Ruby shim code
|
72
|
+
* general - lots of input fields, no custom Ruby code
|
73
|
+
* filter - a smattering of custom Ruby shim code to convert input values to caTissue values
|
74
|
+
* frozen - an example demonstrating how to import storage locations
|
75
|
+
|
76
|
+
Try running an example with the `--debug` flag and look at the `log/migration.log` file to see
|
77
|
+
what caRuby is up to behind the scenes (hint: a lot!).
|
78
|
+
|
79
|
+
Input data
|
80
|
+
----------
|
81
|
+
The sample Galena Tissue Bank CSV input files hold one row for each specimen.
|
82
|
+
Common fields are as follows:
|
83
|
+
|
84
|
+
* MRN - Patient Medical Record Number
|
85
|
+
* Initials - Patient name initials
|
86
|
+
* Frozen? - Flag indicating whether the specimen is frozen
|
87
|
+
* SPN - Surgical Pathology Number
|
88
|
+
* Collection Date - Date the specimen was acquired by the tissue bank
|
89
|
+
* Received Date - Date the specimen was donated by the participant
|
90
|
+
* Quantity - Amount collected
|
91
|
+
* Box - Tissue storage container
|
92
|
+
* X - the tissue box column
|
93
|
+
* Y - the tissue box row
|
@@ -0,0 +1,26 @@
|
|
1
|
+
#!/usr/bin/env jruby
|
2
|
+
#
|
3
|
+
# seed: seeds the Galena example administrative objects in the database
|
4
|
+
#
|
5
|
+
|
6
|
+
# load the caRuby Tissue gem
|
7
|
+
require 'rubygems'
|
8
|
+
begin
|
9
|
+
gem 'caruby-tissue'
|
10
|
+
rescue LoadError
|
11
|
+
# if the gem is not available, then try a local source
|
12
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', '..', '..', 'lib')
|
13
|
+
end
|
14
|
+
|
15
|
+
# add the Galena lib to the path
|
16
|
+
$:.unshift File.join(File.dirname(__FILE__), '..', 'lib')
|
17
|
+
|
18
|
+
# the default log file
|
19
|
+
DEF_LOG_FILE = 'log/migration.log'
|
20
|
+
|
21
|
+
require 'catissue'
|
22
|
+
require 'catissue/cli/command'
|
23
|
+
|
24
|
+
require 'galena/seed/defaults'
|
25
|
+
|
26
|
+
CaTissue::CLI::Command.new.start { Galena::Seed.defaults.ensure_exists }
|
@@ -2,6 +2,7 @@
|
|
2
2
|
# This example extends the simple migration with storage fields
|
3
3
|
# and the frozen defaults.
|
4
4
|
|
5
|
+
Protocol: CollectionProtocol.short_title
|
5
6
|
MRN: ParticipantMedicalIdentifier.medical_record_number, Participant.last_name
|
6
7
|
SPN: SpecimenCollectionGroup.surgical_pathology_number
|
7
8
|
Collection Date: ReceivedEventParameters.timestamp
|