ncs_mdes 0.5.0 → 0.6.0
Sign up to get free protection for your applications and to get access to all the features.
- data/.rspec +2 -0
- data/.yardopts +1 -1
- data/CHANGELOG.md +7 -1
- data/HEURISTICS.md +111 -0
- data/bin/mdes-console +2 -0
- data/documents/2.1/NCS_Transmission_Schema_2.1.00.00.xsd +27737 -0
- data/documents/2.1/disposition_codes.yml +1758 -0
- data/documents/2.1/extract_disposition_codes.rb +56 -0
- data/documents/2.1/heuristic_overrides.yml +417 -0
- data/documents/2.2/NCS_Transmission_Schema_2.2.01.00.xsd +51765 -0
- data/documents/2.2/disposition_codes.yml +1758 -0
- data/documents/2.2/extract_disposition_codes.rb +57 -0
- data/documents/2.2/heuristic_overrides.yml +452 -0
- data/lib/ncs_navigator/mdes/source_documents.rb +4 -0
- data/lib/ncs_navigator/mdes/specification.rb +1 -1
- data/lib/ncs_navigator/mdes/variable.rb +6 -4
- data/lib/ncs_navigator/mdes/version.rb +1 -1
- data/spec/ncs_navigator/mdes/source_documents_spec.rb +41 -17
- data/spec/ncs_navigator/mdes/specification_spec.rb +65 -20
- data/spec/ncs_navigator/mdes/variable_spec.rb +5 -4
- metadata +32 -23
data/.rspec
ADDED
data/.yardopts
CHANGED
data/CHANGELOG.md
CHANGED
@@ -1,6 +1,12 @@
|
|
1
1
|
NCS Navigator MDES Module history
|
2
2
|
=================================
|
3
3
|
|
4
|
+
0.6.0
|
5
|
+
-----
|
6
|
+
|
7
|
+
- Add built-in support for MDES 2.1 (#8).
|
8
|
+
- Add built-in support for MDES 2.2 (#9).
|
9
|
+
|
4
10
|
0.5.0
|
5
11
|
-----
|
6
12
|
|
@@ -8,7 +14,7 @@ NCS Navigator MDES Module history
|
|
8
14
|
flag tables as one or the other.
|
9
15
|
- Fix reference definition for `spec_blood.equip_id`. According to the
|
10
16
|
corresponding instrument, it is not a reference to `spec_equipment`,
|
11
|
-
but rather a manually-filled field.
|
17
|
+
but rather a manually-filled field (#7).
|
12
18
|
|
13
19
|
0.4.2
|
14
20
|
-----
|
data/HEURISTICS.md
ADDED
@@ -0,0 +1,111 @@
|
|
1
|
+
# Why `ncs_mdes` tells you the things it tells you
|
2
|
+
|
3
|
+
`ncs_mdes` derives its view of the NCS Master Data Element
|
4
|
+
Specification primarily from the the XML Schema defining the Vanguard
|
5
|
+
Data Repository submission format. However, that file does not contain
|
6
|
+
the full semantics that the gem exposes. This document discusses how
|
7
|
+
the remaining attributes are derived.
|
8
|
+
|
9
|
+
# Gem overview
|
10
|
+
|
11
|
+
`ncs_mdes` exposes data in three major categories:
|
12
|
+
|
13
|
+
* Tables
|
14
|
+
* Types
|
15
|
+
* Disposition codes
|
16
|
+
|
17
|
+
Types are fairly simple, and are mostly interesting insofar as they
|
18
|
+
are the mechanism whereby you can look up a code list. Disposition
|
19
|
+
codes are extracted from the Master Data Element Specification
|
20
|
+
spreadsheet instead of the VDR schema — unlike the tables and
|
21
|
+
types, they are pre-processed rather than coming from the source
|
22
|
+
document at runtime — but are otherwise simple. This document
|
23
|
+
is mainly concerned with tables and their children, variables.
|
24
|
+
|
25
|
+
# Tables
|
26
|
+
|
27
|
+
The table name attribute is taken directly from the VDR schema.
|
28
|
+
|
29
|
+
## Instrument or operational?
|
30
|
+
|
31
|
+
`ncs_mdes` can also tell you if a table is an operational or
|
32
|
+
instrument table (this is an XOR relationship) and, if it is an
|
33
|
+
instrument table, whether it is a "primary" instrument table.
|
34
|
+
|
35
|
+
Definitions:
|
36
|
+
|
37
|
+
* An operational table is a table that collects study execution
|
38
|
+
information.
|
39
|
+
|
40
|
+
* An instrument table is a table that contains data collected about a
|
41
|
+
study participant.
|
42
|
+
|
43
|
+
* A "primary" instrument table is a table for which there is exactly
|
44
|
+
one record for each time the instrument is collected for a
|
45
|
+
participant. (The MDES is a relational model; non-primary tables
|
46
|
+
contain the results of repeating instrument sections or multivalued
|
47
|
+
questions and are always associated with a primary table, though
|
48
|
+
sometimes the association is indirect.)
|
49
|
+
|
50
|
+
These distinctions are derived using the following heuristic:
|
51
|
+
|
52
|
+
* If the table contains a variable named `instrument_version` and is
|
53
|
+
not the table named `instrument`, it is a primary instrument table
|
54
|
+
(and therefore an instrument table). (The table `instrument` is
|
55
|
+
itself an operational table since it records the execution of an
|
56
|
+
instrument rather than any of the data collected in the instrument.)
|
57
|
+
|
58
|
+
* If the table contains a foreign key to a table which is an
|
59
|
+
instrument table, then it is an instrument table.
|
60
|
+
|
61
|
+
* Otherwise, the table is an operational table.
|
62
|
+
|
63
|
+
This heuristic works in all cases for MDES 2.0.
|
64
|
+
|
65
|
+
# Variables
|
66
|
+
|
67
|
+
The following attributes of a variable are taken directly from the XML
|
68
|
+
schema:
|
69
|
+
|
70
|
+
* name
|
71
|
+
* pii?
|
72
|
+
* required?
|
73
|
+
* omittable?
|
74
|
+
* nillable?
|
75
|
+
* status (active, etc.)
|
76
|
+
* type
|
77
|
+
|
78
|
+
## Table references
|
79
|
+
|
80
|
+
`ncs_mdes` can also tell you if a variable is a foreign key reference
|
81
|
+
and if so, to which table it refers. While the XML schema indicates
|
82
|
+
that a variable is of one of a couple of foreign key types, it does
|
83
|
+
not indicate the associated table. That information is derived using
|
84
|
+
the following heuristic:
|
85
|
+
|
86
|
+
* If the variable is not of foreign key type, it's not a foreign key.
|
87
|
+
|
88
|
+
* Otherwise, find all the tables in the MDES whose primary key is
|
89
|
+
named the same as the candidate foreign key variable.
|
90
|
+
|
91
|
+
* If there is exactly one such table, the variable refers to that
|
92
|
+
table.
|
93
|
+
|
94
|
+
* Otherwise fail.
|
95
|
+
|
96
|
+
This heuristic does not fail for 399 of the foreign keys in MDES
|
97
|
+
2.0. Another 155 are mapped manually for a total of 554.
|
98
|
+
|
99
|
+
There are also three variables which are typed as foreign keys in the
|
100
|
+
XML schema but which for a couple of different reasons are not treated
|
101
|
+
as foreign keys by ncs_mdes. These are described in comments in
|
102
|
+
`documents/2.0/heuristic_overrides.yml` in the ncs_mdes source.
|
103
|
+
|
104
|
+
# Heuristics not used
|
105
|
+
|
106
|
+
## Type coercion
|
107
|
+
|
108
|
+
The MDES VDR schema considers nearly all variables to strings; usually
|
109
|
+
strings of a set length or conforming to a particular
|
110
|
+
pattern. `ncs_mdes` does not attempt to infer a stronger type for
|
111
|
+
these.
|
data/bin/mdes-console
CHANGED
@@ -19,6 +19,7 @@ require 'ncs_navigator/mdes'
|
|
19
19
|
|
20
20
|
$mdes12 = NcsNavigator::Mdes::Specification.new('1.2')
|
21
21
|
$mdes20 = NcsNavigator::Mdes::Specification.new('2.0')
|
22
|
+
$mdes21 = NcsNavigator::Mdes::Specification.new('2.1')
|
22
23
|
|
23
24
|
expected_loc = ENV[NcsNavigator::Mdes::SourceDocuments::BASE_ENV_VAR] ?
|
24
25
|
ENV[NcsNavigator::Mdes::SourceDocuments::BASE_ENV_VAR].inspect :
|
@@ -27,5 +28,6 @@ expected_loc = ENV[NcsNavigator::Mdes::SourceDocuments::BASE_ENV_VAR] ?
|
|
27
28
|
puts "Documents are expected to be in #{expected_loc}."
|
28
29
|
puts "$mdes12 is a Specification for 1.2"
|
29
30
|
puts "$mdes20 is a Specification for 2.0"
|
31
|
+
puts "$mdes21 is a Specification for 2.1"
|
30
32
|
|
31
33
|
IRB.start(__FILE__)
|