ncs_mdes 0.5.0 → 0.6.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.rspec +2 -0
- data/.yardopts +1 -1
- data/CHANGELOG.md +7 -1
- data/HEURISTICS.md +111 -0
- data/bin/mdes-console +2 -0
- data/documents/2.1/NCS_Transmission_Schema_2.1.00.00.xsd +27737 -0
- data/documents/2.1/disposition_codes.yml +1758 -0
- data/documents/2.1/extract_disposition_codes.rb +56 -0
- data/documents/2.1/heuristic_overrides.yml +417 -0
- data/documents/2.2/NCS_Transmission_Schema_2.2.01.00.xsd +51765 -0
- data/documents/2.2/disposition_codes.yml +1758 -0
- data/documents/2.2/extract_disposition_codes.rb +57 -0
- data/documents/2.2/heuristic_overrides.yml +452 -0
- data/lib/ncs_navigator/mdes/source_documents.rb +4 -0
- data/lib/ncs_navigator/mdes/specification.rb +1 -1
- data/lib/ncs_navigator/mdes/variable.rb +6 -4
- data/lib/ncs_navigator/mdes/version.rb +1 -1
- data/spec/ncs_navigator/mdes/source_documents_spec.rb +41 -17
- data/spec/ncs_navigator/mdes/specification_spec.rb +65 -20
- data/spec/ncs_navigator/mdes/variable_spec.rb +5 -4
- metadata +32 -23
data/.rspec
ADDED
data/.yardopts
CHANGED
data/CHANGELOG.md
CHANGED
@@ -1,6 +1,12 @@
|
|
1
1
|
NCS Navigator MDES Module history
|
2
2
|
=================================
|
3
3
|
|
4
|
+
0.6.0
|
5
|
+
-----
|
6
|
+
|
7
|
+
- Add built-in support for MDES 2.1 (#8).
|
8
|
+
- Add built-in support for MDES 2.2 (#9).
|
9
|
+
|
4
10
|
0.5.0
|
5
11
|
-----
|
6
12
|
|
@@ -8,7 +14,7 @@ NCS Navigator MDES Module history
|
|
8
14
|
flag tables as one or the other.
|
9
15
|
- Fix reference definition for `spec_blood.equip_id`. According to the
|
10
16
|
corresponding instrument, it is not a reference to `spec_equipment`,
|
11
|
-
but rather a manually-filled field.
|
17
|
+
but rather a manually-filled field (#7).
|
12
18
|
|
13
19
|
0.4.2
|
14
20
|
-----
|
data/HEURISTICS.md
ADDED
@@ -0,0 +1,111 @@
|
|
1
|
+
# Why `ncs_mdes` tells you the things it tells you
|
2
|
+
|
3
|
+
`ncs_mdes` derives its view of the NCS Master Data Element
|
4
|
+
Specification primarily from the the XML Schema defining the Vanguard
|
5
|
+
Data Repository submission format. However, that file does not contain
|
6
|
+
the full semantics that the gem exposes. This document discusses how
|
7
|
+
the remaining attributes are derived.
|
8
|
+
|
9
|
+
# Gem overview
|
10
|
+
|
11
|
+
`ncs_mdes` exposes data in three major categories:
|
12
|
+
|
13
|
+
* Tables
|
14
|
+
* Types
|
15
|
+
* Disposition codes
|
16
|
+
|
17
|
+
Types are fairly simple, and are mostly interesting insofar as they
|
18
|
+
are the mechanism whereby you can look up a code list. Disposition
|
19
|
+
codes are extracted from the Master Data Element Specification
|
20
|
+
spreadsheet instead of the VDR schema — unlike the tables and
|
21
|
+
types, they are pre-processed rather than coming from the source
|
22
|
+
document at runtime — but are otherwise simple. This document
|
23
|
+
is mainly concerned with tables and their children, variables.
|
24
|
+
|
25
|
+
# Tables
|
26
|
+
|
27
|
+
The table name attribute is taken directly from the VDR schema.
|
28
|
+
|
29
|
+
## Instrument or operational?
|
30
|
+
|
31
|
+
`ncs_mdes` can also tell you if a table is an operational or
|
32
|
+
instrument table (this is an XOR relationship) and, if it is an
|
33
|
+
instrument table, whether it is a "primary" instrument table.
|
34
|
+
|
35
|
+
Definitions:
|
36
|
+
|
37
|
+
* An operational table is a table that collects study execution
|
38
|
+
information.
|
39
|
+
|
40
|
+
* An instrument table is a table that contains data collected about a
|
41
|
+
study participant.
|
42
|
+
|
43
|
+
* A "primary" instrument table is a table for which there is exactly
|
44
|
+
one record for each time the instrument is collected for a
|
45
|
+
participant. (The MDES is a relational model; non-primary tables
|
46
|
+
contain the results of repeating instrument sections or multivalued
|
47
|
+
questions and are always associated with a primary table, though
|
48
|
+
sometimes the association is indirect.)
|
49
|
+
|
50
|
+
These distinctions are derived using the following heuristic:
|
51
|
+
|
52
|
+
* If the table contains a variable named `instrument_version` and is
|
53
|
+
not the table named `instrument`, it is a primary instrument table
|
54
|
+
(and therefore an instrument table). (The table `instrument` is
|
55
|
+
itself an operational table since it records the execution of an
|
56
|
+
instrument rather than any of the data collected in the instrument.)
|
57
|
+
|
58
|
+
* If the table contains a foreign key to a table which is an
|
59
|
+
instrument table, then it is an instrument table.
|
60
|
+
|
61
|
+
* Otherwise, the table is an operational table.
|
62
|
+
|
63
|
+
This heuristic works in all cases for MDES 2.0.
|
64
|
+
|
65
|
+
# Variables
|
66
|
+
|
67
|
+
The following attributes of a variable are taken directly from the XML
|
68
|
+
schema:
|
69
|
+
|
70
|
+
* name
|
71
|
+
* pii?
|
72
|
+
* required?
|
73
|
+
* omittable?
|
74
|
+
* nillable?
|
75
|
+
* status (active, etc.)
|
76
|
+
* type
|
77
|
+
|
78
|
+
## Table references
|
79
|
+
|
80
|
+
`ncs_mdes` can also tell you if a variable is a foreign key reference
|
81
|
+
and if so, to which table it refers. While the XML schema indicates
|
82
|
+
that a variable is of one of a couple of foreign key types, it does
|
83
|
+
not indicate the associated table. That information is derived using
|
84
|
+
the following heuristic:
|
85
|
+
|
86
|
+
* If the variable is not of foreign key type, it's not a foreign key.
|
87
|
+
|
88
|
+
* Otherwise, find all the tables in the MDES whose primary key is
|
89
|
+
named the same as the candidate foreign key variable.
|
90
|
+
|
91
|
+
* If there is exactly one such table, the variable refers to that
|
92
|
+
table.
|
93
|
+
|
94
|
+
* Otherwise fail.
|
95
|
+
|
96
|
+
This heuristic does not fail for 399 of the foreign keys in MDES
|
97
|
+
2.0. Another 155 are mapped manually for a total of 554.
|
98
|
+
|
99
|
+
There are also three variables which are typed as foreign keys in the
|
100
|
+
XML schema but which for a couple of different reasons are not treated
|
101
|
+
as foreign keys by ncs_mdes. These are described in comments in
|
102
|
+
`documents/2.0/heuristic_overrides.yml` in the ncs_mdes source.
|
103
|
+
|
104
|
+
# Heuristics not used
|
105
|
+
|
106
|
+
## Type coercion
|
107
|
+
|
108
|
+
The MDES VDR schema considers nearly all variables to strings; usually
|
109
|
+
strings of a set length or conforming to a particular
|
110
|
+
pattern. `ncs_mdes` does not attempt to infer a stronger type for
|
111
|
+
these.
|
data/bin/mdes-console
CHANGED
@@ -19,6 +19,7 @@ require 'ncs_navigator/mdes'
|
|
19
19
|
|
20
20
|
$mdes12 = NcsNavigator::Mdes::Specification.new('1.2')
|
21
21
|
$mdes20 = NcsNavigator::Mdes::Specification.new('2.0')
|
22
|
+
$mdes21 = NcsNavigator::Mdes::Specification.new('2.1')
|
22
23
|
|
23
24
|
expected_loc = ENV[NcsNavigator::Mdes::SourceDocuments::BASE_ENV_VAR] ?
|
24
25
|
ENV[NcsNavigator::Mdes::SourceDocuments::BASE_ENV_VAR].inspect :
|
@@ -27,5 +28,6 @@ expected_loc = ENV[NcsNavigator::Mdes::SourceDocuments::BASE_ENV_VAR] ?
|
|
27
28
|
puts "Documents are expected to be in #{expected_loc}."
|
28
29
|
puts "$mdes12 is a Specification for 1.2"
|
29
30
|
puts "$mdes20 is a Specification for 2.0"
|
31
|
+
puts "$mdes21 is a Specification for 2.1"
|
30
32
|
|
31
33
|
IRB.start(__FILE__)
|