ncs_mdes 0.5.0 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --color
2
+ --format=nested
data/.yardopts CHANGED
@@ -1,5 +1,5 @@
1
1
  --no-private
2
2
  --markup markdown
3
3
  --hide-void-return
4
- --files CHANGELOG.md
4
+ --files CHANGELOG.md,HEURISTICS.md
5
5
  --readme README.md
data/CHANGELOG.md CHANGED
@@ -1,6 +1,12 @@
1
1
  NCS Navigator MDES Module history
2
2
  =================================
3
3
 
4
+ 0.6.0
5
+ -----
6
+
7
+ - Add built-in support for MDES 2.1 (#8).
8
+ - Add built-in support for MDES 2.2 (#9).
9
+
4
10
  0.5.0
5
11
  -----
6
12
 
@@ -8,7 +14,7 @@ NCS Navigator MDES Module history
8
14
  flag tables as one or the other.
9
15
  - Fix reference definition for `spec_blood.equip_id`. According to the
10
16
  corresponding instrument, it is not a reference to `spec_equipment`,
11
- but rather a manually-filled field.
17
+ but rather a manually-filled field (#7).
12
18
 
13
19
  0.4.2
14
20
  -----
data/HEURISTICS.md ADDED
@@ -0,0 +1,111 @@
1
+ # Why `ncs_mdes` tells you the things it tells you
2
+
3
+ `ncs_mdes` derives its view of the NCS Master Data Element
4
+ Specification primarily from the the XML Schema defining the Vanguard
5
+ Data Repository submission format. However, that file does not contain
6
+ the full semantics that the gem exposes. This document discusses how
7
+ the remaining attributes are derived.
8
+
9
+ # Gem overview
10
+
11
+ `ncs_mdes` exposes data in three major categories:
12
+
13
+ * Tables
14
+ * Types
15
+ * Disposition codes
16
+
17
+ Types are fairly simple, and are mostly interesting insofar as they
18
+ are the mechanism whereby you can look up a code list. Disposition
19
+ codes are extracted from the Master Data Element Specification
20
+ spreadsheet instead of the VDR schema — unlike the tables and
21
+ types, they are pre-processed rather than coming from the source
22
+ document at runtime — but are otherwise simple. This document
23
+ is mainly concerned with tables and their children, variables.
24
+
25
+ # Tables
26
+
27
+ The table name attribute is taken directly from the VDR schema.
28
+
29
+ ## Instrument or operational?
30
+
31
+ `ncs_mdes` can also tell you if a table is an operational or
32
+ instrument table (this is an XOR relationship) and, if it is an
33
+ instrument table, whether it is a "primary" instrument table.
34
+
35
+ Definitions:
36
+
37
+ * An operational table is a table that collects study execution
38
+ information.
39
+
40
+ * An instrument table is a table that contains data collected about a
41
+ study participant.
42
+
43
+ * A "primary" instrument table is a table for which there is exactly
44
+ one record for each time the instrument is collected for a
45
+ participant. (The MDES is a relational model; non-primary tables
46
+ contain the results of repeating instrument sections or multivalued
47
+ questions and are always associated with a primary table, though
48
+ sometimes the association is indirect.)
49
+
50
+ These distinctions are derived using the following heuristic:
51
+
52
+ * If the table contains a variable named `instrument_version` and is
53
+ not the table named `instrument`, it is a primary instrument table
54
+ (and therefore an instrument table). (The table `instrument` is
55
+ itself an operational table since it records the execution of an
56
+ instrument rather than any of the data collected in the instrument.)
57
+
58
+ * If the table contains a foreign key to a table which is an
59
+ instrument table, then it is an instrument table.
60
+
61
+ * Otherwise, the table is an operational table.
62
+
63
+ This heuristic works in all cases for MDES 2.0.
64
+
65
+ # Variables
66
+
67
+ The following attributes of a variable are taken directly from the XML
68
+ schema:
69
+
70
+ * name
71
+ * pii?
72
+ * required?
73
+ * omittable?
74
+ * nillable?
75
+ * status (active, etc.)
76
+ * type
77
+
78
+ ## Table references
79
+
80
+ `ncs_mdes` can also tell you if a variable is a foreign key reference
81
+ and if so, to which table it refers. While the XML schema indicates
82
+ that a variable is of one of a couple of foreign key types, it does
83
+ not indicate the associated table. That information is derived using
84
+ the following heuristic:
85
+
86
+ * If the variable is not of foreign key type, it's not a foreign key.
87
+
88
+ * Otherwise, find all the tables in the MDES whose primary key is
89
+ named the same as the candidate foreign key variable.
90
+
91
+ * If there is exactly one such table, the variable refers to that
92
+ table.
93
+
94
+ * Otherwise fail.
95
+
96
+ This heuristic does not fail for 399 of the foreign keys in MDES
97
+ 2.0. Another 155 are mapped manually for a total of 554.
98
+
99
+ There are also three variables which are typed as foreign keys in the
100
+ XML schema but which for a couple of different reasons are not treated
101
+ as foreign keys by ncs_mdes. These are described in comments in
102
+ `documents/2.0/heuristic_overrides.yml` in the ncs_mdes source.
103
+
104
+ # Heuristics not used
105
+
106
+ ## Type coercion
107
+
108
+ The MDES VDR schema considers nearly all variables to strings; usually
109
+ strings of a set length or conforming to a particular
110
+ pattern. `ncs_mdes` does not attempt to infer a stronger type for
111
+ these.
data/bin/mdes-console CHANGED
@@ -19,6 +19,7 @@ require 'ncs_navigator/mdes'
19
19
 
20
20
  $mdes12 = NcsNavigator::Mdes::Specification.new('1.2')
21
21
  $mdes20 = NcsNavigator::Mdes::Specification.new('2.0')
22
+ $mdes21 = NcsNavigator::Mdes::Specification.new('2.1')
22
23
 
23
24
  expected_loc = ENV[NcsNavigator::Mdes::SourceDocuments::BASE_ENV_VAR] ?
24
25
  ENV[NcsNavigator::Mdes::SourceDocuments::BASE_ENV_VAR].inspect :
@@ -27,5 +28,6 @@ expected_loc = ENV[NcsNavigator::Mdes::SourceDocuments::BASE_ENV_VAR] ?
27
28
  puts "Documents are expected to be in #{expected_loc}."
28
29
  puts "$mdes12 is a Specification for 1.2"
29
30
  puts "$mdes20 is a Specification for 2.0"
31
+ puts "$mdes21 is a Specification for 2.1"
30
32
 
31
33
  IRB.start(__FILE__)