conceptql 0.0.9 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 5750de5f196ce44a4f679c8c28613bd7476824a1
4
- data.tar.gz: 66c448563f934f4a64e5a030e343eceef5a691aa
3
+ metadata.gz: 297853bf4ddbd97f096bd42b929843cf25b88774
4
+ data.tar.gz: b6ec45dc7694ffdd8e205a2fa0c911501cb8ebb4
5
5
  SHA512:
6
- metadata.gz: 286d1046bab811f33f2a83d0066cc608a2d205c46beb188bc3cd9227db36a4cad2b23bbc0ee968ddb49ba34394800bd2d312661787d77bb8467f6bb80a13d09e
7
- data.tar.gz: b6978826328af7dd3d1cc3af40447a20ccd3afe83074c0ecbce3528f56dbc6cb48621131e784444194a050f10549dc523dccd0a9228a453892facb9faa099bb1
6
+ metadata.gz: c858413d3020e1168b793c43d54cf1a9e7a9574cc5ee0a1edd50237e71e4abc07ba2a6a9036fe8a82249b3908dce5ff6ea8d4c9500f05665223dd36c6b08bd7c
7
+ data.tar.gz: 3cd9354b86045d6d66061d131b32fb2451f1b9fdd582e905bdb7fbf0ea7d28b1657663e0b63c261b2454af30f234e81aaa509e04a49793c1150032d4de2e6a60
data/CHANGELOG.md CHANGED
@@ -1,6 +1,24 @@
1
1
  # Changelog
2
2
  All notable changes to this project will be documented in this file.
3
3
 
4
+ ## 0.1.0 - 2014-09-04
5
+
6
+ ### Added
7
+ - Support for numeric, string, and concept_ids returned in results.
8
+ - Many updates to the ConceptQL Specification document.
9
+ - Added doc/implementation_notes.md to capture thoughts and bad ideas.
10
+
11
+ ### Deprecated
12
+ - Nothing.
13
+
14
+ ### Removed
15
+ - Nothing.
16
+
17
+ ### Fixed
18
+ - "Fake" graphs are now drawn correctly.
19
+ - bin/conceptql doesn't bomb out drawing "fake" graphs
20
+
21
+
4
22
  ## 0.0.9 - 2014-09-03
5
23
 
6
24
  ### Added
@@ -0,0 +1,39 @@
1
+ # ConceptQL Implementation Notes
2
+ And here is where I record information about some of the decisions I made.
3
+
4
+ ## Define/Recall
5
+ - Sequel's create table statement runs out-of-band with rest of ConceptQL statemnt
6
+ - Gets executed immediately
7
+ - This is going to be very slow for large datasets :-(
8
+ - I had to retool Tree and Query and Graph to expect an array of concepts in a ConceptQL statement
9
+ - I'm not sure I like this
10
+ - A ConceptQL statement perhaps should be only a single statement at the end
11
+ - Defines need to occur before they are used
12
+ - Most languages have a "forward definition" ability
13
+ - I have no use cases for when we might need those?
14
+ - Perhaps a definition that uses a definition that doesn't exist?
15
+ - Is that recursive?
16
+ - Is that something we want/need in ConceptQL?
17
+
18
+
19
+ ## Values
20
+ - I had considered comparing a result set with a between operator, but this implies that the R stream needs a range somehow
21
+ - I don't think supporting a range is a good idea right now
22
+
23
+
24
+ ## Bad Ideas
25
+ Here is where I intend to record bad ideas that appear to be blind allies
26
+
27
+ ### Removing person-only rows from a stream after ANDing
28
+ - Theoretically, any result set that passes up through an AND node carries all the valid person IDs with it, so it is redundant to carry results that are just person-only. It would make sense to eliminate them.
29
+ - NO BAD IDEA
30
+ - We have to check to see if the stream is exclusively person-only. In that case, it is unsafe to eliniate those results because we'd end up removing all results from the stream
31
+ - We could implement this check, but let's do that later when we want to optimize things
32
+ - Also, what happens if we cast to visit later and we've removed all person-only results? This could cause odd behavior
33
+
34
+ ### Treating Patient Streams as "Eternal" when they encounter a temporal node
35
+ I had created a rather sophisticated system of how to handle a patient stream entering a temporal node.
36
+
37
+ The basic premise is that patient information (gender, race, etc) is "timeless" or "eternal" and so if a patient stream is the R stream in a temporal node and the L stream is, say, a stream of MIs, what is the result? I proposed that the MI would be filtered down to only those patients that appear in the R stream.
38
+
39
+ Likewise, if the MI stream is the R stream and the person stream is the L stream, what's the result? Same thing. Only patients common to both streams are passed through the L stream. But patients DO have a date associated with them: their date of birth. And, by passing a patient stream through a time_shift node of say, +50yr, we can use that patient stream in the R stream of a temporal node to filter the L stream by the patient being 50 years old.
data/doc/spec.md CHANGED
@@ -478,7 +478,7 @@ For situations where we need to represent pre-defined date ranges, we can use "d
478
478
  - *Not yet implemented*
479
479
 
480
480
 
481
- #### What is \<date-format\>?
481
+ #### What is <date-format\>?
482
482
  Dates follow these formats:
483
483
 
484
484
  - "YYYY-MM-DD"
@@ -848,6 +848,90 @@ And don't forget the left-hand side can have multiple types of streams:
848
848
  }
849
849
  ```
850
850
 
851
+
852
+ ## Sub-concepts within a Larger Concept
853
+ If a concept is particularly complex, or has a stream of results that are used more than once, it can be helpful to break the concept into a set of sub-concepts. This can be done using two nodes: define and recall
854
+
855
+ #### define
856
+ - Takes 2 arguments
857
+ - First argument is a string of arbitrary length that describe the stream to be save. This is the "name" assigned to the stream for later recall
858
+ - Second argument is the stream to save under the name specified
859
+
860
+
861
+ #### recall
862
+ - Takes 1 argument
863
+ - The "name" of the stream previously saved using the `define` node
864
+
865
+
866
+ A stream must be `define`d before `recall` can use it.
867
+
868
+ ```ConceptQL
869
+ # Save away a stream of results to build the 1 inpatient, 2 outpatient pattern used in claims data algorithms
870
+ [
871
+ {
872
+ define: [
873
+ 'Heart Attack Visit',
874
+ { visit_occurrence: { icd9: '412' } }
875
+ ]
876
+ },
877
+
878
+ {
879
+ define: [
880
+ 'Inpatient Heart Attack',
881
+ {
882
+ intersect: [
883
+ { recall: 'Heart Attack Visit'},
884
+ { place_of_service_code: 21 }
885
+ ]
886
+ }
887
+ ]
888
+ },
889
+
890
+ {
891
+ define: [
892
+ 'Outpatient Heart Attack',
893
+ {
894
+ intersect: [
895
+ { recall: 'Heart Attack Visit'},
896
+ {
897
+ complement: {
898
+ place_of_service_code: 21
899
+ }
900
+ }
901
+ ]
902
+ }
903
+ ]
904
+ },
905
+
906
+ {
907
+ define: [
908
+ 'Earlier of Two Outpatient Heart Attacks',
909
+ {
910
+ before: {
911
+ left: { recall: 'Outpatient Heart Attack' },
912
+ right: {
913
+ time_window: [
914
+ { recall: 'Outpatient Heart Attack' },
915
+ { start: '-30d', end: '0' }
916
+ ]
917
+ }
918
+ }
919
+ }
920
+ ]
921
+ },
922
+
923
+ {
924
+ first: {
925
+ union: [
926
+ { recall: 'Inpatient Heart Attack' },
927
+ { recall: 'Earlier of Two Outpatient Heart Attacks'}
928
+ ]
929
+ }
930
+ }
931
+ ]
932
+ ```
933
+
934
+
851
935
  ## Concepts within Concepts
852
936
  One of the main motivations behind keeping ConceptQL so flexible is to allow users to build ConceptQL statements from other ConceptQL statements. This section loosely describes how this feature will work. Its actual execution and implementation will differ from what is presented here.
853
937
 
@@ -894,6 +978,155 @@ In the actual implementation of the concept node, each ConceptQL statement will
894
978
  }
895
979
  ```
896
980
 
981
+
982
+ ## Values
983
+ A result can carry forward three different types of values, modeled after the behavior of the observation table:
984
+
985
+ - value_as_numeric
986
+ - For values like lab values, counts of occurrence of results, cost information
987
+ - value_as_string
988
+ - For value_as_string from observation table, or notes captured in EHR data
989
+ - value_as_concept_id
990
+ - For values that are like factors from the observation value_as_concept_id column
991
+
992
+
993
+ By default, all value fields are set to NULL, unless a criterion node is explicitly written to populate one or more of those fields.
994
+
995
+ There are many operations that can be performed on the value_as\_\* columns and as those operations are implemented, this section will grow.
996
+
997
+ For now we'll cover some of the general behavior of the value_as_numeric column and it's associated nodes.
998
+
999
+ #### numeric
1000
+ - Takes 2 arguments
1001
+ - A stream
1002
+ - And a numeric value or a symbol representing the name of a column in CDM
1003
+
1004
+ Passing streams through a `numeric` node changes the number stored in the value column:
1005
+
1006
+ ```ConceptQL
1007
+ # All MIs, setting value_as_numeric to 2
1008
+ {
1009
+ numeric: [
1010
+ { icd9: '412' },
1011
+ 2
1012
+ ]
1013
+ }
1014
+ ```
1015
+
1016
+ `numeric` can also take a column name instead of a number. It will derive the results row's value from the value stored in the column specified.
1017
+ ```ConceptQL
1018
+ # All copays for 99214s
1019
+ {
1020
+ numeric: [
1021
+ { procedure_cost: { cpt: '99214' } },
1022
+ :paid_copay
1023
+ ]
1024
+ }
1025
+ ```
1026
+
1027
+ If something nonsensical happens, like the column specified isn't present in the table pointed to by a result row, value_as_numeric in the result row will be unaffected:
1028
+ ```ConceptQL
1029
+ # Still all MIs with value_as_numeric defaulted to NULL. condition_occurrence table doesn't have a "paid_copay" column
1030
+ {
1031
+ value: [
1032
+ { icd9: '412' },
1033
+ :paid_copay
1034
+ ]
1035
+ }
1036
+ ```
1037
+
1038
+ Or if the column specified exists, but refers to a non-numerical column, we'll set the value to 0
1039
+ ```ConceptQL
1040
+ # All MIs, with value set to 0 since the column specified by value node is a non-numerical column
1041
+ {
1042
+ value: [
1043
+ { icd9: '412' },
1044
+ :stop_reason
1045
+ ]
1046
+ }
1047
+ ```
1048
+
1049
+ With a `numeric` node defined, we could introduce a sum node that will sum by patient and type. This allows us to implement the Charlson comorbidity algorithm:
1050
+ ```ConceptQL
1051
+ {
1052
+ sum: [
1053
+ {
1054
+ union: [
1055
+ {
1056
+ numeric: [
1057
+ { person: { icd9: '412' } },
1058
+ 1
1059
+ ]
1060
+ },
1061
+ {
1062
+ numeric: [
1063
+ { person: { icd9: '278.02' } },
1064
+ 2
1065
+ ]
1066
+ }
1067
+ ]
1068
+ }
1069
+ ]
1070
+ }
1071
+ ```
1072
+
1073
+ ### Counting
1074
+ It might be helpful to count the number of occurrences of a result row in a stream. A simple "count" node could group identical rows and store the number of occurrences in the value_as_numeric column.
1075
+
1076
+ I need examples of algorithms that could benefit from this node. I'm concerned that we'll want to roll up occurrences by person most of the time and that would require us to first cast streams to person before passing the person stream to count.
1077
+ ```ConceptQL
1078
+ # Count the number of times each person was irritable
1079
+ {
1080
+ count: { person: { icd9: '799.22' } }
1081
+ }
1082
+ ```
1083
+
1084
+ We could do dumb things like count the number of times a row shows up in a union:
1085
+ ```ConceptQL
1086
+ # All rows with a value of 2 would be rows that were both MI and Primary
1087
+ {
1088
+ count: {
1089
+ union: [
1090
+ { icd9: '412' },
1091
+ { primary_diagnosis: true}
1092
+ ]
1093
+ }
1094
+ }
1095
+ ```
1096
+
1097
+ #### Numeric Value Comparison
1098
+ Acts like any other binary node. L and R streams, joined by person. Any L that pass comparison go downstream. R is thrown out. Comparison based on result row's value column.
1099
+
1100
+ - Less than
1101
+ - Less than or equal
1102
+ - Equal
1103
+ - Greater than or equal
1104
+ - Greater than
1105
+ - Not equal
1106
+
1107
+
1108
+ ### numeric as criterion node
1109
+ Numeric doesn't have to take a stream. If it doesn't have a stream as an argument, it acts like a criterion node much like date_range
1110
+ ```ConceptQL
1111
+ # People with more than 1 MI
1112
+ {
1113
+
1114
+ greater_than: {
1115
+ left: { count: { person: { icd9: '412' }}},
1116
+ right: { numeric: 1 }
1117
+ }
1118
+ }
1119
+ ```
1120
+
1121
+ #### sum
1122
+ - Takes a stream of results and does some wild things
1123
+ - Groups all results by person and type
1124
+ - Sums the value_as_numeric column within that grouping
1125
+ - Sets start_date to the earliest start_date in the group
1126
+ - Sets the end_date to the most recent end_date in the group
1127
+ - Sets criterion_id to 0 since there is no particular single row that the result refers to anymore
1128
+
1129
+
897
1130
  # Appendix A - Criterion Nodes
898
1131
 
899
1132
  | Node Name | Stream Type | Arguments | Returns |
@@ -1031,15 +1264,14 @@ ConceptQL is not yet fully specified. These are modifications/enhancements that
1031
1264
  5. How do we want to look up standard vocab concepts?
1032
1265
  - I think Marc’s approach is a bit heavy-handed
1033
1266
 
1034
-
1035
- ### Slots and Variables
1036
1267
  Some statements maybe very useful and it would be handy to reuse the bulk of the statement, but perhaps vary just a few things about it. ConceptQL supports the idea of using variables to represent sub-expressions. The variable node is used as a place holder to say "some criteria set belongs here". That variable can be defined in another part of the criteria set and will be used in all places the variable node appears.
1037
1268
 
1038
- If a variable node is used, but not defined, the concept is still valid, but will fail to run until a definition for all missing variables is provided.
1039
1269
 
1040
- I don't have a good feel for:
1270
+ ### Future Work for Define and Recall
1271
+ I'd like to make it so if a variable node is used, but not defined, the concept is still valid, but will fail to run until a definition for all missing variables is provided.
1272
+
1273
+ But I don't have a good feel for:
1041
1274
 
1042
- - How to represent a variable node in a diagram
1043
1275
  - Whether we should have users name the variables, or auto-assign a name?
1044
1276
  - We risk name collisions if a concept includes a sub-concept with the same variable name
1045
1277
  - Probably need to name space all variables
@@ -1048,173 +1280,26 @@ I don't have a good feel for:
1048
1280
  - We'll need to do a pass through a concept to find all variables and prompt a user, then do another pass through the concept before attempting to execute it to ensure all variables have values
1049
1281
  - Do we throw an exception if not?
1050
1282
  - Do we require calling programs to invoke a check on the concept before generating the query?
1283
+ - Perhaps slot is a different node from "define"
1051
1284
 
1052
1285
 
1053
- ##### Update 2014-08-13
1054
- I've hacked some variable support into an experimental branch of ConceptQL. So far, here's how it works:
1055
- - Define node
1056
- - Give it two params, a name, and a stream
1057
- - Stream is used to populate a temporary table that is named after the name
1058
- - name is plain english sentence that is then turned into a hexdigest for a name
1059
- - Avoids collisions with other names
1060
- - Avoids truncation issues if name is WAY long
1061
- - Recall node
1062
- - Takes a single argument: the name used in a define node
1063
- - Re-written to fetch results from the temp table that has the name provided
1064
-
1065
- Current issues:
1066
-
1067
- - Sequel's create table statement runs out-of-band with rest of ConceptQL statemnt
1068
- - Gets executed immediately
1069
- - Type information in "define" needs to be made available to "from"
1070
- - Currently attempting to pass this information from define to from using an attribute tacked onto the shared db connection
1071
- - Recall may not have access to this information until #query is called
1072
- - This is bad and needs to be fixed/rethought
1073
-
1074
-
1075
- Considerations for the future:
1076
- - Probably want to rename these nodes to something better
1077
- - It would still be nice to drop a concept into a concept that has "slots" waiting
1078
- - Perhaps slot is a different node from "define"
1079
- - I had to retool Tree and Query and Graph to expect an array of concepts in a ConceptQL statement
1080
- - I'm not sure I like this
1081
- - A ConceptQL statement perhaps should be only a single statement at the end
1082
- - If an array of sub-concepts is fed into Query, maybe we only execute the last one after parsing the others
1083
- - This is consistent with how Sequel wants to live and would yield a single set of results
1084
- - I think I like this
1085
- - Defines need to occur before they are used
1086
- - Most languages have a "forward definition" ability
1087
- - I have no use cases for when we might need those?
1088
- - Perhaps a definition that uses a definition that doesn't exist?
1089
- - Is that recursive?
1090
- - Is that something we want/need in ConceptQL?
1091
-
1092
-
1093
- ### Value Nodes
1094
- So far, we can’t recreate the Charlson comorbidity index using ConceptQL. If we added a “value” node, we could.
1095
-
1096
- By default each result row will carry a value column, set to 1. Some examples:
1097
- ```ConceptQL
1098
- # All MIs, defaulting value to 1
1099
- { icd9: '412' }
1100
- ```
1101
-
1102
- Passing streams through a value node changes the number stored in the value column:
1103
-
1104
- ```ConceptQL
1105
- # All MIs, changing value to 2
1106
- {
1107
- value: [
1108
- { icd9: '412' },
1109
- 2
1110
- ]
1111
- }
1112
- ```
1113
-
1114
- Value can also take a column name instead of a number. It will derive the results row's value from the value stored in the column specified.
1115
- ```ConceptQL
1116
- # All copays for 99214s
1117
- {
1118
- value: [
1119
- { procedure_cost: { cpt: '99214' } },
1120
- :paid_copay
1121
- ]
1122
- }
1123
- ```
1124
-
1125
- If something nonsensical happens, like the column specified isn't present in the table pointed to by a result row, the value in the result row will be unaffected:
1126
- ```ConceptQL
1127
- # Still all MIs with value defaulted to 1. condition_occurrence table doesn't have a "paid_copay" column
1128
- {
1129
- value: [
1130
- { icd9: '412' },
1131
- :paid_copay
1132
- ]
1133
- }
1134
- ```
1135
-
1136
- Or if the column specified exists, but refers to a non-numerical column, we'll set the value to 0
1137
- ```ConceptQL
1138
- # All MIs, with value set to 0 since the column specified by value node is a non-numerical column
1139
- {
1140
- value: [
1141
- { icd9: '412' },
1142
- :stop_reason
1143
- ]
1144
- }
1145
- ```
1146
-
1147
- With a value node defined, we could introduce a sum node that will sum by patient. This allows us to implement the Charlson comorbidity algorithm:
1148
- ```ConceptQL
1149
- {
1150
- sum: [
1151
- {
1152
- union: [
1153
- {
1154
- value: [
1155
- { person: { icd9: '412' } },
1156
- 1
1157
- ]
1158
- },
1159
- {
1160
- value: [
1161
- { person: { icd9: '278.02' } },
1162
- 2
1163
- ]
1164
- }
1165
- ]
1166
- }
1167
- ]
1168
- }
1169
- ```
1170
-
1171
- ### Counting
1172
- It might be helpful to count the number of occurrences of a result row in a stream. A simple "count" node could group identical rows and store the number of occurrences in the value column.
1286
+ ### Considerations for Values
1287
+ I'm considering defaulting each value_as\_\* column to some value.
1288
+ - numeric => 1
1289
+ - concept_id => 0
1290
+ - Or maybe the concept_id of the main concept_id value from the row?
1291
+ - This would be confusing when pulling from the observation table
1292
+ - What's the "main" concept_id of a person?
1293
+ - Hm. This feels a bit less like a good idea now
1294
+ - string
1295
+ - source_value?
1296
+ - Boy, this one is even harder to default
1173
1297
 
1174
- I need examples of algorithms that could benefit from this node. I'm concerned that we'll want to roll up occurrences by person most of the time and that would require us to first cast streams to person before passing the person stream to count.
1175
1298
  ```ConceptQL
1176
- # Count the number of times each person was irritable
1177
- {
1178
- count: { person: { icd9: '799.22' } }
1179
- }
1180
- ```
1181
-
1182
- We could do dumb things like count the number of times a row shows up in a union:
1183
- ```ConceptQL
1184
- # All rows with a value of 2 would be rows that were both MI and Primary
1185
- {
1186
- count: {
1187
- union: [
1188
- { icd9: '412' },
1189
- { primary_diagnosis: true}
1190
- ]
1191
- }
1192
- }
1299
+ # All MIs, defaulting value_as_numeric to 1, concept_id to concept id for 412, string to condition_source_value
1300
+ { icd9: '412' }
1193
1301
  ```
1194
1302
 
1195
- ### Value Comparison
1196
- Acts like any other binary node. L and R streams, joined by person. Any L that pass comparison go downstream. R is thrown out. Comparison based on result row's value column.
1197
-
1198
- - Less than
1199
- - Less than or equal
1200
- - Equal
1201
- - Greater than or equal
1202
- - Greater than
1203
- - Not equal
1204
- - Between
1205
-
1206
-
1207
- ### value_literal
1208
- ```ConceptQL
1209
- # People with more than 1 MI
1210
- {
1211
-
1212
- greater_than: {
1213
- left: { count: { person: { icd9: '412' }}},
1214
- right: { value_literal: 1 }
1215
- }
1216
- }
1217
- ```
1218
1303
 
1219
1304
  ### Filter Node
1220
1305
  Inspired by person_filter, why not just have a "filter" node that filters L by R. Takes L, R, and an "as" option. As option temporarily casts the L and R streams to the type specified by :as and then does person by person comparison, only keeping rows that occur on both sides. Handy for keeping procedures that coincide with conditions without fully casting the streams:
data/lib/conceptql/cli.rb CHANGED
@@ -89,7 +89,7 @@ module ConceptQL
89
89
  puts 'JSON'
90
90
  puts JSON.pretty_generate(q.statement)
91
91
  STDIN.gets
92
- graph_it(statement, my_db, title)
92
+ graph_it(statement, title)
93
93
  STDIN.gets
94
94
  puts q.query.sql
95
95
  STDIN.gets
@@ -59,7 +59,7 @@ module ConceptQL
59
59
  attr :values, :name
60
60
  def initialize(name, values)
61
61
  @name = name.to_s
62
- super(nil, values)
62
+ super(values)
63
63
  end
64
64
 
65
65
  def display_name
@@ -61,7 +61,7 @@ module ConceptQL
61
61
  wheres << Sequel.expr(person_id: uncastable_person_ids)
62
62
  end
63
63
 
64
- destination_type_id = type_id(my_type)
64
+ destination_type_id = make_type_id(my_type)
65
65
 
66
66
  unless to_me_types.empty?
67
67
  # For each castable type in the stream, setup a query that
@@ -72,7 +72,7 @@ module ConceptQL
72
72
  .where(criterion_type: source_type.to_s)
73
73
  .select_group(:criterion_id)
74
74
  source_table = make_table_name(source_type)
75
- source_type_id = type_id(source_type)
75
+ source_type_id = make_type_id(source_type)
76
76
 
77
77
  db.from(source_table)
78
78
  .where(source_type_id => source_ids)
@@ -85,7 +85,7 @@ module ConceptQL
85
85
 
86
86
  unless from_me_types.empty?
87
87
  from_me_types.each do |from_me_type|
88
- fk_type_id = type_id(from_me_type)
88
+ fk_type_id = make_type_id(from_me_type)
89
89
  wheres << Sequel.expr(fk_type_id => db.from(stream_query).where(criterion_type: from_me_type.to_s).select_group(:criterion_id))
90
90
  end
91
91
  end
@@ -11,7 +11,7 @@ module ConceptQL
11
11
  .exclude(:criterion_id => nil)
12
12
  .where(:criterion_type => type.to_s)
13
13
  query = db.from(make_table_name(type))
14
- .exclude(type_id(type) => positive_query)
14
+ .exclude(make_type_id(type) => positive_query)
15
15
  db.from(select_it(query, type))
16
16
  end.inject do |union_query, q|
17
17
  union_query.union(q, all: true)
@@ -0,0 +1,23 @@
1
+ require_relative 'pass_thru'
2
+
3
+ module ConceptQL
4
+ module Nodes
5
+ class Count < PassThru
6
+ def query(db)
7
+ db.from(unioned(db))
8
+ .group(*COLUMNS)
9
+ .select(*(COLUMNS - [:value_as_numeric]))
10
+ .select_append{count(1).as(:value_as_numeric)}
11
+ .from_self
12
+ end
13
+
14
+ def unioned(db)
15
+ children.map { |c| c.evaluate(db) }.inject do |uni, q|
16
+ uni.union(q)
17
+ end
18
+ end
19
+ end
20
+ end
21
+ end
22
+
23
+
@@ -42,7 +42,13 @@ module ConceptQL
42
42
  # Also, things will blow up if you try to use a variable that hasn't been
43
43
  # defined yet.
44
44
  def query(db)
45
- db.create_table!(table_name, temp: true, as: stream.evaluate(db))
45
+ # We'll wrap the creation of the temp table in memoization
46
+ # That way we can call #query multiple times, but only suffer the
47
+ # cost of creating the temp table just once
48
+ @_run ||= begin
49
+ db.create_table!(table_name, temp: true, as: stream.evaluate(db))
50
+ true
51
+ end
46
52
  db.from(table_name)
47
53
  end
48
54
 
@@ -0,0 +1,11 @@
1
+ require_relative 'temporal_node'
2
+
3
+ module ConceptQL
4
+ module Nodes
5
+ class Equal < TemporalNode
6
+ def where_clause
7
+ { r__value_as_numeric: :l__value_as_numeric }
8
+ end
9
+ end
10
+ end
11
+ end
@@ -3,6 +3,16 @@ require 'active_support/core_ext/hash'
3
3
  module ConceptQL
4
4
  module Nodes
5
5
  class Node
6
+ COLUMNS = [
7
+ :person_id,
8
+ :criterion_id,
9
+ :criterion_type,
10
+ :start_date,
11
+ :end_date,
12
+ :value_as_numeric,
13
+ :value_as_string,
14
+ :value_as_concept_id
15
+ ]
6
16
  attr :values, :options
7
17
  attr_accessor :tree
8
18
  def initialize(*args)
@@ -44,13 +54,15 @@ module ConceptQL
44
54
  end
45
55
 
46
56
  def columns(query, local_type = nil)
47
- criterion_type = Sequel.expr(:criterion_type)
57
+ criterion_type = :criterion_type
48
58
  if local_type
49
- criterion_type = Sequel.cast_string(local_type.to_s)
59
+ criterion_type = Sequel.cast_string(local_type.to_s).as(:criterion_type)
50
60
  end
51
- [:person_id___person_id,
52
- Sequel.expr(type_id(local_type)).as(:criterion_id),
53
- criterion_type.as(:criterion_type)] + date_columns(query, local_type)
61
+ columns = [:person_id,
62
+ type_id(local_type),
63
+ criterion_type]
64
+ columns += date_columns(query, local_type)
65
+ columns += value_columns(query)
54
66
  end
55
67
 
56
68
  private
@@ -77,6 +89,10 @@ module ConceptQL
77
89
  def type_id(type = nil)
78
90
  return :criterion_id if type.nil?
79
91
  type = :person if type == :death
92
+ Sequel.expr(make_type_id(type)).as(:criterion_id)
93
+ end
94
+
95
+ def make_type_id(type)
80
96
  (type.to_s + '_id').to_sym
81
97
  end
82
98
 
@@ -84,7 +100,31 @@ module ConceptQL
84
100
  "#{table}___tab".to_sym
85
101
  end
86
102
 
103
+ def value_columns(query)
104
+ [
105
+ numeric_value(query),
106
+ string_value(query),
107
+ concept_id_value(query)
108
+ ]
109
+ end
110
+
111
+ def numeric_value(query)
112
+ return :value_as_numeric if query.columns.include?(:value_as_numeric)
113
+ Sequel.cast_numeric(nil, Float).as(:value_as_numeric)
114
+ end
115
+
116
+ def string_value(query)
117
+ return :value_as_string if query.columns.include?(:value_as_string)
118
+ Sequel.cast_string(nil).as(:value_as_string)
119
+ end
120
+
121
+ def concept_id_value(query)
122
+ return :value_as_concept_id if query.columns.include?(:value_as_concept_id)
123
+ Sequel.cast_numeric(nil).as(:value_as_concept_id)
124
+ end
125
+
87
126
  def date_columns(query, type = nil)
127
+ return [:start_date, :end_date] if (query.columns.include?(:start_date) && query.columns.include?(:end_date))
88
128
  return [:start_date, :end_date] unless type
89
129
  sd = start_date_column(query, type)
90
130
  sd = Sequel.expr(sd).cast(:date).as(:start_date) unless sd == :start_date
@@ -0,0 +1,40 @@
1
+ require_relative 'pass_thru'
2
+
3
+ module ConceptQL
4
+ module Nodes
5
+ # Represents a node that will either:
6
+ # - create a value_as_numeric value for every person in the database
7
+ # - change the value_as_numeric value for every every result passed in
8
+ # - either to a numeric
9
+ # - or a value from a column in the origin row
10
+ #
11
+ # Accepts two params:
12
+ # - Either a numeric value or a symbol representing a column name
13
+ # - An optional stream
14
+ class Numeric < PassThru
15
+ def query(db)
16
+ stream.nil? ? as_criterion(db) : with_kids(db)
17
+ end
18
+
19
+ def types
20
+ stream.nil? ? [:person] : super
21
+ end
22
+
23
+ private
24
+ def with_kids(db)
25
+ db.from(stream.evaluate(db))
26
+ .select(*(COLUMNS - [:value_as_numeric]))
27
+ .select_append(Sequel.lit('?', arguments.first).cast(Float).as(:value_as_numeric))
28
+ .from_self
29
+ end
30
+
31
+ def as_criterion(db)
32
+ db.from(select_it(db.from(:person), :person))
33
+ .select(*(COLUMNS - [:value_as_numeric]))
34
+ .select_append(Sequel.lit('?', arguments.first).cast(Float).as(:value_as_numeric))
35
+ .from_self
36
+ end
37
+ end
38
+ end
39
+ end
40
+
@@ -4,7 +4,7 @@ module ConceptQL
4
4
  module Nodes
5
5
  class PassThru < Node
6
6
  def types
7
- values.map(&:types).flatten.uniq
7
+ children.map(&:types).flatten.uniq
8
8
  end
9
9
  end
10
10
  end
@@ -20,17 +20,26 @@ module ConceptQL
20
20
  # before we call #query. Probably time to reevaluate how we're caching
21
21
  # the type information.
22
22
  def query(db)
23
+ # We're going to call evaluate on definition to ensure the definition
24
+ # has been created. We were running into odd timing issues when
25
+ # drawing graphs where the recall node was being drawn before definition
26
+ # was drawn.
27
+ definition.evaluate(db)
23
28
  db.from(table_name)
24
29
  end
25
30
 
26
31
  def types
27
- tree.defined[table_name].types
32
+ definition.types
28
33
  end
29
34
 
30
35
  private
31
36
  def table_name
32
37
  @table_name ||= namify(arguments.first)
33
38
  end
39
+
40
+ def definition
41
+ tree.defined[table_name]
42
+ end
34
43
  end
35
44
  end
36
45
  end
@@ -0,0 +1,24 @@
1
+ require_relative 'pass_thru'
2
+
3
+ module ConceptQL
4
+ module Nodes
5
+ class Sum < PassThru
6
+ def query(db)
7
+ db.from(unioned(db))
8
+ .select_group(*(COLUMNS - [:start_date, :end_date, :criterion_id, :value_as_numeric]))
9
+ .select_append(Sequel.lit('?', 0).as(:criterion_id))
10
+ .select_append{ min(start_date).as(:start_date) }
11
+ .select_append{ max(end_date).as(:end_date) }
12
+ .select_append{sum(value_as_numeric).as(:value_as_numeric)}
13
+ .from_self
14
+ end
15
+
16
+ def unioned(db)
17
+ children.map { |c| c.evaluate(db) }.inject do |uni, q|
18
+ uni.union(q)
19
+ end
20
+ end
21
+ end
22
+ end
23
+ end
24
+
@@ -1,3 +1,3 @@
1
1
  module ConceptQL
2
- VERSION = "0.0.9"
2
+ VERSION = "0.1.0"
3
3
  end
@@ -10,6 +10,6 @@ describe ConceptQL::Nodes::Complement do
10
10
  it 'generates complement for single criteria' do
11
11
  double1 = QueryDouble.new(1)
12
12
  double1.must_behave_like(:evaluator)
13
- ConceptQL::Nodes::Complement.new(double1).query(Sequel.mock).sql.must_equal "SELECT * FROM (SELECT person_id AS person_id, visit_occurrence_id AS criterion_id, CAST('visit_occurrence' AS varchar(255)) AS criterion_type, CAST(visit_start_date AS date) AS start_date, CAST(visit_end_date AS date) AS end_date FROM visit_occurrence AS tab WHERE (visit_occurrence_id NOT IN (SELECT criterion_id FROM (SELECT * FROM table1) AS t1 WHERE ((criterion_id IS NOT NULL) AND (criterion_type = 'visit_occurrence'))))) AS t1"
13
+ ConceptQL::Nodes::Complement.new(double1).query(Sequel.mock).sql.must_equal "SELECT * FROM (SELECT person_id, visit_occurrence_id AS criterion_id, CAST('visit_occurrence' AS varchar(255)) AS criterion_type, CAST(visit_start_date AS date) AS start_date, CAST(visit_end_date AS date) AS end_date, CAST(NULL AS numeric) AS value_as_numeric, CAST(NULL AS varchar(255)) AS value_as_string, CAST(NULL AS integer) AS value_as_concept_id FROM visit_occurrence AS tab WHERE (visit_occurrence_id NOT IN (SELECT criterion_id FROM (SELECT * FROM table1) AS t1 WHERE ((criterion_id IS NOT NULL) AND (criterion_type = 'visit_occurrence'))))) AS t1"
14
14
  end
15
15
  end
@@ -9,7 +9,7 @@ describe ConceptQL::Nodes::TimeWindow do
9
9
  end
10
10
 
11
11
  def query(db)
12
- db
12
+ db.from(:table)
13
13
  end
14
14
  end
15
15
 
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: conceptql
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.9
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Ryan Duryea
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2014-09-03 00:00:00.000000000 Z
11
+ date: 2014-09-05 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: activesupport
@@ -156,6 +156,7 @@ files:
156
156
  - conceptql.gemspec
157
157
  - doc/ConceptQL Specification (alpha).pdf
158
158
  - doc/diagram_0.png
159
+ - doc/implementation_notes.md
159
160
  - doc/spec.md
160
161
  - lib/conceptql.rb
161
162
  - lib/conceptql/behaviors/dottable.rb
@@ -172,12 +173,14 @@ files:
172
173
  - lib/conceptql/nodes/complement.rb
173
174
  - lib/conceptql/nodes/concept.rb
174
175
  - lib/conceptql/nodes/condition_type.rb
176
+ - lib/conceptql/nodes/count.rb
175
177
  - lib/conceptql/nodes/cpt.rb
176
178
  - lib/conceptql/nodes/date_range.rb
177
179
  - lib/conceptql/nodes/death.rb
178
180
  - lib/conceptql/nodes/define.rb
179
181
  - lib/conceptql/nodes/drug_type_concept.rb
180
182
  - lib/conceptql/nodes/during.rb
183
+ - lib/conceptql/nodes/equal.rb
181
184
  - lib/conceptql/nodes/except.rb
182
185
  - lib/conceptql/nodes/first.rb
183
186
  - lib/conceptql/nodes/from.rb
@@ -190,6 +193,7 @@ files:
190
193
  - lib/conceptql/nodes/last.rb
191
194
  - lib/conceptql/nodes/loinc.rb
192
195
  - lib/conceptql/nodes/node.rb
196
+ - lib/conceptql/nodes/numeric.rb
193
197
  - lib/conceptql/nodes/occurrence.rb
194
198
  - lib/conceptql/nodes/pass_thru.rb
195
199
  - lib/conceptql/nodes/person.rb
@@ -203,6 +207,7 @@ files:
203
207
  - lib/conceptql/nodes/source_vocabulary_node.rb
204
208
  - lib/conceptql/nodes/standard_vocabulary_node.rb
205
209
  - lib/conceptql/nodes/started_by.rb
210
+ - lib/conceptql/nodes/sum.rb
206
211
  - lib/conceptql/nodes/temporal_node.rb
207
212
  - lib/conceptql/nodes/time_window.rb
208
213
  - lib/conceptql/nodes/union.rb