conceptql 0.0.9 → 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +18 -0
- data/doc/ConceptQL Specification (alpha).pdf +0 -0
- data/doc/implementation_notes.md +39 -0
- data/doc/spec.md +252 -167
- data/lib/conceptql/cli.rb +1 -1
- data/lib/conceptql/graph_nodifier.rb +1 -1
- data/lib/conceptql/nodes/casting_node.rb +3 -3
- data/lib/conceptql/nodes/complement.rb +1 -1
- data/lib/conceptql/nodes/count.rb +23 -0
- data/lib/conceptql/nodes/define.rb +7 -1
- data/lib/conceptql/nodes/equal.rb +11 -0
- data/lib/conceptql/nodes/node.rb +45 -5
- data/lib/conceptql/nodes/numeric.rb +40 -0
- data/lib/conceptql/nodes/pass_thru.rb +1 -1
- data/lib/conceptql/nodes/recall.rb +10 -1
- data/lib/conceptql/nodes/sum.rb +24 -0
- data/lib/conceptql/version.rb +1 -1
- data/spec/conceptql/nodes/complement_spec.rb +1 -1
- data/spec/conceptql/nodes/time_window_spec.rb +1 -1
- metadata +7 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 297853bf4ddbd97f096bd42b929843cf25b88774
|
4
|
+
data.tar.gz: b6ec45dc7694ffdd8e205a2fa0c911501cb8ebb4
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: c858413d3020e1168b793c43d54cf1a9e7a9574cc5ee0a1edd50237e71e4abc07ba2a6a9036fe8a82249b3908dce5ff6ea8d4c9500f05665223dd36c6b08bd7c
|
7
|
+
data.tar.gz: 3cd9354b86045d6d66061d131b32fb2451f1b9fdd582e905bdb7fbf0ea7d28b1657663e0b63c261b2454af30f234e81aaa509e04a49793c1150032d4de2e6a60
|
data/CHANGELOG.md
CHANGED
@@ -1,6 +1,24 @@
|
|
1
1
|
# Changelog
|
2
2
|
All notable changes to this project will be documented in this file.
|
3
3
|
|
4
|
+
## 0.1.0 - 2014-09-04
|
5
|
+
|
6
|
+
### Added
|
7
|
+
- Support for numeric, string, and concept_ids returned in results.
|
8
|
+
- Many updates to the ConceptQL Specification document.
|
9
|
+
- Added doc/implementation_notes.md to capture thoughts and bad ideas.
|
10
|
+
|
11
|
+
### Deprecated
|
12
|
+
- Nothing.
|
13
|
+
|
14
|
+
### Removed
|
15
|
+
- Nothing.
|
16
|
+
|
17
|
+
### Fixed
|
18
|
+
- "Fake" graphs are now drawn correctly.
|
19
|
+
- bin/conceptql doesn't bomb out drawing "fake" graphs
|
20
|
+
|
21
|
+
|
4
22
|
## 0.0.9 - 2014-09-03
|
5
23
|
|
6
24
|
### Added
|
Binary file
|
@@ -0,0 +1,39 @@
|
|
1
|
+
# ConceptQL Implementation Notes
|
2
|
+
And here is where I record information about some of the decisions I made.
|
3
|
+
|
4
|
+
## Define/Recall
|
5
|
+
- Sequel's create table statement runs out-of-band with rest of ConceptQL statemnt
|
6
|
+
- Gets executed immediately
|
7
|
+
- This is going to be very slow for large datasets :-(
|
8
|
+
- I had to retool Tree and Query and Graph to expect an array of concepts in a ConceptQL statement
|
9
|
+
- I'm not sure I like this
|
10
|
+
- A ConceptQL statement perhaps should be only a single statement at the end
|
11
|
+
- Defines need to occur before they are used
|
12
|
+
- Most languages have a "forward definition" ability
|
13
|
+
- I have no use cases for when we might need those?
|
14
|
+
- Perhaps a definition that uses a definition that doesn't exist?
|
15
|
+
- Is that recursive?
|
16
|
+
- Is that something we want/need in ConceptQL?
|
17
|
+
|
18
|
+
|
19
|
+
## Values
|
20
|
+
- I had considered comparing a result set with a between operator, but this implies that the R stream needs a range somehow
|
21
|
+
- I don't think supporting a range is a good idea right now
|
22
|
+
|
23
|
+
|
24
|
+
## Bad Ideas
|
25
|
+
Here is where I intend to record bad ideas that appear to be blind allies
|
26
|
+
|
27
|
+
### Removing person-only rows from a stream after ANDing
|
28
|
+
- Theoretically, any result set that passes up through an AND node carries all the valid person IDs with it, so it is redundant to carry results that are just person-only. It would make sense to eliminate them.
|
29
|
+
- NO BAD IDEA
|
30
|
+
- We have to check to see if the stream is exclusively person-only. In that case, it is unsafe to eliniate those results because we'd end up removing all results from the stream
|
31
|
+
- We could implement this check, but let's do that later when we want to optimize things
|
32
|
+
- Also, what happens if we cast to visit later and we've removed all person-only results? This could cause odd behavior
|
33
|
+
|
34
|
+
### Treating Patient Streams as "Eternal" when they encounter a temporal node
|
35
|
+
I had created a rather sophisticated system of how to handle a patient stream entering a temporal node.
|
36
|
+
|
37
|
+
The basic premise is that patient information (gender, race, etc) is "timeless" or "eternal" and so if a patient stream is the R stream in a temporal node and the L stream is, say, a stream of MIs, what is the result? I proposed that the MI would be filtered down to only those patients that appear in the R stream.
|
38
|
+
|
39
|
+
Likewise, if the MI stream is the R stream and the person stream is the L stream, what's the result? Same thing. Only patients common to both streams are passed through the L stream. But patients DO have a date associated with them: their date of birth. And, by passing a patient stream through a time_shift node of say, +50yr, we can use that patient stream in the R stream of a temporal node to filter the L stream by the patient being 50 years old.
|
data/doc/spec.md
CHANGED
@@ -478,7 +478,7 @@ For situations where we need to represent pre-defined date ranges, we can use "d
|
|
478
478
|
- *Not yet implemented*
|
479
479
|
|
480
480
|
|
481
|
-
#### What is
|
481
|
+
#### What is <date-format\>?
|
482
482
|
Dates follow these formats:
|
483
483
|
|
484
484
|
- "YYYY-MM-DD"
|
@@ -848,6 +848,90 @@ And don't forget the left-hand side can have multiple types of streams:
|
|
848
848
|
}
|
849
849
|
```
|
850
850
|
|
851
|
+
|
852
|
+
## Sub-concepts within a Larger Concept
|
853
|
+
If a concept is particularly complex, or has a stream of results that are used more than once, it can be helpful to break the concept into a set of sub-concepts. This can be done using two nodes: define and recall
|
854
|
+
|
855
|
+
#### define
|
856
|
+
- Takes 2 arguments
|
857
|
+
- First argument is a string of arbitrary length that describe the stream to be save. This is the "name" assigned to the stream for later recall
|
858
|
+
- Second argument is the stream to save under the name specified
|
859
|
+
|
860
|
+
|
861
|
+
#### recall
|
862
|
+
- Takes 1 argument
|
863
|
+
- The "name" of the stream previously saved using the `define` node
|
864
|
+
|
865
|
+
|
866
|
+
A stream must be `define`d before `recall` can use it.
|
867
|
+
|
868
|
+
```ConceptQL
|
869
|
+
# Save away a stream of results to build the 1 inpatient, 2 outpatient pattern used in claims data algorithms
|
870
|
+
[
|
871
|
+
{
|
872
|
+
define: [
|
873
|
+
'Heart Attack Visit',
|
874
|
+
{ visit_occurrence: { icd9: '412' } }
|
875
|
+
]
|
876
|
+
},
|
877
|
+
|
878
|
+
{
|
879
|
+
define: [
|
880
|
+
'Inpatient Heart Attack',
|
881
|
+
{
|
882
|
+
intersect: [
|
883
|
+
{ recall: 'Heart Attack Visit'},
|
884
|
+
{ place_of_service_code: 21 }
|
885
|
+
]
|
886
|
+
}
|
887
|
+
]
|
888
|
+
},
|
889
|
+
|
890
|
+
{
|
891
|
+
define: [
|
892
|
+
'Outpatient Heart Attack',
|
893
|
+
{
|
894
|
+
intersect: [
|
895
|
+
{ recall: 'Heart Attack Visit'},
|
896
|
+
{
|
897
|
+
complement: {
|
898
|
+
place_of_service_code: 21
|
899
|
+
}
|
900
|
+
}
|
901
|
+
]
|
902
|
+
}
|
903
|
+
]
|
904
|
+
},
|
905
|
+
|
906
|
+
{
|
907
|
+
define: [
|
908
|
+
'Earlier of Two Outpatient Heart Attacks',
|
909
|
+
{
|
910
|
+
before: {
|
911
|
+
left: { recall: 'Outpatient Heart Attack' },
|
912
|
+
right: {
|
913
|
+
time_window: [
|
914
|
+
{ recall: 'Outpatient Heart Attack' },
|
915
|
+
{ start: '-30d', end: '0' }
|
916
|
+
]
|
917
|
+
}
|
918
|
+
}
|
919
|
+
}
|
920
|
+
]
|
921
|
+
},
|
922
|
+
|
923
|
+
{
|
924
|
+
first: {
|
925
|
+
union: [
|
926
|
+
{ recall: 'Inpatient Heart Attack' },
|
927
|
+
{ recall: 'Earlier of Two Outpatient Heart Attacks'}
|
928
|
+
]
|
929
|
+
}
|
930
|
+
}
|
931
|
+
]
|
932
|
+
```
|
933
|
+
|
934
|
+
|
851
935
|
## Concepts within Concepts
|
852
936
|
One of the main motivations behind keeping ConceptQL so flexible is to allow users to build ConceptQL statements from other ConceptQL statements. This section loosely describes how this feature will work. Its actual execution and implementation will differ from what is presented here.
|
853
937
|
|
@@ -894,6 +978,155 @@ In the actual implementation of the concept node, each ConceptQL statement will
|
|
894
978
|
}
|
895
979
|
```
|
896
980
|
|
981
|
+
|
982
|
+
## Values
|
983
|
+
A result can carry forward three different types of values, modeled after the behavior of the observation table:
|
984
|
+
|
985
|
+
- value_as_numeric
|
986
|
+
- For values like lab values, counts of occurrence of results, cost information
|
987
|
+
- value_as_string
|
988
|
+
- For value_as_string from observation table, or notes captured in EHR data
|
989
|
+
- value_as_concept_id
|
990
|
+
- For values that are like factors from the observation value_as_concept_id column
|
991
|
+
|
992
|
+
|
993
|
+
By default, all value fields are set to NULL, unless a criterion node is explicitly written to populate one or more of those fields.
|
994
|
+
|
995
|
+
There are many operations that can be performed on the value_as\_\* columns and as those operations are implemented, this section will grow.
|
996
|
+
|
997
|
+
For now we'll cover some of the general behavior of the value_as_numeric column and it's associated nodes.
|
998
|
+
|
999
|
+
#### numeric
|
1000
|
+
- Takes 2 arguments
|
1001
|
+
- A stream
|
1002
|
+
- And a numeric value or a symbol representing the name of a column in CDM
|
1003
|
+
|
1004
|
+
Passing streams through a `numeric` node changes the number stored in the value column:
|
1005
|
+
|
1006
|
+
```ConceptQL
|
1007
|
+
# All MIs, setting value_as_numeric to 2
|
1008
|
+
{
|
1009
|
+
numeric: [
|
1010
|
+
{ icd9: '412' },
|
1011
|
+
2
|
1012
|
+
]
|
1013
|
+
}
|
1014
|
+
```
|
1015
|
+
|
1016
|
+
`numeric` can also take a column name instead of a number. It will derive the results row's value from the value stored in the column specified.
|
1017
|
+
```ConceptQL
|
1018
|
+
# All copays for 99214s
|
1019
|
+
{
|
1020
|
+
numeric: [
|
1021
|
+
{ procedure_cost: { cpt: '99214' } },
|
1022
|
+
:paid_copay
|
1023
|
+
]
|
1024
|
+
}
|
1025
|
+
```
|
1026
|
+
|
1027
|
+
If something nonsensical happens, like the column specified isn't present in the table pointed to by a result row, value_as_numeric in the result row will be unaffected:
|
1028
|
+
```ConceptQL
|
1029
|
+
# Still all MIs with value_as_numeric defaulted to NULL. condition_occurrence table doesn't have a "paid_copay" column
|
1030
|
+
{
|
1031
|
+
value: [
|
1032
|
+
{ icd9: '412' },
|
1033
|
+
:paid_copay
|
1034
|
+
]
|
1035
|
+
}
|
1036
|
+
```
|
1037
|
+
|
1038
|
+
Or if the column specified exists, but refers to a non-numerical column, we'll set the value to 0
|
1039
|
+
```ConceptQL
|
1040
|
+
# All MIs, with value set to 0 since the column specified by value node is a non-numerical column
|
1041
|
+
{
|
1042
|
+
value: [
|
1043
|
+
{ icd9: '412' },
|
1044
|
+
:stop_reason
|
1045
|
+
]
|
1046
|
+
}
|
1047
|
+
```
|
1048
|
+
|
1049
|
+
With a `numeric` node defined, we could introduce a sum node that will sum by patient and type. This allows us to implement the Charlson comorbidity algorithm:
|
1050
|
+
```ConceptQL
|
1051
|
+
{
|
1052
|
+
sum: [
|
1053
|
+
{
|
1054
|
+
union: [
|
1055
|
+
{
|
1056
|
+
numeric: [
|
1057
|
+
{ person: { icd9: '412' } },
|
1058
|
+
1
|
1059
|
+
]
|
1060
|
+
},
|
1061
|
+
{
|
1062
|
+
numeric: [
|
1063
|
+
{ person: { icd9: '278.02' } },
|
1064
|
+
2
|
1065
|
+
]
|
1066
|
+
}
|
1067
|
+
]
|
1068
|
+
}
|
1069
|
+
]
|
1070
|
+
}
|
1071
|
+
```
|
1072
|
+
|
1073
|
+
### Counting
|
1074
|
+
It might be helpful to count the number of occurrences of a result row in a stream. A simple "count" node could group identical rows and store the number of occurrences in the value_as_numeric column.
|
1075
|
+
|
1076
|
+
I need examples of algorithms that could benefit from this node. I'm concerned that we'll want to roll up occurrences by person most of the time and that would require us to first cast streams to person before passing the person stream to count.
|
1077
|
+
```ConceptQL
|
1078
|
+
# Count the number of times each person was irritable
|
1079
|
+
{
|
1080
|
+
count: { person: { icd9: '799.22' } }
|
1081
|
+
}
|
1082
|
+
```
|
1083
|
+
|
1084
|
+
We could do dumb things like count the number of times a row shows up in a union:
|
1085
|
+
```ConceptQL
|
1086
|
+
# All rows with a value of 2 would be rows that were both MI and Primary
|
1087
|
+
{
|
1088
|
+
count: {
|
1089
|
+
union: [
|
1090
|
+
{ icd9: '412' },
|
1091
|
+
{ primary_diagnosis: true}
|
1092
|
+
]
|
1093
|
+
}
|
1094
|
+
}
|
1095
|
+
```
|
1096
|
+
|
1097
|
+
#### Numeric Value Comparison
|
1098
|
+
Acts like any other binary node. L and R streams, joined by person. Any L that pass comparison go downstream. R is thrown out. Comparison based on result row's value column.
|
1099
|
+
|
1100
|
+
- Less than
|
1101
|
+
- Less than or equal
|
1102
|
+
- Equal
|
1103
|
+
- Greater than or equal
|
1104
|
+
- Greater than
|
1105
|
+
- Not equal
|
1106
|
+
|
1107
|
+
|
1108
|
+
### numeric as criterion node
|
1109
|
+
Numeric doesn't have to take a stream. If it doesn't have a stream as an argument, it acts like a criterion node much like date_range
|
1110
|
+
```ConceptQL
|
1111
|
+
# People with more than 1 MI
|
1112
|
+
{
|
1113
|
+
|
1114
|
+
greater_than: {
|
1115
|
+
left: { count: { person: { icd9: '412' }}},
|
1116
|
+
right: { numeric: 1 }
|
1117
|
+
}
|
1118
|
+
}
|
1119
|
+
```
|
1120
|
+
|
1121
|
+
#### sum
|
1122
|
+
- Takes a stream of results and does some wild things
|
1123
|
+
- Groups all results by person and type
|
1124
|
+
- Sums the value_as_numeric column within that grouping
|
1125
|
+
- Sets start_date to the earliest start_date in the group
|
1126
|
+
- Sets the end_date to the most recent end_date in the group
|
1127
|
+
- Sets criterion_id to 0 since there is no particular single row that the result refers to anymore
|
1128
|
+
|
1129
|
+
|
897
1130
|
# Appendix A - Criterion Nodes
|
898
1131
|
|
899
1132
|
| Node Name | Stream Type | Arguments | Returns |
|
@@ -1031,15 +1264,14 @@ ConceptQL is not yet fully specified. These are modifications/enhancements that
|
|
1031
1264
|
5. How do we want to look up standard vocab concepts?
|
1032
1265
|
- I think Marc’s approach is a bit heavy-handed
|
1033
1266
|
|
1034
|
-
|
1035
|
-
### Slots and Variables
|
1036
1267
|
Some statements maybe very useful and it would be handy to reuse the bulk of the statement, but perhaps vary just a few things about it. ConceptQL supports the idea of using variables to represent sub-expressions. The variable node is used as a place holder to say "some criteria set belongs here". That variable can be defined in another part of the criteria set and will be used in all places the variable node appears.
|
1037
1268
|
|
1038
|
-
If a variable node is used, but not defined, the concept is still valid, but will fail to run until a definition for all missing variables is provided.
|
1039
1269
|
|
1040
|
-
|
1270
|
+
### Future Work for Define and Recall
|
1271
|
+
I'd like to make it so if a variable node is used, but not defined, the concept is still valid, but will fail to run until a definition for all missing variables is provided.
|
1272
|
+
|
1273
|
+
But I don't have a good feel for:
|
1041
1274
|
|
1042
|
-
- How to represent a variable node in a diagram
|
1043
1275
|
- Whether we should have users name the variables, or auto-assign a name?
|
1044
1276
|
- We risk name collisions if a concept includes a sub-concept with the same variable name
|
1045
1277
|
- Probably need to name space all variables
|
@@ -1048,173 +1280,26 @@ I don't have a good feel for:
|
|
1048
1280
|
- We'll need to do a pass through a concept to find all variables and prompt a user, then do another pass through the concept before attempting to execute it to ensure all variables have values
|
1049
1281
|
- Do we throw an exception if not?
|
1050
1282
|
- Do we require calling programs to invoke a check on the concept before generating the query?
|
1283
|
+
- Perhaps slot is a different node from "define"
|
1051
1284
|
|
1052
1285
|
|
1053
|
-
|
1054
|
-
I'
|
1055
|
-
-
|
1056
|
-
|
1057
|
-
-
|
1058
|
-
|
1059
|
-
-
|
1060
|
-
-
|
1061
|
-
-
|
1062
|
-
-
|
1063
|
-
-
|
1064
|
-
|
1065
|
-
Current issues:
|
1066
|
-
|
1067
|
-
- Sequel's create table statement runs out-of-band with rest of ConceptQL statemnt
|
1068
|
-
- Gets executed immediately
|
1069
|
-
- Type information in "define" needs to be made available to "from"
|
1070
|
-
- Currently attempting to pass this information from define to from using an attribute tacked onto the shared db connection
|
1071
|
-
- Recall may not have access to this information until #query is called
|
1072
|
-
- This is bad and needs to be fixed/rethought
|
1073
|
-
|
1074
|
-
|
1075
|
-
Considerations for the future:
|
1076
|
-
- Probably want to rename these nodes to something better
|
1077
|
-
- It would still be nice to drop a concept into a concept that has "slots" waiting
|
1078
|
-
- Perhaps slot is a different node from "define"
|
1079
|
-
- I had to retool Tree and Query and Graph to expect an array of concepts in a ConceptQL statement
|
1080
|
-
- I'm not sure I like this
|
1081
|
-
- A ConceptQL statement perhaps should be only a single statement at the end
|
1082
|
-
- If an array of sub-concepts is fed into Query, maybe we only execute the last one after parsing the others
|
1083
|
-
- This is consistent with how Sequel wants to live and would yield a single set of results
|
1084
|
-
- I think I like this
|
1085
|
-
- Defines need to occur before they are used
|
1086
|
-
- Most languages have a "forward definition" ability
|
1087
|
-
- I have no use cases for when we might need those?
|
1088
|
-
- Perhaps a definition that uses a definition that doesn't exist?
|
1089
|
-
- Is that recursive?
|
1090
|
-
- Is that something we want/need in ConceptQL?
|
1091
|
-
|
1092
|
-
|
1093
|
-
### Value Nodes
|
1094
|
-
So far, we can’t recreate the Charlson comorbidity index using ConceptQL. If we added a “value” node, we could.
|
1095
|
-
|
1096
|
-
By default each result row will carry a value column, set to 1. Some examples:
|
1097
|
-
```ConceptQL
|
1098
|
-
# All MIs, defaulting value to 1
|
1099
|
-
{ icd9: '412' }
|
1100
|
-
```
|
1101
|
-
|
1102
|
-
Passing streams through a value node changes the number stored in the value column:
|
1103
|
-
|
1104
|
-
```ConceptQL
|
1105
|
-
# All MIs, changing value to 2
|
1106
|
-
{
|
1107
|
-
value: [
|
1108
|
-
{ icd9: '412' },
|
1109
|
-
2
|
1110
|
-
]
|
1111
|
-
}
|
1112
|
-
```
|
1113
|
-
|
1114
|
-
Value can also take a column name instead of a number. It will derive the results row's value from the value stored in the column specified.
|
1115
|
-
```ConceptQL
|
1116
|
-
# All copays for 99214s
|
1117
|
-
{
|
1118
|
-
value: [
|
1119
|
-
{ procedure_cost: { cpt: '99214' } },
|
1120
|
-
:paid_copay
|
1121
|
-
]
|
1122
|
-
}
|
1123
|
-
```
|
1124
|
-
|
1125
|
-
If something nonsensical happens, like the column specified isn't present in the table pointed to by a result row, the value in the result row will be unaffected:
|
1126
|
-
```ConceptQL
|
1127
|
-
# Still all MIs with value defaulted to 1. condition_occurrence table doesn't have a "paid_copay" column
|
1128
|
-
{
|
1129
|
-
value: [
|
1130
|
-
{ icd9: '412' },
|
1131
|
-
:paid_copay
|
1132
|
-
]
|
1133
|
-
}
|
1134
|
-
```
|
1135
|
-
|
1136
|
-
Or if the column specified exists, but refers to a non-numerical column, we'll set the value to 0
|
1137
|
-
```ConceptQL
|
1138
|
-
# All MIs, with value set to 0 since the column specified by value node is a non-numerical column
|
1139
|
-
{
|
1140
|
-
value: [
|
1141
|
-
{ icd9: '412' },
|
1142
|
-
:stop_reason
|
1143
|
-
]
|
1144
|
-
}
|
1145
|
-
```
|
1146
|
-
|
1147
|
-
With a value node defined, we could introduce a sum node that will sum by patient. This allows us to implement the Charlson comorbidity algorithm:
|
1148
|
-
```ConceptQL
|
1149
|
-
{
|
1150
|
-
sum: [
|
1151
|
-
{
|
1152
|
-
union: [
|
1153
|
-
{
|
1154
|
-
value: [
|
1155
|
-
{ person: { icd9: '412' } },
|
1156
|
-
1
|
1157
|
-
]
|
1158
|
-
},
|
1159
|
-
{
|
1160
|
-
value: [
|
1161
|
-
{ person: { icd9: '278.02' } },
|
1162
|
-
2
|
1163
|
-
]
|
1164
|
-
}
|
1165
|
-
]
|
1166
|
-
}
|
1167
|
-
]
|
1168
|
-
}
|
1169
|
-
```
|
1170
|
-
|
1171
|
-
### Counting
|
1172
|
-
It might be helpful to count the number of occurrences of a result row in a stream. A simple "count" node could group identical rows and store the number of occurrences in the value column.
|
1286
|
+
### Considerations for Values
|
1287
|
+
I'm considering defaulting each value_as\_\* column to some value.
|
1288
|
+
- numeric => 1
|
1289
|
+
- concept_id => 0
|
1290
|
+
- Or maybe the concept_id of the main concept_id value from the row?
|
1291
|
+
- This would be confusing when pulling from the observation table
|
1292
|
+
- What's the "main" concept_id of a person?
|
1293
|
+
- Hm. This feels a bit less like a good idea now
|
1294
|
+
- string
|
1295
|
+
- source_value?
|
1296
|
+
- Boy, this one is even harder to default
|
1173
1297
|
|
1174
|
-
I need examples of algorithms that could benefit from this node. I'm concerned that we'll want to roll up occurrences by person most of the time and that would require us to first cast streams to person before passing the person stream to count.
|
1175
1298
|
```ConceptQL
|
1176
|
-
#
|
1177
|
-
{
|
1178
|
-
count: { person: { icd9: '799.22' } }
|
1179
|
-
}
|
1180
|
-
```
|
1181
|
-
|
1182
|
-
We could do dumb things like count the number of times a row shows up in a union:
|
1183
|
-
```ConceptQL
|
1184
|
-
# All rows with a value of 2 would be rows that were both MI and Primary
|
1185
|
-
{
|
1186
|
-
count: {
|
1187
|
-
union: [
|
1188
|
-
{ icd9: '412' },
|
1189
|
-
{ primary_diagnosis: true}
|
1190
|
-
]
|
1191
|
-
}
|
1192
|
-
}
|
1299
|
+
# All MIs, defaulting value_as_numeric to 1, concept_id to concept id for 412, string to condition_source_value
|
1300
|
+
{ icd9: '412' }
|
1193
1301
|
```
|
1194
1302
|
|
1195
|
-
### Value Comparison
|
1196
|
-
Acts like any other binary node. L and R streams, joined by person. Any L that pass comparison go downstream. R is thrown out. Comparison based on result row's value column.
|
1197
|
-
|
1198
|
-
- Less than
|
1199
|
-
- Less than or equal
|
1200
|
-
- Equal
|
1201
|
-
- Greater than or equal
|
1202
|
-
- Greater than
|
1203
|
-
- Not equal
|
1204
|
-
- Between
|
1205
|
-
|
1206
|
-
|
1207
|
-
### value_literal
|
1208
|
-
```ConceptQL
|
1209
|
-
# People with more than 1 MI
|
1210
|
-
{
|
1211
|
-
|
1212
|
-
greater_than: {
|
1213
|
-
left: { count: { person: { icd9: '412' }}},
|
1214
|
-
right: { value_literal: 1 }
|
1215
|
-
}
|
1216
|
-
}
|
1217
|
-
```
|
1218
1303
|
|
1219
1304
|
### Filter Node
|
1220
1305
|
Inspired by person_filter, why not just have a "filter" node that filters L by R. Takes L, R, and an "as" option. As option temporarily casts the L and R streams to the type specified by :as and then does person by person comparison, only keeping rows that occur on both sides. Handy for keeping procedures that coincide with conditions without fully casting the streams:
|
data/lib/conceptql/cli.rb
CHANGED
@@ -61,7 +61,7 @@ module ConceptQL
|
|
61
61
|
wheres << Sequel.expr(person_id: uncastable_person_ids)
|
62
62
|
end
|
63
63
|
|
64
|
-
destination_type_id =
|
64
|
+
destination_type_id = make_type_id(my_type)
|
65
65
|
|
66
66
|
unless to_me_types.empty?
|
67
67
|
# For each castable type in the stream, setup a query that
|
@@ -72,7 +72,7 @@ module ConceptQL
|
|
72
72
|
.where(criterion_type: source_type.to_s)
|
73
73
|
.select_group(:criterion_id)
|
74
74
|
source_table = make_table_name(source_type)
|
75
|
-
source_type_id =
|
75
|
+
source_type_id = make_type_id(source_type)
|
76
76
|
|
77
77
|
db.from(source_table)
|
78
78
|
.where(source_type_id => source_ids)
|
@@ -85,7 +85,7 @@ module ConceptQL
|
|
85
85
|
|
86
86
|
unless from_me_types.empty?
|
87
87
|
from_me_types.each do |from_me_type|
|
88
|
-
fk_type_id =
|
88
|
+
fk_type_id = make_type_id(from_me_type)
|
89
89
|
wheres << Sequel.expr(fk_type_id => db.from(stream_query).where(criterion_type: from_me_type.to_s).select_group(:criterion_id))
|
90
90
|
end
|
91
91
|
end
|
@@ -11,7 +11,7 @@ module ConceptQL
|
|
11
11
|
.exclude(:criterion_id => nil)
|
12
12
|
.where(:criterion_type => type.to_s)
|
13
13
|
query = db.from(make_table_name(type))
|
14
|
-
.exclude(
|
14
|
+
.exclude(make_type_id(type) => positive_query)
|
15
15
|
db.from(select_it(query, type))
|
16
16
|
end.inject do |union_query, q|
|
17
17
|
union_query.union(q, all: true)
|
@@ -0,0 +1,23 @@
|
|
1
|
+
require_relative 'pass_thru'
|
2
|
+
|
3
|
+
module ConceptQL
|
4
|
+
module Nodes
|
5
|
+
class Count < PassThru
|
6
|
+
def query(db)
|
7
|
+
db.from(unioned(db))
|
8
|
+
.group(*COLUMNS)
|
9
|
+
.select(*(COLUMNS - [:value_as_numeric]))
|
10
|
+
.select_append{count(1).as(:value_as_numeric)}
|
11
|
+
.from_self
|
12
|
+
end
|
13
|
+
|
14
|
+
def unioned(db)
|
15
|
+
children.map { |c| c.evaluate(db) }.inject do |uni, q|
|
16
|
+
uni.union(q)
|
17
|
+
end
|
18
|
+
end
|
19
|
+
end
|
20
|
+
end
|
21
|
+
end
|
22
|
+
|
23
|
+
|
@@ -42,7 +42,13 @@ module ConceptQL
|
|
42
42
|
# Also, things will blow up if you try to use a variable that hasn't been
|
43
43
|
# defined yet.
|
44
44
|
def query(db)
|
45
|
-
|
45
|
+
# We'll wrap the creation of the temp table in memoization
|
46
|
+
# That way we can call #query multiple times, but only suffer the
|
47
|
+
# cost of creating the temp table just once
|
48
|
+
@_run ||= begin
|
49
|
+
db.create_table!(table_name, temp: true, as: stream.evaluate(db))
|
50
|
+
true
|
51
|
+
end
|
46
52
|
db.from(table_name)
|
47
53
|
end
|
48
54
|
|
data/lib/conceptql/nodes/node.rb
CHANGED
@@ -3,6 +3,16 @@ require 'active_support/core_ext/hash'
|
|
3
3
|
module ConceptQL
|
4
4
|
module Nodes
|
5
5
|
class Node
|
6
|
+
COLUMNS = [
|
7
|
+
:person_id,
|
8
|
+
:criterion_id,
|
9
|
+
:criterion_type,
|
10
|
+
:start_date,
|
11
|
+
:end_date,
|
12
|
+
:value_as_numeric,
|
13
|
+
:value_as_string,
|
14
|
+
:value_as_concept_id
|
15
|
+
]
|
6
16
|
attr :values, :options
|
7
17
|
attr_accessor :tree
|
8
18
|
def initialize(*args)
|
@@ -44,13 +54,15 @@ module ConceptQL
|
|
44
54
|
end
|
45
55
|
|
46
56
|
def columns(query, local_type = nil)
|
47
|
-
criterion_type =
|
57
|
+
criterion_type = :criterion_type
|
48
58
|
if local_type
|
49
|
-
criterion_type = Sequel.cast_string(local_type.to_s)
|
59
|
+
criterion_type = Sequel.cast_string(local_type.to_s).as(:criterion_type)
|
50
60
|
end
|
51
|
-
[:
|
52
|
-
|
53
|
-
|
61
|
+
columns = [:person_id,
|
62
|
+
type_id(local_type),
|
63
|
+
criterion_type]
|
64
|
+
columns += date_columns(query, local_type)
|
65
|
+
columns += value_columns(query)
|
54
66
|
end
|
55
67
|
|
56
68
|
private
|
@@ -77,6 +89,10 @@ module ConceptQL
|
|
77
89
|
def type_id(type = nil)
|
78
90
|
return :criterion_id if type.nil?
|
79
91
|
type = :person if type == :death
|
92
|
+
Sequel.expr(make_type_id(type)).as(:criterion_id)
|
93
|
+
end
|
94
|
+
|
95
|
+
def make_type_id(type)
|
80
96
|
(type.to_s + '_id').to_sym
|
81
97
|
end
|
82
98
|
|
@@ -84,7 +100,31 @@ module ConceptQL
|
|
84
100
|
"#{table}___tab".to_sym
|
85
101
|
end
|
86
102
|
|
103
|
+
def value_columns(query)
|
104
|
+
[
|
105
|
+
numeric_value(query),
|
106
|
+
string_value(query),
|
107
|
+
concept_id_value(query)
|
108
|
+
]
|
109
|
+
end
|
110
|
+
|
111
|
+
def numeric_value(query)
|
112
|
+
return :value_as_numeric if query.columns.include?(:value_as_numeric)
|
113
|
+
Sequel.cast_numeric(nil, Float).as(:value_as_numeric)
|
114
|
+
end
|
115
|
+
|
116
|
+
def string_value(query)
|
117
|
+
return :value_as_string if query.columns.include?(:value_as_string)
|
118
|
+
Sequel.cast_string(nil).as(:value_as_string)
|
119
|
+
end
|
120
|
+
|
121
|
+
def concept_id_value(query)
|
122
|
+
return :value_as_concept_id if query.columns.include?(:value_as_concept_id)
|
123
|
+
Sequel.cast_numeric(nil).as(:value_as_concept_id)
|
124
|
+
end
|
125
|
+
|
87
126
|
def date_columns(query, type = nil)
|
127
|
+
return [:start_date, :end_date] if (query.columns.include?(:start_date) && query.columns.include?(:end_date))
|
88
128
|
return [:start_date, :end_date] unless type
|
89
129
|
sd = start_date_column(query, type)
|
90
130
|
sd = Sequel.expr(sd).cast(:date).as(:start_date) unless sd == :start_date
|
@@ -0,0 +1,40 @@
|
|
1
|
+
require_relative 'pass_thru'
|
2
|
+
|
3
|
+
module ConceptQL
|
4
|
+
module Nodes
|
5
|
+
# Represents a node that will either:
|
6
|
+
# - create a value_as_numeric value for every person in the database
|
7
|
+
# - change the value_as_numeric value for every every result passed in
|
8
|
+
# - either to a numeric
|
9
|
+
# - or a value from a column in the origin row
|
10
|
+
#
|
11
|
+
# Accepts two params:
|
12
|
+
# - Either a numeric value or a symbol representing a column name
|
13
|
+
# - An optional stream
|
14
|
+
class Numeric < PassThru
|
15
|
+
def query(db)
|
16
|
+
stream.nil? ? as_criterion(db) : with_kids(db)
|
17
|
+
end
|
18
|
+
|
19
|
+
def types
|
20
|
+
stream.nil? ? [:person] : super
|
21
|
+
end
|
22
|
+
|
23
|
+
private
|
24
|
+
def with_kids(db)
|
25
|
+
db.from(stream.evaluate(db))
|
26
|
+
.select(*(COLUMNS - [:value_as_numeric]))
|
27
|
+
.select_append(Sequel.lit('?', arguments.first).cast(Float).as(:value_as_numeric))
|
28
|
+
.from_self
|
29
|
+
end
|
30
|
+
|
31
|
+
def as_criterion(db)
|
32
|
+
db.from(select_it(db.from(:person), :person))
|
33
|
+
.select(*(COLUMNS - [:value_as_numeric]))
|
34
|
+
.select_append(Sequel.lit('?', arguments.first).cast(Float).as(:value_as_numeric))
|
35
|
+
.from_self
|
36
|
+
end
|
37
|
+
end
|
38
|
+
end
|
39
|
+
end
|
40
|
+
|
@@ -20,17 +20,26 @@ module ConceptQL
|
|
20
20
|
# before we call #query. Probably time to reevaluate how we're caching
|
21
21
|
# the type information.
|
22
22
|
def query(db)
|
23
|
+
# We're going to call evaluate on definition to ensure the definition
|
24
|
+
# has been created. We were running into odd timing issues when
|
25
|
+
# drawing graphs where the recall node was being drawn before definition
|
26
|
+
# was drawn.
|
27
|
+
definition.evaluate(db)
|
23
28
|
db.from(table_name)
|
24
29
|
end
|
25
30
|
|
26
31
|
def types
|
27
|
-
|
32
|
+
definition.types
|
28
33
|
end
|
29
34
|
|
30
35
|
private
|
31
36
|
def table_name
|
32
37
|
@table_name ||= namify(arguments.first)
|
33
38
|
end
|
39
|
+
|
40
|
+
def definition
|
41
|
+
tree.defined[table_name]
|
42
|
+
end
|
34
43
|
end
|
35
44
|
end
|
36
45
|
end
|
@@ -0,0 +1,24 @@
|
|
1
|
+
require_relative 'pass_thru'
|
2
|
+
|
3
|
+
module ConceptQL
|
4
|
+
module Nodes
|
5
|
+
class Sum < PassThru
|
6
|
+
def query(db)
|
7
|
+
db.from(unioned(db))
|
8
|
+
.select_group(*(COLUMNS - [:start_date, :end_date, :criterion_id, :value_as_numeric]))
|
9
|
+
.select_append(Sequel.lit('?', 0).as(:criterion_id))
|
10
|
+
.select_append{ min(start_date).as(:start_date) }
|
11
|
+
.select_append{ max(end_date).as(:end_date) }
|
12
|
+
.select_append{sum(value_as_numeric).as(:value_as_numeric)}
|
13
|
+
.from_self
|
14
|
+
end
|
15
|
+
|
16
|
+
def unioned(db)
|
17
|
+
children.map { |c| c.evaluate(db) }.inject do |uni, q|
|
18
|
+
uni.union(q)
|
19
|
+
end
|
20
|
+
end
|
21
|
+
end
|
22
|
+
end
|
23
|
+
end
|
24
|
+
|
data/lib/conceptql/version.rb
CHANGED
@@ -10,6 +10,6 @@ describe ConceptQL::Nodes::Complement do
|
|
10
10
|
it 'generates complement for single criteria' do
|
11
11
|
double1 = QueryDouble.new(1)
|
12
12
|
double1.must_behave_like(:evaluator)
|
13
|
-
ConceptQL::Nodes::Complement.new(double1).query(Sequel.mock).sql.must_equal "SELECT * FROM (SELECT person_id
|
13
|
+
ConceptQL::Nodes::Complement.new(double1).query(Sequel.mock).sql.must_equal "SELECT * FROM (SELECT person_id, visit_occurrence_id AS criterion_id, CAST('visit_occurrence' AS varchar(255)) AS criterion_type, CAST(visit_start_date AS date) AS start_date, CAST(visit_end_date AS date) AS end_date, CAST(NULL AS numeric) AS value_as_numeric, CAST(NULL AS varchar(255)) AS value_as_string, CAST(NULL AS integer) AS value_as_concept_id FROM visit_occurrence AS tab WHERE (visit_occurrence_id NOT IN (SELECT criterion_id FROM (SELECT * FROM table1) AS t1 WHERE ((criterion_id IS NOT NULL) AND (criterion_type = 'visit_occurrence'))))) AS t1"
|
14
14
|
end
|
15
15
|
end
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: conceptql
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0
|
4
|
+
version: 0.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Ryan Duryea
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date: 2014-09-
|
11
|
+
date: 2014-09-05 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: activesupport
|
@@ -156,6 +156,7 @@ files:
|
|
156
156
|
- conceptql.gemspec
|
157
157
|
- doc/ConceptQL Specification (alpha).pdf
|
158
158
|
- doc/diagram_0.png
|
159
|
+
- doc/implementation_notes.md
|
159
160
|
- doc/spec.md
|
160
161
|
- lib/conceptql.rb
|
161
162
|
- lib/conceptql/behaviors/dottable.rb
|
@@ -172,12 +173,14 @@ files:
|
|
172
173
|
- lib/conceptql/nodes/complement.rb
|
173
174
|
- lib/conceptql/nodes/concept.rb
|
174
175
|
- lib/conceptql/nodes/condition_type.rb
|
176
|
+
- lib/conceptql/nodes/count.rb
|
175
177
|
- lib/conceptql/nodes/cpt.rb
|
176
178
|
- lib/conceptql/nodes/date_range.rb
|
177
179
|
- lib/conceptql/nodes/death.rb
|
178
180
|
- lib/conceptql/nodes/define.rb
|
179
181
|
- lib/conceptql/nodes/drug_type_concept.rb
|
180
182
|
- lib/conceptql/nodes/during.rb
|
183
|
+
- lib/conceptql/nodes/equal.rb
|
181
184
|
- lib/conceptql/nodes/except.rb
|
182
185
|
- lib/conceptql/nodes/first.rb
|
183
186
|
- lib/conceptql/nodes/from.rb
|
@@ -190,6 +193,7 @@ files:
|
|
190
193
|
- lib/conceptql/nodes/last.rb
|
191
194
|
- lib/conceptql/nodes/loinc.rb
|
192
195
|
- lib/conceptql/nodes/node.rb
|
196
|
+
- lib/conceptql/nodes/numeric.rb
|
193
197
|
- lib/conceptql/nodes/occurrence.rb
|
194
198
|
- lib/conceptql/nodes/pass_thru.rb
|
195
199
|
- lib/conceptql/nodes/person.rb
|
@@ -203,6 +207,7 @@ files:
|
|
203
207
|
- lib/conceptql/nodes/source_vocabulary_node.rb
|
204
208
|
- lib/conceptql/nodes/standard_vocabulary_node.rb
|
205
209
|
- lib/conceptql/nodes/started_by.rb
|
210
|
+
- lib/conceptql/nodes/sum.rb
|
206
211
|
- lib/conceptql/nodes/temporal_node.rb
|
207
212
|
- lib/conceptql/nodes/time_window.rb
|
208
213
|
- lib/conceptql/nodes/union.rb
|