@datagrok/hit-triage 1.1.4 → 1.1.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +3 -20
- package/README_HD.md +46 -2
- package/README_HT.md +74 -2
- package/css/hit-triage.css +1 -1
- package/dist/package.js +1 -1
- package/dist/package.js.map +1 -1
- package/images/icons/hit-triage-icon.png +0 -0
- package/package.json +1 -1
- package/src/app/accordeons/new-hit-design-template-accordeon.ts +10 -0
- package/src/app/accordeons/new-template-accordeon.ts +10 -0
- package/src/app/consts.ts +2 -0
- package/src/app/dialogs/functions-dialog.ts +55 -5
- package/src/app/hit-design-app.ts +22 -3
- package/src/app/hit-triage-app.ts +68 -6
- package/src/app/hit-triage-views/info-view.ts +11 -13
- package/src/app/types.ts +8 -0
- package/src/app/utils/calculate-single-cell.ts +26 -3
- package/src/app/utils.ts +18 -0
package/README.md
CHANGED
|
@@ -1,26 +1,9 @@
|
|
|
1
1
|
# HitTriage
|
|
2
2
|
|
|
3
|
-
The HitTriage package is a powerful tool designed for molecule analysis and campaign management within the Datagrok environment. It consists of two applications: HitTriage and HitDesign.
|
|
3
|
+
The **HitTriage** package is a powerful tool designed for molecule analysis and campaign management within the Datagrok environment. It consists of two applications: [HitTriage](https://github.com/datagrok-ai/public/blob/master/packages/HitTriage/README_HT.md) and [HitDesign](https://github.com/datagrok-ai/public/blob/master/packages/HitTriage/README_HD.md).
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
- The **HitTriage** application is designed for molecule analysis and filtering. It allows users to upload a dataset, filter it, calculate molecular properties, submit the results to any chosen function or query and share campaigns between users. More about HitTriage can be found [here](https://github.com/datagrok-ai/public/blob/master/packages/HitTriage/README_HT.md).
|
|
6
6
|
|
|
7
|
-
|
|
7
|
+
- The **HitDesign** application is similar in terms of campaign management, but instead of uploading a dataset, users can sketch molecules, calculate molecular properties, filter and organize them in stages. More about HitDesign can be found [here](https://github.com/datagrok-ai/public/blob/master/packages/HitTriage/README_HT.md).
|
|
8
8
|
|
|
9
|
-
1. **Template Creation:**
|
|
10
|
-
|
|
11
|
-
Define a template specifying the data source for molecules, name, key, additional needed information and compute functions. This source can be a file upload or a query in any other package tagged with `HitTriageDataSource` tag.
|
|
12
|
-
The Compute functions are collected from any package with a tag `HitTriageFunction`.
|
|
13
|
-
|
|
14
|
-

|
|
15
|
-
|
|
16
|
-
2. **Campaign Building**:
|
|
17
|
-
|
|
18
|
-
Create campaigns based on the template.
|
|
19
|
-
Provide a campaign name, select the data source, provide additional information and initiate the campaign.
|
|
20
|
-
During the campaign run, the specified compute functions are executed, and their results are appended to the dataframe. For example, you can compute molecular descriptors, toxicity risks, structural alerts and more.
|
|
21
|
-
|
|
22
|
-

|
|
23
|
-
|
|
24
|
-
After running a campaign, you can submit the dataframe to any chosen function or query. or
|
|
25
|
-
save the campaign for later use. Saved campaigns can be reloaded and run again by any user on the platform usign the link or the campaigns table on the first page.
|
|
26
9
|
|
package/README_HD.md
CHANGED
|
@@ -68,7 +68,7 @@ Hit design campaign consists of two views, a main design view and a tiles view.
|
|
|
68
68
|
|
|
69
69
|
HitDesign allows users to define custom compute and submit functions, and these functions can be written in any Datagrok package that is installed in the environment.
|
|
70
70
|
|
|
71
|
-
|
|
71
|
+
### Compute functions
|
|
72
72
|
|
|
73
73
|
Compute functions are used to calculate molecular properties. For example, mass, solubility, mutagenicity, partial charges, toxicity risks, etc. By default, Hit design will include compute functions from `Chem` package, which are molecular descriptors, Structural alerts, Toxicity risks and Chemical properties. Users can add additional compute functions by tagging them with `HitDesignFunction` tag and writing them in normal datagrok style. The First two inputs of these functions should be `Dataframe` `table` and `Column` `molecule`, and rest can be any other input. Function should perform a certain task, modify the dataframe in desired way and return the modified dataframe. For example, we can create a function that retrieves the `Chembl` mol registration number by smiles string:
|
|
74
74
|
|
|
@@ -98,7 +98,51 @@ export async function chemblMolregno(table: DG.DataFrame, molecules: DG.Column):
|
|
|
98
98
|
|
|
99
99
|
This function will go through every molecule in the dataframe, convert them to canonical smiles and call the query from Chembl database, that will retrieve the molregno number. The result will be added as a new column to the dataframe. If this function is defined in the `Chembl` package, after building and deploying it to stand, it will be automatically added to the compute functions list in HitDesign.
|
|
100
100
|
|
|
101
|
-
|
|
101
|
+
Datagrok scripts can also be used as compute functions. For example, you can create a js script that adds a new column to the dataframe. This script also needs to have `HitTriageFunction` tag and should accept `Dataframe` `table` and `Column` `molecules` as first two inputs:
|
|
102
|
+
|
|
103
|
+
```
|
|
104
|
+
//name: Demo script HT
|
|
105
|
+
//description: Hello world script
|
|
106
|
+
//language: javascript
|
|
107
|
+
//input: dataframe df
|
|
108
|
+
//input: column col
|
|
109
|
+
//input: int a
|
|
110
|
+
//tags: HitTriageFunction
|
|
111
|
+
//output: dataframe res
|
|
112
|
+
|
|
113
|
+
df.columns.addNewInt('Some number col').init(() => a)
|
|
114
|
+
res = df
|
|
115
|
+
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
Similarly, queries with same `HitTriageFunction` tag will be added to the compute functions list. The query needs to have at least one input, first of which must be `list<string>`, representing the list of molecules. The query must return a dataframe, which should contain column `molecules` in order to join result with initial dataframe. `molecules` column will be used as key for joining tables. For example, you can create a query that looks for the molecule in Chembl database and returns the molregno number:
|
|
119
|
+
|
|
120
|
+
```
|
|
121
|
+
--name: ChemblMolregNoBySmiles
|
|
122
|
+
--friendlyName: Chembl Molregno by smiles
|
|
123
|
+
--input: list<string> molecules
|
|
124
|
+
--tags: HitTriageFunction
|
|
125
|
+
--connection: Chembl
|
|
126
|
+
select molregno, molecules from compound_structures c
|
|
127
|
+
INNER JOIN unnest(@molecules) molecules
|
|
128
|
+
ON molecules.molecules
|
|
129
|
+
= c.canonical_smiles
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
Or a query that calculates fraction of sp3 hybridized carbons in the molecule using RDKit SQL cartridge:
|
|
133
|
+
|
|
134
|
+
```
|
|
135
|
+
--name: SP3Fraction
|
|
136
|
+
--friendlyName: SP3 fraction of carbons
|
|
137
|
+
--input: list<string> molecules
|
|
138
|
+
--tags: HitTriageFunction
|
|
139
|
+
--connection: Chembl
|
|
140
|
+
select molecules, mol_fractioncsp3(Cast(molecules as mol))
|
|
141
|
+
from unnest(@molecules) as molecules
|
|
142
|
+
where is_valid_smiles(Cast(molecules as cstring))
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
### Submit functions
|
|
102
146
|
|
|
103
147
|
Submit functions are used to save or submit the filtered and computed dataset. This could include saving to a private database or additional calculations. Submit functions are defined in the same way as compute functions, but they are tagged with `HitTriageSubmitFunction` tag. The function should accept only two inputs, `Dataframe` `df` and `String` `molecules`, which are the resulting dataframe and name of molecules column respectively. For example, we can create a function that saves the filtered and computed dataset to a database:
|
|
104
148
|
|
package/README_HT.md
CHANGED
|
@@ -40,7 +40,79 @@ The application will detect that the function/query requeires an input parameter
|
|
|
40
40
|
|
|
41
41
|
- **Additional fields** : Users can configure additional fields for the template, which will be prompted for input during campaign creation. These fields include name, type, and whether they are required or not. For example, additional field for a campaign can be a target protein name, Head scientist name, deadlile, etc.
|
|
42
42
|
|
|
43
|
-
- **Compute functions**
|
|
43
|
+
- **Compute functions**
|
|
44
|
+
|
|
45
|
+
Compute functions are used to calculate molecular properties. For example, mass, solubility, mutagenicity, partial charges, toxicity risks, etc. By default, Hit design will include compute functions from `Chem` package, which are molecular descriptors, Structural alerts, Toxicity risks and Chemical properties. Users can add additional compute functions by tagging them with `HitDesignFunction` tag and writing them in normal datagrok style. The First two inputs of these functions should be `Dataframe` `table` and `Column` `molecule`, and rest can be any other input. Function should perform a certain task, modify the dataframe in desired way and return the modified dataframe. For example, we can create a function that retrieves the `Chembl` mol registration number by smiles string:
|
|
46
|
+
|
|
47
|
+
```
|
|
48
|
+
//name: Chembl molregno
|
|
49
|
+
//tags: HitTriageFunction
|
|
50
|
+
//input: dataframe table [Input data table] {caption: Table}
|
|
51
|
+
//input: column molecules {caption: Molecules; semType: Molecule}
|
|
52
|
+
//output: dataframe result
|
|
53
|
+
export async function chemblMolregno(table: DG.DataFrame, molecules: DG.Column): Promise<DG.DataFrame> {
|
|
54
|
+
const name = table.columns.getUnusedName('CHEMBL molregno');
|
|
55
|
+
table.columns.addNewInt(name);
|
|
56
|
+
for (let i = 0; i < molecules.length; i++) {
|
|
57
|
+
const smile = molecules.get(i);
|
|
58
|
+
if (!smile) {
|
|
59
|
+
table.set(name, i, null);
|
|
60
|
+
continue;
|
|
61
|
+
}
|
|
62
|
+
const canonical = grok.chem.convert(smile, DG.chem.Notation.Unknown, DG.chem.Notation.Smiles);
|
|
63
|
+
const resDf: DG.DataFrame = await grok.data.query('Chembl:ChemblMolregNoBySmiles', {smiles: canonical});
|
|
64
|
+
const res: number = resDf.getCol('molregno').toList()[0];
|
|
65
|
+
table.set(name, i, res);
|
|
66
|
+
}
|
|
67
|
+
return table;
|
|
68
|
+
}
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
This function will go through every molecule in the dataframe, convert them to canonical smiles and call the query from Chembl database, that will retrieve the molregno number. The result will be added as a new column to the dataframe. If this function is defined in the `Chembl` package, after building and deploying it to stand, it will be automatically added to the compute functions list in HitDesign.
|
|
72
|
+
|
|
73
|
+
Datagrok scripts can also be used as compute functions. For example, you can create a js script that adds a new column to the dataframe. This script also needs to have `HitTriageFunction` tag and should accept `Dataframe` `table` and `Column` `molecules` as first two inputs:
|
|
74
|
+
|
|
75
|
+
```
|
|
76
|
+
//name: Demo script HT
|
|
77
|
+
//description: Hello world script
|
|
78
|
+
//language: javascript
|
|
79
|
+
//input: dataframe df
|
|
80
|
+
//input: column col
|
|
81
|
+
//input: int a
|
|
82
|
+
//tags: HitTriageFunction
|
|
83
|
+
//output: dataframe res
|
|
84
|
+
|
|
85
|
+
df.columns.addNewInt('Some number col').init(() => a)
|
|
86
|
+
res = df
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
Similarly, queries with same `HitTriageFunction` tag will be added to the compute functions list. The query needs to have at least one input, first of which must be `list<string>`, representing the list of molecules. The query must return a dataframe, which should contain column `molecules` in order to join result with initial dataframe. `molecules` column will be used as key for joining tables. For example, we can create a query that looks for the molecule in Chembl database and returns the molregno number:
|
|
91
|
+
|
|
92
|
+
```
|
|
93
|
+
--name: ChemblMolregNoBySmilesDirect
|
|
94
|
+
--friendlyName: Chembl Molregno by smiles direct
|
|
95
|
+
--input: list<string> molecules
|
|
96
|
+
--tags: HitTriageFunction
|
|
97
|
+
--connection: Chembl
|
|
98
|
+
select molregno, molecules from compound_structures c
|
|
99
|
+
INNER JOIN unnest(@molecules) molecules
|
|
100
|
+
ON molecules.molecules
|
|
101
|
+
= c.canonical_smiles
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
Or a query that calculates fraction of sp3 hybridized carbons in the molecule using RDKit SQL cartridge:
|
|
105
|
+
|
|
106
|
+
```
|
|
107
|
+
--name: SP3Fraction
|
|
108
|
+
--friendlyName: SP3 fraction of carbons
|
|
109
|
+
--input: list<string> molecules
|
|
110
|
+
--tags: HitTriageFunction
|
|
111
|
+
--connection: Chembl
|
|
112
|
+
select molecules, mol_fractioncsp3(Cast(molecules as mol))
|
|
113
|
+
from unnest(@molecules) as molecules
|
|
114
|
+
where is_valid_smiles(Cast(molecules as cstring))
|
|
115
|
+
```
|
|
44
116
|
|
|
45
117
|
- **Submit function** : Users can define custom submit functions (tagged with `HitTriageSubmitFunction`) to further process or save the filtered and computed dataset. This could include saving to a private database or additional calculations.
|
|
46
118
|
|
|
@@ -72,6 +144,6 @@ Users can start a new campaign by choosing a template and filling out the requir
|
|
|
72
144
|
|
|
73
145
|

|
|
74
146
|
|
|
75
|
-
After the campaign starts,
|
|
147
|
+
After the campaign starts, new calculated columns will be added. Users can filter, modify or add viewers to the campaign and then save them. Once saved, reloading the campaign will restore the saved state.
|
|
76
148
|
|
|
77
149
|

|