python-katlas 0.1.3__tar.gz → 0.2.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. {python_katlas-0.1.3 → python_katlas-0.2.0}/LICENSE +0 -0
  2. {python_katlas-0.1.3 → python_katlas-0.2.0}/MANIFEST.in +0 -0
  3. {python_katlas-0.1.3/python_katlas.egg-info → python_katlas-0.2.0}/PKG-INFO +152 -112
  4. {python_katlas-0.1.3 → python_katlas-0.2.0}/README.md +121 -102
  5. python_katlas-0.2.0/katlas/__init__.py +1 -0
  6. python_katlas-0.2.0/katlas/_modidx.py +216 -0
  7. python_katlas-0.2.0/katlas/clustering.py +142 -0
  8. python_katlas-0.2.0/katlas/common.py +4 -0
  9. python_katlas-0.2.0/katlas/core.py +6 -0
  10. python_katlas-0.2.0/katlas/data.py +455 -0
  11. python_katlas-0.2.0/katlas/dnn.py +384 -0
  12. {python_katlas-0.1.3 → python_katlas-0.2.0}/katlas/feature.py +136 -111
  13. python_katlas-0.2.0/katlas/pathway.py +170 -0
  14. python_katlas-0.2.0/katlas/plot.py +924 -0
  15. python_katlas-0.2.0/katlas/pssm.py +844 -0
  16. python_katlas-0.2.0/katlas/score.py +322 -0
  17. python_katlas-0.2.0/katlas/statistics.py +102 -0
  18. {python_katlas-0.1.3 → python_katlas-0.2.0}/katlas/train.py +51 -77
  19. python_katlas-0.2.0/katlas/utils.py +189 -0
  20. python_katlas-0.2.0/pyproject.toml +11 -0
  21. {python_katlas-0.1.3 → python_katlas-0.2.0/python_katlas.egg-info}/PKG-INFO +152 -112
  22. {python_katlas-0.1.3 → python_katlas-0.2.0}/python_katlas.egg-info/SOURCES.txt +10 -2
  23. python_katlas-0.2.0/python_katlas.egg-info/requires.txt +27 -0
  24. python_katlas-0.2.0/settings.ini +40 -0
  25. {python_katlas-0.1.3 → python_katlas-0.2.0}/setup.py +0 -0
  26. python_katlas-0.1.3/katlas/__init__.py +0 -1
  27. python_katlas-0.1.3/katlas/_modidx.py +0 -109
  28. python_katlas-0.1.3/katlas/core.py +0 -816
  29. python_katlas-0.1.3/katlas/dl.py +0 -357
  30. python_katlas-0.1.3/katlas/imports.py +0 -7
  31. python_katlas-0.1.3/katlas/plot.py +0 -665
  32. python_katlas-0.1.3/python_katlas.egg-info/requires.txt +0 -19
  33. python_katlas-0.1.3/settings.ini +0 -44
  34. {python_katlas-0.1.3 → python_katlas-0.2.0}/python_katlas.egg-info/dependency_links.txt +0 -0
  35. {python_katlas-0.1.3 → python_katlas-0.2.0}/python_katlas.egg-info/entry_points.txt +0 -0
  36. {python_katlas-0.1.3 → python_katlas-0.2.0}/python_katlas.egg-info/not-zip-safe +0 -0
  37. {python_katlas-0.1.3 → python_katlas-0.2.0}/python_katlas.egg-info/top_level.txt +0 -0
  38. {python_katlas-0.1.3 → python_katlas-0.2.0}/setup.cfg +0 -0
File without changes
File without changes
@@ -1,6 +1,6 @@
1
- Metadata-Version: 2.1
1
+ Metadata-Version: 2.4
2
2
  Name: python-katlas
3
- Version: 0.1.3
3
+ Version: 0.2.0
4
4
  Summary: tools for predicting kinome specificities
5
5
  Home-page: https://github.com/sky1ove/katlas
6
6
  Author: lily
@@ -18,36 +18,52 @@ Classifier: License :: OSI Approved :: Apache Software License
18
18
  Requires-Python: >=3.7
19
19
  Description-Content-Type: text/markdown
20
20
  License-File: LICENSE
21
+ Requires-Dist: pandas
22
+ Requires-Dist: gdown
21
23
  Requires-Dist: statsmodels
24
+ Requires-Dist: statannotations
22
25
  Requires-Dist: fastparquet
26
+ Requires-Dist: pyarrow
23
27
  Requires-Dist: tqdm
28
+ Requires-Dist: logomaker-kinase
29
+ Requires-Dist: seaborn
30
+ Requires-Dist: bokeh
31
+ Requires-Dist: reactome2py
32
+ Requires-Dist: adjustText
33
+ Requires-Dist: scikit-learn
34
+ Requires-Dist: umap-learn
35
+ Requires-Dist: ipywidgets
36
+ Requires-Dist: biopython
24
37
  Provides-Extra: dev
25
38
  Requires-Dist: nbdev; extra == "dev"
26
39
  Requires-Dist: pyngrok; extra == "dev"
27
- Requires-Dist: fastai>=2.7.12; extra == "dev"
28
- Requires-Dist: fastbook; extra == "dev"
40
+ Requires-Dist: fastai; extra == "dev"
29
41
  Requires-Dist: fairscale; extra == "dev"
30
42
  Requires-Dist: fair-esm; extra == "dev"
31
- Requires-Dist: logomaker; extra == "dev"
32
- Requires-Dist: seaborn; extra == "dev"
33
43
  Requires-Dist: rdkit; extra == "dev"
34
- Requires-Dist: umap-learn; extra == "dev"
35
- Requires-Dist: adjustText; extra == "dev"
36
- Requires-Dist: bokeh; extra == "dev"
37
- Requires-Dist: scikit-learn>=1.3.0; extra == "dev"
38
44
  Requires-Dist: openpyxl; extra == "dev"
45
+ Requires-Dist: transformers; extra == "dev"
46
+ Requires-Dist: sentencepiece; extra == "dev"
47
+ Dynamic: author
48
+ Dynamic: author-email
49
+ Dynamic: classifier
50
+ Dynamic: description
51
+ Dynamic: description-content-type
52
+ Dynamic: home-page
53
+ Dynamic: keywords
54
+ Dynamic: license
55
+ Dynamic: license-file
56
+ Dynamic: provides-extra
57
+ Dynamic: requires-dist
58
+ Dynamic: requires-python
59
+ Dynamic: summary
39
60
 
40
61
  # KATLAS
41
62
 
42
63
 
43
64
  <!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->
44
65
 
45
- <img alt="Katlas logo" width="600" caption="Katlas logo" src="https://github.com/sky1ove/katlas/raw/main/dataset/images/logo.png" id="logo"/>
46
-
47
- <a target="_blank" href="https://colab.research.google.com/github/sky1ove/katlas/blob/main/nbs/index.ipynb">
48
- <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
49
- </a> <a href="https://pypi.org/project/python-katlas/">
50
- <img src="https://img.shields.io/pypi/v/python-katlas?link=https%3A%2F%2Fpypi.org%2Fproject%2Fpython-katlas%2F" alt="PyPI"></a>
66
+ <img alt="Katlas logo" width="600" caption="Katlas logo" src="https://github.com/sky1ove/katlas/raw/main/logo.png" id="logo"/>
51
67
 
52
68
  KATLAS is a repository containing python tools to predict kinases given
53
69
  a substrate sequence. It also contains datasets of kinase substrate
@@ -83,8 +99,6 @@ helpful to your research.
83
99
  Follow the instructions in katlas_raw:
84
100
  https://github.com/sky1ove/katlas_raw
85
101
 
86
- Need to install the package via: `pip install 'python-katlas[dev]' -U`
87
-
88
102
  ## Web applications
89
103
 
90
104
  Users can now run the analysis directly on the web without needing to
@@ -93,26 +107,27 @@ code.
93
107
  Check out our latest web platform:
94
108
  [kinase-atlas.com](https://kinase-atlas.com/)
95
109
 
96
- ## Tutorials on Colab
110
+ ## Install
97
111
 
98
- - 1. [Substrate scoring on a single substrate
99
- sequence](https://colab.research.google.com/github/sky1ove/katlas/blob/main/nbs/tutorial_01_sinlge_input.ipynb)
100
- - 2. [High throughput substrate scoring on phosphoproteomics
101
- dataset](https://colab.research.google.com/github/sky1ove/katlas/blob/main/nbs/tutorial_02_high_throughput.ipynb)
102
- - 3. [Kinase enrichment analysis for AKT
103
- inhibitor](https://colab.research.google.com/github/sky1ove/katlas/blob/main/nbs/tutorial_03a_enrichment_AKTi.ipynb)
112
+ UV:
104
113
 
105
- ## Install
114
+ ``` bash
115
+ uv add -U python-katlas
116
+ ```
106
117
 
107
- pip install python-katlas -U
118
+ pip:
108
119
 
109
- To use other modules besides the core, do
110
- `pip install 'python-katlas[dev]' -U`
120
+ ``` bash
121
+ pip install -U python-katlas
122
+ ```
123
+
124
+ If using machine-learning related modules, need to install development
125
+ verison: `pip install -U "python-katlas[dev]"`
111
126
 
112
127
  ## Import
113
128
 
114
129
  ``` python
115
- from katlas.core import *
130
+ from katlas.common import *
116
131
  ```
117
132
 
118
133
  # Quick start
@@ -132,93 +147,101 @@ For input sequences, we also consider it in two conditions:
132
147
  - all capital
133
148
  - contains lower cases indicating phosphorylation status
134
149
 
135
- ## Single sequence as input
150
+ ## Quick start
151
+
152
+ ### Site scoring
136
153
 
137
- ### CDDM, all capital
154
+ CDDM, all capital
138
155
 
139
156
  ``` python
140
- predict_kinase('AAAAAAASGGAGSDN',**param_CDDM_upper)
157
+ predict_kinase('AAAAAAASGAGSDN',**Params("CDDM_upper"))
141
158
  ```
142
159
 
143
- considering string: ['-7A', '-6A', '-5A', '-4A', '-3A', '-2A', '-1A', '0S', '1G', '2G', '3A', '4G', '5S', '6D', '7N']
160
+ considering string: ['-7A', '-6A', '-5A', '-4A', '-3A', '-2A', '-1A', '0S', '1G', '2A', '3G', '4S', '5D', '6N']
144
161
 
145
- kinase
146
- PAK6 2.032
147
- ULK3 2.032
148
- PRKX 2.012
149
- ATR 1.991
150
- PRKD1 1.988
151
- ...
152
- DDR2 0.928
153
- EPHA4 0.928
154
- TEK 0.921
155
- KIT 0.915
156
- FGFR3 0.910
157
- Length: 289, dtype: float64
162
+ GCN2 4.556
163
+ MPSK1 4.425
164
+ MEKK2 4.253
165
+ WNK3 4.213
166
+ WNK1 4.064
167
+ ...
168
+ PDK1 -25.077
169
+ PDHK3 -25.346
170
+ CLK2 -27.251
171
+ ROR2 -27.582
172
+ DDR1 -53.581
173
+ Length: 328, dtype: float64
158
174
 
159
- ### CDDM, with lower case indicating phosphorylation status
175
+ CDDM, with lower case indicating phosphorylation status
160
176
 
161
177
  ``` python
162
- predict_kinase('AAAAAAAsGGAGsDN',**param_CDDM)
178
+ predict_kinase('AAAAAAAsGGAGsDN',**Params("CDDM"))
163
179
  ```
164
180
 
165
181
  considering string: ['-7A', '-6A', '-5A', '-4A', '-3A', '-2A', '-1A', '0s', '1G', '2G', '3A', '4G', '5s', '6D', '7N']
166
182
 
167
- kinase
168
- ULK3 1.987
169
- PAK6 1.981
170
- PRKD1 1.946
171
- PIM3 1.944
172
- PRKX 1.939
173
- ...
174
- EPHA4 0.905
175
- EGFR 0.900
176
- TEK 0.898
177
- FGFR3 0.894
178
- KIT 0.882
179
- Length: 289, dtype: float64
180
-
181
- ### PSPA, with lower case indicating phosphorylation status
183
+ ROR1 8.355
184
+ WNK1 4.907
185
+ WNK2 4.782
186
+ ERK5 4.466
187
+ RIPK2 4.045
188
+ ...
189
+ DDR1 -29.393
190
+ TNNI3K -29.884
191
+ CHAK1 -31.775
192
+ VRK1 -45.287
193
+ BRAF -49.403
194
+ Length: 328, dtype: float64
195
+
196
+ PSPA, with lower case indicating phosphorylation status
182
197
 
183
198
  ``` python
184
- predict_kinase('AEEKEyHsEGG',**param_PSPA).head()
199
+ predict_kinase('AEEKEyHsEGG',**Params("PSPA"))
185
200
  ```
186
201
 
187
202
  considering string: ['-5A', '-4E', '-3E', '-2K', '-1E', '0y', '1H', '2s', '3E', '4G', '5G']
188
203
 
189
204
  kinase
190
- EGFR 4.013
191
- FGFR4 3.568
192
- ZAP70 3.412
193
- CSK 3.241
194
- SYK 3.209
195
- dtype: float64
196
-
197
- ### To replicate the results from The Kinase Library (PSPA)
205
+ EGFR 4.013
206
+ FGFR4 3.568
207
+ ZAP70 3.412
208
+ CSK 3.241
209
+ SYK 3.209
210
+ ...
211
+ JAK1 -3.837
212
+ DDR2 -4.421
213
+ TNK2 -4.534
214
+ TNNI3K_TYR -4.651
215
+ TNK1 -5.320
216
+ Length: 93, dtype: float64
217
+
218
+ To replicate the results from The Kinase Library (PSPA)
198
219
 
199
220
  Check this link: [The Kinase
200
- Library](https://kinase-library.phosphosite.org/site?s=AEEKEy*HsEGG&pp=false&scp=true),
221
+ Library](https://kinase-library.mit.edu/site?s=AEEKEy*HSEGG&pp=false&scp=true),
201
222
  and use log2(score) to rank, it shows same results with the below (with
202
223
  slight differences due to rounding).
203
224
 
204
225
  ``` python
205
- predict_kinase('AEEKEyHSEGG',**param_PSPA).head(10)
226
+ out = predict_kinase('AEEKEyHSEGG',**Params("PSPA"))
227
+ out
206
228
  ```
207
229
 
208
230
  considering string: ['-5A', '-4E', '-3E', '-2K', '-1E', '0y', '1H', '2S', '3E', '4G', '5G']
209
231
 
210
232
  kinase
211
- EGFR 3.181
212
- FGFR4 2.390
213
- CSK 2.308
214
- ZAP70 2.068
215
- SYK 1.998
216
- PDHK1_TYR 1.922
217
- RET 1.732
218
- MATK 1.688
219
- FLT1 1.627
220
- BMPR2_TYR 1.456
221
- dtype: float64
233
+ EGFR 3.181
234
+ FGFR4 2.390
235
+ CSK 2.308
236
+ ZAP70 2.068
237
+ SYK 1.998
238
+ ...
239
+ EPHA1 -3.501
240
+ FES -3.699
241
+ TNK1 -4.269
242
+ TNK2 -4.577
243
+ DDR2 -4.920
244
+ Length: 93, dtype: float64
222
245
 
223
246
  - So far [The kinase Library](https://kinase-library.phosphosite.org)
224
247
  considers all ***tyr sequences*** in capital regardless of whether or
@@ -234,8 +257,10 @@ sheet.
234
257
  ``` python
235
258
  # Percentile reference sheet
236
259
  y_pct = Data.get_pspa_tyr_pct()
260
+ ```
237
261
 
238
- get_pct('AEEKEyHSEGG',**param_PSPA_y, pct_ref = y_pct)
262
+ ``` python
263
+ get_pct('AEEKEyHSEGG',pct_ref = y_pct,**Params("PSPA_y"))
239
264
  ```
240
265
 
241
266
  considering string: ['-5A', '-4E', '-3E', '-2K', '-1E', '0Y', '1H', '2S', '3E', '4G', '5G']
@@ -270,15 +295,15 @@ get_pct('AEEKEyHSEGG',**param_PSPA_y, pct_ref = y_pct)
270
295
  <p>93 rows × 2 columns</p>
271
296
  </div>
272
297
 
273
- ## High-throughput substrate scoring on a dataframe
298
+ ### Site scoring in a df
274
299
 
275
- ### Load your csv
300
+ Load your csv:
276
301
 
277
302
  ``` python
278
303
  # df = pd.read_csv('your_file.csv')
279
304
  ```
280
305
 
281
- ### Load a demo df
306
+ Or load a demo df
282
307
 
283
308
  ``` python
284
309
  # Load a demo df with phosphorylation sites
@@ -309,22 +334,21 @@ df.iloc[:,-2:]
309
334
 
310
335
  </div>
311
336
 
312
- ### Set the column name and param to calculate
337
+ Set the column name and param to calculate
313
338
 
314
339
  Here we choose param_CDDM_upper, as the sequences in the demo df are all
315
340
  in capital. You can also choose other params.
316
341
 
317
342
  ``` python
318
- results = predict_kinase_df(df,'site_seq',**param_CDDM_upper)
343
+ results = predict_kinase_df(df,'site_seq',**Params("CDDM_upper"))
319
344
  results
320
345
  ```
321
346
 
322
347
  input dataframe has a length 5
323
348
  Preprocessing
324
349
  Finish preprocessing
325
- Calculating position: [-7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7]
326
-
327
- 100%|██████████| 289/289 [00:05<00:00, 56.64it/s]
350
+ Merging reference
351
+ Finish merging
328
352
 
329
353
  <div>
330
354
  <style scoped>
@@ -339,18 +363,35 @@ results
339
363
  }
340
364
  </style>
341
365
 
342
- | kinase | SRC | EPHA3 | FES | NTRK3 | ALK | EPHA8 | ABL1 | FLT3 | EPHB2 | FYN | ... | MEK5 | PKN2 | MAP2K7 | MRCKB | HIPK3 | CDK8 | BUB1 | MEKK3 | MAP2K3 | GRK1 |
366
+ | | SRC | EPHA3 | FES | NTRK3 | ALK | ABL1 | FLT3 | EPHA8 | EPHB2 | EPHB1 | ... | VRK1 | PKMYT1 | GRK3 | CAMK1B | CDC7 | SMMLCK | ROR1 | GAK | MAST2 | BRAF |
343
367
  |----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
344
- | 0 | 0.991760 | 1.093712 | 1.051750 | 1.067134 | 1.013682 | 1.097519 | 0.966379 | 0.982464 | 1.054986 | 1.055910 | ... | 1.314859 | 1.635470 | 1.652251 | 1.622672 | 1.362973 | 1.797155 | 1.305198 | 1.423618 | 1.504941 | 1.872020 |
345
- | 1 | 0.910262 | 0.953743 | 0.942327 | 0.950601 | 0.872694 | 0.932586 | 0.846899 | 0.826662 | 0.915020 | 0.942713 | ... | 1.175454 | 1.402006 | 1.430392 | 1.215826 | 1.569373 | 1.716455 | 1.270999 | 1.195081 | 1.223082 | 1.793290 |
346
- | 2 | 0.849866 | 0.899910 | 0.848895 | 0.879652 | 0.874959 | 0.899414 | 0.839200 | 0.836523 | 0.858040 | 0.867269 | ... | 1.408003 | 1.813739 | 1.454786 | 1.084522 | 1.352556 | 1.524663 | 1.377839 | 1.173830 | 1.305691 | 1.811849 |
347
- | 3 | 0.803826 | 0.836527 | 0.800759 | 0.894570 | 0.839905 | 0.781001 | 0.847847 | 0.807040 | 0.805877 | 0.801402 | ... | 1.110307 | 1.703637 | 1.795092 | 1.469653 | 1.549936 | 1.491344 | 1.446922 | 1.055452 | 1.534895 | 1.741090 |
348
- | 4 | 0.822793 | 0.796532 | 0.792343 | 0.839882 | 0.810122 | 0.781420 | 0.805251 | 0.795022 | 0.790380 | 0.864538 | ... | 1.062617 | 1.357689 | 1.485945 | 1.249266 | 1.456078 | 1.422782 | 1.376471 | 1.089629 | 1.121309 | 1.697524 |
368
+ | 0 | -2.440640 | -0.818753 | -1.663990 | -0.738991 | -2.047628 | -3.602344 | -3.200998 | -0.935176 | -1.388444 | -1.859450 | ... | -17.103237 | -113.698143 | -16.848783 | -41.520172 | -41.646187 | 1.284159 | -26.566362 | -69.165062 | -17.706400 | -87.763214 |
369
+ | 1 | -3.838486 | -2.735969 | -2.533986 | -2.150399 | -3.792498 | -4.725527 | -5.711791 | -4.534240 | -3.148449 | -2.511518 | ... | -67.889053 | -68.652641 | -45.833855 | -64.171600 | -39.465572 | -65.061722 | -109.561707 | -85.911224 | -60.105064 | -63.889122 |
370
+ | 2 | -2.610423 | -2.370090 | -3.235637 | -1.508413 | -2.571347 | -3.740941 | -3.025596 | -3.373504 | -2.776297 | -3.060740 | ... | -15.798462 | -45.905319 | -61.440742 | -67.695694 | -55.047962 | -42.135216 | -38.501572 | -62.624382 | -56.119389 | -107.060989 |
371
+ | 3 | -5.180541 | -4.201880 | -5.766463 | -3.038421 | -3.836897 | -4.249900 | -5.029885 | -5.411311 | -4.713308 | -4.827825 | ... | -96.978317 | -83.419777 | -22.559393 | -110.611588 | -63.283070 | -37.240440 | -24.497492 | -112.878151 | -43.538158 | -60.348518 |
372
+ | 4 | -2.844254 | -3.322700 | -3.681745 | -1.766435 | -2.666579 | -3.748774 | -4.083619 | -3.912834 | -3.724181 | -3.948160 | ... | -35.824612 | -87.983566 | -83.312317 | -107.162407 | -61.478374 | -85.793571 | -43.738819 | -47.004211 | -42.281624 | -59.518513 |
349
373
 
350
- <p>5 rows × 289 columns</p>
374
+ <p>5 rows × 328 columns</p>
351
375
  </div>
352
376
 
353
- ## Phosphorylation sites
377
+ ``` python
378
+ results.iloc[0].sort_values(ascending=False)
379
+ ```
380
+
381
+ TLK2 8.264621
382
+ GCN2 8.101542
383
+ TLK1 7.693897
384
+ HRI 6.691402
385
+ PLK3 6.579368
386
+ ...
387
+ NIK -64.605148
388
+ SRPK2 -67.300667
389
+ GAK -69.165062
390
+ BRAF -87.763214
391
+ PKMYT1 -113.698143
392
+ Name: 0, Length: 328, dtype: float32
393
+
394
+ ## Dataset
354
395
 
355
396
  Besides calculating sequence scores, we also provides multiple datasets
356
397
  of phosphorylation sites.
@@ -405,9 +446,9 @@ df.head(3)
405
446
 
406
447
  | | uniprot | position | residue | is_disopred | disopred_score | log10_hotspot_pval_min | isHotspot | uniprot_position | functional_score | current_uniprot | name | gene | Sequence | is_valid | site_seq | gene_site |
407
448
  |----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
408
- | 0 | A0A075B6Q4 | 24 | S | True | 0.91 | 6.839384 | True | A0A075B6Q4_24 | 0.149257 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | VDDEKGDSNDDYDSA | A0A075B6Q4_S24 |
409
- | 1 | A0A075B6Q4 | 35 | S | True | 0.87 | 9.192622 | False | A0A075B6Q4_35 | 0.136966 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | YDSAGLLSDEDCMSV | A0A075B6Q4_S35 |
410
- | 2 | A0A075B6Q4 | 57 | S | False | 0.28 | 0.818834 | False | A0A075B6Q4_57 | 0.125364 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | IADHLFWSEETKSRF | A0A075B6Q4_S57 |
449
+ | 0 | A0A075B6Q4 | 24 | S | 1.0 | 0.91 | 6.839384 | 1.0 | A0A075B6Q4_24 | 0.149257 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | VDDEKGDSNDDYDSA | A0A075B6Q4_S24 |
450
+ | 1 | A0A075B6Q4 | 35 | S | 1.0 | 0.87 | 9.192622 | 0.0 | A0A075B6Q4_35 | 0.136966 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | YDSAGLLSDEDCMSV | A0A075B6Q4_S35 |
451
+ | 2 | A0A075B6Q4 | 57 | S | 0.0 | 0.28 | 0.818834 | 0.0 | A0A075B6Q4_57 | 0.125364 | A0A075B6Q4 | A0A075B6Q4_HUMAN | None | MDIQKSENEDDSEWEDVDDEKGDSNDDYDSAGLLSDEDCMSVPGKT... | True | IADHLFWSEETKSRF | A0A075B6Q4_S57 |
411
452
 
412
453
  </div>
413
454
 
@@ -459,13 +500,12 @@ df.head(3)
459
500
  }
460
501
  </style>
461
502
 
462
- | | site_seq | gene_site | gene | source | num_site | acceptor | -7 | -6 | -5 | -4 | ... | -2 | -1 | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
463
- |----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
464
- | 0 | AAAAAAASGGAGSDN | PBX1_S136 | PBX1 | ochoa | 1 | S | A | A | A | A | ... | A | A | S | G | G | A | G | S | D | N |
465
- | 1 | AAAAAAASGGGVSPD | PBX2_S146 | PBX2 | ochoa | 1 | S | A | A | A | A | ... | A | A | S | G | G | G | V | S | P | D |
466
- | 2 | AAAAAAASGVTTGKP | CLASR_S349 | CLASR | ochoa | 1 | S | A | A | A | A | ... | A | A | S | G | V | T | T | G | K | P |
503
+ | | uniprot | gene | site | site_seq | source | AM_pathogenicity | CDDM_upper | CDDM_max_score |
504
+ |----|----|----|----|----|----|----|----|----|
505
+ | 0 | A0A024R4G9 | C19orf48 | S20 | ITGSRLLSMVPGPAR | psp | NaN | PRKX,AKT1,PKG1,P90RSK,HIPK4,AKT3,HIPK1,PKACB,H... | 2.407041 |
506
+ | 1 | A0A075B6Q4 | None | S24 | VDDEKGDSNDDYDSA | ochoa | NaN | CK2A2,CK2A1,GRK7,GRK5,CK1G1,CK1A,IKKA,CK1G2,CA... | 2.295654 |
507
+ | 2 | A0A075B6Q4 | None | S35 | YDSAGLLSDEDCMSV | ochoa | NaN | CK2A2,CK2A1,IKKA,ATM,IKKB,CAMK1D,MARK2,GRK7,IK... | 2.488683 |
467
508
 
468
- <p>3 rows × 21 columns</p>
469
509
  </div>
470
510
 
471
511
  ## Phosphorylation site sequence example