pisa-analysis 3.0.3__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pisa_analysis-3.0.3/PKG-INFO +214 -0
- pisa_analysis-3.0.3/README.md +196 -0
- pisa_analysis-3.0.3/pisa_analysis.egg-info/PKG-INFO +214 -0
- pisa_analysis-3.0.3/pisa_analysis.egg-info/SOURCES.txt +23 -0
- pisa_analysis-3.0.3/pisa_analysis.egg-info/dependency_links.txt +1 -0
- pisa_analysis-3.0.3/pisa_analysis.egg-info/entry_points.txt +3 -0
- pisa_analysis-3.0.3/pisa_analysis.egg-info/requires.txt +7 -0
- pisa_analysis-3.0.3/pisa_analysis.egg-info/top_level.txt +1 -0
- pisa_analysis-3.0.3/pisa_utils/__init__.py +0 -0
- pisa_analysis-3.0.3/pisa_utils/analyze.py +268 -0
- pisa_analysis-3.0.3/pisa_utils/constants.py +4 -0
- pisa_analysis-3.0.3/pisa_utils/covariations_int.py +285 -0
- pisa_analysis-3.0.3/pisa_utils/dictionaries.py +289 -0
- pisa_analysis-3.0.3/pisa_utils/parsers.py +1452 -0
- pisa_analysis-3.0.3/pisa_utils/run.py +170 -0
- pisa_analysis-3.0.3/pisa_utils/run_pisa.py +249 -0
- pisa_analysis-3.0.3/pisa_utils/utils.py +208 -0
- pisa_analysis-3.0.3/pyproject.toml +49 -0
- pisa_analysis-3.0.3/setup.cfg +4 -0
- pisa_analysis-3.0.3/tests/test_analyze.py +178 -0
- pisa_analysis-3.0.3/tests/test_covariations_int.py +215 -0
- pisa_analysis-3.0.3/tests/test_dictionaries.py +98 -0
- pisa_analysis-3.0.3/tests/test_models.py +367 -0
- pisa_analysis-3.0.3/tests/test_parsers.py +1046 -0
- pisa_analysis-3.0.3/tests/test_utils.py +72 -0
|
@@ -0,0 +1,214 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: pisa-analysis
|
|
3
|
+
Version: 3.0.3
|
|
4
|
+
Summary: This python package works with PISA to analyse data for macromolecular interfaces and interactions in assemblies.
|
|
5
|
+
Author-email: Grisell Diaz Leines <gdiazleines@ebi.ac.uk>
|
|
6
|
+
License: Apache 2.0
|
|
7
|
+
Project-URL: Homepage, https://github.com/PDBe-KB/pisa-analysis
|
|
8
|
+
Project-URL: Repository, https://github.com/PDBe-KB/pisa-analysis
|
|
9
|
+
Requires-Python: >=3.10
|
|
10
|
+
Description-Content-Type: text/markdown
|
|
11
|
+
Requires-Dist: gemmi>=0.7.3
|
|
12
|
+
Requires-Dist: jsonschema>=4.25.1
|
|
13
|
+
Requires-Dist: lxml>=6.0.2
|
|
14
|
+
Requires-Dist: pandas>=2.3.3
|
|
15
|
+
Requires-Dist: pydantic>=2.12.4
|
|
16
|
+
Requires-Dist: xmlschema>=4.2.0
|
|
17
|
+
Requires-Dist: xmltodict>=1.0.2
|
|
18
|
+
|
|
19
|
+
# Assembly interfaces analysis
|
|
20
|
+
|
|
21
|
+
## Basic information
|
|
22
|
+
|
|
23
|
+
This python package works with PISA to analyze data for macromolecular interfaces and interactions in assemblies.
|
|
24
|
+
|
|
25
|
+
The code consists of the module `pisa_analysis` that will:
|
|
26
|
+
|
|
27
|
+
- Analyse macromolecular interfaces with PISA
|
|
28
|
+
- Create a JSON dictionary with assembly interactions/interfaces information
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
git clone https://github.com/PDBe-KB/pisa-analysis
|
|
32
|
+
|
|
33
|
+
cd pisa-analysis
|
|
34
|
+
```
|
|
35
|
+
## Dependencies
|
|
36
|
+
|
|
37
|
+
The pisa_analysis process runs PISA as a subprocess and requires apriori compilation of PISA.
|
|
38
|
+
|
|
39
|
+
To make your life easier when running the process, you can set two path environment variables for PISA:
|
|
40
|
+
|
|
41
|
+
An environment variable to the `pisa` binary:
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
export PATH="$PATH:your_path_to_pisa/pisa/build"
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
A path to the setup directory of PISA:
|
|
48
|
+
|
|
49
|
+
```
|
|
50
|
+
export PISA_SETUP_DIR="/your_path_to_pisa/pisa/setup"
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
Additionally, it is required that PISA setup directory contains a pisa configuration template named [pisa_cfg_tmp](https://github.com/PDBe-KB/pisa/tree/main/setup/pisa_cfg_tmp)
|
|
54
|
+
|
|
55
|
+
<!-- Comment that config for CCP4 install can also be used. -->
|
|
56
|
+
|
|
57
|
+
Other dependencies can be installed with:
|
|
58
|
+
|
|
59
|
+
```
|
|
60
|
+
pip install -r requirements.txt
|
|
61
|
+
```
|
|
62
|
+
See [requirements.txt](https://github.com/PDBe-KB/pisa-analysis/blob/main/requirements.txt)
|
|
63
|
+
|
|
64
|
+
|
|
65
|
+
For development:
|
|
66
|
+
|
|
67
|
+
**pre-commit usage**
|
|
68
|
+
|
|
69
|
+
```
|
|
70
|
+
pip install pre-commit
|
|
71
|
+
pre-commit
|
|
72
|
+
pre-commit install
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## Usage
|
|
76
|
+
|
|
77
|
+
Follow below steps to install the module **pisa_analysis** :
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
cd pisa-analysis/
|
|
81
|
+
|
|
82
|
+
python3 -m venv .venv
|
|
83
|
+
source .venv/bin/activate
|
|
84
|
+
|
|
85
|
+
python3 -m pip install .
|
|
86
|
+
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
To run the modules in command line:
|
|
91
|
+
|
|
92
|
+
**pisa_analysis**:
|
|
93
|
+
```
|
|
94
|
+
pisa_analysis [-h] \
|
|
95
|
+
-i <INPUT_CIF_FILE> \
|
|
96
|
+
--pdb_id <PDB_ID> \
|
|
97
|
+
--assembly_id <ASSEMBLY_CODE> \
|
|
98
|
+
-o <OUTPUT_JSON> \
|
|
99
|
+
--output_xml <OUTPUT_XML>
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
Required arguments are :
|
|
103
|
+
|
|
104
|
+
```
|
|
105
|
+
--input_cif (-i) : Assembly CIF file (It can also read a PDB file). Optional if --gen_full_results is used and --assembly_id not specified.
|
|
106
|
+
--pdb_id : Entry ID
|
|
107
|
+
--assembly_id : Assembly code
|
|
108
|
+
--output_json (-o) : Output directory for JSON fille
|
|
109
|
+
--output_xml : Output directory for XML files
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
|
|
113
|
+
Other optional arguments are:
|
|
114
|
+
|
|
115
|
+
```
|
|
116
|
+
--input_updated_cif : Updated cif for pdbid entry
|
|
117
|
+
--force : Always runs PISA calculation
|
|
118
|
+
--pisa_setup_dir : Path to the 'setup' directory in PISA
|
|
119
|
+
--pisa_binary : Binary file for PISA
|
|
120
|
+
-h, --help : Show help message
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
|
|
124
|
+
The process is as follows:
|
|
125
|
+
|
|
126
|
+
For **pisa_analysis** module:
|
|
127
|
+
|
|
128
|
+
1. The process first runs PISA in a subprocess and generates two xml files:
|
|
129
|
+
- interfaces.xml
|
|
130
|
+
- assembly.xml
|
|
131
|
+
|
|
132
|
+
The xml files are saved in the output directory defined by the `--output_xml` argument. If the xml files exist and are valid, the process will skip running PISA unless the `--force` is used in the arguments.
|
|
133
|
+
|
|
134
|
+
2. Next, the process parses xml files generated by PISA and creates a dictionary that contains all assembly interfaces/interactions information.
|
|
135
|
+
|
|
136
|
+
3. While creating the interfaces dictionary for the entry, the process reads UniProt accession and sequence numbers from an Updated CIF file using Gemmi.
|
|
137
|
+
|
|
138
|
+
4. The process also parses xml file `assembly.xml` generated by PISA and creates a simplified dictionary with some assembly information.
|
|
139
|
+
|
|
140
|
+
4. In the last steps, the process dumps the dictionaries into JSON files. The JSON files are saved in the output directory defined by the `-o` or `--output_json` arguments. The output json files are:
|
|
141
|
+
|
|
142
|
+
*xxxx-assemX_interfaces.json* and *xxxx-assemblyX.json*
|
|
143
|
+
|
|
144
|
+
where xxxx is the pdb id entry and X is the assembly code.
|
|
145
|
+
|
|
146
|
+
|
|
147
|
+
## Expected JSON files
|
|
148
|
+
|
|
149
|
+
Documentation on the assembly interfaces json file and schema can be found here:
|
|
150
|
+
|
|
151
|
+
https://pisalite.docs.apiary.io/#reference/0/pisaqualifierjson/interaction-interface-data-per-pdb-assembly-entry
|
|
152
|
+
|
|
153
|
+
The simplified assembly json output looks as follows:
|
|
154
|
+
```
|
|
155
|
+
{
|
|
156
|
+
"PISA": {
|
|
157
|
+
"pdb_id": "1d2s",
|
|
158
|
+
"assembly_id": "1",
|
|
159
|
+
"pisa_version": "2.0",
|
|
160
|
+
"assembly": {
|
|
161
|
+
"id": "1",
|
|
162
|
+
"size": "8",
|
|
163
|
+
"macromolecular_size": "2",
|
|
164
|
+
"dissociation_energy": -3.96,
|
|
165
|
+
"accessible_surface_area": 15146.45,
|
|
166
|
+
"buried_surface_area": 3156.79,
|
|
167
|
+
"entropy": 12.09,
|
|
168
|
+
"dissociation_area": 733.07,
|
|
169
|
+
"solvation_energy_gain": -41.09,
|
|
170
|
+
"number_of_uc": "0",
|
|
171
|
+
"number_of_dissociated_elements": "2",
|
|
172
|
+
"symmetry_number": "2",
|
|
173
|
+
"formula": "A(2)a(4)b(2)",
|
|
174
|
+
"composition": "A-2A[CA](4)[DHT](2)"
|
|
175
|
+
}
|
|
176
|
+
}
|
|
177
|
+
}
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
## Setup with Docker
|
|
181
|
+
|
|
182
|
+
Build the docker image with:
|
|
183
|
+
```shell
|
|
184
|
+
docker build -t pisa-analysis .
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
Run the docker container with:
|
|
188
|
+
```
|
|
189
|
+
docker run -v <HOST_DIR>:/data_dir \
|
|
190
|
+
pisa-analysis \
|
|
191
|
+
pisa_analysis \
|
|
192
|
+
--input_cif /data_dir/<INPUT_CIF> \
|
|
193
|
+
--pdb_id <PDB_ID> \
|
|
194
|
+
--assembly_id <ASSEMBLY_CODE> \
|
|
195
|
+
--output_json /data_dir/<OUTPUT_JSON> \
|
|
196
|
+
--output_xml /data_dir/<OUTPUT_XML>
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
## Versioning
|
|
200
|
+
|
|
201
|
+
We use [SemVer](https://semver.org) for versioning.
|
|
202
|
+
|
|
203
|
+
## Authors
|
|
204
|
+
* [Grisell Diaz Leines](https://github.com/grisell) - Lead developer
|
|
205
|
+
* [Stephen Anyango](otienoanyango) - Review and productionising
|
|
206
|
+
* [Mihaly Varadi](https://github.com/mvaradi) - Review and management
|
|
207
|
+
|
|
208
|
+
See all contributors [here](https://github.com/PDBe-KB/pisa-analysis/graphs/contributors).
|
|
209
|
+
|
|
210
|
+
## License
|
|
211
|
+
|
|
212
|
+
See [LICENSE](https://github.com/PDBe-KB/pisa-analysis/blob/main/LICENSE)
|
|
213
|
+
|
|
214
|
+
## Acknowledgements
|
|
@@ -0,0 +1,196 @@
|
|
|
1
|
+
# Assembly interfaces analysis
|
|
2
|
+
|
|
3
|
+
## Basic information
|
|
4
|
+
|
|
5
|
+
This python package works with PISA to analyze data for macromolecular interfaces and interactions in assemblies.
|
|
6
|
+
|
|
7
|
+
The code consists of the module `pisa_analysis` that will:
|
|
8
|
+
|
|
9
|
+
- Analyse macromolecular interfaces with PISA
|
|
10
|
+
- Create a JSON dictionary with assembly interactions/interfaces information
|
|
11
|
+
|
|
12
|
+
```
|
|
13
|
+
git clone https://github.com/PDBe-KB/pisa-analysis
|
|
14
|
+
|
|
15
|
+
cd pisa-analysis
|
|
16
|
+
```
|
|
17
|
+
## Dependencies
|
|
18
|
+
|
|
19
|
+
The pisa_analysis process runs PISA as a subprocess and requires apriori compilation of PISA.
|
|
20
|
+
|
|
21
|
+
To make your life easier when running the process, you can set two path environment variables for PISA:
|
|
22
|
+
|
|
23
|
+
An environment variable to the `pisa` binary:
|
|
24
|
+
|
|
25
|
+
```
|
|
26
|
+
export PATH="$PATH:your_path_to_pisa/pisa/build"
|
|
27
|
+
```
|
|
28
|
+
|
|
29
|
+
A path to the setup directory of PISA:
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
export PISA_SETUP_DIR="/your_path_to_pisa/pisa/setup"
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
Additionally, it is required that PISA setup directory contains a pisa configuration template named [pisa_cfg_tmp](https://github.com/PDBe-KB/pisa/tree/main/setup/pisa_cfg_tmp)
|
|
36
|
+
|
|
37
|
+
<!-- Comment that config for CCP4 install can also be used. -->
|
|
38
|
+
|
|
39
|
+
Other dependencies can be installed with:
|
|
40
|
+
|
|
41
|
+
```
|
|
42
|
+
pip install -r requirements.txt
|
|
43
|
+
```
|
|
44
|
+
See [requirements.txt](https://github.com/PDBe-KB/pisa-analysis/blob/main/requirements.txt)
|
|
45
|
+
|
|
46
|
+
|
|
47
|
+
For development:
|
|
48
|
+
|
|
49
|
+
**pre-commit usage**
|
|
50
|
+
|
|
51
|
+
```
|
|
52
|
+
pip install pre-commit
|
|
53
|
+
pre-commit
|
|
54
|
+
pre-commit install
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
## Usage
|
|
58
|
+
|
|
59
|
+
Follow below steps to install the module **pisa_analysis** :
|
|
60
|
+
|
|
61
|
+
```
|
|
62
|
+
cd pisa-analysis/
|
|
63
|
+
|
|
64
|
+
python3 -m venv .venv
|
|
65
|
+
source .venv/bin/activate
|
|
66
|
+
|
|
67
|
+
python3 -m pip install .
|
|
68
|
+
|
|
69
|
+
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
To run the modules in command line:
|
|
73
|
+
|
|
74
|
+
**pisa_analysis**:
|
|
75
|
+
```
|
|
76
|
+
pisa_analysis [-h] \
|
|
77
|
+
-i <INPUT_CIF_FILE> \
|
|
78
|
+
--pdb_id <PDB_ID> \
|
|
79
|
+
--assembly_id <ASSEMBLY_CODE> \
|
|
80
|
+
-o <OUTPUT_JSON> \
|
|
81
|
+
--output_xml <OUTPUT_XML>
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
Required arguments are :
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
--input_cif (-i) : Assembly CIF file (It can also read a PDB file). Optional if --gen_full_results is used and --assembly_id not specified.
|
|
88
|
+
--pdb_id : Entry ID
|
|
89
|
+
--assembly_id : Assembly code
|
|
90
|
+
--output_json (-o) : Output directory for JSON fille
|
|
91
|
+
--output_xml : Output directory for XML files
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
|
|
95
|
+
Other optional arguments are:
|
|
96
|
+
|
|
97
|
+
```
|
|
98
|
+
--input_updated_cif : Updated cif for pdbid entry
|
|
99
|
+
--force : Always runs PISA calculation
|
|
100
|
+
--pisa_setup_dir : Path to the 'setup' directory in PISA
|
|
101
|
+
--pisa_binary : Binary file for PISA
|
|
102
|
+
-h, --help : Show help message
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
|
|
106
|
+
The process is as follows:
|
|
107
|
+
|
|
108
|
+
For **pisa_analysis** module:
|
|
109
|
+
|
|
110
|
+
1. The process first runs PISA in a subprocess and generates two xml files:
|
|
111
|
+
- interfaces.xml
|
|
112
|
+
- assembly.xml
|
|
113
|
+
|
|
114
|
+
The xml files are saved in the output directory defined by the `--output_xml` argument. If the xml files exist and are valid, the process will skip running PISA unless the `--force` is used in the arguments.
|
|
115
|
+
|
|
116
|
+
2. Next, the process parses xml files generated by PISA and creates a dictionary that contains all assembly interfaces/interactions information.
|
|
117
|
+
|
|
118
|
+
3. While creating the interfaces dictionary for the entry, the process reads UniProt accession and sequence numbers from an Updated CIF file using Gemmi.
|
|
119
|
+
|
|
120
|
+
4. The process also parses xml file `assembly.xml` generated by PISA and creates a simplified dictionary with some assembly information.
|
|
121
|
+
|
|
122
|
+
4. In the last steps, the process dumps the dictionaries into JSON files. The JSON files are saved in the output directory defined by the `-o` or `--output_json` arguments. The output json files are:
|
|
123
|
+
|
|
124
|
+
*xxxx-assemX_interfaces.json* and *xxxx-assemblyX.json*
|
|
125
|
+
|
|
126
|
+
where xxxx is the pdb id entry and X is the assembly code.
|
|
127
|
+
|
|
128
|
+
|
|
129
|
+
## Expected JSON files
|
|
130
|
+
|
|
131
|
+
Documentation on the assembly interfaces json file and schema can be found here:
|
|
132
|
+
|
|
133
|
+
https://pisalite.docs.apiary.io/#reference/0/pisaqualifierjson/interaction-interface-data-per-pdb-assembly-entry
|
|
134
|
+
|
|
135
|
+
The simplified assembly json output looks as follows:
|
|
136
|
+
```
|
|
137
|
+
{
|
|
138
|
+
"PISA": {
|
|
139
|
+
"pdb_id": "1d2s",
|
|
140
|
+
"assembly_id": "1",
|
|
141
|
+
"pisa_version": "2.0",
|
|
142
|
+
"assembly": {
|
|
143
|
+
"id": "1",
|
|
144
|
+
"size": "8",
|
|
145
|
+
"macromolecular_size": "2",
|
|
146
|
+
"dissociation_energy": -3.96,
|
|
147
|
+
"accessible_surface_area": 15146.45,
|
|
148
|
+
"buried_surface_area": 3156.79,
|
|
149
|
+
"entropy": 12.09,
|
|
150
|
+
"dissociation_area": 733.07,
|
|
151
|
+
"solvation_energy_gain": -41.09,
|
|
152
|
+
"number_of_uc": "0",
|
|
153
|
+
"number_of_dissociated_elements": "2",
|
|
154
|
+
"symmetry_number": "2",
|
|
155
|
+
"formula": "A(2)a(4)b(2)",
|
|
156
|
+
"composition": "A-2A[CA](4)[DHT](2)"
|
|
157
|
+
}
|
|
158
|
+
}
|
|
159
|
+
}
|
|
160
|
+
```
|
|
161
|
+
|
|
162
|
+
## Setup with Docker
|
|
163
|
+
|
|
164
|
+
Build the docker image with:
|
|
165
|
+
```shell
|
|
166
|
+
docker build -t pisa-analysis .
|
|
167
|
+
```
|
|
168
|
+
|
|
169
|
+
Run the docker container with:
|
|
170
|
+
```
|
|
171
|
+
docker run -v <HOST_DIR>:/data_dir \
|
|
172
|
+
pisa-analysis \
|
|
173
|
+
pisa_analysis \
|
|
174
|
+
--input_cif /data_dir/<INPUT_CIF> \
|
|
175
|
+
--pdb_id <PDB_ID> \
|
|
176
|
+
--assembly_id <ASSEMBLY_CODE> \
|
|
177
|
+
--output_json /data_dir/<OUTPUT_JSON> \
|
|
178
|
+
--output_xml /data_dir/<OUTPUT_XML>
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
## Versioning
|
|
182
|
+
|
|
183
|
+
We use [SemVer](https://semver.org) for versioning.
|
|
184
|
+
|
|
185
|
+
## Authors
|
|
186
|
+
* [Grisell Diaz Leines](https://github.com/grisell) - Lead developer
|
|
187
|
+
* [Stephen Anyango](otienoanyango) - Review and productionising
|
|
188
|
+
* [Mihaly Varadi](https://github.com/mvaradi) - Review and management
|
|
189
|
+
|
|
190
|
+
See all contributors [here](https://github.com/PDBe-KB/pisa-analysis/graphs/contributors).
|
|
191
|
+
|
|
192
|
+
## License
|
|
193
|
+
|
|
194
|
+
See [LICENSE](https://github.com/PDBe-KB/pisa-analysis/blob/main/LICENSE)
|
|
195
|
+
|
|
196
|
+
## Acknowledgements
|
|
@@ -0,0 +1,214 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: pisa-analysis
|
|
3
|
+
Version: 3.0.3
|
|
4
|
+
Summary: This python package works with PISA to analyse data for macromolecular interfaces and interactions in assemblies.
|
|
5
|
+
Author-email: Grisell Diaz Leines <gdiazleines@ebi.ac.uk>
|
|
6
|
+
License: Apache 2.0
|
|
7
|
+
Project-URL: Homepage, https://github.com/PDBe-KB/pisa-analysis
|
|
8
|
+
Project-URL: Repository, https://github.com/PDBe-KB/pisa-analysis
|
|
9
|
+
Requires-Python: >=3.10
|
|
10
|
+
Description-Content-Type: text/markdown
|
|
11
|
+
Requires-Dist: gemmi>=0.7.3
|
|
12
|
+
Requires-Dist: jsonschema>=4.25.1
|
|
13
|
+
Requires-Dist: lxml>=6.0.2
|
|
14
|
+
Requires-Dist: pandas>=2.3.3
|
|
15
|
+
Requires-Dist: pydantic>=2.12.4
|
|
16
|
+
Requires-Dist: xmlschema>=4.2.0
|
|
17
|
+
Requires-Dist: xmltodict>=1.0.2
|
|
18
|
+
|
|
19
|
+
# Assembly interfaces analysis
|
|
20
|
+
|
|
21
|
+
## Basic information
|
|
22
|
+
|
|
23
|
+
This python package works with PISA to analyze data for macromolecular interfaces and interactions in assemblies.
|
|
24
|
+
|
|
25
|
+
The code consists of the module `pisa_analysis` that will:
|
|
26
|
+
|
|
27
|
+
- Analyse macromolecular interfaces with PISA
|
|
28
|
+
- Create a JSON dictionary with assembly interactions/interfaces information
|
|
29
|
+
|
|
30
|
+
```
|
|
31
|
+
git clone https://github.com/PDBe-KB/pisa-analysis
|
|
32
|
+
|
|
33
|
+
cd pisa-analysis
|
|
34
|
+
```
|
|
35
|
+
## Dependencies
|
|
36
|
+
|
|
37
|
+
The pisa_analysis process runs PISA as a subprocess and requires apriori compilation of PISA.
|
|
38
|
+
|
|
39
|
+
To make your life easier when running the process, you can set two path environment variables for PISA:
|
|
40
|
+
|
|
41
|
+
An environment variable to the `pisa` binary:
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
export PATH="$PATH:your_path_to_pisa/pisa/build"
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
A path to the setup directory of PISA:
|
|
48
|
+
|
|
49
|
+
```
|
|
50
|
+
export PISA_SETUP_DIR="/your_path_to_pisa/pisa/setup"
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
Additionally, it is required that PISA setup directory contains a pisa configuration template named [pisa_cfg_tmp](https://github.com/PDBe-KB/pisa/tree/main/setup/pisa_cfg_tmp)
|
|
54
|
+
|
|
55
|
+
<!-- Comment that config for CCP4 install can also be used. -->
|
|
56
|
+
|
|
57
|
+
Other dependencies can be installed with:
|
|
58
|
+
|
|
59
|
+
```
|
|
60
|
+
pip install -r requirements.txt
|
|
61
|
+
```
|
|
62
|
+
See [requirements.txt](https://github.com/PDBe-KB/pisa-analysis/blob/main/requirements.txt)
|
|
63
|
+
|
|
64
|
+
|
|
65
|
+
For development:
|
|
66
|
+
|
|
67
|
+
**pre-commit usage**
|
|
68
|
+
|
|
69
|
+
```
|
|
70
|
+
pip install pre-commit
|
|
71
|
+
pre-commit
|
|
72
|
+
pre-commit install
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
## Usage
|
|
76
|
+
|
|
77
|
+
Follow below steps to install the module **pisa_analysis** :
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
cd pisa-analysis/
|
|
81
|
+
|
|
82
|
+
python3 -m venv .venv
|
|
83
|
+
source .venv/bin/activate
|
|
84
|
+
|
|
85
|
+
python3 -m pip install .
|
|
86
|
+
|
|
87
|
+
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
To run the modules in command line:
|
|
91
|
+
|
|
92
|
+
**pisa_analysis**:
|
|
93
|
+
```
|
|
94
|
+
pisa_analysis [-h] \
|
|
95
|
+
-i <INPUT_CIF_FILE> \
|
|
96
|
+
--pdb_id <PDB_ID> \
|
|
97
|
+
--assembly_id <ASSEMBLY_CODE> \
|
|
98
|
+
-o <OUTPUT_JSON> \
|
|
99
|
+
--output_xml <OUTPUT_XML>
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
Required arguments are :
|
|
103
|
+
|
|
104
|
+
```
|
|
105
|
+
--input_cif (-i) : Assembly CIF file (It can also read a PDB file). Optional if --gen_full_results is used and --assembly_id not specified.
|
|
106
|
+
--pdb_id : Entry ID
|
|
107
|
+
--assembly_id : Assembly code
|
|
108
|
+
--output_json (-o) : Output directory for JSON fille
|
|
109
|
+
--output_xml : Output directory for XML files
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
|
|
113
|
+
Other optional arguments are:
|
|
114
|
+
|
|
115
|
+
```
|
|
116
|
+
--input_updated_cif : Updated cif for pdbid entry
|
|
117
|
+
--force : Always runs PISA calculation
|
|
118
|
+
--pisa_setup_dir : Path to the 'setup' directory in PISA
|
|
119
|
+
--pisa_binary : Binary file for PISA
|
|
120
|
+
-h, --help : Show help message
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
|
|
124
|
+
The process is as follows:
|
|
125
|
+
|
|
126
|
+
For **pisa_analysis** module:
|
|
127
|
+
|
|
128
|
+
1. The process first runs PISA in a subprocess and generates two xml files:
|
|
129
|
+
- interfaces.xml
|
|
130
|
+
- assembly.xml
|
|
131
|
+
|
|
132
|
+
The xml files are saved in the output directory defined by the `--output_xml` argument. If the xml files exist and are valid, the process will skip running PISA unless the `--force` is used in the arguments.
|
|
133
|
+
|
|
134
|
+
2. Next, the process parses xml files generated by PISA and creates a dictionary that contains all assembly interfaces/interactions information.
|
|
135
|
+
|
|
136
|
+
3. While creating the interfaces dictionary for the entry, the process reads UniProt accession and sequence numbers from an Updated CIF file using Gemmi.
|
|
137
|
+
|
|
138
|
+
4. The process also parses xml file `assembly.xml` generated by PISA and creates a simplified dictionary with some assembly information.
|
|
139
|
+
|
|
140
|
+
4. In the last steps, the process dumps the dictionaries into JSON files. The JSON files are saved in the output directory defined by the `-o` or `--output_json` arguments. The output json files are:
|
|
141
|
+
|
|
142
|
+
*xxxx-assemX_interfaces.json* and *xxxx-assemblyX.json*
|
|
143
|
+
|
|
144
|
+
where xxxx is the pdb id entry and X is the assembly code.
|
|
145
|
+
|
|
146
|
+
|
|
147
|
+
## Expected JSON files
|
|
148
|
+
|
|
149
|
+
Documentation on the assembly interfaces json file and schema can be found here:
|
|
150
|
+
|
|
151
|
+
https://pisalite.docs.apiary.io/#reference/0/pisaqualifierjson/interaction-interface-data-per-pdb-assembly-entry
|
|
152
|
+
|
|
153
|
+
The simplified assembly json output looks as follows:
|
|
154
|
+
```
|
|
155
|
+
{
|
|
156
|
+
"PISA": {
|
|
157
|
+
"pdb_id": "1d2s",
|
|
158
|
+
"assembly_id": "1",
|
|
159
|
+
"pisa_version": "2.0",
|
|
160
|
+
"assembly": {
|
|
161
|
+
"id": "1",
|
|
162
|
+
"size": "8",
|
|
163
|
+
"macromolecular_size": "2",
|
|
164
|
+
"dissociation_energy": -3.96,
|
|
165
|
+
"accessible_surface_area": 15146.45,
|
|
166
|
+
"buried_surface_area": 3156.79,
|
|
167
|
+
"entropy": 12.09,
|
|
168
|
+
"dissociation_area": 733.07,
|
|
169
|
+
"solvation_energy_gain": -41.09,
|
|
170
|
+
"number_of_uc": "0",
|
|
171
|
+
"number_of_dissociated_elements": "2",
|
|
172
|
+
"symmetry_number": "2",
|
|
173
|
+
"formula": "A(2)a(4)b(2)",
|
|
174
|
+
"composition": "A-2A[CA](4)[DHT](2)"
|
|
175
|
+
}
|
|
176
|
+
}
|
|
177
|
+
}
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
## Setup with Docker
|
|
181
|
+
|
|
182
|
+
Build the docker image with:
|
|
183
|
+
```shell
|
|
184
|
+
docker build -t pisa-analysis .
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
Run the docker container with:
|
|
188
|
+
```
|
|
189
|
+
docker run -v <HOST_DIR>:/data_dir \
|
|
190
|
+
pisa-analysis \
|
|
191
|
+
pisa_analysis \
|
|
192
|
+
--input_cif /data_dir/<INPUT_CIF> \
|
|
193
|
+
--pdb_id <PDB_ID> \
|
|
194
|
+
--assembly_id <ASSEMBLY_CODE> \
|
|
195
|
+
--output_json /data_dir/<OUTPUT_JSON> \
|
|
196
|
+
--output_xml /data_dir/<OUTPUT_XML>
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
## Versioning
|
|
200
|
+
|
|
201
|
+
We use [SemVer](https://semver.org) for versioning.
|
|
202
|
+
|
|
203
|
+
## Authors
|
|
204
|
+
* [Grisell Diaz Leines](https://github.com/grisell) - Lead developer
|
|
205
|
+
* [Stephen Anyango](otienoanyango) - Review and productionising
|
|
206
|
+
* [Mihaly Varadi](https://github.com/mvaradi) - Review and management
|
|
207
|
+
|
|
208
|
+
See all contributors [here](https://github.com/PDBe-KB/pisa-analysis/graphs/contributors).
|
|
209
|
+
|
|
210
|
+
## License
|
|
211
|
+
|
|
212
|
+
See [LICENSE](https://github.com/PDBe-KB/pisa-analysis/blob/main/LICENSE)
|
|
213
|
+
|
|
214
|
+
## Acknowledgements
|
|
@@ -0,0 +1,23 @@
|
|
|
1
|
+
README.md
|
|
2
|
+
pyproject.toml
|
|
3
|
+
pisa_analysis.egg-info/PKG-INFO
|
|
4
|
+
pisa_analysis.egg-info/SOURCES.txt
|
|
5
|
+
pisa_analysis.egg-info/dependency_links.txt
|
|
6
|
+
pisa_analysis.egg-info/entry_points.txt
|
|
7
|
+
pisa_analysis.egg-info/requires.txt
|
|
8
|
+
pisa_analysis.egg-info/top_level.txt
|
|
9
|
+
pisa_utils/__init__.py
|
|
10
|
+
pisa_utils/analyze.py
|
|
11
|
+
pisa_utils/constants.py
|
|
12
|
+
pisa_utils/covariations_int.py
|
|
13
|
+
pisa_utils/dictionaries.py
|
|
14
|
+
pisa_utils/parsers.py
|
|
15
|
+
pisa_utils/run.py
|
|
16
|
+
pisa_utils/run_pisa.py
|
|
17
|
+
pisa_utils/utils.py
|
|
18
|
+
tests/test_analyze.py
|
|
19
|
+
tests/test_covariations_int.py
|
|
20
|
+
tests/test_dictionaries.py
|
|
21
|
+
tests/test_models.py
|
|
22
|
+
tests/test_parsers.py
|
|
23
|
+
tests/test_utils.py
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
pisa_utils
|
|
File without changes
|