PyPI - bio-analyze-docking - Versions diffs - 0.1.0a0__tar.gz - Mend

bio-analyze-docking 0.1.0a0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (77) hide show

bio_analyze_docking-0.1.0a0/.gitignore ADDED Viewed

@@ -0,0 +1,26 @@
+__pycache__/
+*.py[cod]
+*.pyd
+*.pyo
+*.so
+.Python
+.venv/
+env/
+venv/
+build/
+dist/
+*.egg-info/
+.pytest_cache/
+.ruff_cache/
+.mypy_cache/
+.idea/
+.vscode/
+uv.lock
+*.log
+output
+.trae/
+*.xml

bio_analyze_docking-0.1.0a0/CHANGELOG.md ADDED Viewed

File without changes

bio_analyze_docking-0.1.0a0/PKG-INFO ADDED Viewed

@@ -0,0 +1,239 @@
+Metadata-Version: 2.4
+Name: bio-analyze-docking
+Version: 0.1.0a0
+Summary: Molecular docking module for bio-analyze.
+Author: qww
+License: GPL-3.0
+Requires-Python: <3.15,>=3.9
+Requires-Dist: bio-analyze-core>=0.1.0a0
+Requires-Dist: gemmi>=0.6.0
+Requires-Dist: meeko>=0.5.1
+Requires-Dist: numpy>=1.20.0
+Requires-Dist: openbabel-wheel
+Requires-Dist: openpyxl>=3.1.0
+Requires-Dist: pandas>=2.0.0
+Requires-Dist: pdbfixer>=1.9.0
+Requires-Dist: propka>=3.5.0
+Requires-Dist: rdkit>=2024.03.1
+Requires-Dist: scipy>=1.10.0
+Provides-Extra: dev
+Requires-Dist: pytest; extra == 'dev'
+Description-Content-Type: text/markdown
+# bio-analyze-docking
+An automated molecular docking module based on Vina, providing a full-pipeline solution from receptor/ligand preparation to docking simulation and result summarization. Supports both single docking and high-throughput batch docking.
+## ✨ Features
+- **Batch Processing**: Supports multi-to-multi (M receptors x N ligands) batch docking by specifying directories, automatically handling task scheduling.
+- **Resumable**: Batch tasks support resuming; if interrupted, restarting will skip already completed tasks.
+- **Wide Format Support**:
+  - **Receptor**: Supports `.pdb`, `.cif`, `.mmcif` (automatically converted to PDB).
+  - **Ligand**: Supports `.sdf`, `.mol2`, `.pdb`, `.smi` (SMILES).
+- **Result Summarization**: Automatically generates `docking_summary.csv`, containing binding affinities, RMSD, and box parameters.
+- **Complex Generation**: Optionally generates the docked receptor-ligand complex structure (PDB format) for easy viewing in PyMOL.
+- **Automated Preparation**: Integrates `Meeko` and `PDBFixer` to automatically handle receptor protonation, missing atom completion, and ligand PDBQT conversion.
+## 🔧 Dependencies
+- **AutoDock Vina** (via Python `vina` package)
+- **Smina** (Advanced fork of Vina, must be installed in PATH)
+- **Gnina** (Deep learning-based docking, must be installed in PATH)
+- **Meeko** (Ligand/Receptor preparation)
+- **RDKit** (Chemical informatics)
+- **OpenBabel** (Backup for receptor preparation)
+- **Gemmi** (CIF/mmCIF support)
+- **PDBFixer** (Receptor repair)
+## 🚀 Usage
+### 1. Receptor Preparation
+Converts PDB/CIF files to PDBQT format, automatically adding polar hydrogens.
+```bash
+# Single file
+uv run bioanalyze docking prepare-receptor receptor.pdb -o receptor.pdbqt
+# CIF format support
+uv run bioanalyze docking prepare-receptor structure.cif -o structure.pdbqt
+```
+### 2. Ligand Preparation
+Converts SDF/SMILES/PDB files to PDBQT format, automatically generating 3D conformations and handling flexible bonds.
+```bash
+uv run bioanalyze docking prepare-ligand ligand.sdf -o ligand.pdbqt
+```
+### 3. Run Docking
+#### Scenario A: Single Docking
+```bash
+uv run bioanalyze docking run \
+    --receptor receptor.pdbqt \
+    --ligand ligand.pdbqt \
+    --output ./results \
+    --center-x 10.5 --center-y 20.0 --center-z 30.0 \
+    --size-x 20 --size-y 20 --size-z 20
+```
+#### Scenario B: Batch Docking
+Simply specify `--receptor` or `--ligand` as directories, and the program will automatically scan for all supported files and perform pairwise docking.
+```bash
+uv run bioanalyze docking run \
+    --receptor ./receptors_dir \
+    --ligand ./ligands_dir \
+    --output ./batch_results \
+    --padding 4.0  # Automatically calculate the box based on the receptor and add 4.0A padding
+```
+**Batch Docking Output Structure:**
+```
+batch_results/
+├── dock_results/
+│   ├── poses/          # Docked poses (PDBQT)
+│   │   └── receptor_name/
+│   │       └── ligand_name_docked.pdbqt
+│   └── complex/        # (Optional) Complex structures (PDB)
+├── docking_summary.csv # Summary table (contains Affinity, RMSD, etc.)
+├── logs/               # Independent logs for each task
+└── configs.json        # Run configuration record
+```
+#### Scenario C: Autoboxing based on Reference Ligand
+Use a co-crystallized ligand to automatically determine the docking center and extent.
+```bash
+uv run bioanalyze docking run \
+    --receptor receptor.pdbqt \
+    --ligand ligand.pdbqt \
+    --output ./results \
+    --autobox-ligand reference_ligand.sdf \
+    --padding 4.0
+```
+#### Scenario D: Using a Configuration File
+Manage complex parameters via `config.json` or `config.yaml`:
+```json
+{
+  "receptor": "./receptors_dir",
+  "ligand": "./ligands_dir",
+  "output_dir": "./results",
+  "exhaustiveness": 8,
+  "n_poses": 9,
+  "engine": "vina" // or "smina", "gnina"
+}
+```
+```bash
+uv run bioanalyze docking run --config config.json
+```
+#### Scenario E: Using Smina or Gnina Engine
+If `smina` or `gnina` is installed in the system PATH, you can enable them via the `--engine` parameter:
+```bash
+uv run bioanalyze docking run \
+    --receptor receptor.pdbqt \
+    --ligand ligand.pdbqt \
+    --output ./results \
+    --engine gnina
+```
+## 📦 Python API
+### 1. Single Docking (`run_docking`)
+```python
+from bio_analyze_docking import run_docking
+from pathlib import Path
+result = run_docking(
+    receptor=Path("receptor.pdb"),       # Supports PDB/PDBQT/CIF
+    ligand=Path("ligand.sdf"),           # Supports SDF/MOL2/PDB/SMILES
+    output_dir=Path("./results"),
+    center=[10.5, 20.0, 30.0],           # Box center [x, y, z]
+    size=[20.0, 20.0, 20.0],             # Box size [x, y, z]
+    exhaustiveness=8,                    # Search exhaustiveness (default 8)
+    n_poses=9,                           # Number of poses to output
+    output_docked_lig_recep_struct=True, # Whether to save complex PDB (requires PyMOL)
+    charge_model="gasteiger"             # Charge model
+)
+print(f"Best Score: {result['best_score']}")
+```
+**Parameters:**
+- `receptor`: Receptor file path (PDB/PDBQT/CIF/MMCIF).
+- `ligand`: Ligand file path (SDF/MOL2/PDB/SMILES).
+- `output_dir`: Output directory.
+- `center`: Box center `[x, y, z]`.
+- `size`: Box size `[x, y, z]` (default `[20, 20, 20]`).
+- `autobox_ligand`: (Optional) Reference ligand path for automatic box definition (overrides `center` and `size`).
+- `padding`: (Optional) Automatic box padding (Angstroms).
+- `exhaustiveness`: Vina search exhaustiveness (default 8).
+- `n_poses`: Number of poses to generate (default 9).
+- `output_docked_lig_recep_struct`: Whether to generate complex PDB files (default False).
+- `charge_model`: Charge model used during receptor preparation (default 'gasteiger').
+### 2. Batch Docking (`run_docking_batch`)
+```python
+from bio_analyze_docking import run_docking_batch
+from pathlib import Path
+results = run_docking_batch(
+    receptors=Path("./receptors_dir"),  # Receptor directory
+    ligands=Path("./ligands_dir"),      # Ligand directory
+    output_dir=Path("./batch_results"),
+    padding=4.0,                        # Auto box padding
+    exhaustiveness=8
+)
+# results is a list containing the results of each docking task
+```
+**Parameters:**
+- `receptors`: Receptor directory path or list of files.
+- `ligands`: Ligand directory path or list of files.
+- `output_dir`: Base output directory.
+- `summary_filename`: Summary file name (default "docking_summary.csv").
+- Other parameters are the same as `run_docking`.
+### 3. Underlying Components
+You can also use the underlying preparation and engine classes independently:
+```python
+from bio_analyze_docking import prepare_receptor, prepare_ligand, DockingEngine
+# Prepare files
+rec_pdbqt = prepare_receptor("protein.pdb", "protein.pdbqt")
+lig_pdbqt = prepare_ligand("ligand.sdf", "ligand.pdbqt")
+# Initialize engine
+engine = DockingEngine(rec_pdbqt, lig_pdbqt, output_dir=Path("./out"))
+# Compute box
+engine.compute_box(center=[0, 0, 0], size=[20, 20, 20])
+# Run docking
+engine.dock()
+# Save results
+engine.save_results("docked.pdbqt")
+print(f"Best affinity: {engine.score()}")
+```

bio_analyze_docking-0.1.0a0/README.md ADDED Viewed

@@ -0,0 +1,217 @@
+# bio-analyze-docking
+An automated molecular docking module based on Vina, providing a full-pipeline solution from receptor/ligand preparation to docking simulation and result summarization. Supports both single docking and high-throughput batch docking.
+## ✨ Features
+- **Batch Processing**: Supports multi-to-multi (M receptors x N ligands) batch docking by specifying directories, automatically handling task scheduling.
+- **Resumable**: Batch tasks support resuming; if interrupted, restarting will skip already completed tasks.
+- **Wide Format Support**:
+  - **Receptor**: Supports `.pdb`, `.cif`, `.mmcif` (automatically converted to PDB).
+  - **Ligand**: Supports `.sdf`, `.mol2`, `.pdb`, `.smi` (SMILES).
+- **Result Summarization**: Automatically generates `docking_summary.csv`, containing binding affinities, RMSD, and box parameters.
+- **Complex Generation**: Optionally generates the docked receptor-ligand complex structure (PDB format) for easy viewing in PyMOL.
+- **Automated Preparation**: Integrates `Meeko` and `PDBFixer` to automatically handle receptor protonation, missing atom completion, and ligand PDBQT conversion.
+## 🔧 Dependencies
+- **AutoDock Vina** (via Python `vina` package)
+- **Smina** (Advanced fork of Vina, must be installed in PATH)
+- **Gnina** (Deep learning-based docking, must be installed in PATH)
+- **Meeko** (Ligand/Receptor preparation)
+- **RDKit** (Chemical informatics)
+- **OpenBabel** (Backup for receptor preparation)
+- **Gemmi** (CIF/mmCIF support)
+- **PDBFixer** (Receptor repair)
+## 🚀 Usage
+### 1. Receptor Preparation
+Converts PDB/CIF files to PDBQT format, automatically adding polar hydrogens.
+```bash
+# Single file
+uv run bioanalyze docking prepare-receptor receptor.pdb -o receptor.pdbqt
+# CIF format support
+uv run bioanalyze docking prepare-receptor structure.cif -o structure.pdbqt
+```
+### 2. Ligand Preparation
+Converts SDF/SMILES/PDB files to PDBQT format, automatically generating 3D conformations and handling flexible bonds.
+```bash
+uv run bioanalyze docking prepare-ligand ligand.sdf -o ligand.pdbqt
+```
+### 3. Run Docking
+#### Scenario A: Single Docking
+```bash
+uv run bioanalyze docking run \
+    --receptor receptor.pdbqt \
+    --ligand ligand.pdbqt \
+    --output ./results \
+    --center-x 10.5 --center-y 20.0 --center-z 30.0 \
+    --size-x 20 --size-y 20 --size-z 20
+```
+#### Scenario B: Batch Docking
+Simply specify `--receptor` or `--ligand` as directories, and the program will automatically scan for all supported files and perform pairwise docking.
+```bash
+uv run bioanalyze docking run \
+    --receptor ./receptors_dir \
+    --ligand ./ligands_dir \
+    --output ./batch_results \
+    --padding 4.0  # Automatically calculate the box based on the receptor and add 4.0A padding
+```
+**Batch Docking Output Structure:**
+```
+batch_results/
+├── dock_results/
+│   ├── poses/          # Docked poses (PDBQT)
+│   │   └── receptor_name/
+│   │       └── ligand_name_docked.pdbqt
+│   └── complex/        # (Optional) Complex structures (PDB)
+├── docking_summary.csv # Summary table (contains Affinity, RMSD, etc.)
+├── logs/               # Independent logs for each task
+└── configs.json        # Run configuration record
+```
+#### Scenario C: Autoboxing based on Reference Ligand
+Use a co-crystallized ligand to automatically determine the docking center and extent.
+```bash
+uv run bioanalyze docking run \
+    --receptor receptor.pdbqt \
+    --ligand ligand.pdbqt \
+    --output ./results \
+    --autobox-ligand reference_ligand.sdf \
+    --padding 4.0
+```
+#### Scenario D: Using a Configuration File
+Manage complex parameters via `config.json` or `config.yaml`:
+```json
+{
+  "receptor": "./receptors_dir",
+  "ligand": "./ligands_dir",
+  "output_dir": "./results",
+  "exhaustiveness": 8,
+  "n_poses": 9,
+  "engine": "vina" // or "smina", "gnina"
+}
+```
+```bash
+uv run bioanalyze docking run --config config.json
+```
+#### Scenario E: Using Smina or Gnina Engine
+If `smina` or `gnina` is installed in the system PATH, you can enable them via the `--engine` parameter:
+```bash
+uv run bioanalyze docking run \
+    --receptor receptor.pdbqt \
+    --ligand ligand.pdbqt \
+    --output ./results \
+    --engine gnina
+```
+## 📦 Python API
+### 1. Single Docking (`run_docking`)
+```python
+from bio_analyze_docking import run_docking
+from pathlib import Path
+result = run_docking(
+    receptor=Path("receptor.pdb"),       # Supports PDB/PDBQT/CIF
+    ligand=Path("ligand.sdf"),           # Supports SDF/MOL2/PDB/SMILES
+    output_dir=Path("./results"),
+    center=[10.5, 20.0, 30.0],           # Box center [x, y, z]
+    size=[20.0, 20.0, 20.0],             # Box size [x, y, z]
+    exhaustiveness=8,                    # Search exhaustiveness (default 8)
+    n_poses=9,                           # Number of poses to output
+    output_docked_lig_recep_struct=True, # Whether to save complex PDB (requires PyMOL)
+    charge_model="gasteiger"             # Charge model
+)
+print(f"Best Score: {result['best_score']}")
+```
+**Parameters:**
+- `receptor`: Receptor file path (PDB/PDBQT/CIF/MMCIF).
+- `ligand`: Ligand file path (SDF/MOL2/PDB/SMILES).
+- `output_dir`: Output directory.
+- `center`: Box center `[x, y, z]`.
+- `size`: Box size `[x, y, z]` (default `[20, 20, 20]`).
+- `autobox_ligand`: (Optional) Reference ligand path for automatic box definition (overrides `center` and `size`).
+- `padding`: (Optional) Automatic box padding (Angstroms).
+- `exhaustiveness`: Vina search exhaustiveness (default 8).
+- `n_poses`: Number of poses to generate (default 9).
+- `output_docked_lig_recep_struct`: Whether to generate complex PDB files (default False).
+- `charge_model`: Charge model used during receptor preparation (default 'gasteiger').
+### 2. Batch Docking (`run_docking_batch`)
+```python
+from bio_analyze_docking import run_docking_batch
+from pathlib import Path
+results = run_docking_batch(
+    receptors=Path("./receptors_dir"),  # Receptor directory
+    ligands=Path("./ligands_dir"),      # Ligand directory
+    output_dir=Path("./batch_results"),
+    padding=4.0,                        # Auto box padding
+    exhaustiveness=8
+)
+# results is a list containing the results of each docking task
+```
+**Parameters:**
+- `receptors`: Receptor directory path or list of files.
+- `ligands`: Ligand directory path or list of files.
+- `output_dir`: Base output directory.
+- `summary_filename`: Summary file name (default "docking_summary.csv").
+- Other parameters are the same as `run_docking`.
+### 3. Underlying Components
+You can also use the underlying preparation and engine classes independently:
+```python
+from bio_analyze_docking import prepare_receptor, prepare_ligand, DockingEngine
+# Prepare files
+rec_pdbqt = prepare_receptor("protein.pdb", "protein.pdbqt")
+lig_pdbqt = prepare_ligand("ligand.sdf", "ligand.pdbqt")
+# Initialize engine
+engine = DockingEngine(rec_pdbqt, lig_pdbqt, output_dir=Path("./out"))
+# Compute box
+engine.compute_box(center=[0, 0, 0], size=[20, 20, 20])
+# Run docking
+engine.dock()
+# Save results
+engine.save_results("docked.pdbqt")
+print(f"Best affinity: {engine.score()}")
+```

bio_analyze_docking-0.1.0a0/metadata/prepare-ligand.json ADDED Viewed

@@ -0,0 +1,39 @@
+{
+  "name": "prepare-ligand",
+  "description": {
+    "en": "",
+    "zh": ""
+  },
+  "params": [
+    {
+      "name": "--input-file",
+      "type": "path",
+      "required": true,
+      "default": null,
+      "description": {
+        "en": "输入配体文件 (SDF, SMILES, PDB)。",
+        "zh": "输入配体文件 (SDF, SMILES, PDB)。"
+      }
+    },
+    {
+      "name": "-o, --output",
+      "type": "path",
+      "required": true,
+      "default": null,
+      "description": {
+        "en": "输出 PDBQT 文件。",
+        "zh": "输出 PDBQT 文件。"
+      }
+    },
+    {
+      "name": "--add-hydrogens",
+      "type": "bool",
+      "required": false,
+      "default": "True",
+      "description": {
+        "en": "添加氢原子。",
+        "zh": "添加氢原子。"
+      }
+    }
+  ]
+}

bio_analyze_docking-0.1.0a0/metadata/prepare-ligand_cli.json ADDED Viewed

@@ -0,0 +1,40 @@
+{
+  "name": "prepare-ligand",
+  "type": "cli",
+  "description": {
+    "zh": "",
+    "en": ""
+  },
+  "params": [
+    {
+      "name": "--input-file",
+      "type": "path",
+      "required": true,
+      "default": null,
+      "description": {
+        "zh": "输入配体文件 (SDF, SMILES, PDB)。",
+        "en": "Input ligand file (SDF, SMILES, PDB)."
+      }
+    },
+    {
+      "name": "-o, --output",
+      "type": "path",
+      "required": true,
+      "default": null,
+      "description": {
+        "zh": "输出 PDBQT 文件。",
+        "en": "Output PDBQT file."
+      }
+    },
+    {
+      "name": "--add-hydrogens",
+      "type": "bool",
+      "required": false,
+      "default": "True",
+      "description": {
+        "zh": "添加氢原子。",
+        "en": "Add hydrogens."
+      }
+    }
+  ]
+}

bio_analyze_docking-0.1.0a0/metadata/prepare-receptor.json ADDED Viewed

@@ -0,0 +1,49 @@
+{
+  "name": "prepare-receptor",
+  "description": {
+    "en": "",
+    "zh": ""
+  },
+  "params": [
+    {
+      "name": "--input-file",
+      "type": "path",
+      "required": true,
+      "default": null,
+      "description": {
+        "en": "输入受体文件 (PDB)。",
+        "zh": "输入受体文件 (PDB)。"
+      }
+    },
+    {
+      "name": "-o, --output",
+      "type": "path",
+      "required": true,
+      "default": null,
+      "description": {
+        "en": "输出 PDBQT 文件。",
+        "zh": "输出 PDBQT 文件。"
+      }
+    },
+    {
+      "name": "--add-hydrogens",
+      "type": "bool",
+      "required": false,
+      "default": "True",
+      "description": {
+        "en": "添加氢原子。",
+        "zh": "添加氢原子。"
+      }
+    },
+    {
+      "name": "--charge-model",
+      "type": "string",
+      "required": false,
+      "default": "gasteiger",
+      "description": {
+        "en": "电荷模型 (gasteiger, zero 等)。",
+        "zh": "电荷模型 (gasteiger, zero 等)。"
+      }
+    }
+  ]
+}

bio_analyze_docking-0.1.0a0/metadata/prepare-receptor_cli.json ADDED Viewed

@@ -0,0 +1,50 @@
+{
+  "name": "prepare-receptor",
+  "type": "cli",
+  "description": {
+    "zh": "",
+    "en": ""
+  },
+  "params": [
+    {
+      "name": "--input-file",
+      "type": "path",
+      "required": true,
+      "default": null,
+      "description": {
+        "zh": "输入受体文件 (PDB)。",
+        "en": "Input receptor file (PDB)."
+      }
+    },
+    {
+      "name": "-o, --output",
+      "type": "path",
+      "required": true,
+      "default": null,
+      "description": {
+        "zh": "输出 PDBQT 文件。",
+        "en": "Output PDBQT file."
+      }
+    },
+    {
+      "name": "--add-hydrogens",
+      "type": "bool",
+      "required": false,
+      "default": "True",
+      "description": {
+        "zh": "添加氢原子。",
+        "en": "Add hydrogens."
+      }
+    },
+    {
+      "name": "--charge-model",
+      "type": "string",
+      "required": false,
+      "default": "gasteiger",
+      "description": {
+        "zh": "电荷模型 (gasteiger, zero 等)。",
+        "en": "Charge model (gasteiger, zero, etc.)."
+      }
+    }
+  ]
+}