Deep Site and Docking Pose (DSDP) is a blind docking strategy accelerated by GPUs, developed by Gao Group. For the site prediction part, several modifications are introduced to PUResNet program. The pose sampling part is similar as AutoDock Vina combined with a number of modifications.
This repository contains code, instructions, dataset and model weights necessary to run the method.
The source code is available on Linux systems (tested on Ubuntu 20.04, 22.04) .
NVCC is required for compilation, please install Cuda Toolkit and make sure it is in the system path. Cuda version would need to be compatible with g++
and torch
. The cuda version is cuda_11.6 and gcc version is 9.4.0, if a old gcc version was used in your computer, please modify the Makefile by replacing the sm_70 using sm_60.
Please set up the python environment by Anaconda.
Create a new environment by DSDP.yml
:
conda env create -f DSDP.yml
You need to check the version of torch
to match your cuda environment. If needed, please change the torch version directly in the DSDP.yml
file.
Activate the environment
conda activate DSDP
cd DSDP_redocking
make
cd ..
Once you need to compile again, please run make clean && make
.
cd protein_feature_tool
g++ protein_feature_tool.cpp -o protein_feature_tool
cd ..
cd surface_tool
make
cd ..
cd DSDP_blind_docking
make
cd ..
A python wrapper for DSDP is available on pypi (it can work on Ubuntu and Windows), which you can directlly use
pip install DSDP
to install it. There are some differences in the installation and usage between the source code and the python wrapper of DSDP. See DSDP_in_pypi for detail.
The files in test_dataset
contain three datasets, namely, DSDP_dataset, DUD-E dataset and PDBBind time split dataset.
For each complex you want to predict, you need a directory containing the ligand and protein file. For example:
DSDP_dataset
└───name1
│ name1_protein.pdbqt
│ name1_ligand.pdbqt
└───name2
│ name2_protein.pdbqt
│ name2_ligand.pdbqt
...
Input files of DSDP are pdbqt format, which can be generated by AutoDock Tools.
DSDP is an integrated docking program developed for blind docking, which can also be used for redocking task. We support pdbqt input format in DSDP. You can generate it from pdb file by AutoDock Tools. If you install from pypi, please see DSDP_in_pypi for details, and if you install it from scource code, you can run DSDP as following:
For blind docking task, run:
python DSDP_blind_docking.py \
--dataset_path ./test_dataset/DSDP_dataset/ \
--dataset_name DSDP_dataset \
--site_path ./results/DSDP_dataset/site_output/ \
--exhaustiveness 384 --search_depth 40 --top_n 1 \
--out ./results/DSDP_dataset/docking_results/ \
--log ./results/DSDP_dataset/docking_results/
Options (see --help
)
--dataset_path
: Path to the dataset file, please put the pdbqt documents of protein and ligand to one folder--dataset_name
: Name of the test dataset--site_path
: Output path of the site--exhaustiveness
: Number of sampling threads--search_depth
: Number of sampling steps--top_n
: Top N results are exported--out
: Output path of DSDP--log
: Log path of DSDPFor redocking and conventional docking tasks, run:
./DSDP_redocking/DSDP \
--ligand ./test_dataset/DSDP_dataset/1a2b/1a2b_ligand.pdbqt \
--protein ./test_dataset/DSDP_dataset/1a2b/1a2b_protein.pdbqt \
--box_min 2.241 20.008 21.314 \
--box_max 24.744 35.470 38.495 \
--exhaustiveness 384 --search_depth 40 --top_n 1 \
--out ./results/DSDP_dataset/redocking/1a2b_out.pdbqt \
--log ./results/DSDP_dataset/redocking/1a2b_out.log
Note: the box information (minima and maxima along x y z axis) of redocking needs to be provided by users. The box information of this example is only suitable for 1a2b protein.
--ligand
: File name of ligand--protein
: File name of protein--box_min
: x y z minima of box--box_max
: x y z maxima of box--exhaustiveness
: Number of sampling threads, default 384--search_depth
: Number of sampling steps, default 40--top_n
: Top N results are exported, default 10--out
: Output file name of redocking, default 'OUT.pdbqt'--log
: Log file name of redocking, default 'OUT.log'Also, the --help
command is provided to print massage about the arguments. This is supported in the new version at 2023/9/26, in which we also changed the name of arguments, e. g., -ligand
to --ligand
.
The binding site prediction part of DSDP is modified according to PUResNet. The file train_example
contains the script to train the model used in the present work. It should be noted that the train dataset in this file is just an example. The whole train dataset is a subset of PDBBind which is used in EquiBind (https://arxiv.org/abs/2202.05146). You can download this dataset from their website: https://zenodo.org/record/6408497.