Data-driven prediction of $\alpha_{IIb}\beta_{3}$ integrin activation pathways using nonlinear manifold learning and deep generative modeling
This repository demonstrates a general and transferrable approach to predict intermediate transition structures between known metastable states of large multi-molecular systems that are difficult to obtain using conventional techniques. The method comprises nonlinear manifold learning and deep generative modeling (DGM) methods to learn the forward mapping from high dimension (3D molecular structure) to low dimension (latent space) and a generative inverse mapping from low dimension (latent space) to high dimension (3D molecular structure), respectively. Density adaptive-diffusion maps and conditional Wasserstein generative adversarial networks (cWGAN) are used for nonlinear manifold learning and DGM, respectively. We use the prototypical and a major platelet integrin
Installation
INSTALL_LOCATION=<set path here for installation of conda env>
conda env create -f integrin_environment.yml --prefix $INSTALL_LOCATION
Note: git lfs is used for storing files in this repository. To download the intermediate all-atom PDB files, it may be better to simply download the entire repository using download zip
link under Code
if git lfs clone
or git clone
don't work because of file size restrictions.
Required codes
- https://github.com/Ferg-Lab/dMap_JAX is used for performing root mean square deviation (RMSD) calculations and to obtain diffusion maps (dMap). Both RMSD and dMap calculations are accelerated via JAX library. Follow installation instructions on dMap_JAX/README.md.
- DGM using cWGAN is performed using https://github.com/Ferg-Lab/molgen.git. Follow installation instructions on molgen/README.md.
Overview
-
datasets: Contains all-atom (~1770 residues) to 300 bead coarse-grained mapping of
$\alpha_{IIb}-\beta_{3}$ integrin in each metastable state and learning of the unified low-dimensional embedding of the gross structure relations between and within metastable states using density-adaptive dMaps. Please see datasets/README.md for getting started. - model: Trained cWGAN model to generate 300 bead coarse-grained structures of integrin using 4D latent space (leading four non-trivial dMap coordinates) as the conditional input. Please see model/README.md for further details.
- inference: Access to notebooks to generate intermediate structure at latent space points corresponding to training samples and intermediate regions between metastable states for which no training samples are available. All-atom intermediate structures in PDB format between metastable states that are generated via targeted MD simulations are also available along with template GROMACS and PLUMED files. Please see inference/README.md.
Cite
If you use the codes or notebooks from this repo in your work, please cite:
S. Dasetty, T. C. Bidone, and A. L. Ferguson. "Data-driven prediction of
@article{ferglab2023Integrin,
title={Data-driven prediction of $\alpha_{IIb}-\beta_{3}$ integrin heterodimer activation pathways using nonlinear manifold learning and deep generative modeling,
author={Dasetty, S. and Bidone, T.C. and Ferguson, A.L.},
journal={XXXX},
volume={XXXX},
number={XXXX},
pages={XXXX-XXXX},
year={2023},
publisher={XXXX}
}