Feeds:
Posts
Comments

Posts Tagged ‘graph neural networks’

@MIT Artificial intelligence system rapidly predicts how two proteins will attach: The model called Equidock, focuses on rigid body docking — which occurs when two proteins attach by rotating or translating in 3D space, but their shapes don’t squeeze or bend

Reporter: Aviva Lev-Ari, PhD, RN

This paper introduces a novel SE(3) equivariant graph matching network, along with a keypoint discovery and alignment approach, for the problem of protein-protein docking, with a novel loss based on optimal transport. The overall consensus is that this is an impactful solution to an important problem, whereby competitive results are achieved without the need for templates, refinement, and are achieved with substantially faster run times.
28 Sept 2021 (modified: 18 Nov 2021)ICLR 2022 SpotlightReaders:  Everyone Show BibtexShow Revisions
 
Keywords:protein complexes, protein structure, rigid body docking, SE(3) equivariance, graph neural networks
AbstractProtein complex formation is a central problem in biology, being involved in most of the cell’s processes, and essential for applications such as drug design or protein engineering. We tackle rigid body protein-protein docking, i.e., computationally predicting the 3D structure of a protein-protein complex from the individual unbound structures, assuming no three-dimensional flexibility during binding. We design a novel pairwise-independent SE(3)-equivariant graph matching network to predict the rotation and translation to place one of the proteins at the right location and the right orientation relative to the second protein. We mathematically guarantee that the predicted complex is always identical regardless of the initial placements of the two structures, avoiding expensive data augmentation. Our model approximates the binding pocket and predicts the docking pose using keypoint matching and alignment through optimal transport and a differentiable Kabsch algorithm. Empirically, we achieve significant running time improvements over existing protein docking software and predict qualitatively plausible protein complex structures despite not using heavy sampling, structure refinement, or templates.
One-sentence SummaryWe perform rigid protein docking using a novel independent SE(3)-equivariant message passing mechanism that guarantees the same resulting protein complex independent of the initial placement of the two 3D structures.
 
SOURCE
 

MIT researchers created a machine-learning model that can directly predict the complex that will form when two proteins bind together. Their technique is between 80 and 500 times faster than state-of-the-art software methods, and often predicts protein structures that are closer to actual structures that have been observed experimentally.

This technique could help scientists better understand some biological processes that involve protein interactions, like DNA replication and repair; it could also speed up the process of developing new medicines.

Deep learning is very good at capturing interactions between different proteins that are otherwise difficult for chemists or biologists to write experimentally. Some of these interactions are very complicated, and people haven’t found good ways to express them. This deep-learning model can learn these types of interactions from data,” says Octavian-Eugen Ganea, a postdoc in the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and co-lead author of the paper.

Ganea’s co-lead author is Xinyuan Huang, a graduate student at ETH Zurich. MIT co-authors include Regina Barzilay, the School of Engineering Distinguished Professor for AI and Health in CSAIL, and Tommi Jaakkola, the Thomas Siebel Professor of Electrical Engineering in CSAIL and a member of the Institute for Data, Systems, and Society. The research will be presented at the International Conference on Learning Representations.

Significance of the Scientific Development by the @MIT Team

EquiDock wide applicability:

  • Our method can be integrated end-to-end to boost the quality of other models (see above discussion on runtime importance). Examples are predicting functions of protein complexes [3] or their binding affinity [5], de novo generation of proteins binding to specific targets (e.g., antibodies [6]), modeling back-bone and side-chain flexibility [4], or devising methods for non-binary multimers. See the updated discussion in the “Conclusion” section of our paper.

 

Advantages over previous methods:

  • Our method does not rely on templates or heavy candidate sampling [7], aiming at the ambitious goal of predicting the complex pose directly. This should be interpreted in terms of generalization (to unseen structures) and scalability capabilities of docking models, as well as their applicability to various other tasks (discussed above).

 

  • Our method obtains a competitive quality without explicitly using previous geometric (e.g., 3D Zernike descriptors [8]) or chemical (e.g., hydrophilic information) features [3]. Future EquiDock extensions would find creative ways to leverage these different signals and, thus, obtain more improvements.

   

Novelty of theory:

  • Our work is the first to formalize the notion of pairwise independent SE(3)-equivariance. Previous work (e.g., [9,10]) has incorporated only single object Euclidean-equivariances into deep learning models. For tasks such as docking and binding of biological objects, it is crucial that models understand the concept of multi-independent Euclidean equivariances.

  • All propositions in Section 3 are our novel theoretical contributions.

  • We have rewritten the Contribution and Related Work sections to clarify this aspect.

   


Footnote [a]: We have fixed an important bug in the cross-attention code. We have done a more extensive hyperparameter search and understood that layer normalization is crucial in layers used in Eqs. 5 and 9, but not on the h embeddings as it was originally shown in Eq. 10. We have seen benefits from training our models with a longer patience in the early stopping criteria (30 epochs for DIPS and 150 epochs for DB5). Increasing the learning rate to 2e-4 is important to speed-up training. Using an intersection loss weight of 10 leads to improved results compared to the default of 1.

 

Bibliography:

[1] Protein-ligand blind docking using QuickVina-W with inter-process spatio-temporal integration, Hassan et al., 2017

[2] GNINA 1.0: molecular docking with deep learning, McNutt et al., 2021

[3] Protein-protein and domain-domain interactions, Kangueane and Nilofer, 2018

[4] Side-chain Packing Using SE(3)-Transformer, Jindal et al., 2022

[5] Contacts-based prediction of binding affinity in protein–protein complexes, Vangone et al., 2015

[6] Iterative refinement graph neural network for antibody sequence-structure co-design, Jin et al., 2021

[7] Hierarchical, rotation-equivariant neural networks to select structural models of protein complexes, Eismann et al, 2020

[8] Protein-protein docking using region-based 3D Zernike descriptors, Venkatraman et al., 2009

[9] SE(3)-transformers: 3D roto-translation equivariant attention networks, Fuchs et al, 2020

[10] E(n) equivariant graph neural networks, Satorras et al., 2021

[11] Fast end-to-end learning on protein surfaces, Sverrisson et al., 2020

SOURCE

https://openreview.net/forum?id=GQjaI9mLet

Read Full Post »

%d bloggers like this: