How Artificial Intelligence Is Advancing Structural Proteomics

This article is based on research findings that are yet to be peer-reviewed. Results are therefore considered as preliminary and should be interpreted as such. Find out about the role of the peer review process in research here. For further information, please contact the cited source.

Understanding protein complex formation is crucial in drug design and the development of therapeutic proteins such as antibodies. However, proteins can attach to each other in millions of different combinations and current docking solutions used to predict these interactions can be very slow. Faster and more accurate solutions are needed to streamline the process.

In a pre-print published earlier this year, a new machine-learning model – EquiDock – was introduced that can rapidly predict how two proteins will interact. Unlike other approaches, the model doesn’t rely on heavy candidate sampling and was shown to reach predictions up to 80ؘ–500 times faster than popular docking software.

To learn more about EquiDock and how artificial intelligence (AI) methods are advancing the field of structural proteomics, Technology Networks spoke to co-lead author of the paper, Octavian-Eugen Ganeaa postdoctoral researcher in the MIT Computer Science and Artificial Intelligence Laboratory.

Molly Campbell (MC): For our readers that may be unfamiliar, please can you describe your current research focus in proteomics?

Octavian Ganea (OG): My research uses AI (specifically, deep learning) to model aspects of molecules that are important in various applications such as drug discovery.

Proteins are involved in most of the biological processes in our bodies. Two or more proteins with different functions interact and form larger machines, ie, complexes. They also bind to smaller molecules such as those found in drugs. These processes change the biological functions of individual proteins, for instance an ideal drug would inhibit a cancer-causing protein by attaching to specific parts of its surface. I am interested in using deep learning to model these interactions and to assist and speed-up the research of chemists and biologists by providing better and faster computational tools.

MC: How are AI-based methods advancing the field of proteomics and specifically structural proteomics?

OG: Biological processes are inherently very complicated and have their own mysteries, even for domain experts. For instance, to understand how interacting proteins attach to each other, humans or computers have to try out all possible attachment combinations in order to find the most plausible one. Intuitively, having two three-dimensional objects with very irregular surfaces, one has to rotate them and try to dock them in all possible ways until one can find two complementary regions on both surfaces that would match very well in terms of their geometric and chemical patterns . This is a very time-consuming process for both manual approaches and computational ones. Moreover, biologists are interested in discovering new interactions across a very large set of proteins such as the ~20,000-sized human proteome. This is important, for instance, for automatically discovering unexpected side-effects of new treatments. Such a problem now becomes similar to an extremely large 3D puzzle where one has to simultaneously scan pieces for matching ones, as well as understanding how each single pairwise attachment happens by trying out all possible combinations and rotations.

MC: Can you explain how you created EquiDock?

OG: EquiDock takes the 3D structures of two proteins and directly identifies which areas are likely to interact which otherwise would be a complicated problem even for a biology expert. Discovering this information is then enough for understanding how to rotate and orient the two proteins in their attached positions. EquiDock learns to capture complex docking patterns from a large set of ~41,000 protein structures using a geometrically constrained model with thousands of parameters that are dynamically and automatically adjusted until they solve the task very well.

MC: What are the potential applications of EquiDock?

OG: As already mentioned, EquiDock can enable fast computational scanning of drug side effects. This goes along with massive scale virtual screening of drugs and other types of molecules (eg, antibodies, nanobodies, peptides). This is needed in order to significantly reduce an astronomical search space that would otherwise be infeasible for all our current experimental capabilities (even world-wide aggregated). A fast protein-protein docking method such as EquiDock combined with a fast protein structure prediction model (such as AlphaFold2 developed by DeepMind) would help drug design, protein engineering, antibody generation, or understanding a drug’s mechanism of action, among many other exciting applications critically needed in our search for better disease treatments.

Octavian Ganea was speaking to Molly Campbell, Senior Science Writer for Technology Networks.