Deep Learning Driven Adaptive Sampling Methods for Targeting Cancer and COVID-19

Anda Trifan, University of Illinois at Urbana-Champaign

Photo of Anda Trifan

With increasingly large data sets, ranging on the scale of hundreds of terabytes, artificial intelligence and machine learning (ML) techniques have become an indispensable analysis tool. We have leveraged these tools to address a rapidly emerging pandemic by combining adaptive all-atom simulations of key coronavirus (SARS-CoV-2) proteins with deep learning models to understand key transitional events and conformational changes. We studied the intrinsic motions of the spike protein using adversarial autoencoders (AAE). We also leveraged this approach to derive an adaptive sampling technique, DeepDriveMD, that lets us interleave stages of simulations with ML techniques to identify compounds that can inhibit the papain-like protease (PLPro) protein, which is crucial in the viral replication process. Taken together, our approaches demonstrate at least two orders of magnitude faster sampling while providing quantitative insights into the biophysical mechanisms of important SARS-CoV-2 drug targets.

Abstract Author(s): Anda Trifan, Heng Ma, Alexander Brace, Lorenzo Casalino, Austin Clyde, Agastya Bhati, Shunzhou Wan, Matteo Turilli, Li Tan, Hyungro Lee, Peter Coveney, Shantenu Jha, Rick Stevens, Rommie Amaro, Arvind Ramanathan