The Known Unknown: Computational Identification of Promising Enzymatic Reactions and Associated Genes
Industrial biotechnology, utilizing enzymes or whole organisms for the production of chemical compounds, is being increasingly explored for use in industrial processes. Unfortunately, the discovery of new biochemical pathways remains challenging, hindering the application of industrial biotechnology to new processes. Often many potential biochemical pathways for the production of a compound are conceivable and manually identifying them all is difficult and time consuming. Many computational techniques for biochemical pathway prediction utilize databases of known enzymatic chemistry to identify promising pathways. Unfortunately, such methods are inherently restricted to exploring only previously discovered enzymatic reactions. Our knowledge of enzymatic reactions is far from complete and by limiting our exploration to known enzymatic reactions we preclude the possibility of discovering pathways that involve likely enzymatic reactions that have simply not yet been observed. The Broadbelt lab has developed a program known as the Biochemical Network Integrated Computational Explorer (BNICE) to assist such research by utilizing generalized reaction rules to predict probable enzymatic reactions. In this work, we describe a novel dynamic programming algorithm for the automatic generation of these reaction rules from databases of enzymatic reactions. We further demonstrate the use of these rules to study the biosynthesis of coenzyme Q, a key component of the electron transport chain, and a novel network-based approach which uses these rules to identify gene candidates for the reactions in the predicted biochemical pathways. We believe these new computational techniques have the potential to rapidly increase the speed at which new biochemical pathways can be discovered and applied to industrial processes.