Literature DB >> 34669692

AI delivers Michaelis constants as fuel for genome-scale metabolic models.

Albert A Antolin^1,2, Marta Cascante^3,4.

Abstract

Michaelis constants (Km) are essential to predict the catalytic rate of enzymes, but are not widely available. A new study in PLOS Biology uses artificial intelligence (AI) to accurately predict Km on a proteome-wide scale, paving the way for dynamic, genome-wide modeling of metabolism.

Entities: Chemical

Mesh：
Artificial Intelligence

Year: 2021 PMID： 34669692 PMCID： PMC8528274 DOI： 10.1371/journal.pbio.3001415

Source DB: PubMed Journal: PLoS Biol ISSN： 1544-9173 Impact factor: 8.029

The Michaelis–Menten equation was derived by Leonor Michaelis and Maud Menten to quantify the velocity of an enzymatic reaction using measurable concentrations of enzyme and substrate even before the exact nature of enzymes was elucidated (Fig 1) [1]. Despite the limitations, its broad applicability, simplicity, and elegance have made it a cornerstone of biochemistry over the last century [1].

Fig 1

Impact of Km availability on metabolic modeling.

Impact of Km availability on metabolic modeling.

AI accurate and comprehensive prediction of Km values, the key parameters related with enzyme substrate saturation, for 47 model organisms can be used to simulate dynamic metabolic flux changes at genome scale, facilitating the full exploitation of metabolomics data and opening new avenues in drug target discovery and metabolic engineering. AI, artificial intelligence; Km, Michaelis constant; RNA-seq, RNA sequencing. The Michaelis constant (Km) in the equation is a pseudo-equilibrium constant that corresponds to the substrate concentration at which an enzyme operates at half of its maximum catalytic rate (Fig 1) [2]. Moreover, under certain assumptions, Km is also an inverse measure of the affinity between the enzyme and its substrate [2]. Km values can vary widely, often between 10−1 and 10−7 M [2]. Therefore, the determination of Km is essential to predict catalytic rate of product formation and ideal substrate concentrations. This is important not only for fundamental research in enzymology but also for modern industrial biocatalysis, among other applications. Unfortunately, the experimental characterization of Km values is laborious and time-consuming as it requires expressing and purifying enzymes and measuring their initial reaction rate at several substrate concentrations. Accordingly, Km values in public repositories exist for only a small fraction of enzymatic reactions (Fig 1) [3]. For example, Km values have been experimentally determined for less than 30% of Escherichia coli’s natural substrates (Fig 1)[3]. In turn, this lack of experimental data heavily limits its broad applicability in systems biology and metabolic modeling. Artificial intelligence (AI), empowered by the increasing availability of Big Data, is transforming many aspects of our lives and multiple research fields [4]. Rooted in the 1950s, AI could be broadly defined as an algorithm that can “learn” patterns from training datasets and apply this learning to make new predictions [4]. We often subdivide the field between different types of “learning.” Machine learning (ML) uses hundreds of parameters that remain fully transparent to the researcher, but the ways in which they are combined are not always obvious. Deep learning (DL), in contrast, uses layered abstraction to identify key patterns in much more complex, sparse, and multidimensional data [4]. As recently illustrated by the impressive advances of Google’s DeepMind Alphafold2 in protein structure prediction, AI holds great potential to transform areas of research by releasing large-scale predictions that empower researchers worldwide [5]. Now, a new study published in PLOS Biology by Kroll and colleagues uses AI to predict Km purely from protein and substrate information [6]. Their generalizable, organism-independent algorithm and predictions could have a transformative impact in several research fields. The authors used Km values from public databases to train AI models with an increasing amount of additional substrate and protein information [6]. First, they compared 4 different molecular fingerprints—vectors commonly used to numerically represent small molecules. Interestingly, a task-specific molecular fingerprint of the substrate generated using a graph neural network outperformed 3 traditional predefined molecular fingerprints. This result illustrates how DL can also be used to identify the best molecular representation [6]. The authors then compared a method of linear regression, a ML method and a DL method to train the models. Perhaps surprisingly, the ML method—gradient boosting—outperformed the other approaches, illustrating that more complex models are not necessarily better. Finally, the authors then used a cutting-edge deep numerical representation of the enzyme’s amino acid sequence, termed UniRep vector, to provide information on the enzyme. Interestingly, while the best model is the one using both enzyme and substrate information, the model only using substrate information outperforms the model only using enzyme information. The fact that the information on the exact residues comprising the catalytic site could not be provided probably contributes to explain this discrepancy, but it is interesting to speculate that this information is partially encoded in the substrate because the catalytic site has been optimized throughout evolution to fit the transition state of the substrate. The final model was appropriately validated using an independent dataset and predicted Km values only deviated from experimental values by 4-fold on average. However, model performance was still increasing with the size of the training dataset, and, therefore, it will be important to continue improving the model as more experimental data become available, particularly regarding extreme values poorly represented in public datasets. Overall, Kroll and colleagues provide a very significant step forward that outperforms previous attempts at predicting Km. Importantly, the authors not only provide the code in a public repository, but they also make available genome-scale Km predictions for 47 model organisms. We foresee that these invaluable predictions will open new avenues of research in multiple fields. In particular, we think they could be an important step toward dynamic, genome-scale metabolic models (GSMMs). GSMMs emerged in the last decade as powerful constraint-based modeling platforms to achieve quantitative predictions of metabolic fluxes through multiomics data integration [7,8]. GSMMs have been successfully used for metabolic engineering and to identify cancer drug targets [9,10], but they are limited by the use of reconstructed metabolic reaction maps based on stoichiometric linear equations and pseudo steady state assumptions. One of the main bottlenecks is that kinetic parameters related to enzyme substrate saturation are not comprehensively available to be included in the equations describing enzyme reactions. This significantly limits model accuracy and provides a static model. A second, related, bottleneck is that metabolomics data can only be integrated qualitatively as metabolite concentrations cannot be calculated with current stoichiometric models. The deposition of Km predictions proteome-wide by Kroll and colleagues could fuel a new generation of GSMMs that accurately predict dynamic metabolic flux maps to uncover new drug targets and boost our ability to quantitatively and accurately model metabolism.

8 in total

1. Transforming cancer drug discovery with Big Data and AI.

Authors: Paul Workman; Albert A Antolin; Bissan Al-Lazikani
Journal: Expert Opin Drug Discov Date: 2019-07-08 Impact factor: 6.098

2. Creation and analysis of biochemical constraint-based models using the COBRA Toolbox v.3.0.

Authors: Laurent Heirendt; Sylvain Arreckx; Thomas Pfau; Sebastián N Mendoza; Anne Richelle; Almut Heinken; Hulda S Haraldsdóttir; Jacek Wachowiak; Sarah M Keating; Vanja Vlasov; Stefania Magnusdóttir; Chiam Yu Ng; German Preciat; Alise Žagare; Siu H J Chan; Maike K Aurich; Catherine M Clancy; Jennifer Modamio; John T Sauls; Alberto Noronha; Aarash Bordbar; Benjamin Cousins; Diana C El Assal; Luis V Valcarcel; Iñigo Apaolaza; Susan Ghaderi; Masoud Ahookhosh; Marouen Ben Guebila; Andrejs Kostromins; Nicolas Sompairac; Hoai M Le; Ding Ma; Yuekai Sun; Lin Wang; James T Yurkovich; Miguel A P Oliveira; Phan T Vuong; Lemmer P El Assal; Inna Kuperstein; Andrei Zinovyev; H Scott Hinton; William A Bryant; Francisco J Aragón Artacho; Francisco J Planes; Egils Stalidzans; Alejandro Maass; Santosh Vempala; Michael Hucka; Michael A Saunders; Costas D Maranas; Nathan E Lewis; Thomas Sauter; Bernhard Ø Palsson; Ines Thiele; Ronan M T Fleming
Journal: Nat Protoc Date: 2019-03 Impact factor: 13.491

Review 3. A guide to the Michaelis-Menten equation: steady state and beyond.

Authors: Bharath Srinivasan
Journal: FEBS J Date: 2021-07-31 Impact factor: 5.622

4. Harnessing synthetic lethality to predict the response to cancer treatment.

Authors: Joo Sang Lee; Avinash Das; Livnat Jerby-Arnon; Rand Arafeh; Noam Auslander; Matthew Davidson; Lynn McGarry; Daniel James; Arnaud Amzallag; Seung Gu Park; Kuoyuan Cheng; Welles Robinson; Dikla Atias; Chani Stossel; Ella Buzhor; Gidi Stein; Joshua J Waterfall; Paul S Meltzer; Talia Golan; Sridhar Hannenhalli; Eyal Gottlieb; Cyril H Benes; Yardena Samuels; Emma Shanks; Eytan Ruppin
Journal: Nat Commun Date: 2018-06-29 Impact factor: 14.919

5. BRENDA in 2019: a European ELIXIR core data resource.

Authors: Lisa Jeske; Sandra Placzek; Ida Schomburg; Antje Chang; Dietmar Schomburg
Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971

6. Integrating systemic and molecular levels to infer key drivers sustaining metabolic adaptations.

Authors: Pedro de Atauri; Míriam Tarrado-Castellarnau; Josep Tarragó-Celada; Carles Foguet; Effrosyni Karakitsou; Josep Joan Centelles; Marta Cascante
Journal: PLoS Comput Biol Date: 2021-07-23 Impact factor: 4.475

7. Highly accurate protein structure prediction with AlphaFold.

Authors: John Jumper; Richard Evans; Alexander Pritzel; Tim Green; Michael Figurnov; Olaf Ronneberger; Kathryn Tunyasuvunakool; Russ Bates; Augustin Žídek; Anna Potapenko; Alex Bridgland; Clemens Meyer; Simon A A Kohl; Andrew J Ballard; Andrew Cowie; Bernardino Romera-Paredes; Stanislav Nikolov; Rishub Jain; Demis Hassabis; Jonas Adler; Trevor Back; Stig Petersen; David Reiman; Ellen Clancy; Michal Zielinski; Martin Steinegger; Michalina Pacholska; Tamas Berghammer; Sebastian Bodenstein; David Silver; Oriol Vinyals; Andrew W Senior; Koray Kavukcuoglu; Pushmeet Kohli
Journal: Nature Date: 2021-07-15 Impact factor: 49.962

8 in total

1 in total

1. renz: An R package for the analysis of enzyme kinetic data.

Authors: Juan Carlos Aledo
Journal: BMC Bioinformatics Date: 2022-05-16 Impact factor: 3.307

1 in total