David B Sauer1, Da-Neng Wang1. 1. Department of Cell Biology, and The Helen L. and Martin S. Kimmel Center for Biology and Medicine, Skirball Institute of Biomolecular Medicine, New York University School of Medicine, New York, New York, USA.
Abstract
MOTIVATION: Optimal growth temperature is a fundamental characteristic of all living organisms. Knowledge of this temperature is central to the study of a prokaryote, the thermal stability and temperature dependent activity of its genes, and the bioprospecting of its genome for thermally adapted proteins. While high throughput sequencing methods have dramatically increased the availability of genomic information, the growth temperatures of the source organisms are often unknown. This limits the study and technological application of these species and their genomes. Here, we present a novel method for the prediction of growth temperatures of prokaryotes using only genomic sequences. RESULTS: By applying the reverse ecology principle that an organism's genome includes identifiable adaptations to its native environment, we can predict a species' optimal growth temperature with an accuracy of 5.17°C root-mean-square error and a coefficient of determination of 0.835. The accuracy can be further improved for specific taxonomic clades or by excluding psychrophiles. This method provides a valuable tool for the rapid calculation of organism growth temperature when only the genome sequence is known. AVAILABILITY AND IMPLEMENTATION: Source code, genomes analyzed and features calculated are available at: https://github.com/DavidBSauer/OGT_prediction. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Optimal growth temperature is a fundamental characteristic of all living organisms. Knowledge of this temperature is central to the study of a prokaryote, the thermal stability and temperature dependent activity of its genes, and the bioprospecting of its genome for thermally adapted proteins. While high throughput sequencing methods have dramatically increased the availability of genomic information, the growth temperatures of the source organisms are often unknown. This limits the study and technological application of these species and their genomes. Here, we present a novel method for the prediction of growth temperatures of prokaryotes using only genomic sequences. RESULTS: By applying the reverse ecology principle that an organism's genome includes identifiable adaptations to its native environment, we can predict a species' optimal growth temperature with an accuracy of 5.17°C root-mean-square error and a coefficient of determination of 0.835. The accuracy can be further improved for specific taxonomic clades or by excluding psychrophiles. This method provides a valuable tool for the rapid calculation of organism growth temperature when only the genome sequence is known. AVAILABILITY AND IMPLEMENTATION: Source code, genomes analyzed and features calculated are available at: https://github.com/DavidBSauer/OGT_prediction. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: T Kawashima; N Amano; H Koike; S Makino; S Higuchi; Y Kawashima-Ohya; K Watanabe; M Yamazaki; K Kanehori; T Kawamoto; T Nunoshiba; Y Yamamoto; H Aramaki; K Makino; M Suzuki Journal: Proc Natl Acad Sci U S A Date: 2000-12-19 Impact factor: 11.205
Authors: A Merz; M C Yee; H Szadkowski; G Pappenberger; A Crameri; W P Stemmer; C Yanofsky; K Kirschner Journal: Biochemistry Date: 2000-02-08 Impact factor: 3.162
Authors: Daan R Speth; Feiqiao B Yu; Stephanie A Connon; Sujung Lim; John S Magyar; Manet E Peña-Salinas; Stephen R Quake; Victoria J Orphan Journal: ISME J Date: 2022-03-28 Impact factor: 11.217
Authors: Mark Westoby; Daniel Aagren Nielsen; Michael R Gillings; Elena Litchman; Joshua S Madin; Ian T Paulsen; Sasha G Tetu Journal: Ecol Evol Date: 2021-03-16 Impact factor: 2.912