Syed Shujaat Ali Zaidi1,2,3, Masood Ur Rehman Kayani4, Xuegong Zhang1, Younan Ouyang5, Imran Haider Shamsi6. 1. Bioinformatics Division, Beijing National Research Institute for Information Science and Technology (BNRIST), Department of Automation, Tsinghua University, Beijing, 100084, People's Republic of China. 2. Bioscience Department, COMSATS Institute of Information Technology, Islamabad, 44000, Pakistan. 3. Center for Innovation in Brain Science, University of Arizona, Tucson, 85719, USA. 4. Center for Microbiota and Immunological Diseases, Shanghai General Hospital, Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of Medicine, Shanghai, 2000025, People's Republic of China. 5. China National Rice Research Institute (CNRRI), 28 Shuidaosuo rd, Fuyang, Hangzhou, 311400, People's Republic of China. 6. Department of Agronomy, College of Agriculture and Biotechnology, Key Laboratory of Crop Germplasm Resource, Zhejiang University, Hangzhou, 310058, People's Republic of China. drimran@zju.edu.cn.
Abstract
BACKGROUND: Efficient regulation of bacterial genes in response to the environmental stimulus results in unique gene clusters known as operons. Lack of complete operonic reference and functional information makes the prediction of metagenomic operons a challenging task; thus, opening new perspectives on the interpretation of the host-microbe interactions. RESULTS: In this work, we identified whole-genome and metagenomic operons via MetaRon (Metagenome and whole-genome opeRon prediction pipeline). MetaRon identifies operons without any experimental or functional information. MetaRon was implemented on datasets with different levels of complexity and information. Starting from its application on whole-genome to simulated mixture of three whole-genomes (E. coli MG1655, Mycobacterium tuberculosis H37Rv and Bacillus subtilis str. 16), E. coli c20 draft genome extracted from chicken gut and finally on 145 whole-metagenome data samples from human gut. MetaRon consistently achieved high operon prediction sensitivity, specificity and accuracy across E. coli whole-genome (97.8, 94.1 and 92.4%), simulated genome (93.7, 75.5 and 88.1%) and E. coli c20 (87, 91 and 88%,), respectively. Finally, we identified 1,232,407 unique operons from 145 paired-end human gut metagenome samples. We also report strong association of type 2 diabetes with Maltose phosphorylase (K00691), 3-deoxy-D-glycero-D-galacto-nononate 9-phosphate synthase (K21279) and an uncharacterized protein (K07101). CONCLUSION: With MetaRon, we were able to remove two notable limitations of existing whole-genome operon prediction methods: (1) generalizability (ability to predict operons in unrelated bacterial genomes), and (2) whole-genome and metagenomic data management. We also demonstrate the use of operons as a subset to represent the trends of secondary metabolites in whole-metagenome data and the role of secondary metabolites in the occurrence of disease condition. Using operonic data from metagenome to study secondary metabolic trends will significantly reduce the data volume to more precise data. Furthermore, the identification of metabolic pathways associated with the occurrence of type 2 diabetes (T2D) also presents another dimension of analyzing the human gut metagenome. Presumably, this study is the first organized effort to predict metagenomic operons and perform a detailed analysis in association with a disease, in this case type 2 diabetes. The application of MetaRon to metagenomic data at diverse scale will be beneficial to understand the gene regulation and therapeutic metagenomics.
BACKGROUND: Efficient regulation of bacterial genes in response to the environmental stimulus results in unique gene clusters known as operons. Lack of complete operonic reference and functional information makes the prediction of metagenomic operons a challenging task; thus, opening new perspectives on the interpretation of the host-microbe interactions. RESULTS: In this work, we identified whole-genome and metagenomic operons via MetaRon (Metagenome and whole-genome opeRon prediction pipeline). MetaRon identifies operons without any experimental or functional information. MetaRon was implemented on datasets with different levels of complexity and information. Starting from its application on whole-genome to simulated mixture of three whole-genomes (E. coli MG1655, Mycobacterium tuberculosis H37Rv and Bacillus subtilis str. 16), E. coli c20 draft genome extracted from chicken gut and finally on 145 whole-metagenome data samples from human gut. MetaRon consistently achieved high operon prediction sensitivity, specificity and accuracy across E. coli whole-genome (97.8, 94.1 and 92.4%), simulated genome (93.7, 75.5 and 88.1%) and E. coli c20 (87, 91 and 88%,), respectively. Finally, we identified 1,232,407 unique operons from 145 paired-end human gut metagenome samples. We also report strong association of type 2 diabetes with Maltose phosphorylase (K00691), 3-deoxy-D-glycero-D-galacto-nononate 9-phosphate synthase (K21279) and an uncharacterized protein (K07101). CONCLUSION: With MetaRon, we were able to remove two notable limitations of existing whole-genome operon prediction methods: (1) generalizability (ability to predict operons in unrelated bacterial genomes), and (2) whole-genome and metagenomic data management. We also demonstrate the use of operons as a subset to represent the trends of secondary metabolites in whole-metagenome data and the role of secondary metabolites in the occurrence of disease condition. Using operonic data from metagenome to study secondary metabolic trends will significantly reduce the data volume to more precise data. Furthermore, the identification of metabolic pathways associated with the occurrence of type 2 diabetes (T2D) also presents another dimension of analyzing the human gut metagenome. Presumably, this study is the first organized effort to predict metagenomic operons and perform a detailed analysis in association with a disease, in this case type 2 diabetes. The application of MetaRon to metagenomic data at diverse scale will be beneficial to understand the gene regulation and therapeutic metagenomics.
Authors: Marc G Dumont; Stefan M Radajewski; Carlos B Miguez; Ian R McDonald; J Colin Murrell Journal: Environ Microbiol Date: 2006-07 Impact factor: 5.491
Authors: Tyrrell Conway; James P Creecy; Scott M Maddox; Joe E Grissom; Trevor L Conkle; Tyler M Shadid; Jun Teramoto; Phillip San Miguel; Tomohiro Shimada; Akira Ishihama; Hirotada Mori; Barry L Wanner Journal: MBio Date: 2014-07-08 Impact factor: 7.867