| Literature DB >> 30718868 |
Yuanqiang Zou1,2,3, Wenbin Xue1,2, Guangwen Luo1,2,4, Ziqing Deng1,2, Panpan Qin1,2,5, Ruijin Guo1,2, Haipeng Sun1,2, Yan Xia1,2,5, Suisha Liang1,2,6, Ying Dai1,2, Daiwei Wan1,2, Rongrong Jiang1,2, Lili Su1,2, Qiang Feng1,2, Zhuye Jie1,2, Tongkun Guo1,2, Zhongkui Xia1,2, Chuan Liu1,2,6, Jinghong Yu1,2, Yuxiang Lin1,2, Shanmei Tang1,2, Guicheng Huo4, Xun Xu1,2, Yong Hou1,2, Xin Liu1,2,7, Jian Wang1,8, Huanming Yang1,8, Karsten Kristiansen1,2,3,9, Junhua Li10,11,12, Huijue Jia13,14,15, Liang Xiao16,17,18,19,20.
Abstract
Reference genomes are essential for metagenomic analyses and functional characterization of the human gut microbiota. We present the Culturable Genome Reference (CGR), a collection of 1,520 nonredundant, high-quality draft genomes generated from >6,000 bacteria cultivated from fecal samples of healthy humans. Of the 1,520 genomes, which were chosen to cover all major bacterial phyla and genera in the human gut, 264 are not represented in existing reference genome catalogs. We show that this increase in the number of reference bacterial genomes improves the rate of mapping metagenomic sequencing reads from 50% to >70%, enabling higher-resolution descriptions of the human gut microbiome. We use the CGR genomes to annotate functions of 338 bacterial species, showing the utility of this resource for functional studies. We also carry out a pan-genome analysis of 38 important human gut species, which reveals the diversity and specificity of functional enrichment between their core and dispensable genomes.Entities:
Mesh:
Substances:
Year: 2019 PMID: 30718868 PMCID: PMC6784896 DOI: 10.1038/s41587-018-0008-8
Source DB: PubMed Journal: Nat Biotechnol ISSN: 1087-0156 Impact factor: 54.908
Fig. 1Phylogenetic tree of 1,520 isolated gut bacteria based on whole-genome sequences.
The 1,520 high-quality genomes in CGR are classified into 338 species-level clusters (ANI ≥ 95%) based on their whole-genome sequences. Bacterial species from Firmicutes are colored in orange; Bacteroidetes, blue; Proteobacteria, green; Actinobacteria, violet; Fusobacteria, gray. Novel genera and species are highlighted by red and orange branches, respectively. The bar on the outermost layer indicates the number of genomes archived in each cluster. Rhizobium selenitireducens ATCC BAA 1503 was used as an outgroup for phylogenetic analysis.
Fig. 2Contribution of CGR to metagenomic and SNP analyses.
a, The read mapping ratio of a previous metagenomic analysis (IGCR) was significantly improved by CGR (IGCR + CGR) in fecal samples from Chinese (n = 368, P = 6 × 10−78), American (n = 139, P = 2 × 10−17), Spanish (n = 320, P = 4 × 10−50) and Danish (n = 109, P = 4 × 10−17) individuals. The significance of improvement was determined by two-side Wilcoxon rank-sum test. IGCR, 3,449 reference genomes used in the IGC study[6]; CGR, 1,520 reference genomes generated in this study. Each box plot illustrates the estimated median (center line), upper and lower quartiles (box limits), 1.5 × interquartile range (whiskers), and outliers (points) of the read mapping ratio. b, Reference genomes for SNP analysis generated in previous study[17] (IGCR, green) and current study (CGR, blue). The unclassified species of reference genomes in CGR are highlighted in violet.
Fig. 3Functional landscape of gut microbiota.
The gene abundance of listed functions in 1,520 genomes of CGR is indicated by the color depth in the heat map. The listed functions are enriched in specific phyla or genera (a) or might have deleterious or beneficial effects on human health (b). The bacterial species are ordered according to the phylogenetic tree in Fig. 1. The relative positions of phyla and genera in the phylogenetic tree are indicated by the colored ribbons and dots, respectively.
Fig. 4Pan-genome analysis of 38 representative clusters.
a, The distribution of genes involved in butyrate biosynthesis pathway in the core genomes (pink) and dispensable genomes (cyan). The two pathways for butyrate biosynthesis from acetyl-CoA are shown below. The species with a complete butyrate biosynthesis pathway in the core genome and pan-genome are highlighted in pink and cyan, respectively. Thl, thiolase; Hdb, β-hydroxybutyryl-CoA dehydrogenase; Cro, crotonase; Bcd, butyryl-CoA dehydrogenase (including electron transfer protein α and β subunits); But, butyryl-CoA:acetate CoA transferase; Ptb, phosphate butyryltransferase; Buk, butyrate kinase. b, The distribution of ARGs in in the core genomes (pink) and dispensable genomes (cyan).