Guangyang Wang1, Shenghui Li2, Qiulong Yan1, Ruochun Guo3, Yue Zhang3, Fang Chen1, Xiangge Tian4, Qingbo Lv3, Hao Jin3, Xiaochi Ma5, Yufang Ma6. 1. Department of Microbiology, College of Basic Medical Sciences, Dalian Medical University, Dalian 116044, China. 2. Puensum Genetech Institute, Wuhan 430076, China; Key Laboratory of Precision Nutrition and Food Quality, Department of Nutrition and Health, China Agricultural University, Beijing 100083, China. 3. Puensum Genetech Institute, Wuhan 430076, China. 4. Department of Microbiology, College of Basic Medical Sciences, Dalian Medical University, Dalian 116044, China; Pharmaceutical Research Center, Second Affiliated Hospital, Dalian Medical University, Dalian, China. 5. Pharmaceutical Research Center, Second Affiliated Hospital, Dalian Medical University, Dalian, China. 6. Department of Microbiology, College of Basic Medical Sciences, Dalian Medical University, Dalian 116044, China. Electronic address: yufangma@dmu.edu.cn.
Abstract
INTRODUCTION: Viruses in the human gut have been linked to health and disease. Deciphering the gut virome is dependent on metagenomic sequencing of the virus-like particles (VLPs) purified from the fecal specimens. A major limitation of conventional viral metagenomic sequencing is the low recoverability of viral genomes from the metagenomic dataset. OBJECTIVES: To develop an optimal method for viral amplification and metagenomic sequencing for maximizing the recovery of viral genomes. METHODS: We performed parallel virus enrichment and DNA extraction to generate ∼ 30 viral DNA samples from each of 5 fresh fecal specimens and conducted the experiments including 1) optimizing the cycle number for high-fidelity enzyme-based PCR amplification, 2) evaluating the reproducibility of the optimally whole viral metagenomic experimental process, 3) evaluating the reliability of multiple displacement amplification (MDA), 4) testing the capability of long-read sequencing for improving viral metagenomic assembly, and 5) comparing the differences between viral metagenomic and bulk metagenomic approaches. RESULTS: Our results revealed that the optimal cycle number for PCR amplification is 15. We verified the reliability of MDA and the effectiveness of long-read sequencing. Based on our optimized results, we generated 151 high-quality viruses using the dataset combined from short-read and long-read sequencing. Genomic analysis of these viruses found that most (60.3%) of them were previously unknown and showed a remarkable diversity of viral functions, especially the existence of 206 viral auxiliary metabolic genes. Finally, we uncovered significant differences in the efficiency and coverage of viral identification between viral metagenomic and bulk metagenomic approaches. CONCLUSIONS: Our study demonstrates the potential of optimized experiment and sequencing strategies in uncovering viral genomes from fecal specimens, which will facilitate future research about the genome-level characterization of complex viral communities.
INTRODUCTION: Viruses in the human gut have been linked to health and disease. Deciphering the gut virome is dependent on metagenomic sequencing of the virus-like particles (VLPs) purified from the fecal specimens. A major limitation of conventional viral metagenomic sequencing is the low recoverability of viral genomes from the metagenomic dataset. OBJECTIVES: To develop an optimal method for viral amplification and metagenomic sequencing for maximizing the recovery of viral genomes. METHODS: We performed parallel virus enrichment and DNA extraction to generate ∼ 30 viral DNA samples from each of 5 fresh fecal specimens and conducted the experiments including 1) optimizing the cycle number for high-fidelity enzyme-based PCR amplification, 2) evaluating the reproducibility of the optimally whole viral metagenomic experimental process, 3) evaluating the reliability of multiple displacement amplification (MDA), 4) testing the capability of long-read sequencing for improving viral metagenomic assembly, and 5) comparing the differences between viral metagenomic and bulk metagenomic approaches. RESULTS: Our results revealed that the optimal cycle number for PCR amplification is 15. We verified the reliability of MDA and the effectiveness of long-read sequencing. Based on our optimized results, we generated 151 high-quality viruses using the dataset combined from short-read and long-read sequencing. Genomic analysis of these viruses found that most (60.3%) of them were previously unknown and showed a remarkable diversity of viral functions, especially the existence of 206 viral auxiliary metabolic genes. Finally, we uncovered significant differences in the efficiency and coverage of viral identification between viral metagenomic and bulk metagenomic approaches. CONCLUSIONS: Our study demonstrates the potential of optimized experiment and sequencing strategies in uncovering viral genomes from fecal specimens, which will facilitate future research about the genome-level characterization of complex viral communities.