Xiwei Sun1, Yi Han1, Liyuan Zhou1, Enguo Chen1, Bingjian Lu2, Yong Liu3, Xiaoqing Pan3, Allen W Cowley3, Mingyu Liang3, Qingbiao Wu4, Yan Lu2, Pengyuan Liu1. 1. Department of Respiratory Medicine, Sir Run Run Shaw Hospital and Institute of Translational Medicine. 2. Center for Uterine Cancer Diagnosis & Therapy Research of Zhejiang Province, Women's Reproductive Health Key Laboratory of Zhejiang Province, Department of Gynecologic Oncology, Women's Hospital and Institute of Translational Medicine, School of Medicine, Zhejiang University, Hangzhou, Zhejiang, China. 3. Department of Physiology, Center of Systems Molecular Medicine, Medical College of Wisconsin, Milwaukee, USA. 4. Department of Mathematics, Zhejiang University, Hangzhou, Zhejiang, China.
Abstract
Motivation: The rapid development of next-generation sequencing technology provides an opportunity to study genome-wide DNA methylation at single-base resolution. However, depletion of unmethylated cytosines brings challenges for aligning bisulfite-converted sequencing reads to a large reference. Software tools for aligning methylation reads have not yet been comprehensively evaluated, especially for the widely used reduced representation bisulfite sequencing (RRBS) that involves enrichment for CpG islands (CGIs). Results: We specially developed a simulator, RRBSsim, for benchmarking analysis of RRBS data. We performed extensive comparison of seven mapping algorithms for methylation analysis in both real and simulated RRBS data. Eighteen lung tumors and matched adjacent tissues were sequenced by the RRBS protocols. Our empirical evaluation found that methylation results were less consistent between software tools for CpG sites with low sequencing depth, medium methylation level, on CGI shores or gene body. These observations were further confirmed by simulations that indicated software tools generally had lower recall of detecting these vulnerable CpG sites and lower precision of estimating methylation levels in these CpG sites. Among the software tools tested, bwa-meth and BS-Seeker2 (bowtie2) are currently our preferred aligners for RRBS data in terms of recall, precision and speed. Existing aligners cannot efficiently handle moderately methylated CpG sites and those CpG sites on CGI shores or gene body. Interpretation of methylation results from these vulnerable CpG sites should be treated with caution. Our study reveals several important features inherent in methylation data, and RRBSsim provides guidance to advance sequence-based methylation data analysis and methodological development. Availability and implementation: RRBSsim is a simulator for benchmarking analysis of RRBS data and its source code is available at https://github.com/xwBio/RRBSsim or https://github.com/xwBio/Docker-RRBSsim. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: The rapid development of next-generation sequencing technology provides an opportunity to study genome-wide DNA methylation at single-base resolution. However, depletion of unmethylated cytosines brings challenges for aligning bisulfite-converted sequencing reads to a large reference. Software tools for aligning methylation reads have not yet been comprehensively evaluated, especially for the widely used reduced representation bisulfite sequencing (RRBS) that involves enrichment for CpG islands (CGIs). Results: We specially developed a simulator, RRBSsim, for benchmarking analysis of RRBS data. We performed extensive comparison of seven mapping algorithms for methylation analysis in both real and simulated RRBS data. Eighteen lung tumors and matched adjacent tissues were sequenced by the RRBS protocols. Our empirical evaluation found that methylation results were less consistent between software tools for CpG sites with low sequencing depth, medium methylation level, on CGI shores or gene body. These observations were further confirmed by simulations that indicated software tools generally had lower recall of detecting these vulnerable CpG sites and lower precision of estimating methylation levels in these CpG sites. Among the software tools tested, bwa-meth and BS-Seeker2 (bowtie2) are currently our preferred aligners for RRBS data in terms of recall, precision and speed. Existing aligners cannot efficiently handle moderately methylated CpG sites and those CpG sites on CGI shores or gene body. Interpretation of methylation results from these vulnerable CpG sites should be treated with caution. Our study reveals several important features inherent in methylation data, and RRBSsim provides guidance to advance sequence-based methylation data analysis and methodological development. Availability and implementation: RRBSsim is a simulator for benchmarking analysis of RRBS data and its source code is available at https://github.com/xwBio/RRBSsim or https://github.com/xwBio/Docker-RRBSsim. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Mark E Pepin; Concetta Schiano; Marco Miceli; Giuditta Benincasa; Gelsomina Mansueto; Vincenzo Grimaldi; Andrea Soricelli; Adam R Wende; Claudio Napoli Journal: Exp Cell Res Date: 2021-01-27 Impact factor: 3.905
Authors: Mark K Santillan; Richard C Becker; David A Calhoun; Allen W Cowley; Joseph T Flynn; Justin L Grobe; Theodore A Kotchen; Daniel T Lackland; Kimberly K Leslie; Mingyu Liang; David L Mattson; Kevin E Meyers; Mark M Mitsnefes; Paul M Muntner; Gary L Pierce; Jennifer S Pollock; Curt D Sigmund; Stephen J Thomas; Elaine M Urbina; Srividya Kidambi Journal: Hypertension Date: 2021-05-03 Impact factor: 9.897
Authors: Mark E Pepin; Teresa Infante; Giuditta Benincasa; Concetta Schiano; Marco Miceli; Simona Ceccarelli; Francesca Megiorni; Eleni Anastasiadou; Giovanni Della Valle; Gerardo Fatone; Mario Faenza; Ludovico Docimo; Giovanni F Nicoletti; Cinzia Marchese; Adam R Wende; Claudio Napoli Journal: Front Genet Date: 2020-04-15 Impact factor: 4.599
Authors: Mohammed Alser; Jeremy Rotman; Onur Mutlu; Serghei Mangul; Dhrithi Deshpande; Kodi Taraszka; Huwenbo Shi; Pelin Icer Baykal; Harry Taegyun Yang; Victor Xue; Sergey Knyazev; Benjamin D Singer; Brunilda Balliu; David Koslicki; Pavel Skums; Alex Zelikovsky; Can Alkan Journal: Genome Biol Date: 2021-08-26 Impact factor: 13.583