Jing Wang1, Leon Raskin1, David C Samuels1, Yu Shyr1, Yan Guo1. 1. Center for Quantitative Sciences and Department of Medicine and Center for Human Genetics Research, Vanderbilt University, Nashville, TN 37212, USA.
Abstract
MOTIVATION: The transition/transversion (Ti/Tv) ratio and heterozygous/nonreference-homozygous (het/nonref-hom) ratio have been commonly computed in genetic studies as a quality control (QC) measurement. Additionally, these two ratios are helpful in our understanding of the patterns of DNA sequence evolution. RESULTS: To thoroughly understand these two genomic measures, we performed a study using 1000 Genomes Project (1000G) released genotype data (N=1092). An additional two datasets (N=581 and N=6) were used to validate our findings from the 1000G dataset. We compared the two ratios among continental ancestry, genome regions and gene functionality. We found that the Ti/Tv ratio can be used as a quality indicator for single nucleotide polymorphisms inferred from high-throughput sequencing data. The Ti/Tv ratio varies greatly by genome region and functionality, but not by ancestry. The het/nonref-hom ratio varies greatly by ancestry, but not by genome regions and functionality. Furthermore, extreme guanine + cytosine content (either high or low) is negatively associated with the Ti/Tv ratio magnitude. Thus, when performing QC assessment using these two measures, care must be taken to apply the correct thresholds based on ancestry and genome region. Failure to take these considerations into account at the QC stage will bias any following analysis. CONTACT: yan.guo@vanderbilt.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: The transition/transversion (Ti/Tv) ratio and heterozygous/nonreference-homozygous (het/nonref-hom) ratio have been commonly computed in genetic studies as a quality control (QC) measurement. Additionally, these two ratios are helpful in our understanding of the patterns of DNA sequence evolution. RESULTS: To thoroughly understand these two genomic measures, we performed a study using 1000 Genomes Project (1000G) released genotype data (N=1092). An additional two datasets (N=581 and N=6) were used to validate our findings from the 1000G dataset. We compared the two ratios among continental ancestry, genome regions and gene functionality. We found that the Ti/Tv ratio can be used as a quality indicator for single nucleotide polymorphisms inferred from high-throughput sequencing data. The Ti/Tv ratio varies greatly by genome region and functionality, but not by ancestry. The het/nonref-hom ratio varies greatly by ancestry, but not by genome regions and functionality. Furthermore, extreme guanine + cytosine content (either high or low) is negatively associated with the Ti/Tv ratio magnitude. Thus, when performing QC assessment using these two measures, care must be taken to apply the correct thresholds based on ancestry and genome region. Failure to take these considerations into account at the QC stage will bias any following analysis. CONTACT: yan.guo@vanderbilt.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Yan Guo; Qiuyin Cai; David C Samuels; Fei Ye; Jirong Long; Chung-I Li; Jeanette F Winther; E Janet Tawn; Marilyn Stovall; Päivi Lähteenmäki; Nea Malila; Shawn Levy; Christian Shaffer; Yu Shyr; Xiao-Ou Shu; John D Boice Journal: Mutat Res Date: 2012-02-24 Impact factor: 2.433
Authors: Gonçalo R Abecasis; David Altshuler; Adam Auton; Lisa D Brooks; Richard M Durbin; Richard A Gibbs; Matt E Hurles; Gil A McVean Journal: Nature Date: 2010-10-28 Impact factor: 49.962
Authors: Dan Graur; Yichen Zheng; Nicholas Price; Ricardo B R Azevedo; Rebecca A Zufall; Eran Elhaik Journal: Genome Biol Evol Date: 2013 Impact factor: 3.416
Authors: Mary J Emond; Tin Louie; Julia Emerson; Wei Zhao; Rasika A Mathias; Michael R Knowles; Fred A Wright; Mark J Rieder; Holly K Tabor; Deborah A Nickerson; Kathleen C Barnes; Ronald L Gibson; Michael J Bamshad Journal: Nat Genet Date: 2012-07-08 Impact factor: 38.330
Authors: Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean Journal: Nature Date: 2012-11-01 Impact factor: 49.962
Authors: Virginia Savova; Esther J Pearl; Elvan Boke; Anwesha Nag; Ivan Adzhubei; Marko E Horb; Leonid Peshkin Journal: Dev Biol Date: 2017-03-07 Impact factor: 3.582