MOTIVATION: Copy number variation (CNV) is a type of structural variation, usually defined as genomic segments that are 1 kb or larger, which present variable copy numbers when compared with a reference genome. The screening and ranking algorithm (SaRa) was recently proposed as an efficient approach for multiple change-points detection, which can be applied to CNV detection. However, some practical issues arise from application of SaRa to single nucleotide polymorphism data. RESULTS: In this study, we propose a modified SaRa on CNV detection to address these issues. First, we use the quantile normalization on the original intensities to guarantee that the normal mean model-based SaRa is a robust method. Second, a novel normal mixture model coupled with a modified Bayesian information criterion is proposed for candidate change-point selection and further clustering the potential CNV segments to copy number states. Simulations revealed that the modified SaRa became a robust method for identifying change-points and achieved better performance than the circular binary segmentation (CBS) method. By applying the modified SaRa to real data from the HapMap project, we illustrated its performance on detecting CNV segments. In conclusion, our modified SaRa method improves SaRa theoretically and numerically, for identifying CNVs with high-throughput genotyping data. AVAILABILITY AND IMPLEMENTATION: The modSaRa package is implemented in R program and freely available at http://c2s2.yale.edu/software/modSaRa. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Copy number variation (CNV) is a type of structural variation, usually defined as genomic segments that are 1 kb or larger, which present variable copy numbers when compared with a reference genome. The screening and ranking algorithm (SaRa) was recently proposed as an efficient approach for multiple change-points detection, which can be applied to CNV detection. However, some practical issues arise from application of SaRa to single nucleotide polymorphism data. RESULTS: In this study, we propose a modified SaRa on CNV detection to address these issues. First, we use the quantile normalization on the original intensities to guarantee that the normal mean model-based SaRa is a robust method. Second, a novel normal mixture model coupled with a modified Bayesian information criterion is proposed for candidate change-point selection and further clustering the potential CNV segments to copy number states. Simulations revealed that the modified SaRa became a robust method for identifying change-points and achieved better performance than the circular binary segmentation (CBS) method. By applying the modified SaRa to real data from the HapMap project, we illustrated its performance on detecting CNV segments. In conclusion, our modified SaRa method improves SaRa theoretically and numerically, for identifying CNVs with high-throughput genotyping data. AVAILABILITY AND IMPLEMENTATION: The modSaRa package is implemented in R program and freely available at http://c2s2.yale.edu/software/modSaRa. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Kai Wang; Mingyao Li; Dexter Hadley; Rui Liu; Joseph Glessner; Struan F A Grant; Hakon Hakonarson; Maja Bucan Journal: Genome Res Date: 2007-10-05 Impact factor: 9.043
Authors: Jennifer L Freeman; George H Perry; Lars Feuk; Richard Redon; Steven A McCarroll; David M Altshuler; Hiroyuki Aburatani; Keith W Jones; Chris Tyler-Smith; Matthew E Hurles; Nigel P Carter; Stephen W Scherer; Charles Lee Journal: Genome Res Date: 2006-06-29 Impact factor: 9.043
Authors: Jonathan Sebat; B Lakshmi; Jennifer Troge; Joan Alexander; Janet Young; Pär Lundin; Susanne Månér; Hillary Massa; Megan Walker; Maoyen Chi; Nicholas Navin; Robert Lucito; John Healy; James Hicks; Kenny Ye; Andrew Reiner; T Conrad Gilliam; Barbara Trask; Nick Patterson; Anders Zetterberg; Michael Wigler Journal: Science Date: 2004-07-23 Impact factor: 47.728
Authors: Chris Barnes; Vincent Plagnol; Tomas Fitzgerald; Richard Redon; Jonathan Marchini; David Clayton; Matthew E Hurles Journal: Nat Genet Date: 2008-09-07 Impact factor: 38.330
Authors: Stefano Colella; Christopher Yau; Jennifer M Taylor; Ghazala Mirza; Helen Butler; Penny Clouston; Anne S Bassett; Anneke Seller; Christopher C Holmes; Jiannis Ragoussis Journal: Nucleic Acids Res Date: 2007-03-06 Impact factor: 16.971