Literature DB >> 23543902

Optimal Sparse Segment Identification with Application in Copy Number Variation Analysis.

X Jessie Jeng1, T Tony Cai, Hongzhe Li.   

Abstract

Motivated by DNA copy number variation (CNV) analysis based on high-density single nucleotide polymorphism (SNP) data, we consider the problem of detecting and identifying sparse short segments in a long one-dimensional sequence of data with additive Gaussian white noise, where the number, length and location of the segments are unknown. We present a statistical characterization of the identifiable region of a segment where it is possible to reliably separate the segment from noise. An efficient likelihood ratio selection (LRS) procedure for identifying the segments is developed, and the asymptotic optimality of this method is presented in the sense that the LRS can separate the signal segments from the noise as long as the signal segments are in the identifiable regions. The proposed method is demonstrated with simulations and analysis of a real data set on identification of copy number variants based on high-density SNP data. The results show that the LRS procedure can yield greater gain in power for detecting the true segments than some standard signal identification methods.

Entities:  

Keywords:  DNA copy number; Likelihood ratio selection; multiple testing; signal detection

Year:  2012        PMID: 23543902      PMCID: PMC3610602          DOI: 10.1198/jasa.2010.tm10083

Source DB:  PubMed          Journal:  J Am Stat Assoc        ISSN: 0162-1459            Impact factor:   5.033


  15 in total

Review 1.  Structural variation in the human genome.

Authors:  Lars Feuk; Andrew R Carson; Stephen W Scherer
Journal:  Nat Rev Genet       Date:  2006-02       Impact factor: 53.242

2.  Increase in GSK3beta gene copy number variation in bipolar disorder.

Authors:  Herbert M Lachman; Erika Pedrosa; Oriana A Petruolo; Melissa Cockerham; Alexander Papolos; Tomas Novak; Demitri F Papolos; Pavla Stopkova
Journal:  Am J Med Genet B Neuropsychiatr Genet       Date:  2007-04-05       Impact factor: 3.568

Review 3.  Copy number variation in the human genome and its implications for cardiovascular disease.

Authors:  Rebecca L Pollex; Robert A Hegele
Journal:  Circulation       Date:  2007-06-19       Impact factor: 29.690

4.  A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data.

Authors:  Nancy R Zhang; David O Siegmund
Journal:  Biometrics       Date:  2007-03       Impact factor: 2.571

5.  Completing the map of human genetic variation.

Authors:  Evan E Eichler; Deborah A Nickerson; David Altshuler; Anne M Bowcock; Lisa D Brooks; Nigel P Carter; Deanna M Church; Adam Felsenfeld; Mark Guyer; Charles Lee; James R Lupski; James C Mullikin; Jonathan K Pritchard; Jonathan Sebat; Stephen T Sherry; Douglas Smith; David Valle; Robert H Waterston
Journal:  Nature       Date:  2007-05-10       Impact factor: 49.962

6.  Copy number variation at 1q21.1 associated with neuroblastoma.

Authors:  Sharon J Diskin; Cuiping Hou; Joseph T Glessner; Edward F Attiyeh; Marci Laudenslager; Kristopher Bosse; Kristina Cole; Yaël P Mossé; Andrew Wood; Jill E Lynch; Katlyn Pecor; Maura Diamond; Cynthia Winter; Kai Wang; Cecilia Kim; Elizabeth A Geiger; Patrick W McGrady; Alexandra I F Blakemore; Wendy B London; Tamim H Shaikh; Jonathan Bradfield; Struan F A Grant; Hongzhe Li; Marcella Devoto; Eric R Rappaport; Hakon Hakonarson; John M Maris
Journal:  Nature       Date:  2009-06-18       Impact factor: 49.962

7.  Higher criticism thresholding: Optimal feature selection when useful features are rare and weak.

Authors:  David Donoho; Jiashun Jin
Journal:  Proc Natl Acad Sci U S A       Date:  2008-09-24       Impact factor: 11.205

8.  PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data.

Authors:  Kai Wang; Mingyao Li; Dexter Hadley; Rui Liu; Joseph Glessner; Struan F A Grant; Hakon Hakonarson; Maja Bucan
Journal:  Genome Res       Date:  2007-10-05       Impact factor: 9.043

9.  Detecting simultaneous changepoints in multiple sequences.

Authors:  Nancy R Zhang; David O Siegmund; Hanlee Ji; Jun Z Li
Journal:  Biometrika       Date:  2010-06-16       Impact factor: 2.445

10.  Strong association of de novo copy number mutations with autism.

Authors:  Jonathan Sebat; B Lakshmi; Dheeraj Malhotra; Jennifer Troge; Christa Lese-Martin; Tom Walsh; Boris Yamrom; Seungtai Yoon; Alex Krasnitz; Jude Kendall; Anthony Leotta; Deepa Pai; Ray Zhang; Yoon-Ha Lee; James Hicks; Sarah J Spence; Annette T Lee; Kaija Puura; Terho Lehtimäki; David Ledbetter; Peter K Gregersen; Joel Bregman; James S Sutcliffe; Vaidehi Jobanputra; Wendy Chung; Dorothy Warburton; Mary-Claire King; David Skuse; Daniel H Geschwind; T Conrad Gilliam; Kenny Ye; Michael Wigler
Journal:  Science       Date:  2007-03-15       Impact factor: 47.728

View more
  20 in total

1.  A Statistical Method for Identifying Trait-Associated Copy Number Variants.

Authors:  Jessie Jeng; Qian Wu; Hongzhe Li
Journal:  Hum Hered       Date:  2015-07-28       Impact factor: 0.444

2.  Multiple Change-Point Detection via a Screening and Ranking Algorithm.

Authors:  Ning Hao; Yue Selena Niu; Heping Zhang
Journal:  Stat Sin       Date:  2013-07-01       Impact factor: 1.261

3.  Parametric modeling of whole-genome sequencing data for CNV identification.

Authors:  Saran Vardhanabhuti; X Jessie Jeng; Yinghua Wu; Hongzhe Li
Journal:  Biostatistics       Date:  2014-01-28       Impact factor: 5.899

4.  Simultaneous Discovery of Rare and Common Segment Variants.

Authors:  X Jessie Jeng; T Tony Cai; Hongzhe Li
Journal:  Biometrika       Date:  2013       Impact factor: 2.445

5.  CONSISTENT SELECTION OF THE NUMBER OF CHANGE-POINTS VIA SAMPLE-SPLITTING.

Authors:  Changliang Zou; Guanghui Wang; Runze Li
Journal:  Ann Stat       Date:  2020-02-17       Impact factor: 4.028

6.  Robust Detection and Identification of Sparse Segments in Ultra-High Dimensional Data Analysis.

Authors:  T Tony Cai; X Jessie Jeng; Hongzhe Li
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2012-11       Impact factor: 4.488

7.  Hypothesis testing for phylogenetic composition: a minimum-cost flow perspective.

Authors:  Shulei Wang; T Tony Cai; Hongzhe Li
Journal:  Biometrika       Date:  2020-07-11       Impact factor: 2.445

8.  A super scalable algorithm for short segment detection.

Authors:  Ning Hao; Yue Selena Niu; Feifei Xiao; Heping Zhang
Journal:  Stat Biosci       Date:  2020-04-18

9.  Detecting local genetic correlations with scan statistics.

Authors:  Hanmin Guo; James J Li; Qiongshi Lu; Lin Hou
Journal:  Nat Commun       Date:  2021-04-01       Impact factor: 14.919

10.  STRUCTURED CORRELATION DETECTION WITH APPLICATION TO COLOCALIZATION ANALYSIS IN DUAL-CHANNEL FLUORESCENCE MICROSCOPIC IMAGING.

Authors:  Shulei Wang; Jianqing Fan; Ginger Pocock; Ellen T Arena; Kevin W Eliceiri; Ming Yuan
Journal:  Stat Sin       Date:  2021-01       Impact factor: 1.261

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.