Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 EnsembleCNV: an ensemble machine learning algorithm to identify and genotype copy number variation using SNP array data.

Literature DB >> 30722045

EnsembleCNV: an ensemble machine learning algorithm to identify and genotype copy number variation using SNP array data.

Zhongyang Zhang^1,2, Haoxiang Cheng^1,2, Xiumei Hong³, Antonio F Di Narzo^1,2, Oscar Franzen⁴, Shouneng Peng^1,2, Arno Ruusalepp⁵, Jason C Kovacic⁶, Johan L M Bjorkegren^1,2,4, Xiaobin Wang^3,7, Ke Hao^1,2,8,9.

Abstract

The associations between diseases/traits and copy number variants (CNVs) have not been systematically investigated in genome-wide association studies (GWASs), primarily due to a lack of robust and accurate tools for CNV genotyping. Herein, we propose a novel ensemble learning framework, ensembleCNV, to detect and genotype CNVs using single nucleotide polymorphism (SNP) array data. EnsembleCNV (a) identifies and eliminates batch effects at raw data level; (b) assembles individual CNV calls into CNV regions (CNVRs) from multiple existing callers with complementary strengths by a heuristic algorithm; (c) re-genotypes each CNVR with local likelihood model adjusted by global information across multiple CNVRs; (d) refines CNVR boundaries by local correlation structure in copy number intensities; (e) provides direct CNV genotyping accompanied with confidence score, directly accessible for downstream quality control and association analysis. Benchmarked on two large datasets, ensembleCNV outperformed competing methods and achieved a high call rate (93.3%) and reproducibility (98.6%), while concurrently achieving high sensitivity by capturing 85% of common CNVs documented in the 1000 Genomes Project. Given this CNV call rate and accuracy, which are comparable to SNP genotyping, we suggest ensembleCNV holds significant promise for performing genome-wide CNV association studies and investigating how CNVs predispose to human diseases.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Year: 2019 PMID： 30722045 PMCID： PMC6468244 DOI： 10.1093/nar/gkz068

Source DB: PubMed Journal: Nucleic Acids Res ISSN： 0305-1048 Impact factor: 16.971

42 in total

1. Spatial smoothing and hot spot detection for CGH data using the fused lasso.

Authors: Robert Tibshirani; Pei Wang
Journal: Biostatistics Date: 2007-05-18 Impact factor: 5.899

Review 2. Copy-number variation and association studies of human disease.

Authors: Steven A McCarroll; David M Altshuler
Journal: Nat Genet Date: 2007-07 Impact factor: 38.330

3. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth.

Authors: Menachem Fromer; Jennifer L Moran; Kimberly Chambert; Eric Banks; Sarah E Bergen; Douglas M Ruderfer; Robert E Handsaker; Steven A McCarroll; Michael C O'Donovan; Michael J Owen; George Kirov; Patrick F Sullivan; Christina M Hultman; Pamela Sklar; Shaun M Purcell
Journal: Am J Hum Genet Date: 2012-10-05 Impact factor: 11.025

4. Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants.

Authors: Dalila Pinto; Katayoon Darvishi; Xinghua Shi; Diana Rajan; Diane Rigler; Tom Fitzgerald; Anath C Lionel; Bhooma Thiruvahindrapuram; Jeffrey R Macdonald; Ryan Mills; Aparna Prasad; Kristin Noonan; Susan Gribble; Elena Prigmore; Patricia K Donahoe; Richard S Smith; Ji Hyeon Park; Matthew E Hurles; Nigel P Carter; Charles Lee; Stephen W Scherer; Lars Feuk
Journal: Nat Biotechnol Date: 2011-05-08 Impact factor: 54.908

Review 5. Genome structural variation discovery and genotyping.

Authors: Can Alkan; Bradley P Coe; Evan E Eichler
Journal: Nat Rev Genet Date: 2011-03-01 Impact factor: 53.242

6. Human copy number variation and complex genetic disease.

Authors: Santhosh Girirajan; Catarina D Campbell; Evan E Eichler
Journal: Annu Rev Genet Date: 2011-08-19 Impact factor: 16.830

7. Reconstructing DNA copy number by joint segmentation of multiple sequences.

Authors: Zhongyang Zhang; Kenneth Lange; Chiara Sabatti
Journal: BMC Bioinformatics Date: 2012-08-16 Impact factor: 3.169

8. A framework for variation discovery and genotyping using next-generation DNA sequencing data.

Authors: Mark A DePristo; Eric Banks; Ryan Poplin; Kiran V Garimella; Jared R Maguire; Christopher Hartl; Anthony A Philippakis; Guillermo del Angel; Manuel A Rivas; Matt Hanna; Aaron McKenna; Tim J Fennell; Andrew M Kernytsky; Andrey Y Sivachenko; Kristian Cibulskis; Stacey B Gabriel; David Altshuler; Mark J Daly
Journal: Nat Genet Date: 2011-04-10 Impact factor: 38.330

9. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations.

Authors: Danielle Welter; Jacqueline MacArthur; Joannella Morales; Tony Burdett; Peggy Hall; Heather Junkins; Alan Klemm; Paul Flicek; Teri Manolio; Lucia Hindorff; Helen Parkinson
Journal: Nucleic Acids Res Date: 2013-12-06 Impact factor: 16.971

10. Meta-analysis of five genome-wide association studies identifies multiple new loci associated with testicular germ cell tumor.

Authors: Zhaoming Wang; Katherine A McGlynn; Ewa Rajpert-De Meyts; D Timothy Bishop; Charles C Chung; Marlene D Dalgaard; Mark H Greene; Ramneek Gupta; Tom Grotmol; Trine B Haugen; Robert Karlsson; Kevin Litchfield; Nandita Mitra; Kasper Nielsen; Louise C Pyle; Stephen M Schwartz; Vésteinn Thorsson; Saran Vardhanabhuti; Fredrik Wiklund; Clare Turnbull; Stephen J Chanock; Peter A Kanetsky; Katherine L Nathanson
Journal: Nat Genet Date: 2017-06-12 Impact factor: 38.330

5 in total

1. Multiclass Cancer Prediction Based on Copy Number Variation Using Deep Learning.

Authors: Haleema Attique; Sajid Shah; Saima Jabeen; Fiaz Gul Khan; Ahmad Khan; Mohammed ELAffendi
Journal: Comput Intell Neurosci Date: 2022-06-09

Review 2. Implications of germline copy-number variations in psychiatric disorders: review of large-scale genetic studies.

Authors: Masahiro Nakatochi; Itaru Kushima; Norio Ozaki
Journal: J Hum Genet Date: 2020-09-21 Impact factor: 3.172

Review 3. Progress in Methods for Copy Number Variation Profiling.

Authors: Veronika Gordeeva; Elena Sharova; Georgij Arapidi
Journal: Int J Mol Sci Date: 2022-02-15 Impact factor: 5.923

4. A Novel Computational Framework to Predict Disease-Related Copy Number Variations by Integrating Multiple Data Sources.

Authors: Lin Yuan; Tao Sun; Jing Zhao; Zhen Shen
Journal: Front Genet Date: 2021-06-29 Impact factor: 4.599

5. A genome-wide analysis of copy number variation in Murciano-Granadina goats.

Authors: Dailu Guan; Amparo Martínez; Anna Castelló; Vincenzo Landi; María Gracia Luigi-Sierra; Javier Fernández-Álvarez; Betlem Cabrera; Juan Vicente Delgado; Xavier Such; Jordi Jordana; Marcel Amills
Journal: Genet Sel Evol Date: 2020-08-08 Impact factor: 4.297

5 in total