| Literature DB >> 33920867 |
Marcel Kucharík1,2, Jaroslav Budiš1,2,3, Michaela Hýblová4, Gabriel Minárik4, Tomáš Szemes1,2,5.
Abstract
Copy number variations (CNVs) represent a type of structural variant involving alterations in the number of copies of specific regions of DNA that can either be deleted or duplicated. CNVs contribute substantially to normal population variability, however, abnormal CNVs cause numerous genetic disorders. At present, several methods for CNV detection are applied, ranging from the conventional cytogenetic analysis, through microarray-based methods (aCGH), to next-generation sequencing (NGS). In this paper, we present GenomeScreen, an NGS-based CNV detection method for low-coverage, whole-genome sequencing. We determined the theoretical limits of its accuracy and obtained confirmation in an extensive in silico study and in real patient samples with known genotypes. In theory, at least 6 M uniquely mapped reads are required to detect a CNV with the length of 100 kilobases (kb) or more with high confidence (Z-score > 7). In practice, the in silico analysis required at least 8 M to obtain >99% accuracy (for 100 kb deviations). We compared GenomeScreen with one of the currently used aCGH methods in diagnostic laboratories, which has mean resolution of 200 kb. GenomeScreen and aCGH both detected 59 deviations, while GenomeScreen furthermore detected 134 other (usually) smaller variations. When compared to aCGH, overall performance of the proposed GenemoScreen tool is comparable or superior in terms of accuracy, turn-around time, and cost-effectiveness, thus providing reasonable benefits, particularly in a prenatal diagnosis setting.Entities:
Keywords: CNV detection; CNV detection comparison; aCGH replacement; low-coverage WGS
Year: 2021 PMID: 33920867 PMCID: PMC8071346 DOI: 10.3390/diagnostics11040708
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Visualization of the detected deviations on chromosome 8. Chromosome location is on the X-axis. Normalized bin count is on the Y-axis. Green lines represent normal bin count segments (normalized around zero), magenta lines visualize aberrations (one deletion at the start of the chromosome, one duplication on p22–p12). Filtered bins are depicted as black bars on the zero line on the Y-axis. The unmapped region around the centromere is visualized with the grey bar. Grey dots represent the normalized individual bin counts for each bin.
Figure 2Theoretical minimal read count for successful estimation of copy number variation (CNV) with specified variation length. Different lines represent different Z-score confidence levels.
Figure 3Prediction accuracy computed with in silico analysis based on the length of variation and read count. Each cell number is generated from 8300 simulations (100 randomly generated aberrations; 83 samples).
Figure 4Detection of GenomeScreen (all) and array-based comparative genomic hybridization (aCGH) (red) based on the variation length and number of aCGH probes in the detected interval (by GenomeScreen). Deletions and duplications are visualized by downward and upward triangles, respectively.