| Literature DB >> 28855261 |
Daniel A King1, Alejandro Sifrim1, Tomas W Fitzgerald1, Raheleh Rahbari1, Emma Hobson2, Tessa Homfray3, Sahar Mansour3, Sarju G Mehta4, Mohammed Shehla5, Susan E Tomkins6, Pradeep C Vasudevan7, Matthew E Hurles1.
Abstract
Structural mosaic abnormalities are large post-zygotic mutations present in a subset of cells and have been implicated in developmental disorders and cancer. Such mutations have been conventionally assessed in clinical diagnostics using cytogenetic or microarray testing. Modern disease studies rely heavily on exome sequencing, yet an adequate method for the detection of structural mosaicism using targeted sequencing data is lacking. Here, we present a method, called MrMosaic, to detect structural mosaic abnormalities using deviations in allele fraction and read coverage from next-generation sequencing data. Whole-exome sequencing (WES) and whole-genome sequencing (WGS) simulations were used to calculate detection performance across a range of mosaic event sizes, types, clonalities, and sequencing depths. The tool was applied to 4911 patients with undiagnosed developmental disorders, and 11 events among nine patients were detected. For eight of these 11 events, mosaicism was observed in saliva but not blood, suggesting that assaying blood alone would miss a large fraction, possibly >50%, of mosaic diagnostic chromosomal rearrangements.Entities:
Mesh:
Year: 2017 PMID: 28855261 PMCID: PMC5630034 DOI: 10.1101/gr.212373.116
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Figure 1.Detecting structural mosaicism using MrMosaic. (A) Exome data are stored in a BAM file from which allele fraction (left) and coverage (right) are measured at polymorphic positions within or near target regions. A simulated mosaic deletion is depicted. (B) The raw data, consisting of BAFs (y-axis: B allele frequency) and normalized coverage (y-axis: log ratio of normalized coverage) are plotted across chromosome space (x-axis) for a simulated mosaic deletion. (C) Absolute deviation of BAF (y-axis: Bdev) and normalized coverage (y-axis: Cdev) at heterozygous sites are analyzed. A smoothed median has been included. (D) Mann-Whitney U tests were performed separately for Bdev and Cdev, comparing the signal detected in sliding windows in this chromosome compared with randomly selected sites from other chromosomes, generating a test statistic (y-axis). A smoothed median has been included. (E) The test statistics are depicted in log scale. The P-values of the Mann-Whitney U tests were combined and segmented (black lines). Segments passing the Mscore significance threshold are plotted in blue.
Figure 2.Simulation performance summarized by AUC. We measured the average precision (area under the precision-recall curve [AUC]) for MrMosaic implemented on whole-exome (WE) simulations (A,C,E) and MrMosaic and MAD implemented on whole-genome (WG) simulations (B,D,F). The depth, size, and coverage measured for WES and WGS simulations were selected to accentuate informative differences in performance. AUC across size: Simulated events of 50% clonality were studied for WES (A) and WGS (B) simulations. Although for WES simulations, simulated exome depth was 75× depth, for WGS simulations, it was 30× depth. MrMosaic on whole-genome data (WG-MrM) outperforms MrMosaic on exome data (WE-MrM), which outperforms MAD on exome data (WE-MAD). AUC across clonality: Although for WES (C) simulations, the simulated size and coverage was 5 Mb and 75×, for WGS (D) simulations, it was 100 kb and 30×. AUC across average coverage: Simulated events of 50% were studied for both WES (E) and WGS (F) simulations. Although for WES simulations, simulated event size was 5 Mb, for WGS simulations, it was 100 kb.
Figure 3.Structural mosaicism detected from exome data: Structural Mosaicism Detected by MrMosaic in the Deciphering Developmental Disorders (DDD) study. Black and red dots represent copy number and allele fraction, respectively. Cdev and Bdev are plotted in black and red trend lines. The blue line represents statistically significant segmented detections passing a threshold. Different classes of events are found: (A–C) Mosaic gains; (D–F) mosaic losses; (G) mixed copy number; (H,I) loss-of-heterozygosity events.
Detections by exome and validation by SNP microarray
Phenotypes for children with identified structural mosaicism