| Literature DB >> 34930391 |
Michael M Khayat1,2, Sayed Mohammad Ebrahim Sahraeian3, Samantha Zarate4, Andrew Carroll4, Huixiao Hong5, Bohu Pan5, Leming Shi6,7, Richard A Gibbs1,2, Marghoob Mohiyuddin3, Yuanting Zheng8,9, Fritz J Sedlazeck10.
Abstract
BACKGROUND: Genomic structural variations (SV) are important determinants of genotypic and phenotypic changes in many organisms. However, the detection of SV from next-generation sequencing data remains challenging.Entities:
Keywords: Genomic variability; Next-generation sequencing; Structural variations
Mesh:
Year: 2021 PMID: 34930391 PMCID: PMC8686633 DOI: 10.1186/s13059-021-02558-x
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Fig. 1Overall study design and variability. A Sequencing and analysis overview of the Chinese quartet. The samples were sequenced in three replicates at three different centers. The files were then analyzed by four different mappers including base quality score recalibration (recal) and duplicate read marking (dedup). B Overlap and level of support between the different centers, replicates, and SV mappers. C Heatmap of the percent overlap between the different samples (red = high, yellow = low). D Assessing Mendelian consistency in identical twins compared to parents by mapper. The x-axis shows SVs that were called in replicates from identical twins (3 replicates x 3 centers x 2 twins = 18 replicates). The y-axis shows the percentage of SVs that were called in at least one of the replicates in parents (LCL7 and LCL8). E Distribution of SV events along the differently generated sample files for LCL5
Fig. 2Analysis strategy and comparisons of mapper, center, and replicate SVs. A Analysis strategy for examining variability attributed to different factors including centers, replicates, mappers, and dedup/recal. To examine variability from each factor independently, SVs had to be concordant between the call sets of the other factors to pass for downstream analysis (e.g., to examine variable SVs attributed to mappers, SVs must be filtered to only include SV calls present in all replicates, centers, and dedup/recal). B Comparison of variability due to different SV mapping methods in LCL5. C Comparison of variability due to different centers in LCL5. D Comparison of variability across different replicates in LCL5
Fig. 3Comparison of singleton (unique) SV calls across family members for SV mapper (A), sequencing center (B), replicate (C), and dedup/recal (D) strategies. E Examining clustering of singleton SV in LCL5 across 100kbp windows genome-wide for each of the variability sources. F Scatter plots of total SVs identified per sample compared to mean and standard deviation of coverage and insert size, respectively. Each dot represents an SV call set with red representing Center 1, green representing Center 2, and blue representing Center 3