| Literature DB >> 27330550 |
Caleb F Davis1, Deborah I Ritter2, David A Wheeler3, Hongmei Wang4, Yan Ding5, Shannon P Dugan5, Matthew N Bainbridge6, Donna M Muzny5, Pulivarthi H Rao4, Tsz-Kwong Man7, Sharon E Plon8, Richard A Gibbs3, Ching C Lau7.
Abstract
BACKGROUND: Genomic deletions, inversions, and other rearrangements known collectively as structural variations (SVs) are implicated in many human disorders. Technologies for sequencing DNA provide a potentially rich source of information in which to detect breakpoints of structural variations at base-pair resolution. However, accurate prediction of SVs remains challenging, and existing informatics tools predict rearrangements with significant rates of false positives or negatives.Entities:
Keywords: Algorithm; Cancer; Genome; Genotype; Sequencing; Structural variation; Translocation
Year: 2016 PMID: 27330550 PMCID: PMC4913042 DOI: 10.1186/s13029-016-0051-0
Source DB: PubMed Journal: Source Code Biol Med ISSN: 1751-0473
Fig. 1Use of chimeric and split reads to detect structural variation. Structural variation in the sample is depicted in box 1a as a fusion between genomic regions A (green) and B (blue). Sequence differences in the sample come from structural variation, repetitive sequence (orange), and base substitutions due to sequencing errors and SNPs (black). In box 1b, each group of partially aligned reads, or “stack,” corresponds to a candidate breakpoint located at shared end (left: orange, black, and blue; right: green, yellow, and black). Pairwise combinations of breakpoints form a library of candidate junctions (box 1c). All stacked reads are aligned to the library and are used to assess their support for the candidate junctions. A read aligned to a candidate provides support equal to the product of the length of the “tail” and total alignment quality. The total support for each candidate junction (box 1d) is the sum of supports from the stacked reads aligned to it
Fig. 2SV-STAT is more accurate than alternative methods for determining base-pair resolved breakpoints of translocations given unpaired Roche/454 sequencing data simulated from DNA fusions previously reported in pre-B ALL cases. Samples are arrayed in rows colored for translocations t(4;11) (green), t(1;19) (purple), t(9;22) (orange), and t(12;21) (blue). The first three columns are predictions of SVs from R453Plus1Toolbox [15], CREST [14], and SV-STAT. Grey indicates a false negative, or non-predicted translocation. A color with an “X” through it indicates a false positive, or wrongly-predicted translocation. Columns of boxplots indicate support (log10(S)) for candidate junctions, one column per type of SV. Black vertical dashes indicate median, rectangles indicate the interquartile (25–75 %) range, and upper and lower whiskers represent the boundaries of the 90 % and 10 % percentiles, respectively. Shaded regions indicate sufficient support for SV-STAT to predict SVs