| Literature DB >> 23734783 |
Abstract
BACKGROUND: Somatically-acquired translocations may serve as important markers for assessing the cause and nature of diseases like cancer. Algorithms to locate translocations may use next-generation sequencing (NGS) platform data. However, paired-end strategies do not accurately predict precise translocation breakpoints, and "split-read" methods may lose sensitivity if a translocation boundary is not captured by many sequenced reads. To address these challenges, we have developed "Bellerophon", a method that uses discordant read pairs to identify potential translocations, and subsequently uses "soft-clipped" reads to predict the location of the precise breakpoints. Furthermore, for each chimeric breakpoint, our method attempts to classify it as a participant in an unbalanced translocation, balanced translocation, or interchromosomal insertion.Entities:
Mesh:
Year: 2013 PMID: 23734783 PMCID: PMC3622635 DOI: 10.1186/1471-2105-14-S5-S6
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Translocation captured by a cluster of three chimeric read pairs. The first set of reads map to the forward strand of chromosome i, and the second set map to reverse strand of chromosome j. The distance between the outermost reads to the breakpoint are D1 and D2, for chromosomes i and j respectively. These distances must be less than or equal to mean + k*stdev.
Figure 2The three types of translocations and the mapping orientations that result when a pair spans the breakpoint. These orientations assume that Illumina technologies were used in sequencing. Bellerophon deduces the type of fusion based on the mapping orientation of the pairs in a cluster.
Figure 3The formation of soft-clipped reads. Soft-clipped reads span the translocation boundary between chromosomes i and j. As a result, these reads may align partially to chromosome i and partially to chromosome j.
Structural variants inserted into the first simulated dataset
| Chr1 | Bkpt1 | Strand1 | Chr2 | Bkpt2 | Strand2 | Type |
|---|---|---|---|---|---|---|
| 9 | 73,000,000 | + | 11 | 63,000,000 | + | U |
| 5 | 40,000,000 | + | 2 | 140,000,000 | + | U |
| 7 | 11,000,000 | + | 12 | 45,000,000 | - | U |
| 10 | 5,000,000 | - | 20 | 15,000,000 | + | U |
| 16 | 6,000,000 | - | 18 | 12,000,000 | + | U |
| 4 | 9,000,000 | + | 17 | 17,000,000 | + | U |
| 3 | 35,000,000 | + | 6 | 14,000,000 | + | B |
| 6 | 14,001,000 | + | 3 | 35,000,001 | + | B |
| 13 | 45,000,000 | + | + | II | ||
| + | 13 | 45,000,001 | + | II | ||
| - | 22 | 25,000,000 | + | II | ||
| 22 | 25,000,001 | + | - | II | ||
Structural variants inserted into the simulated dataset. U = unbalanced translocation, II = interchromosomal insertion, and B = balanced translocation. For the "II" and "B" variants, the partner breakpoints are listed consecutively. For "II" variants, the donor chromosome and its breakpoint are bolded. Note that the chr3 and chr6 balanced translocation contains a 1000-bp duplication, so it is not entirely reciprocal.
Structural variants inserted into the second simulated dataset
| Chr1 | Bkpt1 | Strand1 | Chr2 | Bkpt2 | Strand2 | Type |
|---|---|---|---|---|---|---|
| 15 | 41,000,000 | + | 18 | 50,000,000 | + | U |
| 13 | 31,000,000 | + | 20 | 43,000,000 | + | U |
| 9 | 21,000,000 | - | 17 | 60,000,000 | + | U |
| 21 | 30,000,000 | + | 2 | 35,000,000 | - | U |
| 11 | 11,000,000 | + | 12 | 67,000,000 | + | U |
| 16 | 23,000,000 | + | 7 | 44,000,001 | + | B |
| 7 | 44,000,000 | + | 16 | 23,000,001 | + | B |
| 6 | 92,000,000 | - | 10 | 65,000,000 | + | U |
| 19 | 35,000,000 | + | 14 | 55,000,000 | - | U |
Simulated dataset 1 results (100 bp reads)
| 40X | 30X | 20X | 10X | 4X | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| B-phon | 12/12 | 12/12 | 0.96 | 12/12 | 12/12 | 1.0 | 12/12 | 12/12 | 0.96 | 11/12 | 11/11 | 1.36 | 5/12 | 5/5 | 0.5 |
| SVD | 11/12 | 11/11 | 2.7-417 | 11/12 | 11/11 | 4.8-405 | 11/12 | 11/11 | 7.4-389 | 11/12 | 11/11 | 14.7-370 | 8/12 | 8/9 | 29-336 |
| BD | 12/12 | 16/16 | 185.3 | 12/12 | 16/16 | 189.1 | 11/12 | 14/14 | 171.3 | 10/12 | 15/15 | 137.6 | 4/12 | 4/4 | 153 |
| GASV | 12/12 | 12/12 | 100-226 | 12/12 | 12/12 | 103-245 | 12/12 | 12/12 | 106-257 | 12/12 | 12/12 | 114-270 | 10/12 | 10/10 | 129-315 |
| CREST | 12/12 | 25/25 | 1.2 | 12/12 | 23/23 | 1.2 | 11/12 | 15/15 | 0.9 | 9/12 | 11/11 | 0.8 | 5/12 | 5/5 | 1.1 |
Sensitivity, specificity, and breakpoint error of interchromosomal breakpoint calls across all five coverage levels for all five programs. SE = sensitivity (true events captured/ # true events), SP = specificity (# predictions that capture true events/ # predictions), ABE = average breakpoint error. Since SVDetect and GASV predict a range of breakpoints, we reported their average breakpoint errors as ranges. BreakDancer and CREST tended to have redundant predictions, but we did not count this against specificity since redundant predictions typically supported a single common event. These redundancies could easily be merged into a single prediction. The CREST package includes a program to remove redundancies, but BreakDancer does not address these redundant predictions.
Simulated dataset 2 results (75 bp reads)
| 40X | 30X | 20X | 10X | 4X | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| B-phon | 9/9 | 9/9 | 3.5 | 9/9 | 9/9 | 3.4 | 8/9 | 8/8 | 1.6 | 9/9 | 9/9 | 0.6 | 4/9 | 4/4 | 0.5 |
| SVD | 8/9 | 8/9 | 87-451 | 8/9 | 8/9 | 72-436 | 8/9 | 8/9 | 84-422 | 8/9 | 8/9 | 86-416 | 6/9 | 6/6 | 88-344 |
| BD | 9/9 | 14/14 | 169.7 | 9/9 | 14/14 | 170.8 | 9/9 | 13/13 | 155.5 | 7/9 | 10/10 | 129.7 | 1/9 | 1/1 | 199.2 |
| GASV | 9/9 | 9/9 | 73-177 | 9/9 | 9/9 | 75-189 | 9/9 | 9/9 | 75-201 | 8/9 | 8/8 | 81-212 | 7/9 | 7/7 | 101-285 |
| CREST | 9/9 | 15/15 | 1.2 | 9/9 | 14/14 | 1.1 | 8/9 | 8/8 | 0.8 | 7/9 | 7/7 | 1.1 | 1/9 | 1/1 | 1.5 |
Results on the PR-0508 dataset
| 30X | 22.5X | 15X | 7.5X | 3X | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| B-phon | 6/8 | 6/246 | 1.5 | 6/8 | 6/180 | 4.2 | 6/8 | 6/132 | 6.0 | 4/8 | 4/63 | 3.25 | 1/8 | 1/16 | 4.5 |
| SVD | 7/8 | 7/738 | 7.6-293 | 7/8 | 7/544 | 12.4-282 | 7/8 | 7/367 | 27-265 | 4/8 | 4/169 | 35.3-247 | 2/8 | 2/49 | 30.3-275 |
| BD | 7/8 | 9/490 | 141.2 | 7/8 | 8/343 | 165.1 | 5/8 | 5/233 | 131 | 2/8 | 2/112 | 161 | 0/8 | N/A | N/A |
| GASV | 7/8 | 7/538 | 106-393 | 7/8 | 7/392 | 111-402 | 7/8 | 7/270 | 126-418 | 4/8 | 4/133 | 135-439 | 2/8 | 2/43 | 130-425 |
| CREST | 5/8 | 5/55 | 1.1 | 5/8 | 5/43 | 1.1 | 3/8 | 3/37 | 1.0 | 2/8 | 2/17 | 0.5 | 0/8 | N/A | N/A |
Results on the PR-1783 dataset
| 30X | 22.5X | 15X | 7.5X | 3X | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| B-phon | 9/9 | 9/290 | 7.7 | 9/9 | 9/212 | 8.3 | 7/9 | 7/156 | 2.6 | 5/9 | 5/86 | 3.7 | 1/9 | 1/39 | 0 |
| SVD | 9/9 | 9/936 | 19-301 | 9/9 | 9/651 | 24-293 | 8/9 | 8/413 | 27.4-276 | 5/9 | 5/193 | 31.5-281 | 1/9 | 1/80 | 43-273 |
| BD | 9/9 | 12/408 | 171.0 | 8/9 | 8/305 | 180 | 6/9 | 6/210 | 167 | 2/9 | 2/114 | 137 | 0/9 | N/A | N/A |
| GASV | 9/9 | 9/450 | 114-278 | 9/9 | 9/331 | 120-285 | 8/9 | 8/226 | 127-303 | 5/9 | 5/107 | 128-291 | 1/9 | 1/29 | 142-331 |
| CREST | 5/9 | 5/60 | 3.2 | 5/9 | 5/49 | 3.2 | 3/9 | 3/38 | 2.2 | 1/9 | 1/15 | 1.5 | 0/9 | N/A | N/A |