| Literature DB >> 28392986 |
Miika J Ahdesmäki1, Brad A Chapman2, Pablo Cingolani3, Oliver Hofmann4, Aleksandr Sidoruk5, Zhongwu Lai6, Gennadii Zakharov7, Mikhail Rodichenko8, Mikhail Alperovich8, David Jenkins9, T Hedley Carr1, Daniel Stetson6, Brian Dougherty6, J Carl Barrett6, Justin H Johnson6.
Abstract
Sensitivity of short read DNA-sequencing for gene fusion detection is improving, but is hampered by the significant amount of noise composed of uninteresting or false positive hits in the data. In this paper we describe a tiered prioritisation approach to extract high impact gene fusion events from existing structural variant calls. Using cell line and patient DNA sequence data we improve the annotation and interpretation of structural variant calls to best highlight likely cancer driving fusions. We also considerably improve on the automated visualisation of the high impact structural variants to highlight the effects of the variants on the resulting transcripts. The resulting framework greatly improves on readily detecting clinically actionable structural variants.Entities:
Keywords: Annotation; Gene fusion; Oncology; Prioritisation; Structural variation; Visualisation
Year: 2017 PMID: 28392986 PMCID: PMC5382922 DOI: 10.7717/peerj.3166
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Binning of structural variants into 3 priorities.
Collection of structural variants leading to oncogenic fusions in different sample types.
All events are ranked into the highest category (1) by the prioritisation scheme.
| Sample | Panel or WGS | Manta call | Lumpy call | Fusion(s) |
|---|---|---|---|---|
| HDC134P rep. 1 | Panel with intronic probes | INV | INV | EML4-ALK |
| HDC134P rep. 2 | Panel with intronic probes | INV | INV | EML4-ALK |
| HDC134P rep. 3 | Panel with intronic probes | INV | BND | EML4-ALK |
| HDC140P rep. 1 | Panel with intronic probes | INV | BND | CCDC6-RET |
| HDC140P rep. 2 | Panel with intronic probes | INV | BND | CCDC6-RET |
| HDC140P rep. 3 | Panel with intronic probes | INV | BND | CCDC6-RET |
| HDC141P rep. 1 | Panel with intronic probes | BND | BND | ROS1-SLC34A2 |
| HDC141P rep. 2 | Panel with intronic probes | BND | BND | ROS1-SLC34A2 |
| HDC141P rep. 3 | Panel with intronic probes | BND | BND | ROS1-SLC34A2 |
| MCF7 | WGS | DUP | DUP | ESR1-CCDC170 |
| RT4 | Panel with intronic probes | DUP | DUP | TACC3-FGFR3 |
| PDX model | WES | DUP | DUP | TACC3-FGFR3 |
| Prostate cancer patient sample | FMI Panel with intronic probes | DEL | DEL | TMPRSS2-ERG |
Raw SV call numbers for Manta and Lumpy are given in the DUP, DEL, INV and BND columns.
The prioritised calls are shown in the last three columns. The Primary priority column corresponds to the number of detected fusions reported previously in the literature. All samples are from small hybrid capture panels except for the MCF7 sample, thus the relatively low numbers of calls per sample.
| Sample | Algorithm | DUP, DEL, INV | BND | Primary priority | Secondary priority | Tertiary priority |
|---|---|---|---|---|---|---|
| HDC134P rep. 1 | Manta | 5 | 0 | 2 | 2 | 0 |
| Lumpy | 43 | 9 | 1 | 0 | 1 | |
| HDC134P rep. 2 | Manta | 2 | 0 | 2 | 0 | 0 |
| Lumpy | 41 | 4 | 1 | 0 | 0 | |
| HDC134P rep. 3 | Manta | 3 | 0 | 2 | 0 | 0 |
| Lumpy | 41 | 2 | 1 | 2 | 1 | |
| HDC140P rep. 1 | Manta | 1 | 1 | 1 | 1 | 0 |
| Lumpy | 48 | 20 | 1 | 2 | 0 | |
| HDC140P rep. 2 | Manta | 1 | 1 | 1 | 1 | 0 |
| Lumpy | 30 | 12 | 1 | 1 | 0 | |
| HDC140P rep. 3 | Manta | 1 | 1 | 1 | 1 | 0 |
| Lumpy | 56 | 9 | 1 | 3 | 0 | |
| HDC141P rep. 1 | Manta | 0 | 2 | 2 | 0 | 0 |
| Lumpy | 24 | 7 | 1 | 0 | 0 | |
| HDC141P rep. 2 | Manta | 0 | 2 | 2 | 0 | 0 |
| Lumpy | 17 | 1 | 1 | 0 | 0 | |
| HDC141P rep. 3 | Manta | 0 | 2 | 2 | 0 | 0 |
| Lumpy | 25 | 6 | 1 | 1 | 1 | |
| MCF7 (WGS) | Manta | 8,239 | 1,990 | 16 | 48 | 2,814 |
| Lumpy | 4,277 | 2,750 | 21 | 38 | 1,683 | |
| RT4 | Manta | 169 | 158 | 1 | 20 | 135 |
| Lumpy | 1,509 | 14,659 | 1 | 302 | 10,300 | |
| PDX model | Manta | 248 | 65 | 5 | 2 | 164 |
| Lumpy | 143 | 862 | 5 | 8 | 292 | |
| Patient sample | Manta | 30 | 51 | 1 | 6 | 37 |
| Lumpy | 1,034 | 3,621 | 1 | 13 | 177 |
Figure 2Prioritised SV call concordance.
The true positives are concordantly detected in addition to private (non-replicable) false positives. (A), (B) and (C) correspond to HDC134P (EML4-ALK), HDC140P (CCDC6-RET and RET-chr13), and HDC141P (SLC34A2-ROS1), respectively.
Figure 3Svviz output for the FGFR3-TACC3 fusion (tandem duplication) in the RT4 cell line.
Read evidence is shown for both how the last intron of FGFR3 is fused to an exon of TACC3 as well as for the reference alleles.
Figure 4FGFR3-TACC3 tandem duplication fusion exon level visualisation in the New Genome Browser.
Protein domains and exons affected by the structural variant are highlighted in colours. (A) shows the effect of the fusion and (B) the read evidence for the event at both breakpoints.
Figure 5ROS1-SLC34A2 interchromosomal translocation fusion.
(A) shows the effect of the fusion and (B) the read evidence for the event at both breakpoints.
Figure 6EML4-ALK inversion fusion.
(A) shows the effect of the fusion and (B) the read evidence for the event at both breakpoints.