| Literature DB >> 22570408 |
Rocco Piazza1, Alessandra Pirola, Roberta Spinelli, Simona Valletta, Sara Redaelli, Vera Magistroni, Carlo Gambacorti-Passerini.
Abstract
Gene fusions are common driver events in leukaemias and solid tumours; here we present FusionAnalyser, a tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. We initially tested FusionAnalyser by using a set of in silico randomly generated sequencing data from 20 known human translocations occurring in cancer and subsequently using transcriptome data from three chronic and three acute myeloid leukaemia samples. in all the cases our tool was invariably able to detect the presence of the correct driver fusion event(s) with high specificity. In one of the acute myeloid leukaemia samples, FusionAnalyser identified a novel, cryptic, in-frame ETS2-ERG fusion. A fully event-driven graphical interface and a flexible filtering system allow complex analyses to be run in the absence of any a priori programming or scripting knowledge. Therefore, we propose FusionAnalyser as an efficient and robust graphical tool for the identification of functional rearrangements in the context of high-throughput transcriptome sequencing data.Entities:
Mesh:
Year: 2012 PMID: 22570408 PMCID: PMC3439881 DOI: 10.1093/nar/gks394
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Molecular characteristics of the human fusions analysed using simulated in silico data
| Fusion | Translocation | Exon1 Chr | Exon1 Start | Exon1 End | Exon2 Chr | Exon2 Start | Exon2 End | Disease |
|---|---|---|---|---|---|---|---|---|
| BCR–ABL1 (p210) | t(9;22)(q34;q11) | Chr22 | 21 962 525 | 21 962 600 | Chr9 | 132 719 271 | 132 719 445 | CML |
| BCR–ABL1 (p190) | t(9;22)(q34;q11) | Chr22 | 21 852 551 | 21 854 425 | Chr9 | 132 719 271 | 132 719 445 | ALL |
| CBFB–MYH11 | inv(16)(p13q22) | Chr16 | 65 673 616 | 65 673 712 | Chr16 | 15 728 205 | 15 728 412 | AML |
| CEP110–FGFR1 | t(8;9)(p12;q33) | Chr9 | 12 297 5773 | 12 297 5836 | Chr8 | 3 839 8471 | 38 398 616 | 8p12 MPD |
| ETV6–JAK2 | t(9;12)(p24;p13) | Chr12 | 11 913 624 | 11 914 170 | Chr9 | 5 071 724 | 5 071 861 | ALL |
| NCOA4–RET | inv(10)(q11.2;q11.2) | Chr10 | 51 251 275 | 51 251 384 | Chr10 | 42 932 037 | 42 932 185 | PTC |
| NPM1–ALK | t(2;5)(p23;q35) | Chr5 | 170 751 314 | 170 751 408 | Chr2 | 29 299 711 | 29 299 898 | ALCL |
| NUP98–HOXD13 | t(2;11)(q31;p15) | Chr11 | 3 722 314 | 3 722 455 | Chr2 | 176 667 453 | 176 668 912 | AML |
| PICALM–MLLT10 | t(10;11)(p13–14;q14–21) | Chr11 | 85 365 313 | 85 365 373 | Chr10 | 21 941 282 | 21 941 386 | ALL/AML |
| PML–RARA | t(15;17)(q24;q21) | Chr15 | 72 112 549 | 72 112 808 | Chr17 | 35 758 093 | 35 758 242 | AML |
| ETV6–NTRK3 | t(12;15)(p13;q25) | Chr12 | 11 913 624 | 11 914 170 | Chr15 | 86 284 857 | 86 284 988 | AML |
| ETV6–RUNX1 | t(12;21)(p13;q22) | Chr12 | 11 913 624 | 11 914 170 | Chr21 | 35 187 091 | 35 187 130 | ALL |
| EWSR1–ERG | t(21;22)(q22;q12) | Chr22 | 28 012 911 | 28 013 123 | Chr21 | 38 696 348 | 38 696 429 | ES |
| MLL–MLLT1 | t(11;19)(q23;p13.3) | Chr11 | 117 857 639 | 117 858 017 | Chr19 | 6 213 238 | 6 213 321 | ALL/AML |
| MLL–MLLT3 | t(9;11)(p22;q23) | Chr11 | 117 857 639 | 117 858 017 | Chr9 | 20 353 473 | 20 353 603 | AML |
| RUNX1–RUNX1T1 | t(8;21)(q22;q22) | Chr21 | 35 153 640 | 35 153 745 | Chr8 | 93 098 629 | 93 098 767 | AML |
| SFRS3/BCL6 | t(3;6)(q27;p21) | Chr6 | 36 672 515 | 36 672 723 | Chr3 | 188 932 190 | 188 932 412 | NHL–FL |
| TCF3–PBX1 | t(1;19)(q23;p13) | Chr19 | 1 570 109 | 1 570 233 | Chr1 | 163 028 354 | 163 028 599 | ALL |
| TRIP11–PDGFRB | t(5;14)(q33;q32) | Chr14 | 91 524 380 | 91 524 480 | Chr5 | 149 486 275 | 149 486 370 | AML |
| ZBTB16–RARA | t(11;17)(q23;q21) | Chr11 | 113 532 268 | 113 532 366 | Chr17 | 35 758 093 | 35 758 242 | AML |
The fusion name, translocation, genomic coordinates of the two breakpoint exons and the disorder most commonly associated with each lesion are shown.
CML = Chronic Myeloid Leukaemia, AML = Acute Myeloid Leukaemia, MPD = myeloproliferative disorder, PTC = Papillary thyroid carcinoma, ALCL = Anaplastic Large Cell Lymphoma, ES = Ewing Sarcoma, NHL = Non-Hodgkin Lymphoma, FL = Follicular Lymphoma.
Figure 1.Analysis of artificial alignment data for four translocations: RUNX1–RUNX1T1 (a), EWSR1–ERG (b), MLLT10–PICAM (c) and PML–RARA (d), simulating the presence of 1, 2 or 3 randomly generated single nucleotide variants within the breakpoint region. In the upper part of each panel, the standard graphical FusionAnalyser output, in the form of a circular diagram reproducing the identified rearrangement, is shown. In the lower part of each panel, three representative junction regions are shown. The upper sequence in each box represents the reference breakpoint sequence, generated by the Junction Prediction/Projection modules; the lower sequence represents part of an anchor read successfully mapped to the breakpoint region despite the presence of 1 (upper box), 2 (middle box) or 3 (lower box) variants. Each variant is highlighted by the presence of a yellow (variant occurring in the first gene of the fusion) or red (variant occurring in the second gene of the fusion) asterisk.
Summary of clinical details of the three CML patients included in this study
| Patient ID | Age at diagnosis | Sokal Score | WBC at diagnosis (per µl) | Platelets at diagnosis (per µl) | Additional cytogenetic abnormalities | Q-PCR at diagnosis/100 copies of ABL (IS) |
|---|---|---|---|---|---|---|
| CML–CP-001 | 23 | 0.8 | 74.5 × 103 | 748 × 103 | No | 59.5 |
| CML–CP-002 | 52 | 0.66 | 55.7 × 103 | 281 × 103 | No | 60.5 |
| CML–CP-003 | 45 | 0.91 | 34.4 × 103 | 1068 × 103 | Loss of der (9) | 44.2 |
Summary of clinical details of the three AML patients included in this study
| Patient ID | Age at diagnosis | Sex | WBC at diagnosis (per µl) | Platelets at diagnosis (per µl) | Haemoglobin at diagnosis (g/dl) |
|---|---|---|---|---|---|
| AML-001 | 34 | Male | 74.5 × 103 | 748 × 103 | 10.9 |
| AML-002 | 18 | Male | 55.7 × 103 | 281 × 103 | 6.3 |
| AML-003 | 64 | Female | 34.4 × 103 | 1068 × 103 | 7.1 |
Figure 2.Analysis of transcriptome sequencing data of patient AML002 (a): the red curved line highlights the presence of the PML–RARA translocation; the blue lines indicate bona fide read-through events; the thick green line points to the intrachromosomal ETS2–ERG fusion. (b) Schematic model of the ETS2–ERG fusion: the ETS2 exons are shown as thick green arrows; the 3′ ERG exon is shown as a thick red arrow. The thin green arrow shows the open reading frame of the fusion. The blue and yellow boxes indicate the PNT domain of ETS and the ETS domain of ERG, respectively. The two black lines indicate the position of the two primers used for the amplification of the breakpoint region. In the bottom panel, the result of the ETS2–ERG amplification in patients AML001 (1), AML002 (2) and AML003 (3) is shown. (c) Sequence of the ETS2–ERG breakpoint region. The solid black line highlight the PNT domain of ETS, the dotted line the ETS domain of ERG. The black arrow indicates the breakpoint site.
Comparison of three fusion discovery tools
| CRITERIA | FUSIONANALYSER | FUSIONHUNTER | FUSIONSEQ |
|---|---|---|---|
| FUSIONS DETECTED | 2 | 1 | 0 |
| INSTALLATION | EASY (0/1 dep.) | EASY (0/1 dep.) | COMPLEX (≥ 4 dep.) |
| CONFIGURATION | EASY | NORMAL | COMPLEX |
| MULTIPLE SPECIES | NO | YES | YES |
| HARDWARE | DUAL/QUAD CORE PC, 4 GBYTES RAM | MULTICORE SERVER | MULTICORE SERVER |
| ALIGNMENT TOOL | OPEN (SAM/BAM) | CLOSE (Bowtie) | OPEN (.mrf) |
aExpressed as the number of validated fusions identified in the AML002 data set.
bThe complexity of the installation was scored proportionally to the number of dependencies typically required to complete the installation.
cConfiguration scores the complexity and hands-on time required to configure a standard analysis.
dThe ‘Multiple species’ field indicates if the tool is able to analyse transcriptomes from other species besides humans.
eThe ‘Alignment tool’ field indicates if the fusion discovery tool is dependent on a specific aligner. ‘OPEN (SAM/BAM)’ means that any aligner generating correct SAM/BAM alignments can be used to perform the analysis. ‘CLOSE (Bowtie)’ means that only the Bowtie aligner can be used. ‘OPEN (.mrf)’ means that any aligner can be used but the output format must be converted into .mrf files.