| Literature DB >> 29989075 |
Shifu Chen1,2, Ming Liu1, Tanxiao Huang1, Wenting Liao1, Mingyan Xu1, Jia Gu2.
Abstract
In recent years, gene fusion detection for cancer treatment has become increasingly important since more therapeutic agents have been developed to suppress fusion kinases. Although a number of tools have been developed to detect gene fusions from DNA sequencing data, most of them are not sensitive enough for processing the data from the samples with low tumor DNA composition, like cell-free tumor DNA. In this paper, we will introduce GeneFuse, a tool to detect and visualize gene fusions with high sensitivity and specificity. GeneFuse focuses on the curated gene fusions, which are available in COSMIC (the Catalogue of Somatic Mutations in Cancer) database. For each detected fusion, GeneFuse reports its genome locus, inferred protein forms, and supporting sequencing reads. The fusion detection results are visualized in an HTML page for cloud-friendly validation. GeneFuse is an open source tool available at GitHub: https://github.com/OpenGene/GeneFuse.Entities:
Keywords: GeneFuse; fusion detection; fusion visualization; gene fusion
Mesh:
Year: 2018 PMID: 29989075 PMCID: PMC6036752 DOI: 10.7150/ijbs.24626
Source DB: PubMed Journal: Int J Biol Sci ISSN: 1449-2288 Impact factor: 6.580
Fig 1The program flow of GeneFuse. Four steps are included in the workflow: indexing, matching, filtering, and reporting. In the indexing step, a hashmap of mapping to genes is computed. In the matching step, reads are mapped to the genes using the computed hashmap, and those that can be mapped to two genes are saved to fusion matches. In the filtering step, each fusion match is filtered by its read complexity, match quality, and other factors. Finally, in the reporting step, the detected fusions are validated, and the supporting reads for each fusion are piled up and rendered to an HTML page. The input and output files are then highlighted in grey.
Fig 2A screenshot of a GeneFuse's pile-up result. The demonstrated fusion is CD74-ROS1, which is an important druggable target for lung cancer. From the title, we can find that it is the third detected fusion in this report. The inferred fusion protein below the title shows it has 3 exons from CD74 gene, and 33 exons from ROS1 gene. The supporting reads are presented in a table, and the fusion breakpoint is given in the first row of the table, while the reference sequences are given in the second row. For each supporting read, the color of its bases indicates the quality score (green and blue indicate high quality, red indicates low quality). An online report can be found at http://opengene.org/GeneFuse/report.html
The results of GeneFuse in detecting the EML4-ALK fusion events in 10 cfDNA samples compared to DELLY and FACTERA. With the ddPCR result as the golden standard, it was observed that GeneFuse had the highest sensitivity.
| Sample ID | Fusion type | ddPCR | GeneFuse | DELLY | FACTERA |
|---|---|---|---|---|---|
| cfDNA_001 | EML4:exon6-ALK exon20 | detected | detected | detected | detected |
| cfDNA_001 | EML4:exon13-ALK exon20 | detected | detected | detected | detected |
| cfDNA_002 | Wild Type | Not detected | Not detected | Not detected | Not detected |
| cfDNA_003 | Wild Type | Not detected | Not detected | Not detected | Not detected |
| cfDNA_004 | Wild Type | Not detected | Not detected | Not detected | Not detected |
| cfDNA_005 | Wild Type | Not detected | Not detected | Not detected | Not detected |
| cfDNA_006 | EML4:exon6-ALK exon20 | detected | detected | detected | detected |
| cfDNA_006 | EML4:exon13-ALK exon20 | detected | detected | detected | |
| cfDNA_007 | EML4:exon6-ALK exon20 | detected | detected | detected | detected |
| cfDNA_007 | EML4:exon13-ALK exon20 | detected | detected | detected | |
| cfDNA_008 | EML4:exon6-ALK exon20 | detected | detected | detected | detected |
| cfDNA_008 | EML4:exon13-ALK exon20 | detected | detected | detected | |
| cfDNA_009 | EML4:exon6-ALK exon20 | detected | detected | detected | |
| cfDNA_009 | EML4:exon13-ALK exon20 | detected | detected | ||
| cfDNA_010 | EML4:exon6-ALK exon20 | detected | detected | detected | detected |
| cfDNA_010 | EML4:exon13-ALK exon20 | detected | detected | detected | detected |
The speed evaluation result of GeneFuse against FACTERA and DELLY. The file size in the first column is the sum of read1 and read2 base numbers. BWA-MEM was run with 4 threads, while GeneFuse was also run with 4 threads. The druggable targets can be found from the GeneFuse github repository.
| File size (bases) | BWA MEM | Picard sort | FACTERA step | DELLY step | BWA+Picard+FACTERA | BWA+Picard+DELLY | GeneFuse with druggable targets |
|---|---|---|---|---|---|---|---|
| 6.48 G | 0:28:44 | 0:11:15 | 0:05:38 | 0:04:52 | 0:45:37 | 0:45:51 | 0:05:30 |
| 7.26 G | 0:30:10 | 0:11:59 | 0:06:00 | 0:05:09 | 0:48:09 | 0:48:18 | 0:05:39 |
| 9.13 G | 0:49:37 | 0:17:51 | 0:05:43 | 0:08:23 | 1:13:11 | 1:16:51 | 0:08:26 |
| 7.48 G | 0:41:32 | 0:13:54 | 0:06:47 | 0:11:18 | 1:02:13 | 1:07:44 | 0:07:25 |
| 7.33 G | 0:40:52 | 0:14:04 | 0:06:46 | 0:10:59 | 1:01:42 | 1:06:55 | 0:07:34 |
| 7.19 G | 0:43:11 | 0:14:46 | 0:03:35 | 0:05:22 | 1:01:32 | 1:04:19 | 0:07:00 |
| 7.38 G | 1:01:12 | 0:13:39 | 0:09:51 | 0:10:50 | 1:24:42 | 1:26:41 | 0:09:47 |
| 7.80 G | 1:00:45 | 0:14:46 | 0:06:42 | 0:07:17 | 1:22:13 | 1:23:48 | 0:10:07 |
| 7.46 G | 0:54:54 | 0:14:08 | 0:06:34 | 0:08:45 | 1:15:36 | 1:18:47 | 0:09:50 |
| 8.14 G | 1:05:04 | 0:14:56 | 0:08:20 | 0:08:42 | 1:28:20 | 1:29:42 | 0:10:45 |
| 8.53 G | 0:52:06 | 0:15:58 | 0:03:43 | 0:03:19 | 1:11:47 | 1:12:23 | 0:07:57 |
| 9.75 G | 0:48:30 | 0:18:04 | 0:04:27 | 0:04:11 | 1:11:01 | 1:11:45 | 0:08:55 |
| 9.42 G | 0:47:52 | 0:17:46 | 0:06:03 | 0:04:25 | 1:11:41 | 1:11:03 | 0:09:27 |