| Literature DB >> 33257674 |
Alessio Somaschini1, Sebastiano Di Bella1, Carlo Cusi1, Laura Raddrizzani1, Antonella Leone1, Giovanni Carapezza1, Tommaso Mazza2, Antonella Isacchi1, Roberta Bosotti3.
Abstract
Inhibition of kinase gene fusions (KGFs) has proven successful in cancer treatment and continues to represent an attractive research area, due to kinase druggability and clinical validation. Indeed, literature and public databases report a remarkable number of KGFs as potential drug targets, often identified by in vitro characterization of tumor cell line models and confirmed also in clinical samples. However, KGF molecular and experimental information can sometimes be sparse and partially overlapping, suggesting the need for a specific annotation database of KGFs, conveniently condensing all the molecular details that can support targeted drug development pipelines and diagnostic approaches. Here, we describe KuNG FU (KiNase Gene FUsion), a manually curated database collecting detailed annotations on KGFs that were identified and experimentally validated in human cancer cell lines from multiple sources, exclusively focusing on in-frame KGF events retaining an intact kinase domain, representing potentially active driver kinase targets. To our knowledge, KuNG FU represents to date the largest freely accessible homogeneous and curated database of kinase gene fusions in cell line models.Entities:
Year: 2020 PMID: 33257674 PMCID: PMC7705673 DOI: 10.1038/s41597-020-00761-2
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1KuNG FU Workflow. Schematic illustration of the workflow followed for the implementation and feeding of the KuNG FU database: automated and manually curated ‘Pre-Processing’ and ‘Processing’; database structure and interface construction (‘Output’). See detailed description in M&M section.
Fig. 2KuNG FU user interface. Representative screenshots of the KuNG FU user interface showing an example query for the NTRK1 gene. (a) KuNG FU query module (top) and summary table (bottom). The query type can be selected from the scroll-down menu or typed in the ‘Filter’ box; the query immediately provides a summary table with links to the ‘Details’ sections for each KGF found in the database. (b) ‘Fusion’ sub-section, reporting the chromosomal coordinates and the RefSeq transcript IDs of the kinase gene and its respective fusion partner, together with the exons/introns involved in the fusion and the chimeric transcript nucleotide sequence at the breakpoint, with details about the position of the predicted breakpoint in the kinase, the predicted length of the resulting chimeric protein, the type of chromosomal rearrangement event and the experimental methods supporting the KGF. For KGFs that were also identified in clinical samples, links to TCGA patient ID/s in Tumor Fusion Gene Data Portal[27] are provided (c) AGFusion plot section, displaying a diagram of the protein domain architecture of the queried chimeric fusion transcript generated with the AGFusion plot tool[26].
KuNG FU database fields.
| KuNG Fu section | Field | Source |
|---|---|---|
| dataset | ||
| dataset | ||
| dataset | ||
| dataset | ||
| dataset | ||
| dataset | ||
| dataset | ||
| dataset | ||
| AGFusion[ | ||
| HUGO Gene Nomenclature Committee (HGNC) | ||
| HUGO Gene Nomenclature Committee (HGNC) | ||
| HUGO Gene Nomenclature Committee (HGNC) | ||
| HUGO Gene Nomenclature Committee (HGNC) | ||
| manually curated | ||
| manually curated | ||
| manually curated or RefSeq ID by HGNC | ||
| manually curated or RefSeq ID by HGNC | ||
| Predicted | ||
| Predicted | ||
| Predicted | ||
| Predicted | ||
| Literature or manually curated | ||
| Manually curated | ||
| Tumor Fusion Gene Data Portal | ||
| Wikinome (curated) | ||
| Wikinome (curated) | ||
| National Center for Biotechnology Information (NCBI) | ||
| Uniprot | ||
| Uniprot | ||
| Pubmed | ||
| Pubmed | ||
| Pubmed | ||
| Pubmed | ||
| Pubmed | ||
| Pubmed | ||
| Pubmed |
Summary of fields and associated sources included in each section of the KuNG FU database.
Fig. 3KuNG FU statistics and KGFs. Graphical representation of the distribution of the KuNG FU database content. (a) Cell line tissue of origin; (b) Number of cell lines harboring a specific kinase involved in a KGF; (c) Kinase groups (RTK = Receptor Tyrosine Kinase, nRTK = non- Receptor Tyrosine Kinase, TKL = Tyrosine Kinase-Like); (d) 5′- and 3′- kinase fusions; (e) Type of aberrant chromosomal event occurred (‘scramble’: complex genetic event, as in Klijn et al.[13]); (f) Matrix showing the combination types of the 16 kinases (left) with the associated fusion gene partners (top) for the KGFs included in KuNG FU, color-coded by detection in cell lines only (red) or both in cell lines and TCGA samples (blue). For each kinase/gene partner combination, numbers in blue and red boxes indicate the total number of individual KGFs currently reported in KuNG FU.
KuNG FU kinase breakpoint sites.
| kinase | n° of breakpoint sites | breakpoint (aa) |
|---|---|---|
| 4 | 682 (lung), 719 (lung/NMC), 1056 (thymus/NMC), 1093 (NMC) | |
| 4 | 991 (breast), 1749 (lung), 1852 (lung), 1880 (brain) | |
| 3 | 5′-UTR (brain), 620 (uterus), 1009 (brain/bone) | |
| 3 | 141 (blood), 226 (pancreas), 278 (pancreas/brain) | |
| 2 | 208 (stomach), 429 (stomach) | |
| 2 | 759 (urinary bladder), intron 18 (urinary bladder) | |
| 2 | 398 (colon), 451 (lung) | |
| 2 | 465 (blood), 528 (blood) | |
| 2 | 725 (skin), 872 (blood) | |
| 1 | 26 (blood) | |
| 1 | 1057 (blood, colon, lung) | |
| 1 | 380 (brain) | |
| 1 | 428 (blood) | |
| 1 | 828 (stomach) | |
| 1 | 551 (blood) | |
| 1 | 712 (lung, thyroid) |
Number of KGF breakpoint sites for each kinase included in the KuNG FU database. In brackets, amino acid (aa) position of the breakpoint and tumor tissue type where the breakpoints occur.