| Literature DB >> 29860514 |
Silvia Gioiosa1,2, Marco Bolis3, Tiziano Flati1,2, Annalisa Massini4, Enrico Garattini3, Giovanni Chillemi1, Maddalena Fratelli3, Tiziana Castrignanò1.
Abstract
Background: Gene fusions derive from chromosomal rearrangements. The resulting chimeric transcripts are often endowed with oncogenic potential. Furthermore, they serve as diagnostic tools for the clinical classification of cancer subgroups with different prognosis and, in some cases, they can provide specific drug targets. To date, many efforts have been carried out to study gene fusion events occurring in tumor samples. In recent years, the availability of a comprehensive next-generation sequencing dataset for all existing human tumor cell lines has provided the opportunity to further investigate these data in order to identify novel and still uncharacterized gene fusion events.Entities:
Mesh:
Year: 2018 PMID: 29860514 PMCID: PMC6207142 DOI: 10.1093/gigascience/giy062
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
State-of-the-art of databases reporting gene fusions
| Database name | Short description |
|---|---|
| Tumor Fusion Gene Data Portal [ | A collection of fusion genes in the TCGA samples |
| TICdb [ | A collection of 1,374 fusion sequences extracted either from public databases or from published papers (last update: 2013) |
| chimerDB3.0 [ | A catalog of fusion genes encompassing analysis of TCGA data and manual curations from the literature |
| COSMIC Cell Lines [ | Gene fusions are manually curated from peer-reviewed publications. Currently, COSMIC includes information on fusions involved in solid tumors but not yet leukemias and lymphomas |
| Mitelman [ | Reports hundreds of gene fusions associated with clinical reports but does not contain sequence data |
| ChiTaRs-3.1 [ | A collection of 34,922 chimeric transcripts identified by expressed sequence tags and mRNAs from the GenBank, ChimerDB, dbCRID, TICdb, and Mitelman collections of cancer fusions for several organisms |
| FusionCancer [ | Includes 591 samples, both single-end and paired-end RNA-seq, published on the sequence read archive database [ |
| ONGene [ | Literature-derived database of oncogenes |
Figure 1:(A) Venn diagram showing the intersection of the pGFEs identified by the four algorithms. (B) Distribution of pGFEs in the CCS; 43% (purple) of the CCS has not been previously described in any other database or scientific publication; 10% (red) and 20% (green) of the CCS have been reported in databases from healthy/tumoral samples, thus representing the false-positive/true-positive subset of our analysis; 1% of the CCS (orange) reports tags that classify the pGFE as a false-positive couple with medium probability; 25% (gray) of the results represent novel pGFEs tagged with values that classify them as both false positives and true positives. (C) Venn diagram showing the intersection between the LiGeA CCS and other databases.
Example of possible queries on the LiGeA portal
| Search by | Question | Query |
|---|---|---|
| Disease | What are the gene fusion events present in stomach adenocarcinoma cell lines? | Select "stomach adenocarcinoma" under the "disease" menu. |
| Cell line | What are the novel pGFEs affecting RH30(Sarcoma) cell line? | Select "RH30" under the "cell line" menu and check the box "show only novel results." |
| Chromosome | What are the most suitable fusion partners for chromosome 8? | Select "Chr8" either under the "5’ Chromosome" or the "3’ Chromosome" tab and leave the other forms blank. |
| Gene | How many human cell lines show the PML-RARA fusion event? | Select "PML" under the "5’ gene menu"; select "RARA" from the "3’ gene menu"; leave the "cell line" query form blank. |
| Fusion information | What are all the in-frame pGFEs in the Jurkat cell line? | Select "Jurkat" under the "cell line" menu; select "in-frame" under the "predicted effect" menu. |
| Fusion information | What are the known GFEs predicted to be in-frame in the Jurkat cell line? | Select "Jurkat" under the "cell line" menu; select '‘in-frame" under '‘predicted effect" menu; select "known" under the "fusion description" menu. |
| Algorithm | Show only those GFEs supported by FC and TF in RH30 cell line. | Select "RH30" under the "cell line" query form and check the boxes relative to FC and TF. |
| Viruses | Which cell lines are most affected by hepatitis C virus genome integration? | Select "hepatitis C virus" under the "virus" query form and leave the "cell line" query form blank. |
Figure 2:An overview of the LiGeA portal. (A) A "Search by Cell Line" example and the corresponding output. (B) An overview of the input dataset. (C) A circos diagram showing the graphical outcome of a "Query by Cell Line" and the corresponding related table. (D) An extract from the Download web page.