| Literature DB >> 30153812 |
Shamsuddin A Bhuiyan1,2,3, Sophia Ly1, Minh Phan1, Brandon Huntington1, Ellie Hogan1, Chao Chun Liu1, James Liu1, Paul Pavlidis4,5.
Abstract
BACKGROUND: Although most genes in mammalian genomes have multiple isoforms, an ongoing debate is whether these isoforms are all functional as well as the extent to which they increase the functional repertoire of the genome. To ground this debate in data, it would be helpful to have a corpus of experimentally-verified cases of genes which have functionally distinct splice isoforms (FDSIs).Entities:
Keywords: Alternative splicing; Functional diversity; Isoform function; Literature curation
Mesh:
Substances:
Year: 2018 PMID: 30153812 PMCID: PMC6114036 DOI: 10.1186/s12864-018-5013-2
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Non-mutually exclusive types of functional distinctness for literature reported genes with FDSIs. a Generally, the distinctness of FDSIs of the same gene can be attributed to expression-pattern distinctness or biochemical distinctness. Expression-pattern distinctness is defined as a gene having specific splice isoforms necessary in distinct conditions. The depletion of the splice isoform in its distinct condition causes a phenotype. Biochemical distinctness is defined as a protein structure difference between splice isoforms of the same genes. While the FDSIs of the gene can be expressed in the same condition, the depletion of either splice isoform causes a phenotype. b For genes with FDSIs, we categorized the specific subtypes of functional distinctness which contributed to the distinctness between the splice isoforms of the gene (summarized in Table 4). Expression-pattern distinctness can be further categorized as “cell-type-specific”, “tissue-specific”, “developmental-stage-specific”, “subcellular localization-specific” and “other condition-specific”. Biochemical distinctness can be further categorized as “dominant-negative”, “protein domain”, “UTR change” and “protein terminus change”
Most genes with FDSIs have biochemically distinct splice isoforms
| Types of distinctness | Human genes | Mouse genes | |
|---|---|---|---|
| Distinct expression patterns | Cell-type-specific |
| |
| Developmental-stage-specific |
|
| |
| Cellular localization |
|
| |
| Tissue-specific |
|
| |
| Other-condition-specific |
| ||
| Biochemically distinct | Protein domain |
|
|
| Dominant negative |
|
| |
| Protein terminus change |
|
| |
| UTR Change |
|
Genes with FDSIs were categorized on functional type based on the literature that reported on the FDSIs using the scheme outlined in Fig. 1. Genes categorized as “distinct expression patterns” express FDSIs in specific conditions. Genes categorized as “biochemically distinct” have FDSIs whose functional distinctness is a consequence of biochemical differences in their final protein product. Genes can be categorized as both “distinct expression patterns” and “biochemically distinct” such as Myh10 and Robo3
Curation of alternative splicing literature has reveals 23 human genes and 20 mouse genes with functionally distinct splice isoforms (FDSIs)
| Species | Curated genes | Genes with FDSIs | Studies curated | Study Type | |||
|---|---|---|---|---|---|---|---|
| Isoform Removal | Overexpression | Localization | Other study types | ||||
| Human | 555 | 23 | 903 | 149 | 294 | 80 | 380 |
| Mouse | 227 | 20 | 272 | 82 | 70 | 37 | 83 |
| Total | 782 | 43 | 1127 | 222 | 353 | 109 | 443 |
The 23 human genes with FDSIs accounted for almost 4% of human genes annotated in this knowledgebase, while the 20 mouse genes accounted for 9% of the all mouse genes annotated. The majority of curated studies could be classified into three different types: “isoform removal”, “overexpression” and “localization”. Isoform removal studies have experiments where expression of at least one splice isoform is eliminated and a phenotypic change is evaluated. Overexpression studies have experiments where at least one splice isoform is overexpressed. This “abundance” of the splice isoform can cause a phenotype (not necessarily distinct). Localization studies have experiments that characterize where in the cell or organism the splice isoform is expressed. A single study can report experiments with multiple study types. The total number of human and mouse studies curated do not sum to 1158 studies because some publications investigated both human and mouse forms of a single gene
Genes with FDSIs identified
| Gene | Number of FDSIs | Number of Ensembl Transcripts | Number of Studies | Mappable to Ensembl? | PULSE | Gene | Number of FDSIs | Number of Ensembl Transcripts | Number of Studies | Mappable to Ensembl? | PULSE | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Human |
| 3 | 9 | 31 | Yes | NA |
| 3 | 17 | 1 | No | Missed |
|
| 3 | 19 | 12 | No | NA |
| 3 | 11 | 34 | Yes | NA | |
|
| 2 | 2 | 1 | No | Missed |
| 2 | 39 | 58 | Yes | Predicted | |
|
| 2 | 25 | 6 | Yes | Predicted |
| 2 | 7 | 1 | Yes | NA | |
|
| 2 | 16 | 2 | Yes | NA |
| 2 | 38 | 6 | No | NA | |
|
| 3 | 32 | 3 | No | NA |
| 2 | 14 | 1 | Yes | Predicted | |
|
| 2 | 7 | 16 | No | Missed |
| 2 | 23 | 2 | Yes | Predicted | |
|
| 2 | 15 | 11 | Yes | Predicted |
| 2 | 22 | 12 | Yes | Predicted | |
|
| 2 | 4 | 1 | Yes | Missed |
| 2 | 20 | 4 | Yes | Missed | |
|
| 2 | 12 | 2 | No | NA |
| 2 | 35 | 2 | No | NA | |
|
| 2 | 2 | 1 | No | NA |
| 2 | 2 | 1 | No | NA | |
|
| 2 | 14 | 27 | Yes | NA | |||||||
| Mouse |
| 2 | 10 | 2 | No | Missed |
| 2 | 7 | 10 | Yes | NA |
|
| 2 | 2 | 8 | Yes | NA |
| 2 | 1 | 3 | No | NA | |
|
| 2 | 12 | 7 | Yes | NA |
| 2 | 7 | 4 | Yes | Training | |
|
| 2 | 8 | 5 | No | Missed |
| 2 | 12 | 13 | No | Predicted | |
|
| 2 | 6 | 9 | Yes | Missed |
| 2 | 10 | 7 | No | Predicted | |
|
| 2 | 9 | 5 | Yes | Missed |
| 2 | 3 | 2 | Yes | NA | |
|
| 3 | 31 | 20 | Yes | Predicted |
| 2 | 2 | 22 | No | Predicted | |
|
| 2 | 6 | 5 | Yes | Missed |
| 2 | 12 | 4 | No | Missed | |
|
| 2 | 12 | 5 | No | NA |
| 2 | 10 | 3 | No | Missed | |
|
| 2 | 3 | 12 | Yes | NA |
| 2 | 8 | 9 | Yes | NA |
Studies have provided positive evidence of functional distinctness for these genes in experiments where individual splice isoforms were eliminated and a phenotypic change was observed. See Additional file 3 for study demonstrating functional distinctness. “Number of FDSIs” indicates the number of splice isoforms where depletion of splice isoforms causes a phenotype. “Number of Ensembl Transcripts” indicates number of transcripts found in Ensembl entry for gene. “Number of studies” indicates the number of studies associated with the gene retrieved with the term “alternative splicing” on PubMed. The highest number of FDSIs found in a single gene is three. “Mappable to Ensembl” indicates genes where we successfully linked all FDSIs back to Ensembl. “PULSE” indicates whether the gene was used at all by Hao and colleagues in their computational predictions. “Training” in this column means that the gene was used as part of PULSE’s training set. “Predicted” means that PULSE predicted that the gene has multiple functional splice isoforms. “Missed” means that PULSE failed to predict that the gene has multiple functional splice isoforms. “NA” means that the gene was not an input for PULSE
Genes with evidence failing to support FDSIs (negative results)
| Gene | Experimental method | Tissue/Cell Type | Reference (PubMed ID) |
|---|---|---|---|
|
| Isoform-specific rescue | Neuron | 25,552,556 |
|
| Knockdown | Prostate cancer cell line | 20,823,238 |
|
| Isoform-specific rescue | Embryonic fibroblast | 21,200,149 |
|
| Isoform-specific rescue | Neuron | 28,968,791 |
|
| Isoform-specific rescue | Bone marrow | 11,136,823 |
|
| Isoform-specific rescue | Breast cancer cell line | 26,277,624 |
|
| Isoform-specific rescue | MDCK cell line | 26,063,734 |
|
| Isoform-specific rescue | Brain | 18,973,563 |
|
| Knockdown | Kidney | 16,030,021, 17,673,687 |
|
| Isoform-specific rescue | Fibroblast | 11,883,941 |
|
| Knockdown and isoform-specific rescue | Adipose | 11,782,442 |
|
| Knockdown | Breast cancer cell line | 24,197,117 |
|
| Knockdown | Bladder | 21,703,425 |
|
| Knockdown | Colon cancer cell line | 22,124,156 |
|
| Isoform-specific rescue | Embryonic stem cells | 15,630,024 |
|
| Knockdown | Embryonic cells | 21,914,475 |
These genes had multiple isoforms tested however only one splice isoform caused a change in phenotype
Fig. 2Overview of literature curation scheme. We sought papers which study the functional distinctness of a single human or mouse gene’s splice isoforms. Positive studies are those that provide evidence where multiple splice isoforms of a single gene are depleted and at least two isoforms show a phenotype. We annotated studies as providing negative evidence for functional distinctness when investigators deplete multiple splice isoforms of the same gene but only one produces an observable phenotype. The numbers in bold represent the number of studies in each category. Clip art designed from Flaticon (free license with attribution)