| Literature DB >> 35484179 |
Kuo-Feng Tung1, Chao-Yu Pan1,2, Wen-Chang Lin3,4.
Abstract
The discovery and quantification of mRNA transcripts using short-read next-generation sequencing (NGS) data is a complicated task. There are far more alternative mRNA transcripts expressed by human genes than can be identified from NGS transcriptome data and various bioinformatic pipelines, while the numbers of annotated human protein-coding genes has gradually declined in recent years. It is essential to learn more about the thorough tissue expression profiles of alternative transcripts in order to obtain their molecular modulations and actual functional significance. In this report, we present a bioinformatic database for interrogating the representative tissue of human protein-coding transcripts. The database allows researchers to visually explore the top-ranked transcript expression profiles in particular tissue types. Most transcripts of protein-coding genes were found to have certain tissue expression patterns. This observation demonstrated that many alternative transcripts were particularly modulated in different cell types. This user-friendly tool visually represents transcript expression profiles in a tissue-specific manner. Identification of tissue specific protein-coding genes and transcripts is a substantial advance towards interpreting their biological functions and further functional genomics studies.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35484179 PMCID: PMC9050722 DOI: 10.1038/s41598-022-10619-9
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Numbers of tissue representing transcripts interrogated with their tissue expression Z-scores.
| Tissues* | Numbers of transcripts | Numbers of transcripts (TPM ≥ 1) | Numbers of transcripts (TPM ≥ 10) | Numbers of transcripts (TPM ≥ 100) |
|---|---|---|---|---|
| Zero | 28,457 | 14,558 | 3273 | 231 |
| One | 77,606 | 22,576 | 4447 | 423 |
| Two | 35,865 | 10,354 | 1663 | 112 |
| Three | 3615 | 912 | 145 | 9 |
| Four | 28 | 12 | 4 | 0 |
| 145,571 | 48,412 (33.2%) | 9532 (6.5%) | 775 (0.5%) |
Expressed transcript of protein-coding genes were calculated for their Z-core value in each tissue as described in the “Methods”.
*Numbers of tissues with Z-score >=3 for each transcript were noted (zero to four tissues)
Figure 1Tissue expression distribution of the PCP2 gene. The human PCP2 gene is a protein-coding gene for Purkinje cell protein-2, which has two transcripts. Rank1 transcript (ENST00000598935) is the major transcript expressed in brain cerebellum regions (cerebellar hemisphere and cerebellum), where all Purkinje neurons are located. Rank2 transcript (ENST00000311069) is the minor transcript isoform, which is highly expressed in the testis tissue. In the RTTPG user interface, the upper table provides additional information regarding gene and transcript IDs, gene name, transcript length, ORF length, TPM value, transcript Rank, and the represented tissue type for each transcript. In the tissue expression illustration panel below, users can see the tissue expression profile and change the expression scale from raw TPM values, log10 TPM values, and Z-score values.
Figure 2Tissue expression distribution of distinctive tissue expression transcripts. (A) The number of distinctive tissue expression transcripts (Z-score ≥ 3) in human tissues. (B) The average TPM expression values of distinctive tissue expression transcripts in human tissues.
Figure 3Pathway enrichment analysis for tissue representative protein-coding genes. Top 100 expressed genes from the following tissues were chosen for the FunRich enrichment analysis as described in “Methods”: (A) liver, (B) skeletal muscle, and (C) spleen. We used the GO-term biological process function for comparison in this study.
Figure 4Web interface of the RTTPG database. We generated a graphic display interface for selecting representatively expressed transcripts in human tissues. In the default setting, Rank1 transcripts with a Z-score of ≥ 3 were designated for interrogations in the initial home page. Hovering the mouse over the tissue label text will show the number of representative transcript numbers. Users can click on any particular tissue icon label of their study interest. A new web table page will list the transcripts with a Z-score of ≥ 3 in that particular tissue. Users can further select any gene of interest for further interrogation for the alternative transcript expression in different tissue types shown in Supplementary Fig. 2.