| Literature DB >> 36170113 |
Kuo-Feng Tung1, Wen-Chang Lin1,2.
Abstract
Recently, a new reference transcript dataset [Matched Annotation from the NCBI and EMBL-EBI (MANE) select] was released by NCBI and EMBL-EBI to make available a new unified representative transcript for human protein-coding genes. While the main purpose of MANE project is to provide a harmonized gene and transcript information standard, there is no explicit tissue expression information about these MANE select transcripts. In this report, we tried to provide useful expression profiles of MANE select transcripts in various normal human tissues to allow further interrogation of their molecular modulations and functional significance. We obtained the new V9 transcript expression dataset from the Genotype-Tissue Expression (GTEx) web portal. This new GTEx dataset, based on a long-read sequencing platform, affords better assessment of the expression of alternative spliced transcripts. This tissue expression profiles of MANE select transcripts (TEx-MST) database not only provides the basic information of MANE select transcripts but also tissue expression profiles on alternative transcripts in protein-coding genes. Users can initiate the interrogation by gene symbol searches or by browsing the MANE genes with various criteria (such as genome locations or expression rankings). We further utilized the GENCODE biotype feature to identify the top-ranked protein-coding transcripts by choosing the most expressed protein-coding transcripts from GTEx datasets (both V8 and V9 datasets). In summary, there are 18 083 genes matched between MANE and GTEx. Among them, 13 245 MANE select transcripts matched with the top-ranked protein-coding transcripts in GTEx V9 dataset, which underlined the dominate expression of MANE select transcripts. This TEx-MST web bioinformatic database provides a visualized user interface for the normal tissue expression patterns of MANE select transcripts using the newly released GTEx dataset. Database URL: TEx-MST is available at https://texmst.ibms.sinica.edu.tw/.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36170113 PMCID: PMC9518666 DOI: 10.1093/database/baac089
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 4.462
Figure 1.Distribution of expression Ranks (Rank 1–Rank 10) of MANE select transcripts. Most of the MANE select transcripts are the dominantly expressed ones—Rank 1 transcripts according to the GTEx expression dataset. Numbers of genes are shown on top of each column.
Figure 2.Illustration of TEx-MST gene information webpage for tachykinin receptor 2 (TACR2) protein-coding gene. Basic gene and transcript information of the protein-coding gene are provided. The main transcript expression data table is displayed to provide the GTEx and GENCODE information. The top-ranked protein-coding transcript is marked by a red circle and the MANE select transcript is marked by a star symbol. The important expression graph is provided for Rank 1 to Rank 10 transcripts at the bottom chart of the webpage. (A) GTEx V9 (long-read) expression information is displayed; (B) GTEx V8 (short-read) expression information is displayed. Please note that there are more tissue types in the V8 dataset. The TACR2 gene is mostly expressed in the digestive system.
Figure 3.The TEx-MST database web page. We have established a web resource for accessible interrogation on individual MANE select transcripts. There are 19 062 protein-coding gene records in current V1.0 release of MANE project. We further classified them into four categories: (i) matched with GTEx V9 top-ranked protein-coding transcripts—13 245; (ii) not-matched with GTEx V9 top-ranked protein-coding transcripts—3153; (iii) not included in the GTEx V9 transcript list—1685 and (iv) genes not found in the GTEx dataset. A simple user guide is provided for easy access, and users can study the gene of their interests by searching with the gene symbol.