| Literature DB >> 30900191 |
Wei Wang1, Dong-Hui Fang1, Jia Gan1, Yi Shi1, Hui Tang1, Huai Wang1, Mao-Zhong Fu2, Jun Yi3.
Abstract
BACKGROUND: In eukaryotic organisms, it has been well acknowledged that 3' untranslated regions (3' UTRs) of mRNA are actively involved in post-transcriptional regulations of gene expression. Although both shortening and lengthening of 3' UTRs of specific candidate genes were explicitly documented to have functional consequences, landscape of 3' UTR lengths in relation to evolutionary dynamics and biological meanings remains to be elucidated when large-scale data become available.Entities:
Keywords: 3′ untranslated regions; Eukaryotic mRNAs; Functional implications; Sequence length
Mesh:
Substances:
Year: 2019 PMID: 30900191 PMCID: PMC6560010 DOI: 10.1007/s13258-019-00808-8
Source DB: PubMed Journal: Genes Genomics ISSN: 1976-9571 Impact factor: 1.839
Length and range of 3′ UTRs among various gene sets
| Species | Length | Length-based categorization (bp) | Random categorization (bp) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Q1 | Q2 | Q3 | Q4 | R1 | R2 | R3 | R4 | ||
| Human | Mean | 220 | 642 | 1428 | 3173 | 964 | 1002 | 986 | 933 |
| Min | 50 | 404 | 974 | 2058 | 50 | 50 | 50 | 50 | |
| Max | 404 | 974 | 2058 | 24,506 | 13,669 | 17,893 | 13,761 | 24,506 | |
| Mouse | Mean | 183 | 552 | 1238 | 2692 | 860 | 854 | 831 | 850 |
| Min | 50 | 338 | 848 | 1782 | 50 | 50 | 50 | 50 | |
| Max | 338 | 848 | 1781 | 15,948 | 11,756 | 15,948 | 15,587 | 14,333 | |
| Zebrafish | Mean | 156 | 364 | 650 | 1256 | 489 | 494 | 492 | 496 |
| Min | 50 | 59 | 493 | 868 | 50 | 52 | 50 | 50 | |
| Max | 259 | 493 | 868 | 7622 | 6083 | 5052 | 7622 | 6793 | |
| Fruit fly | Mean | 87 | 182 | 421 | 1270 | 284 | 277 | 263 | 259 |
| Min | 50 | 126 | 273 | 679 | 50 | 50 | 50 | 50 | |
| Max | 126 | 273 | 679 | 18,495 | 11,538 | 8558 | 12,542 | 9011 | |
Numbers of the retrieved and processed mRNA sequences among four species
| Steps | Human | Mouse | Zebrafish | Fruit fly |
|---|---|---|---|---|
| Raw retrieval | 99,884 | 78,183 | 47,758 | 30,246 |
| Prefix of ‘NM_’ | 39,272 | 29,686 | 14,906 | 30,246 |
| Filtered by length and completeness | 36,456 | 25,420 | 11,539 | 26,908 |
| Analyzed mRNA reference sequences | 17,351 | 16,903 | 11,220 | 11,661 |
Fig. 1Length distributions and pairwise correlations among 3′ UTRs, coding regions and 5′ UTRs. First, length distributions for 3′ UTRs, coding regions and 5′ UTRs are demonstrated by Box-and-Whisker Plots along the diagonal in the figure, which are also filled by colours for representing different species. Second, the coloured points (lower triangle) graphically demonstrate pairwise comparisons. Third, species-specific Pearson’s coefficients of pairwise correlations are displayed in upper triangle. The lengths of UTRs and coding region (bp) are also log2 transformed for better graphical demonstrations. Throughout all figures, three-letter abbreviations of ‘Dme’ for fruit fly, ‘Dre’ for zebrafish, ‘Mmu’ for mouse and ‘Hsa’ for human are used
Fig. 2Distributions of relative length of 3′ UTRs to 5′ UTRs for each species
Fig. 3Variation patterns of 3′ UTR lengths for orthologous genes among four species. The gray lines demonstrate all gene-wise variations of 3′ UTR lengths, and black bold lines further denote the cluster centers. The numbers of gene included in each cluster are also shown in brackets
Fig. 4Two-dimensional scaling plots of gene sets according to PCA analyses. The 3′ UTR length-based categories of gene sets (Q1–Q4) are plotted in colour along with their respective labels, whereas the random sets (R1–R4) are in grey. Definitions of different gene sets are detailed in main text
Fig. 5Comparisons of 3′ UTR lengths among different categories of human genes. We categorized five groups according to tissue expression patterns (a) or four groups by subcellular localizations (b). The notches in boxes represent 95% confidence intervals of median estimations