| Literature DB >> 33004976 |
Hiro Takahashi1,2,3, Shido Miyaki4, Hitoshi Onouchi5, Taichiro Motomura6, Nobuo Idesako4, Anna Takahashi7,8, Masataka Murase6, Shuichi Fukuyoshi9, Toshinori Endo10, Kenji Satou11, Satoshi Naito5,12, Motoyuki Itoh13.
Abstract
Upstream open reading frames (uORFs) are present in the 5'-untranslated regions of many eukaryotic mRNAs, and some peptides encoded by these regions play important regulatory roles in controlling main ORF (mORF) translation. We previously developed a novel pipeline, ESUCA, to comprehensively identify plant uORFs encoding functional peptides, based on genome-wide identification of uORFs with conserved peptide sequences (CPuORFs). Here, we applied ESUCA to diverse animal genomes, because animal CPuORFs have been identified only by comparing uORF sequences between a limited number of species, and how many previously identified CPuORFs encode regulatory peptides is unclear. By using ESUCA, 1517 (1373 novel and 144 known) CPuORFs were extracted from four evolutionarily divergent animal genomes. We examined the effects of 17 human CPuORFs on mORF translation using transient expression assays. Through these analyses, we identified seven novel regulatory CPuORFs that repressed mORF translation in a sequence-dependent manner, including one conserved only among Eutheria. We discovered a much higher number of animal CPuORFs than previously identified. Since most human CPuORFs identified in this study are conserved across a wide range of Eutheria or a wider taxonomic range, many CPuORFs encoding regulatory peptides are expected to be found in the identified CPuORFs.Entities:
Mesh:
Year: 2020 PMID: 33004976 PMCID: PMC7530721 DOI: 10.1038/s41598-020-73307-6
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Numbers of uORFs, protein-coding genes, and assembled EST/TSA and RefSeq sequences extracted at each step of ESUCA.
| Step | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| uORFa | Gene | EST/TSA + RefSeq | uORFa | Gene | EST/TSA + RefSeq | uORFa | Gene | EST/TSA + RefSeq | uORFa | Gene | EST/TSA + RefSeq | |
| Before selection | – | 13,938 | – | – | 25,206 | – | – | 14,697 | – | – | 19,956 | – |
| Step 1 | 17,035 | 7066 | – | 39,616 | 14,453 | – | 8929 | 3535 | – | 44,085 | 12,321 | – |
| Step 2 | 5040 | 2343 | – | 3599 | 2323 | – | 1320 | 767 | – | 15,069 | 6568 | – |
| Step 3.1 | 4900 | 2308 | 1,854,900 | 3494 | 2271 | 1,822,408 | 1275 | 751 | 668,417 | 14,529 | 6408 | 7,577,191 |
| Step 3.2 | 4882 | 2297 | 873,484 | 3479 | 2261 | 846,829 | 1271 | 750 | 314,665 | 14,499 | 6399 | 3,711,515 |
| Step 4.1 | 4307 | 2076 | 40,982 | 2549 | 1689 | 37,125 | 1122 | 668 | 42,622 | 13,993 | 6217 | 383,797 |
| Step 4.2 | 4294 | 2067 | 40,894 | 2543 | 1688 | 36,434 | 1119 | 665 | 41,306 | 13,970 | 6215 | 378,480 |
| Step 4.3 | 49 | 40 | 1212 | 408 | 343 | 4082 | 774 | 485 | 8171 | 5262 | 3067 | 33,776 |
| Step 5 | 49 | 40 | 1212 | 192 | 180 | 2798 | 261 | 221 | 4074 | 1495 | 1201 | 12,402 |
| Step 7 | 37 | 36 | 1072 | 156 | 154 | 2729 | 230 | 209 | 3945 | 1094 | 969 | 9964 |
aWhen multiple uORFs in a transcript shared the same stop or start codon, they were counted as one.
Figure 1Numbers of CPuORFs extracted by ESUCA in each taxonomic category. (a) Cladogram showing the relationship among the 19 taxonomic categories defined in this study. Fruit fly, zebrafish, chicken, and human belong to Diptera, Cypriniformes, Galliformes, and Primates, respectively. Diptera, Cypriniformes, Galliformes, and Primates belong to Insecta, Ostarioclupeomorpha, Aves, and Euarchontoglires, respectively. (b) Graphs showing the numbers of CPuORFs extracted by ESUCA analyses of the genomes of the indicated species.
Figure 2Taxonomic conservation and experimental validation of 17 selected human CPuORFs. (a) Taxonomic ranges of conservation of CPuORFs examined in transient assays. Filled cells in each taxonomic category indicate the presence of mORF-tBLASTn hits for CPuORFs of the indicated genes. (b) Reporter constructs used for transient assays. The hatched box in the frameshift (fs) mutant CPuORF indicates the frame-shifted region. Dotted boxes represent the first five nucleotides of the mORFs associated with the 17 human CPuORFs. See Supplementary Fig. S6 for the exact position and the length of each CPuORF and the exact frame-shifted region. (c) Relative luciferase (Fluc) activities of WT-aa (white) or fs (gray) CPuORF reporter plasmids. Means ± SDs of at least three biological replicates are shown. *p < 0.05.