| Literature DB >> 30373661 |
Masoud Arabfard1,2, Kaveh Kavousi3, Ahmad Delbari4, Mina Ohadi5.
Abstract
BACKGROUND: Despite their vast biological implication, the relevance of short tandem repeats (STRs)/microsatellites to the protein-coding gene translation initiation sites (TISs) remains largely unknown.Entities:
Keywords: Genome-scale; Human-specific; Selection; Short tandem repeat; Translation initiation site
Mesh:
Year: 2018 PMID: 30373661 PMCID: PMC6206671 DOI: 10.1186/s40246-018-0181-3
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
Fig. 1Genome-scale landscape of STRs in the 120 bp genomic DNA sequence upstream of human TISs. The abundance of STRs is sorted in the ascending order
Fig. 2Genome-scale landscape of STRs in the 120 bp cDNA sequence upstream of human TISs. The abundance of STRs is sorted in the ascending order
The 1st percentile of human protein-coding genes which contain human-specific STRs (length-wise) in their TIS-flanking genomic DNA sequence
| Gene symbol | Gene Ensembl ID | Transcript ID | STR | GO term |
|---|---|---|---|---|
|
| ENSG00000143748 | ENST00000436927 | (T)22 | ATP binding |
|
| ENSG00000165762 | ENST00000641885 | (T)20 | Olfactory receptor activity |
|
| ENSG00000102858 | ENST00000591895 | (A)18 | – |
|
| ENSG00000261052 | ENST00000338971 | Sulfotransferase activity | |
| ENST00000395138 | ||||
|
| ENSG00000057608 | ENST00000380127 | (T)17 | GTPase activator activity |
| ENST00000609712 | ||||
|
| ENSG00000213648 | ENST00000360423 | (A)17 | Sulfotransferase activity |
|
| ENSG00000167637 | ENST00000618787 | (T)17 | Regulation of transcription |
| ENST00000593268 | ||||
|
| ENSG00000184060 | ENST00000581548 | (A)16 | GTPase activator activity |
|
| ENSG00000064703 | ENST00000475700 | Nucleic acid binding | |
|
| ENSG00000118473 | ENST00000435165 | (A)16 | Clathrin-dependent endocytosis |
|
| ENSG00000157578 | ENST00000288350 | Intraciliary transport | |
| ENST00000485895 | ||||
| ENST00000418018 | ||||
| ENST00000448288 | ||||
| ENST00000434281 | ||||
| ENST00000438404 | ||||
| ENST00000411566 | ||||
| ENST00000415863 | ||||
| ENST00000426783 | ||||
| ENST00000456017 | ||||
| ENST00000451131 | ||||
|
| ENSG00000159708 | ENST00000569499 | (T)14 | – |
| ENST00000568804 | ||||
|
| ENSG00000127515 | ENST00000641129 | (CT)14 | G protein-coupled receptor activity |
|
| ENSG00000100142 | ENST00000492213 | (T)14 | Transcription, DNA templated |
|
| ENSG00000120451 | ENST00000528555 | (T)14 | Integral component of membrane |
| ENST00000530356 | ||||
|
| ENSG00000120498 | ENST00000395889 | (TTCC)14 | Meiotic cell cycle |
|
| ENSG00000075239 | ENST00000527942 | (T)13 | Transferring acyl groups |
|
| ENSG00000166664 | ENST00000299847 | (T)13 | Ion transmembrane transport |
| ENST00000562729 | ||||
|
| ENSG00000156958 | ENST00000560654 | (TG)13 | Phosphotransferase activity |
| ENST00000396509 | ||||
| ENST00000558145 | ||||
| ENST00000544523 | ||||
| ENST00000560138 |
Fig. 3Distribution of the human-specific STRs in the TIS-flanking genomic DNA sequence. A significant skewing was observed between this compartment and the compartment containing the overall (human-specific and non-specific) STRs. The abundance of STRs is sorted in the ascending order
The 1st percentile of human protein-coding genes which contain human-specific STRs (length-wise) in their TIS-flanking cDNA sequence
| Gene symbol | Gene Ensembl ID | Transcript ID | STR | GO term |
|---|---|---|---|---|
|
| ENSG00000168676 | ENST00000566295 | (A)20 | Protein homooligomerization |
|
| ENSG00000081923 | ENST00000585322 | (A)17 | Magnesium ion binding |
|
| ENSG00000173918 | ENST00000583904 | Collagen trimer | |
|
| ENSG00000140612 | ENST00000558196 | Peptidase activity | |
|
| ENSG00000144736 | ENST00000463369 | – | |
|
| ENSG00000164056 | ENST00000505319 | Multicellular organism development | |
| ENST00000610581 | ||||
|
| ENSG00000064703 | ENST00000475700 | (A)16 | ATP binding |
|
| ENSG00000138386 | ENST00000409641 | (T)16 | Negative regulation of transcription |
|
| ENSG00000118473 | ENST00000435165 | (A)16 | – |
|
| ENSG00000110693 | ENST00000528252 | (A)14 | Multicellular organism development |
|
| ENSG00000130226 | ENST00000406326 | (T)13 | Proteolysis |
| ENST00000377770 | ||||
|
| ENSG00000178053 | ENST00000482628 | (G)13 | – |
|
| ENSG00000185634 | ENST00000558220 | (T)13 | Stem cell differentiation |
|
| ENSG00000147166 | ENST00000538820 | (T)12 | Calcium ion binding |
|
| ENSG00000184613 | ENST00000548531 | Calcium ion binding | |
|
| ENSG00000123159 | ENST00000393028 | (GCG)11 | – |
| ENST00000345425 | ||||
| ENST00000587210 | ||||
|
| ENSG00000112339 | ENST00000527578 | (T)11 | GTPase activity |
|
| ENSG00000171476 | ENST00000556376 | (A)11 | Cell differentiation |
|
| ENSG00000188000 | ENST00000642043 | (T)11 | G protein-coupled receptor activity |
|
| ENSG00000145860 | ENST00000520638 | Integral component of membrane | |
|
| ENSG00000141759 | ENST00000592837 | mRNA splicing, via spliceosome | |
|
| ENSG00000204574 | ENST00000468958 | (A)10 | ATP binding |
|
| ENSG00000104880 | ENST00000359920 | (T)10 | Rho guanyl-nucleotide exchange factor activity |
|
| ENSG00000179674 | ENST00000320767 | (A)10 | GTP binding |
|
| ENSG00000070669 | ENST00000448127 | (T)10 | Asparagine biosynthetic process |
|
| ENSG00000134001 | ENST00000466499 | Translation initiation factor activity |
Fig. 4Distribution of the human-specific STR compartment in the TIS-flanking cDNA sequence. A significant skewing was observed between this compartment and the compartment containing the overall (human-specific and non-specific) STRs. The abundance of STRs is sorted in the ascending order
Fig. 5Evaluation of a link between STRs and TIS selection on the genomic DNA (a) and cDNA (b) platforms. The Fisher exact test statistic value < 0.00001. The number of times in which human-specific and non-specific STRs occurred with homologous and non-homologous TISs in other species is counted. %Similarity was checked for the first five amino acids (excluding the initiating methionine) of all annotated proteins for the orthologous genes in 46 species. TIS = translation initiation site. STR = short tandem repeat
Fig. 6Sample protein conservation analysis encoded by the genes listed in Table 1 (a), Table 2 (b), and several randomly selected proteins, whose codons were flanked by non-specific STRs (c), between human and three other species. Maximum identity scores were annotated for each gene in every species. Identity scores were considerably higher for the proteins in the non-specific STR compartment