| Literature DB >> 29622039 |
N Nazaripanah1, F Adelirad1, A Delbari1, R Sahaf1, T Abbasi-Asl2, M Ohadi3.
Abstract
BACKGROUND: While there is an ongoing trend to identify single nucleotide substitutions (SNSs) that are linked to inter/intra-species differences and disease phenotypes, short tandem repeats (STRs)/microsatellites may be of equal (if not more) importance in the above processes. Genes that contain STRs in their promoters have higher expression divergence compared to genes with fixed or no STRs in the gene promoters. In line with the above, recent reports indicate a role of repetitive sequences in the rise of young transcription start sites (TSSs) in human evolution.Entities:
Keywords: Core promoter; Human-specific; Short tandem repeat; Tetranucleotide; Trinucleotide
Mesh:
Year: 2018 PMID: 29622039 PMCID: PMC5887250 DOI: 10.1186/s40246-018-0149-3
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
Fig. 1Genome-scale prevalence of human protein-coding core promoter trinucleotide STRs and significant skewing of the human-specific STR compartment
Fig. 2Genome-scale prevalence of human protein-coding core promoter tetranucleotide STRs and significant skewing of the human-specific STR compartment
Genome-scale human-specific core promoter trinucleotide STRs
| Human gene symbol | Ensembl transcript ID | Variant no. | STR formula | |
|---|---|---|---|---|
|
| ENST00000307885.4 | 201 | − 48 (CCA)3 | |
|
| ENST00000370079.3 | 201 | − 79 (ATT)3 | |
|
| ENST00000311051.7 | 202 | − 11 (GCC)3 | |
| ENST00000361539.4 | 201 | − 26 (GGC)5 | ||
|
| ENST00000446408.2 | 203 | − 35 (CCT)3 | |
| ENST00000300227.12 | 201 | − 79 (AGC)3 | − 57 (CGC)5 | |
|
| ENST00000543233.2 | 201 | − 31 (CCT)3 | |
|
| ENST00000563341.1 | 202 | − 39 (CCT)3 | |
|
| ENST00000361727.7 | 201 | − 98 (TGC)3 | |
|
| ENST00000217423.3 | 201 | − 83 (GGA)3 | |
|
| ENST00000310638.8 | 201 | − 86 (CCT)3 | |
|
| ENST00000368102.5 | 201 | − 11 (AAG)3 | |
|
| NM_015372.2.1 | − 98 (GCA)3 | ||
|
| ENST00000546241.1 | 202 | − 70 (AGA)3 | |
|
| ENST00000379868.5 | 201 | − 33 (CCT)3 | |
|
| NM_001077693.3.1 | − 121 (CCA)3 | ||
|
| ENST00000263269.3 | 201 | − 67 (GCC)3 | |
|
| ENST00000394175.6 | 203 | − 55 (GGC)3 | |
|
| ENST00000503927.5 | 202 | − 88 (CGC)3 | |
|
| ENST00000407609.7 | 204 | − 57 (CCT)3 | |
|
| ENST00000504228.5 | 203 | − 81 (AAG)3 | |
|
| ENST00000311936.7 | 202 | − 89 (GAA)3 | |
|
| ENST00000395308.5 | 202 | − 70 (GCG)9 | |
|
| ENST00000426928.6 | 201 | − 58 (GAA)3 | |
|
| ENST00000368780.3 | 201 | − 79 (CCT)3 | |
| ENST00000356016.7 | 202 | − 97 (CCT)3 | − 108 (GCC)3 | |
|
| ENST00000466186.2 | 209 | − 88 (GCA)11 | |
| ENST00000374885.5 | 201 | − 25 (GCC)7 | ||
|
| ENST00000439365.6 | 201 | − 135 (GAA)9 | |
|
| ENST00000320048.1 | 201 | − 100 (GAT)3 | |
|
| ENST00000373521.3 | 201 | − 60 (GCC)3 | |
|
| ENST00000498470.1 | 203 | − 29 (TGC)3 | |
|
| ENST00000492062.1 | 205 | − 96 (GAT)3 | |
|
| ENST00000455695.1 | 205 | − 70 (GGC)5 | |
|
| ENST00000215798.10 | 201 | − 123 (GCT)5 | |
|
| ENST00000448732.1 | 208 | − 27 (GCC)3 | |
|
| ENST00000299333.7 | 201 | − 29 (GGT)3 | |
|
| ENST00000380698.4 | 201 | − 12 (GCA)3 | |
|
| ENST00000305628.7 | 201 | − 80 (TTC)3 | |
| ENST00000330205.10 | 202 | − 37 (TGG)4 | −90 (TGG)4 | |
|
| ENST00000219548.8 | 201 | − 91 (GCC)3 | |
|
| ENST00000272902.9 | 201 | − 102 (AGC)3 | |
|
| ENST00000280358.4 | 201 | − 105 (TGG)3 | |
|
| ENST00000301665.7 | 201 | − 32 (CCG)3 | − 47 (CCG)3 |
| − 59 (CCG)3 | − 83 (CCG)3 | |||
| − 110 (CCG)3 | − 125 (CCG)3 | |||
| − 48 (GCC)3 | − 60 (GCC)3 | |||
| − 84 (GCC)3 | − 126 (GCC)3 | |||
|
| ENST00000372557.1 | 202 | − 53 (GCC)3 | |
|
| ENST00000427445.6 | 201 | − 110 (GCG)3 | |
|
| ENST00000390419.1 | 201 | − 120 (GGC)3 | |
|
| NC_000007.14:TRGV5:u_t_1.1 | − 48 (CTC)3 | ||
|
| ENST00000376656.8 | 201 | − 58 (CCT)4 | |
|
| ENST00000367926.8 | 204 | − 9 (CGT)3 | |
|
| ENST00000326499.10 | 201 | − 31 (GAA)10 | |
|
| NM_000553.4.1 | − 67 (GCC)3 | − 92 (GCC)3 | |
|
| ENST00000618555.4 | 205 | − 67 (CCG)3 | |
|
| XM_006716760.1.1 | − 32 (AGG)3 | ||
|
| ENST00000639929.1 | 212 | − 57 (GAA)3 | |
The numbers before the brackets represent the start site of the STR in respect of the corresponding transcription start site. “Variant no” corresponds to the Ensembl isoform number
Genome-scale human-specific core promoter tetranucleotide STRs
| Human gene symbol | Ensembl transcript ID | Variant no. | STR formula |
|---|---|---|---|
|
| ENST00000345122.7 | 202 | − 110 (GGGA)4 |
|
| ENST00000378115.2 | 201 | − 22 (TCCC)3 |
|
| ENST00000622877.4 | 201 | − 99 (CTCC)3 |
|
| ENST00000343533.9 | 201 | − 27 (GAGG)3 |
|
| ENST00000542616.1 | 207 | − 54 (GCCC)3 |
|
| ENST00000565211.1 | 203 | − 144 (GGCT)6 |
|
| ENST00000388995.10 | 202 | − 80 (TCTG)3 |
|
| ENST00000614064.4 | 206 | − 17 (GAAA)3 |
| ENST00000375377.1 | 201 | − 13 (CCGG)3 | |
|
| ENST00000451018.7 | 203 | − 102 (GTTT)3 |
| ENST00000267273.6 | 201 | − 69 (CAGT)3 | |
|
| ENST00000307002.3 | 201 | − 123 (GATA)13 |
|
| ENST00000308941.9 | 201 | − 107 (TTTA)3 |
|
| ENST00000269724.5 | 201 | − 75 (CCGC)3 |
|
| ENST00000540314.1 | 206 | − 51 (CTCC)3 |
|
| ENST00000390488.1 | 201 | − 124 (GCCT)7 |
|
| ENST00000390475.1 | 201 | − 86 (CCAC)3 |
|
| ENST00000390464.2 | 201 | − 109 (CACC)3 |
|
| ENST00000390429.3 | 201 | − 111 (GACA)3 |
The numbers before the brackets represent the start site of the STR in respect of the corresponding transcription start site. “Variant no” corresponds to the Ensembl isoform number
Fig. 3Multiple sequence alignment of the TSS-flanking 5′UTRs. Examples of ClustAl Omega sequence alignment are represented in the tri- (a) and tetranucleotide (b) categories. Species inclusion was based on the information available in the Ensembl database
Fig. 4Pair-wise sequence comparison of the TSS-flanking 5′UTRs. %identity scoring was performed between human and other species. Asterisks represent sequence identity