| Literature DB >> 22400009 |
Jungnam Lee1, Seyoung Mun, Thomas J Meyer, Kyudong Han.
Abstract
Approximately 80 long interspersed element (LINE-1 or L1) copies are able to retrotranspose actively in the human genome, and these are termed retrotransposition-competent L1s. The 5' untranslated region (UTR) of the human-specific L1 contains an internal promoter and several transcription factor binding sites. To better understand the effect of the L1 5' UTR on the evolution of human-specific L1s, we examined this population of elements, focusing on the sequence diversity and accumulated substitutions within their 5' UTRs. Using network analysis, we estimated the age of each L1 component (the 5' UTR, ORF1, ORF2, and 3' UTR). Through the comparison of the L1 components based on their estimated ages, we found that the 5' UTR of human-specific L1s accumulates mutations at a faster rate than the other components. To further investigate the L1 5' UTR, we examined the substitution frequency per nucleotide position among them. The results showed that the L1 5' UTRs shared relatively conserved transcription factor binding sites, despite their high sequence diversity. Thus, we suggest that the high level of sequence diversity in the 5' UTRs could be one of the factors controlling the number of retrotransposition-competent L1s in the human genome during the evolutionary battle between L1s and their host genomes.Entities:
Year: 2012 PMID: 22400009 PMCID: PMC3286893 DOI: 10.1155/2012/129416
Source DB: PubMed Journal: Comp Funct Genomics ISSN: 1531-6912
Figure 1Substitution frequencies along the 5′ UTR sequences of human-specific L1 elements. The percentage of substitutions per nucleotide position along the 5′ UTR sequence was calculated. The structure of the L1 5′ UTR is shown on the top. The TF binding sites, sense promoter (SP), antisense promoter (ASP), CpG sites, and CpG islands are indicated by colored boxes and lines. (a) 443 full-length human-specific L1 elements. (b) 36 active L1 elements in vitro. (c) 41 dead L1 elements in vitro.
Average frequency of substitutions across the promoter and TF binding sites of the L1 5′ UTR.
| Region | Active L1 elements (%) | Dead L1 elements (%) | All L1 elements in the human genome (%) |
|---|---|---|---|
| Sense promoter | 0.66 | 0.99 | 2.14 |
| YY1 | 0.15 | 0.27 | 1.38 |
| Runx3 | 0.00 | 0.26 | 0.53 |
| SRY-1 | 0.00 | 0.49 | 0.81 |
| Runx3 ASP | 0.29 | 0.26 | 0.86 |
| SRY-2 | 0.00 | 0.00 | 0.41 |
| 5′ UTR overall | 0.53 | 0.78 | 1.62 |
Age estimation of human-specific L1 elements based on each L1 component.
| Subfamilya | Subfamilyb | No. of full-length L1 elements | 5′ UTR Age ± SD (myrs) | ORF1 Age ± SD (myrs) | ORF2 Age ± SD (myrs) | 3′ UTR only Age ± SD (mys) | pORF2 and 3′ UTR Age ± SD (myrs)b |
|---|---|---|---|---|---|---|---|
| L1PA3 | L1PA3-1A | 106 | 20.30 ± 0.65 | 12.83 ± 0.71 | 11.31 ± 0.73 | 15.32 ± 2.11 | 12.71 ± 0.63 |
|
| |||||||
| L1PA2 | L1PA2-1B | 147 | 13.85 ± 0.38 | 9.15 ± 0.59 | 9.63 ± 0.69 | 11.32 ± 1.14 | 7.62 ± 0.47 |
|
| |||||||
| L1HS-1AB | L1HS-1A | 32 | 8.74 ± 0.53 | 5.50 ± 0.57 | 4.88 ± 0.63 | 10.48 ± 1.87 | 5.09 ± 0.56 |
|
| |||||||
| L1HS-preTa | L1HS-preTa | 62 | 7.22 ± 0.41 | 3.95 ± 0.41 | 3.79 ± 0.89 | 4.15 ± 0.84 | 3.13 ± 0.25 |
|
| |||||||
| L1HS-Ta0 | L1HS-Ta0 | 38 | 4.90 ± 0.33 | 3.25 ± 0.42 | 3.29 ± 0.47 | 2.96 ± 0.51 | 2.73 ± 0.22 |
|
| |||||||
| L1HS-Ta1 | L1HS-Ta1 | 58 | 4.38 ± 0.65 | 2.35 ± 0.43 | 2.28 ± 0.44 | 2.29 ± 0.41 | 1.94 ± 0.20 |
|
| |||||||
| Average | 443c | 9.90 ± 0.49 | 6.17 ± 0.52 | 5.86 ± 0.64 | 7.75 ± 1.15 | 5.54 ± 0.39 | |
aIn this study.
bSource Lee et al. [23].
cTotal number of L1 elements.
Figure 2Average age estimates of human-specific L1 elements. N represents the number of samples. The pORF2 + 3′ UTR data are from Lee et al. [23]. * indicates significant differences of P < 0.0001 in Welch's t-test.
Figure 3CpG-to-GpC ratio in L1 components. The vertical axis represents the ratio of CpG to GpC dinucleotide loci in each L1 component. The highest ratio is observed in the 5′ UTR.