| Literature DB >> 23692647 |
Abigail L Savage1, Vivien J Bubb, Gerome Breen, John P Quinn.
Abstract
BACKGROUND: Retrotransposons are a major component of the human genome constituting as much as 45%. The hominid specific SINE-VNTR-Alus are the youngest of these elements constituting 0.13% of the genome; they are therefore a practical and amenable group for analysis of both their global integration, polymorphic variation and their potential contribution to modulation of genome regulation.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23692647 PMCID: PMC3667099 DOI: 10.1186/1471-2148-13-101
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Figure 1Distribution of SVAs is associated with genic regions. A - The SVA density of each human chromosome was plotted against the gene density of that chromosome showing a positive relationship between the two variables (correlation coefficient = 0.74). The correlation coefficient was calculated using bootstrap confidence interval (95%). B – The number of observed retrotransposons in defined regions of the human genome compared to the expected (based on the size of the region) and expressed as a percentage. (SVAs X2 = 339.5, df = 2, P < 0.001, SINEs X2 = 170647, df = 2, P < 0.001, LINEs X2 = 44320, df = 2, P < 0.001, LTRs X2 = 77018, df = 2, P < 0.001). C – The distribution of SVAs within genes, intergenic regions and gene deserts broken down by subtype and compared to their distribution across the whole human genome. (Genes X2 = 0.71, df = 6, P = 0.99), (Intergenic X2 = 0.47, df = 6, P = 0.99), (Gene deserts X2 = 13.91, df = 6, P < 0.05). D – The number of SVAs located within set distances upstream of a transcriptional start site (1 kb, 10 kb 20 kb and 100 kb) (X2 = 506.8, df = 3, P < 0.001).
Figure 2The primary sequence of SVAs has the potential to form G-quadruplex DNA. A – Potential G4 DNA formation was analysed in silico. The fold difference in the relative contribution of each element to their proportion in the whole human genome was calculated and is displayed. B - The percentage of sequence from each SVA subtype that could potentially form G4 DNA in the human genome according to Quadparser software is shown; it was further sub-divided into the following elements: CCCTCT hexamer repeat, VNTRs and the remainder of the sequence (other). C – Illustrates the relationship between VNTR and hexamer repeat length during evolution of the SVA subtypes. The average lengths are shown in base pairs. D – The fold difference in size of each of the central VNTRs from the SVA subtypes in the human genome, and their percentage contribution to form G4 compared to the value for SVA subtype F1 which has the highest value for both central VNTR length and G4 potential of the central VNTR.
Figure 3Primary sequence of allele of SVA identifying the different components. The human-specific PARK7 SVA located 8 kb upstream of the PARK7 gene (chr1:8012112–8013618) contains a CCCTCT hexamer VNTR, Alu-like sequence, TR, VNTR, SINE and poly A-tail. In italics are the sequences of DNA that have been predicted to have the potential to form G4 DNA by Quadparser software, potential sites of methylation (CpGs) are underlined.
Frequency of each allelotype for the SVA in the HapMap cohort
| 19 | 21.8 | |
| 4 | 4.6 | |
| 35 | 40.2 | |
| 4 | 4.6 | |
| 4 | 4.6 | |
| 3 | 3.4 | |
| 1 | 1.1 | |
| 16 | 18.4 | |
| 1 | 1.1 | |
| 0 | 0.0 | |
| Total | 87 |
The alleles are numbered 1–4 from shortest to longest.
Figure 4The SVA showed the ability to affect expression in a reporter gene construct. A – Schematic showing the genomic structure of the PARK7 SVA and the relationship to the fragments tested in the reporter gene constructs. B - The average fold activity of the different fragments from the SVA tested in both forward and reverse orientation over the minimal SV40 promoter alone (pGL3P) in the SK-N-AS cell line. Data was normalised to compensate for transfection efficiency, N = 4. C - The average fold activity in the MCF-7 cell line of the different fragments of the SVA in forward and reverse orientation over the minimal SV40 promoter alone (pGL3P) normalised to the internal control to account for transfection efficiency. N = 4. One tailed t-test was used to measure significance of fold activity of PARK7 SVA fragments over SV40 minimal promoter alone (pGL3P) and to compare fold activity of forward and reverse orientations. * P < 0.05, **P < 0.01, ***P < 0.001, # P < 0.05, ## P < 0.01, ### P < 0.001. N = 4.
Sequence analysis of the four alleles identified in the SVA
| 1 | 7 | 12 | 10 |
| 2 | 10 | 12 | 11 |
| 3 | 10 | 12 | 12 |
| 4 | 13 | 12 | 12 |
Genomic DNA from individuals in the CEU (Utah residents with Northern and Western European ancestry from the CEPH collection) HapMap cohort was analysed. The length variation detected occurred in the CCCTCT hexamer repeat (termed hexamer VNTR) and within a second repetitive VNTR region further downstream. In this cohort a repetitive domain here termed a TR was not found to vary between the individuals within this population; this TR was located upstream of the second VNTR. The alleles were numbered 1–4 from shortest to longest. One example of each type of allele was sequenced.