| Literature DB >> 24651690 |
Cristina Aguado1, Magdalena Gayà-Vidal1, Sergi Villatoro1, Meritxell Oliva1, David Izquierdo1, Carla Giner-Delgado2, Víctor Montalvo1, Judit García-González1, Alexander Martínez-Fundichely1, Laia Capilla1, Aurora Ruiz-Herrera3, Xavier Estivill4, Marta Puig1, Mario Cáceres5.
Abstract
In recent years different types of structural variants (SVs) have been discovered in the human genome and their functional impact has become increasingly clear. Inversions, however, are poorly characterized and more difficult to study, especially those mediated by inverted repeats or segmental duplications. Here, we describe the results of a simple and fast inverse PCR (iPCR) protocol for high-throughput genotyping of a wide variety of inversions using a small amount of DNA. In particular, we analyzed 22 inversions predicted in humans ranging from 5.1 kb to 226 kb and mediated by inverted repeat sequences of 1.6-24 kb. First, we validated 17 of the 22 inversions in a panel of nine HapMap individuals from different populations, and we genotyped them in 68 additional individuals of European origin, with correct genetic transmission in ∼ 12 mother-father-child trios. Global inversion minor allele frequency varied between 1% and 49% and inversion genotypes were consistent with Hardy-Weinberg equilibrium. By analyzing the nucleotide variation and the haplotypes in these regions, we found that only four inversions have linked tag-SNPs and that in many cases there are multiple shared SNPs between standard and inverted chromosomes, suggesting an unexpected high degree of inversion recurrence during human evolution. iPCR was also used to check 16 of these inversions in four chimpanzees and two gorillas, and 10 showed both orientations either within or between species, providing additional support for their multiple origin. Finally, we have identified several inversions that include genes in the inverted or breakpoint regions, and at least one disrupts a potential coding gene. Thus, these results represent a significant advance in our understanding of inversion polymorphism in human populations and challenge the common view of a single origin of inversions, with important implications for inversion analysis in SNP-based studies.Entities:
Mesh:
Year: 2014 PMID: 24651690 PMCID: PMC3961182 DOI: 10.1371/journal.pgen.1004208
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Figure 1Diagram of iPCR validation of inversions mediated by inverted repeats (IRs).
Standard and inverted arrangements are represented by unique regions A, B, C and D, which are separated by IR1 and IR2 at each inversion breakpoint (BP). The iPCR involves three steps: digestion at the restriction enzyme target sites (RE), circularization of the digested fragments by self-ligation, and PCR amplification with primers (small arrows) flanking the restriction site to generate products of different size depending on the orientation of the segment between both repeats.
Summary of the characteristics of the analyzed inversion predictions and iPCR validation results.
| Inversion | Chr. | Breakpoint 1 (NCBI36/HG18) | Breakpoint 2 (NCBI36/HG18) | Inverted repeat (IR) | IR1/IR2 size (kb) | Inversion size | Fragments iPCR (kb) | Validation status | iPCR/PEM consistency |
| HsInv0114 | chr9 | 125778473–125781206 | 125793139–125795872 | TEs/SDs | 2.7/2.7 | 14.7 | 13.8–17.8 | Inversion | Yes (9/9) |
| HsInv0124 | chr11 | 300225–301836 | 307945–309555 | SDs | 4.5/8.9 | 7.7 | 2.3–14.8 | Inversion | Yes (9/9) |
| HsInv0209 | chr11 | 70953485–70960513 | 70965419–70972446 | SDs | 7.0/7.0 | 11.9 | 9.2–11.7 | Inversion | Yes (9/9) |
| HsInv0241 | chr2 | 241264234–241270541 | 241280391–241286847 | SDs | 13.3/12.8 | 16.2 | 17–20.1 | Inversion | Yes (7/7) |
| HsInv0272 | chr5 | 178802777–178811616 | 178857371–178867555 | SDs | 8.8/10.2 | 55.3 | 8.8–17.6 | Not validated | Yes (9/9) |
| HsInv0278 | chr5 | 180455356–180458186 | 180460457–180463248 | SDs | 2.8/2.8 | 5.1 | 3.3–13.8 | Inversion | Yes (8/8) |
| HsInv0286 | chr7 | 54258468–54259261 | 54353522–54354315 | TEs/SDs | 15.9/15.9 | 95.1 | 23.3–41.5 | Inversion | Yes (9/9) |
| HsInv0306 | chr8 | 2272127–2276815 | 2311945–2316621 | L1PA7, AluSc | 4.7/4.7 | 39.8 | 9.7–25.6 | Complex region | - |
| HsInv0340 | chr13 | 63188985–63193242 | 63237334–63241591 | SDs | 10.6/10.6 | 48.4 | 31.6–39.9 | Inversion | Yes (9/9) |
| HsInv0341 | chr13 | 79291903–79294413 | 79312733–79315243 | L1PA3 | 6.2/6.2 | 20.8 | 7.2–17.1 | Inversion | Yes (9/9) |
| HsInv0344 | chr14 | 34079802–34086813 | 34094204–34101227 | SDs | 7.2/7.2 | 14.4 | 9.8–28.7 | Inversion | Yes (9/9) |
| HsInv0347 | chr14 | 60141001–60142582 | 60148138–60149719 | THE1C | 1.6/1.6 | 7.1 | 4.3–9.2 | Inversion | Yes (8/8) |
| HsInv0389 | chrX | 153219602–153228808 | 153266422–153275629 | SDs | 11.3/11.3 | 46.8 | 12–30.8 | Inversion | Yes (9/9) |
| HsInv0393 | chrX | 100739178–100743833 | 100753236–100757891 | SDs | 4.7/4.7 | 14.1 | 7.3–8.5 | Inversion | Yes (9/9) |
| HsInv0396 | chrX | 72132652–72141803 | 72214354–72223499 | SDs | 9.5/9.5 | 81.7 | 12.3–17.9 | Inversion | Yes (9/9) |
| HsInv0397 | chrX | 105396369–105408238 | 105417867–105429736 | SDs | 32.2/15.1 | 21.5 | 18.4–20.7 | Inversion | Yes (8/8) |
| HsInv0403 | chrX | 75278106–75282269 | 75285318–75289479 | TEs/SDs | 4.3/4.3 | 7.2 | 8.2–10.1 | Inversion | Yes (5/5) |
| HsInv0414 | chrX | 151985038–151993457 | 151994653–152003088 | SDs | 8.4/8.4 | 9.6 | - | Complex region | - |
| HsInv0526 | chr11 | 116509421–116514763 | 116581672–116587255 | SDs | 5.3/5.6 | 72.4 | 9.1–27.6 | Not validated | Yes (9/9) |
| HsInv0710 | chr8 | 2167585–2182892 | 2316618–2331389 | SDs | 15.3/14.8 | 148.8 | 25.6–33.2 | Complex region | - |
| HsInv0832 | chrY | 16496132–16504854 | 16517493–16526218 | SDs | 8.7/8.7 | 21.4 | 15.5–37.4 | Inversion | Yes (1/1) |
| HsInv1051 | chr17 | 18442024–18466271 | 18667812–18692134 | SDs | 24.2/24.3 | 225.8 | 46.4–72.7 | Inversion | Yes (9/9) |
Inversion size is calculated as the distance between the middle positions of the two breakpoint intervals.
Consistency between the iPCR and paired-end mapping (PEM) results is indicated by the number of individuals in which the iPCR genotype is consistent with the presence of fosmids supporting the Std or Inv orientation after excluding those mapping within the IRs (see Table S2).
HsInv0306 includes this inversion prediction and HsInv0312, whereas HsInv0710 includes this inversion prediction and HsInv0311 [54].
The inverted region of inversion HsInv0403 shows the opposite orientation in the GRCh37 (HG19) assembly.
For inversion predictions in which no inverted sequence was available, breakpoint coordinates correspond to the whole length of the segmental duplications (SDs) annotated in HG18 or the IRs defined by BLAST alignment of the two breakpoints.
IRs include sequences of transposable elements (TEs) L2, MLT1K, MLT1C, AluSq, L3, and MIRb, although the whole region has been anotated as SD in HG19.
IRs are annotated as SDs in HG18, but not in HG19 since they are completely formed by multiple partial copies of different TEs.
IRs are not annotated as SDs in HG18 or HG19, but contain unique sequences and parts of different TEs (L2, L4, MER41C, MER4A, and MamGyp).
Figure 2Quantification of self-ligation efficiency in iPCR with different DNA dilutions and different reagents.
A. Effect of DNA dilution during ligation on iPCR amplification of staggered-end fragments from HsInv0340 of 32 kb (AB) and 40 kb (BD) and the A–D concatamer. B. Effect of different reagent concentrations during ligation on iPCR amplification of blunt-end fragments from HsInv0286 of 33 kb (BD) and 41 kb (AB). The A–D fragment with 10 ng/µl of DNA and the 41 kb fragment with no additional reagent were selected as reference to calculate the relative amplification in A and B, respectively. NA18517 DNA that is heterozygous for both inversions was used in the analysis.
Figure 3Multiplex iPCR results of two validated inversions in nine human samples.
A. HsInv0403 ABD and ACD iPCRs. Band sizes are: AB, 364 bp; BD, 239 bp; AC, 350 bp; CD, 225 bp; and AD, 265 bp. B. HsInv0209 ABC and BCD iPCRs. Band sizes are AB, 435 bp; AC, 243 bp; BD, 543 bp; CD, 351 bp; and BC, 470 bp. For both panels the genomic DNA samples are: 1, negative control; 2, NA12156; 3, NA12878; 4, NA15510; 5, NA18507; 6, NA18517; 7, NA18555; 8, NA18956; 9, NA19129; 10, NA19240; 11, DNA without restriction enzyme; 12, DNA without T4 DNA ligase; and L, 100 bp DNA Ladder (Invitrogen).
Summary information of the genotyping in CEU individuals, gene effect, and evolutionary history for the 17 validated polymorphic inversions in the human genome.
| Inversion | Chr. | N |
| Observed heterozygosity | Valid families | Genes affected | iPCR results | Ancestral orientation | ||
| Breakpoints | Inverted region | Chimp | Gorilla | |||||||
| HsInv0114 | chr9 | 92 | 0.64 | 0.50 | 12/12 | no | no |
|
|
|
| HsInv0124 | chr11 | 92 | 0.39 | 0.43 | 12/12 |
|
|
|
|
|
| HsInv0209 | chr11 | 90 | 0.02 | 0.04 | 12/12 |
| no |
|
|
|
| HsInv0241 | chr2 | 92 | 0.16 | 0.28 | 12/12 |
| no |
|
|
|
| HsInv0278 | chr5 | 92 | 0.10 | 0.20 | 12/12 |
| no |
|
|
|
| HsInv0286 | chr7 | 72 | 0.47 | 0.50 | 3/3 | no | no |
|
|
|
| HsInv0340 | chr13 | 90 | 0.01 | 0.02 | 11/11 | no |
|
|
|
|
| HsInv0341 | chr13 | 92 | 0.03 | 0.07 | 12/12 | no | no |
|
|
|
| HsInv0344 | chr14 | 92 | 0.59 | 0.48 | 12/12 |
| no |
| ND | Unknown |
| HsInv0347 | chr14 | 92 | 0.10 | 0.15 | 12/12 | no | no |
|
|
|
| HsInv0389 | chrX | 69/46 | 0.17 | 0.17 | 14/14 | no |
|
|
|
|
| HsInv0393 | chrX | 66/44 | 0.36 | 0.55 | 13/13 |
| no |
|
|
|
| HsInv0396 | chrX | 69/46 | 0.16 | 0.22 | 14/14 |
| no |
|
|
|
| HsInv0397 | chrX | 68/46 | 0.18 | 0.35 | 13/13 | no | no |
|
|
|
| HsInv0403 | chrX | 69/46 | 0.26 | 0.30 | 14/14 | no | no |
|
| Unknown |
| HsInv0832 | chrY | 25 | 0 | 0 | 7/7 | no | no |
|
| Unknown |
| HsInv1051 | chr17 | 86 | 0 | 0 | 9/9 |
| 7 genes | ND | ND |
|
N, number of chromosomes from unrelated individuals.
For inversions located on chr. X, the first number refers to the chromosomes used to calculate the frequency of the inversion and the second number to the chromosomes in females used to calculate the heterozygosity.
iPCR results derive from the analysis of four chimpanzees and two gorillas and those not coinciding with the current orientation of the species genome (panTro4 or gorGor3) are marked with an asterisk.
For HsInv0389, iPCR data from chimpanzees is consistent with the experimental analysis of Cáceres et al. [59]. ND, not determined.
Estimation of the ancestral orientation is based mainly on the iPCR results for chimpanzee and gorilla.
Those cases in which the phylogenetic trees are informative and support the results from the iPCR are shown in boldface. For HsInv1051, the ancestral state is based on the chimpanzee genome and the disruption of the CCDC144B gene in the inverted orientation.
Genes completely included within the inversion are TBC1D28, ZNF286B, FOXO3B, TRIM16L, FBXW10, FAM18B1, and DKFZp434O1826.
Nucleotide variation data from HapMap and 1000 Genomes Project (1000GP) for 14 polymorphic inversions with >2 inverted chromosomes in the CEU population.
| HapMap genotypes | 1000GP genotypes | 1000GP haplotypes | |||||||||||
| Inversion | N | SNPs | Fixed | Shared | N | SNPs | Fixed | Shared | Sites | Fst | πall | π | π |
|
| |||||||||||||
| HsInv0114 | 46 | 16 | 0 | 1 | 28 | 43 | 2 | 2 | 11932 |
| 0.00064 | 0.00061 | 0.00020 |
| HsInv0124 | 46 | 2 | 0 | 1 | 28 | 18 | 0 | 6 | 6109 |
| 0.00066 | 0.00065 | 0.00039 |
| HsInv0209 | 45 | 6 | 0 | 1 | 28 | 17 | 0 | 0 | 4906 | 0.41 | 0.00095 | 0.00093 | - |
| HsInv0241 | 46 | 3 | 0 | 0 | 28 | 10 | 0 | 0 | 3178 |
| 0.00098 | 0.00093 | 0.00063 |
| HsInv0278 | 46 | 2 | 0 | 2 | 28 | 17 | 0 | 0 | 2271 |
| 0.00258 | 0.00255 | 0.00053 |
| HsInv0286 | 33 | 51 | 1 | 3 | 21 | 393 | 5 | 9 | 74342 |
| 0.00077 | 0.00087 | 0.00035 |
| HsInv0341 | 45 | 19 | 0 | 12 | 28 | 55 | 0 | 28 | 16881 | 0.00 | 0.00101 | 0.00101 | 0.00122 |
| HsInv0344 | 46 | 3 | 0 | 1 | 28 | 36 | 0 | 20 | 7368 |
| 0.00095 | 0.00069 | 0.00086 |
| HsInv0347 | 46 | 3 | 2 | 0 | 28 | 11 | 2 | 0 | 5543 |
| 0.00025 | 0.00011 | 0.00036 |
| HsInv0389 | 46 | 16 | 0 | 14 | 28 | 60 | 0 | 35 | 37610 |
| 0.00030 | 0.00015 | 0.00034 |
| HsInv0393 | 44 | 3 | 0 | 3 | 27 | 16 | 0 | 4 | 9399 |
| 0.00030 | 0.00014 | 0.00025 |
| HsInv0396 | 46 | 29 | 0 | 0 | 28 | 192 | 12 | 5 | 71906 |
| 0.00081 | 0.00060 | 0.00004 |
| HsInv0397 | 45 | 0 | 0 | 0 | 27 | 3 | 0 | 0 | 8906 | 0.00 | 0.00001 | 0.00001 | 0.00000 |
| HsInv0403 | 46 | 1 | 0 | 1 | 28 | 5 | 0 | 5 | 2796 |
| 0.00050 | 0.00056 | 0.00039 |
|
| |||||||||||||
| HsInv0114 | 46 | 14 | 0 | 4 | 28 | 80 | 2 | 12 | 19949 |
| 0.00071 | 0.00053 | 0.00030 |
| HsInv0124 | 46 | 16 | 0 | 9 | 28 | 110 | 0 | 43 | 19999 |
| 0.00141 | 0.00108 | 0.00121 |
| HsInv0209 | 45 | 24 | 0 | 0 | 28 | 106 | 0 | 0 | 19998 | 0.16 | 0.00164 | 0.00163 | - |
| HsInv0241 | 46 | 14 | 0 | 3 | 28 | 60 | 0 | 7 | 19999 |
| 0.00098 | 0.00099 | 0.00036 |
| HsInv0278 | 46 | 17 | 0 | 9 | 28 | 95 | 0 | 7 | 19993 |
| 0.00121 | 0.00119 | 0.00074 |
| HsInv0286 | 33 | 11 | 1 | 0 | 21 | 121 | 1 | 2 | 19987 |
| 0.00082 | 0.00095 | 0.00038 |
| HsInv0341 | 45 | 23 | 0 | 10 | 28 | 61 | 0 | 16 | 19992 | 0.00 | 0.00099 | 0.00100 | 0.00107 |
| HsInv0344 | 46 | 11 | 0 | 10 | 28 | 73 | 0 | 35 | 19991 |
| 0.00081 | 0.00064 | 0.00069 |
| HsInv0347 | 46 | 14 | 5 | 0 | 28 | 32 | 10 | 2 | 19999 |
| 0.00020 | 0.00002 | 0.00033 |
| HsInv0389 | 45 | 19 | 0 | 17 | 28 | 34 | 0 | 22 | 20000 |
| 0.00031 | 0.00019 | 0.00048 |
| HsInv0393 | 43 | 13 | 0 | 7 | 27 | 32 | 0 | 10 | 19996 |
| 0.00039 | 0.00022 | 0.00042 |
| HsInv0396 | 45 | 10 | 0 | 0 | 28 | 42 | 1 | 1 | 19998 |
| 0.00049 | 0.00037 | 0.00010 |
| HsInv0397 | 45 | 1 | 0 | 1 | 27 | 15 | 0 | 1 | 20000 |
| 0.00008 | 0.00008 | 0.00003 |
| HsInv0403 | 46 | 9 | 0 | 0 | 28 | 58 | 0 | 18 | 20000 |
| 0.00088 | 0.00045 | 0.00064 |
N, number of unrelated individuals analyzed.
Fst values were calculated comparing the Inv and Std inferred haplotypes.
* P<0.05.
The inverted region includes the region within the IRs that mediated the inversion.
The flanking region corresponds to 10 kb outside each IR.
Figure 4Distribution of SNPs along 14 polymorphic inversions in CEU individuals.
SNP distribution was calculated according to the haplotypes inferred from the 1000 Genomes Project data by PHASE. Both the inverted region and 10(indicated by black arrows) are represented. Polymorphic SNPs in the Std or Inv arrangement are shown in grey, fixed SNPs between arrangements are shown in red, and shared SNPs between arrangements are shown in blue.
Figure 5Haplotype network of four human polymorphic inversions from phased HapMap SNP data showing unique or recurrent origins.
A. HsInv0114. B. HsInv0286. C. HsInv0341. D. HsInv0389. Circles correspond to the different haplotypes found for the region of the inversion and circle sizes are proportional to the frequency of each haplotype. Std and Inv haplotypes are represented in yellow and blue, respectively. Small red circles represent hypothetical haplotypes. Nucleotide changes between haplotypes are indicated as red numbers.