| Literature DB >> 24817545 |
Beniamino Trombetta1, Daniele Sellitto2, Rosaria Scozzari1, Fulvio Cruciani3.
Abstract
It has long been believed that the male-specific region of the human Y chromosome (MSY) is genetically independent from the X chromosome. This idea has been recently dismissed due to the discovery that X-Y gametologous gene conversion may occur. However, the pervasiveness of this molecular process in the evolution of sex chromosomes has yet to be exhaustively analyzed. In this study, we explored how pervasive X-Y gene conversion has been during the evolution of the youngest stratum of the human sex chromosomes. By comparing about 0.5 Mb of human-chimpanzee gametologous sequences, we identified 19 regions in which extensive gene conversion has occurred. From our analysis, two major features of these emerged: 1) Several of them are evolutionarily conserved between the two species and 2) almost all of the 19 hotspots overlap with regions where X-Y crossing-over has been previously reported to be involved in sex reversal. Furthermore, in order to explore the dynamics of X-Y gametologous conversion in recent human evolution, we resequenced these 19 hotspots in 68 widely divergent Y haplogroups and used publicly available single nucleotide polymorphism data for the X chromosome. We found that at least ten hotspots are still active in humans. Hence, the results of the interspecific analysis are consistent with the hypothesis of widespread reticulate evolution within gametologous sequences in the differentiation of hominini sex chromosomes. In turn, intraspecific analysis demonstrates that X-Y gene conversion may modulate human sex-chromosome-sequence evolution to a greater extent than previously thought.Entities:
Keywords: X–Y gene conversion; human Y chromosome; recombination hotspots; sex chromosome evolution
Mesh:
Year: 2014 PMID: 24817545 PMCID: PMC4104316 DOI: 10.1093/molbev/msu155
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
F(A) Possible variant sites within a four-way alignment of the orthologous and gametologous sequences from human and chimpanzee sex chromosomes. Different types of sites are shown: C, C-site or conversion site; N, N-site or nonconversion site; S, S-site (or Singleton); M, multiple mutation site; W, other complex site. Invariant sites are not shown. Classification according to Kijima and Innan (2010). (B) Different molecular mechanisms for the formation of variable sites. A type-S site may arise if a mutation occurs in one sex chromosome after human–chimpanzee speciation, whereas an N-site may be generated by a mutation occurring before speciation. If a gene conversion event involves a type-N nucleotide, it generates an S-site, whereas a conversion of an S-site may generate a C-nucleotide or an invariant site depending on the direction of conversion. M- and W-sites may arise only with multiple mutational events.
C-Site-Enriched Regions.
| CER ID | C-Sites | N-Sites | S-Sites | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| ChrX | ChrY | ChrX | ChrY | SHY | SPY | SHX | SPX | ||||
| CER1 | 3679860–3680460 | 7091216–7091814 | 3675948–3676546 | 22898996–22899593 | 601 | 4 | 0 | 10 | 0 | 30 | 0 |
| CER2 | 3671777–3672707 | 7096711–7097641 | 3667630–3668558 | 22893085–22894013 | 931 | 11 | 0 | 6 | 6 | 6 | 3 |
| CER3 | 3603266–3604004 | 7163099–7163862 | 3593349–3594110 | 22826827–22827584 | 776 | 9 | 8 | 15 | 8 | 11 | 4 |
| CER4 | 3600301–3601725 | 7165484–7166941 | 3590386–3591806 | 22823805–22825223 | 1,458 | 7 | 5 | 31 | 8 | 5 | 7 |
| CER5 | 3596033–3596381 | 7168440–7168782 | 3585803–3586149 | 22821952–22822298 | 351 | 4 | 0 | 27 | 1 | 0 | 0 |
| CER6 | 3594322–3594833 | 7169931–7170442 | 3584088–3584597 | 22820296–22820805 | 493 | 5 | 1 | 7 | 4 | 3 | 9 |
| CER7 | 3591725–3593007 | 7171778–7173059 | 3581500–3582777 | 22817686–22818962 | 1,283 | 8 | 5 | 8 | 12 | 22 | 10 |
| CER8 | 3559783–3560184 | 7208985–7209385 | AACZ03178410_random: 258–657 | 22781358–22781750 | 402 | 4 | 0 | 5 | 3 | 0 | 1 |
| CER9 | 3533911–3534399 | 7234931–7235419 | 3537792–3538276 | 22753816–22754299 | 489 | 4 | 0 | 2 | 7 | 4 | 2 |
| CER10 | 3529192–3530503 | 7239647–7240956 | 3532811–3534120 | 22748581–22749881 | 1,321 | 8 | 15 | 28 | 18 | 7 | 8 |
| CER11 | 3525449–3525876 | 7246240–7246670 | 3529302–3529727 | 22742867–22743271 | 441 | 4 | 0 | 5 | 16 | 22 | 6 |
| CER12 | 3455115–3455541 | 7319098–7319526 | 3454452–3454878 | 22663835–22664261 | 429 | 8 | 0 | 3 | 3 | 8 | 2 |
| CER13 | 3436059–3436907 | 7333693–7334559 | 3435069–3435917 | 22643731–22644600 | 884 | 10 | 60 | 8 | 11 | 9 | 2 |
| CER14 | 3362952–3363523 | 7418498–7419244 | 3353968–3354721 | 22552803–22553555 | 757 | 6 | 0 | 2 | 8 | 28 | 2 |
| CER15 | 3278881–3278953 | 14051551–14051623 | GL393313_random: 318610–318682 | 24848003–24848075 | 73 | 4 | 1 | 1 | 0 | 1 | 4 |
| CER16 | 3092315–3092940 | 14295656–14296272 | GL393313_random: 140911–141524 | 22203042–22203656 | 627 | 8 | 2 | 11 | 2 | 30 | 0 |
| CER17 | 2842671–2843504 | 14483248–14484067 | 2842671–2843504 | 17463306–17464121 | 884 | 12 | 0 | 14 | 1 | 20 | 0 |
| CER18 | 2841823–2842477 | 14484255–14484908 | 2833742–2834393 | 17464311–17464962 | 655 | 4 | 22 | 9 | 5 | 1 | 6 |
| CER19 | 2837016–2837544 | 14490130–14490661 | 2830934–2831463 | 17470174–17470703 | 535 | 6 | 0 | 4 | 26 | 8 | 4 |
| 3457797–3464821 | 7311695–7318744 | 3485445–3492520 | 22664624–22671623 | 7,498 | 2 | 625 | 44 | 55 | 25 | 27 | |
aGenomic position is according to GChR37/hg19 for Homo sapiens and CGSG 2.1.3 for Pan troglodytes.
bNumber of base pairs of the four-way alignment.
cNumber of singleton within the alignment. SHY, S-sites in the human Y chromosome; SPY, S-sites in the chimpanzee Y chromosome; SHX, S-sites in the human X chromosome; SPX, S-sites in the chimpanzee X chromosome.
dNon-CER portion of the four-way alignment representative of the stratum 5.
Testing for Gene Conversion in CERs.
| CER ID | Observed C-Sites | Expected C-Sites | Proportion of C-Sites (PC) | Tree Shape | Evidence for Conversion | Evidence for Conversion by SD Analysis | |||
|---|---|---|---|---|---|---|---|---|---|
| Hsa | Ptr | ||||||||
| CER1 | 601 | 4 | 0.43 | 1 × 10−3 | 1 | ((((Xp,Yp),Yh),Xh),Xo) | No | Yes | Yes |
| CER2 | 931 | 11 | 0.28 | <10−12 | 1 | ((Xp,Yp),(Yh,Xh),Xo) | Yes | Yes | Yes |
| CER3 | 776 | 9 | 0.59 | 1.5 × 10−8 | 0.53 | ((Xp,Yp),(Yh,Xh),Xo) | Yes | Yes | Yes |
| CER4 | 1,458 | 7 | 0.54 | 1.6 × 10−6 | 0.58 | ((((Xp,Yp),Xh),Yh),Xo) | No | Yes | Yes |
| CER5 | 351 | 4 | 0.27 | 1.8 × 10−4 | 1 | (((Xp,Yp),Xh,Yh),Xo) | No | Yes | No |
| CER6 | 493 | 5 | 0.29 | 1.3 × 10−5 | 0.83 | ((((Xh,Yh),Yp),Xp),Xo) | Yes | No | Yes |
| CER7 | 1,283 | 8 | 0.48 | 4.7 × 10−8 | 0.61 | ((((Xp,Yp),Yh),Xh),Xg) | No | Yes | Yes |
| CER8 | 402 | 4 | 0.08 | 1.79 × 10−6 | 1 | ((((Xp,Yp),Xh),Yh),Xo) | No | Yes | Yes |
| CER9 | 489 | 4 | 0.15 | 2 × 10−5 | 1 | ((Xp,Yp),(Yh,Xh),Xg) | Yes | Yes | Yes |
| CER10 | 1,321 | 8 | 0.41 | 1.46 × 10−8 | 0.35 | ((((Yp,Yh),Xh),Xp),Xg) | No | No | Yes |
| CER11 | 441 | 4 | 0.59 | 3.3 × 10−3 | 1 | ((((Yp,Xp),Yh),Xh),Xo) | No | Yes | Yes |
| CER12 | 429 | 8 | 0.32 | 1.8 × 10−9 | 1 | ((Xp,Yp),(Xh,Yh),Xo) | Yes | Yes | Yes |
| CER13 | 884 | 10 | 0.42 | 3.2 × 10−11 | 0.14 | ((Yh,Yp),(Xh,Xp),Xo) | No | No | Yes |
| CER14 | 757 | 6 | 1.25 | 1.8 × 10−3 | 1 | ((((Xp,Yp),Yh),Xh),Xg) | No | Yes | No |
| CER15 | 73 | 4 | 0.451 | 1.2 × 10−3 | 0.8 | ((Xp,Yp),(Xh,Yh),Xo) | Yes | Yes | Yes |
| CER16 | 627 | 8 | 0.88 | 4.13 × 10−6 | 0.8 | ((((Xp,Yp),Yh),Xh),Xo) | No | Yes | Yes |
| CER17 | 884 | 12 | 0.5 | <10−12 | 1 | ((((Xp,Yp),Yh),Xh),Xo) | No | Yes | Yes |
| CER18 | 655 | 4 | 0.16 | 2.7 × 10−5 | 0.15 | ((((Yp,Yh),Xh),Xp),Xg) | No | No | Yes |
| CER19 | 535 | 6 | 0.59 | 3.4 × 10−5 | 1 | ((Xp,Yp),(Xh,Yh),Xo) | Yes | Yes | Yes |
| 7,498 | 2 | 0.51 | 9.3 × 10−2 | 3 × 10−2 | ((Yh,Yp),(Xh,Xp),Xg) | No | No | No | |
aLength (bp) of the four-way alignment.
bShape of the NJ tree in newick format. Xp and Yp are X and Y chromosomes of Pan troglodytes, respectively; Xh and Yh are X and Y chromosomes of Homo sapiens, respectively; Xo and Xg indicate the X chromosome of Pongo pygmaeus and Gorilla gorilla, respectively.
cEvidence for gene conversion based on the tree shape analysis. Yes or no means the presence or absence of gene conversion in NJ-phylogenetic analyses, respectively.
dEvidence for gene conversion based on the Split Decomposition (SD) analysis. Yes or no means the presence or absence of a reticulation, respectively.
eNon-CER portion of the four-way alignment representative of the stratum 5.
FDistribution of CERs on the human X chromosome: (A) Schematic representation of the X chromosome. (B) The different tracks report the following features (from top to bottom): Scale bar; genomic coordinates from the GRCh37 human genome reference sequence; regions of the X chromosomes used for the four-way alignment position of CERs (red: CERs involved in recent X-to-Y gene conversion); previously known X–Y gene conversion hotspots (in green, ARSDP and HSA hotspots); UCSC genes.
FDistribution of CERs on the human Y chromosome: (A) Schematic representation of the Y chromosome. (B) The different tracks report the following features (from top to bottom): Scale bar; genomic coordinates from the GRCh37 human genome reference sequence; regions of the X chromosomes used for the four-way alignment position of CERs (red: CERs involved in recent X-to-Y gene conversion); previously known X–Y gene conversion hotspots (in green, ARSDP and HSA hotspots); UCSC genes (the ARSDP position is inferred by BLAT analysis).
List of Y-Polymorphisms Identified in This Study.
| V290 | 7091768 | A to T | A | A3b-M114 | 1 | — |
| V335 | 7163481 | C to T | C | E1b1a1-M2 | 3 | rs9785813 |
| V336 | 7163514 | A to T | A | A3a-M28 | — | |
| V337 | 7163703 | Del TAG | Ins | E1b1c-M329 | — | |
| V301 | 7169816 | C to T | C | A2-M6 | 6 | — |
| V302 | 7169899 | C to T | C | DE | — | |
| V318.1 | 7170441 | A to C | C | E1b1a1g1a*-M290* | — | |
| V318.2 | 7170441 | A to C | C | E2b*-M98* | — | |
| V327 | 7171673 | Ins AC | del | E1b1b1a*-V68* | 7 | — |
| V328 | 7171834 | G to T | G | J1d-P56 | — | |
| V329 | 7171957 | G to A | G | H1-M52 | rs35147341 | |
| V330 | 7172251 | C to G | C | B2b4b-M211 | — | |
| V331 | 7172768 | G to A | G | B1*-M236* | — | |
| V332.1 | 7173143 | G to A | A | E1b1a1a-M58 | rs9786714 | |
| V332.2 | 7173143 | G to A | A | F2-IJ-K | rs9786714 | |
| V283 | 7209376 | G to A | G | E1b1a1*-M2* | 8 | — |
| V284 | 7209401 | G to T | G | E1a*-M33* | — | |
| V334 | 7209078 | C to G | C | O2b-P49 | — | |
| V297 | 7234782 | C to T | C | J2-M172 | 9 | — |
| V299 | 7235118 | C to A | C | B2b*-M112* | — | |
| V300 | 7235379 | G to A | G | A3b-M144 | rs35195174 | |
| V319 | 7239870 | C to T | (C/T) rs145173948 | B2a1*-M218* | 10 | — |
| V320 | 7240836 | A to C | A | O-M175 | — | |
| V339 | 7246397 | A to G | A | A1b-V148 | 11 | — |
| V279 | 7319200 | T to C | T | E1b1b1b1-M34 | 12 | — |
| V280 | 7319490 | C to T | C | A3b-M144 | — | |
| V281 | 7319576 | A to C | A | A2-V218 | — | |
| V282 | 7319580 | T to C | T | E1b1b1-M35 | — | |
| V340 | 7333516 | A to G | A | R1b1a*-V88* | 13 | — |
| V291 | 7333543 | C to T | C | J2b-M12 | — | |
| V292 | 7334323 | T to C | T | O1a2-M50 | — | |
| V293 | 7418611 | G to A | (C/G) rs187021908 | F*-M89* | 14 | — |
| V294 | 7419117 | C to T | C | B2b3-M30 | — | |
| V295 | 7419266 | G to A | G | A3a-M28 | — | |
| V296 | 7419349 | G to C | G | A2-M6 | — | |
| V321 | 14051677 | C to T | C | A2-M114 | 15 | — |
| V322 | 14051686 | A to G | G | E*-M96* | — | |
| V323 | 14051820 | C to T | C | A3a-M28 | — | |
| V324 | 14051832 | C to A | C | E1b1b1a1d-V65 | — | |
| V325 | 14051863 | C to T | T | B2b4b-M211 | — | |
| V326 | 14051913 | G to A | G | A3b1b-V11 | — | |
| V285 | 14295658 | C to T | C | O-M175 | 16 | — |
| V286 | 14295751 | C to G | C | C1-M8 | — | |
| V287 | 14295977 | G to A | G | D2a1-M125 | — | |
| V288 | 14296291 | G to A | NA | A3a-M28 | — | |
| V289 | 14296306 | T to G | NA | B2a1a2a2a*-P50* | — | |
| V270 | 14482977 | G to C | G | A3-M32 | 17 | rs34555243 |
| V271 | 14483622 | T to C | C | C3c-M48 | — | |
| V272 | 14483887 | A to C | A | B2b4b-M211 | — | |
| V333 | 14483992 | G to C | G | O1a2-M50 | — | |
| V273 | 14484106 | G to T | G | B1*-M236* | — | |
| V274 | 14484379 | A to C | C | I-M170 | 18 | rs113822196 |
| V275 | 14484394 | C to T | T | I*-M170* | rs113686211 | |
| V257 | 14484596 | C to T | C | E1b1b1b-V257 | — | |
| V276 | 14484897 | G to A | G | A2-T | — | |
| V277 | 14490108 | G to A | G | D2a1-M125 | 19 | — |
aPosition according to the February 2009 Y-chromosome reference sequence of the UCSC genome browser.
bGametologous base on the X chromosome. NA, no X–Y alignment. The two X-SNPs have a MAF < 1%.
cHaplogroup nomenclature by lineage and last mutation. Nomenclature according to Karafet et al. (2008) and Scozzari et al. (2012).
FSNPs identified in the sequenced regions. To the left, a simplified version of the Y-chromosome tree showing the phylogenetic relationships of the chromosomes analyzed (Karafet et al. 2008; Scozzari et al. 2012). SNP names and CER numbers are given at the top and at the bottom, respectively. Square colors represent the allelic state for each SNP: White, ancestral allele; red, the SNP has arisen in an X–Y GSV and the derived state is equal to the gametologous base on the X; and green, the SNP has arisen in an identical site between X and Y; ×-marked squares, missing data SNPs V288 and V289 are in regions that do not align with any region of the X chromosome (see table 3).