| Literature DB >> 26226630 |
Beniamino Trombetta1, Eugenia D'Atanasio1, Andrea Massaia1, Natalie M Myres2, Rosaria Scozzari1, Fulvio Cruciani3, Andrea Novelletto4.
Abstract
Factors affecting the rate and pattern of the mutational process are being identified for human autosomes, but the same relationships for the male specific portion of the Y chromosome (MSY) are not established. We considered 3,390 mutations occurring in 19 sequence bins identified by sequencing 1.5 Mb of the MSY from each of 104 present-day chromosomes. The occurrence of mutations was not proportional to the amount of sequenced bases in each bin, with a 2-fold variation. The regression of the number of mutations per unit sequence against a number of indicators of the genomic features of each bin, revealed the same fundamental patterns as in the autosomes. By considering the sequences of the same region from two precisely dated ancient specimens, we obtained a calibrated region-specific substitution rate of 0.716 × 10-9/site/year. Despite its lack of recombination and other peculiar features, the MSY then resembles the autosomes in displaying a marked regional heterogeneity of the mutation rate. An immediate implication is that a given figure for the substitution rate only makes sense if bound to a specific DNA region. By strictly applying this principle we obtained an unbiased estimate of the antiquity of lineages relevant to the genetic history of the human Y chromosome. In particular, the two deepest nodes of the tree highlight the survival, in Central-Western Africa, of lineages whose coalescence (291 ky, 95% C.I. 253-343) predates the emergence of anatomically modern features in the fossil record.Entities:
Mesh:
Year: 2015 PMID: 26226630 PMCID: PMC4520482 DOI: 10.1371/journal.pone.0134646
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Features of the 5 genomic regions considered in this work, subdivided into 19 bins.
| Region | Bin n. | Initial pos. | Final pos. | Genomic span | N. of baited fragments | Effectively sequenced bases | Gene content | CpG content | Replication time score (BG02 cell line) |
|---|---|---|---|---|---|---|---|---|---|
| A | 1 | 2690918 | 2910156 | 219239 | 285 | 91454 | 6.20 | 1318 | 1.14 |
| B | 2 | 6655517 | 6673365 | 17849 | 33 | 8275 | 0 | 70 | -0.30 |
| C | 3 | 7540768 | 7671720 | 130953 | 289 | 79812 | 0.56 | 1028 | -0.77 |
| 4 | 7671760 | 7945341 | 273582 | 394 | 79655 | 0.31 | 584 | -0.87 | |
| 5 | 7946953 | 8156861 | 209909 | 457 | 79642 | 0 | 1004 | -0.71 | |
| 6 | 8156863 | 8382932 | 226070 | 397 | 79674 | 0 | 660 | -0.72 | |
| 7 | 8383002 | 8489128 | 106127 | 186 | 78966 | 0 | 1090 | -0.93 | |
| 8 | 8489176 | 8595654 | 106479 | 147 | 79820 | 0.62 | 1222 | -0.78 | |
| 9 | 8595830 | 8739563 | 143734 | 257 | 80846 | 0.77 | 832 | -0.49 | |
| D | 10 | 14629906 | 14887780 | 257875 | 256 | 83305 | 6.64 | 1184 | 1.28 |
| 11 | 14888375 | 15056945 | 168571 | 218 | 83295 | 13.54 | 1090 | 1.46 | |
| 12 | 15056995 | 15436522 | 379528 | 332 | 82726 | 3.52 | 710 | 1.21 | |
| 13 | 15436558 | 15727001 | 290444 | 282 | 83244 | 5.42 | 952 | 0.18 | |
| 14 | 15730403 | 15957784 | 227382 | 291 | 83106 | 2.00 | 1292 | -0.16 | |
| E | 15 | 18553309 | 18714071 | 160763 | 246 | 84781 | 0 | 614 | -0.62 |
| 16 | 18714094 | 18912763 | 198670 | 292 | 84043 | 0 | 550 | -0.56 | |
| 17 | 18912843 | 19148792 | 235950 | 330 | 84363 | 0 | 634 | -0.66 | |
| 18 | 19148988 | 19343219 | 194232 | 309 | 83940 | 0 | 622 | -0.85 | |
| 19 | 19343595 | 19549929 | 206335 | 273 | 84565 | 0 | 638 | -0.68 |
1. GRCh37/hg19 coordinates
2. as % bases overlapping UCSC genes in the sequenced fragments
3. as positions residing in CpG's in the sequenced fragments
4. classified as non-palindromic ampliconic in ref. [18] with no highly similar paralogues on the Y chromosome
Fig 1Maximum parsimony tree showing the phyletic relationships of 104 chromosomes.
The individual Id's (as in S1 Table) are reported on the right, aligned with the corresponding branch. Clades corresponding to major haplogroups are bracketed or indicated individually at the far right, following the nomenclature of van Oven et al. [64](note the difference with ref. [4]). Branches are numbered (in italics) and mutations assigned to them are listed in S2 Table. Note that the tree is unrooted and variants defining branch 0 are identified solely as different from the reference sequence. The clade corresponding to Hg E1b1b-M35 has been collapsed since it is discussed in detail in a dedicated paper [65].
Distribution of mutational events in 19 bins in total.
| Region | Bin n. | Effectively sequenced bases | Whole tree (all positions) | Whole tree (CpG's) | Whole tree (non-CpG's) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Abs. | /100kb | Tr | Tv | Abs. | /100kb | Tr | Tv | Abs. | /100kb | Tr | Tv | |||
| A | 1 | 91454 | 178 | 194.6 | 111 | 67 | 20 | 1441.2 | 19 | 1 | 158 | 175.3 | 92 | 66 |
| B | 2 | 8275 | 15 | 181.3 | 8 | 7 | 2 | 845.9 | 2 | 0 | 13 | 158.4 | 6 | 7 |
| C | 3 | 79812 | 199 | 249.3 | 126 | 73 | 27 | 1288.0 | 26 | 1 | 172 | 218.3 | 100 | 72 |
| 4 | 79655 | 178 | 223.5 | 108 | 70 | 20 | 733.2 | 17 | 3 | 158 | 199.8 | 91 | 67 | |
| 5 | 79642 | 147 | 184.6 | 91 | 56 | 13 | 1260.6 | 11 | 2 | 134 | 170.4 | 80 | 54 | |
| 6 | 79674 | 185 | 232.2 | 105 | 80 | 18 | 828.4 | 16 | 2 | 167 | 211.4 | 89 | 78 | |
| 7 | 78966 | 279 | 353.3 | 172 | 107 | 55 | 1380.3 | 53 | 2 | 224 | 287.6 | 119 | 105 | |
| 8 | 79820 | 201 | 251.8 | 141 | 60 | 29 | 1530.9 | 28 | 1 | 172 | 218.8 | 113 | 59 | |
| 9 | 80846 | 196 | 242.4 | 123 | 73 | 16 | 1029.1 | 16 | 0 | 180 | 225.0 | 107 | 73 | |
| D | 10 | 83305 | 165 | 198.1 | 114 | 51 | 19 | 1421.3 | 17 | 2 | 146 | 177.8 | 97 | 49 |
| 11 | 83295 | 157 | 188.5 | 98 | 59 | 17 | 1308.6 | 16 | 1 | 140 | 170.3 | 82 | 58 | |
| 12 | 82726 | 150 | 181.3 | 92 | 58 | 16 | 858.3 | 16 | 0 | 134 | 163.4 | 76 | 58 | |
| 13 | 83244 | 181 | 217.4 | 121 | 60 | 20 | 1143.6 | 18 | 2 | 161 | 195.6 | 103 | 58 | |
| 14 | 83106 | 169 | 203.4 | 103 | 66 | 15 | 1554.6 | 13 | 2 | 154 | 188.2 | 90 | 64 | |
| E | 15 | 84781 | 172 | 202.9 | 106 | 66 | 17 | 724.2 | 15 | 2 | 155 | 184.2 | 91 | 64 |
| 16 | 84043 | 164 | 195.1 | 109 | 55 | 19 | 654.4 | 18 | 1 | 145 | 173.7 | 91 | 54 | |
| 17 | 84363 | 252 | 298.7 | 153 | 99 | 30 | 751.5 | 28 | 2 | 222 | 265.1 | 125 | 97 | |
| 18 | 83940 | 202 | 240.6 | 133 | 69 | 22 | 741.0 | 20 | 2 | 180 | 216.0 | 113 | 67 | |
| 19 | 84565 | 200 | 236.5 | 131 | 69 | 21 | 754.4 | 18 | 3 | 179 | 213.3 | 113 | 66 | |
| Total | 1495512 | 3390 | 2145 | 1245 | 396 | 367 | 29 | 2994 | 1778 | 1216 | ||||
| χ2 (17 d.f.) | 115.6 | 73.0 | 58.2 | 76.3 | 80.4 | n.t. | 75.8 | 37.1 | 58.2 | |||||
| P | 2.20×10−16 | 6.63×10−9 | 2.07×10−6 | 1.75×10−9 | 3.31×10−10 | 2.14×10−9 | 0.003 | 2.05×10−6 | ||||||
1. Mutations at CpG dinucleotides every 100 kb of positions in CpG dinucleotides.
Distribution of mutational events in 19 bins, by haplogroup.
| Reg. | Bin n. | Selected haplogroups | |||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| A00 | A0 | A1 | A2'3 | B | DE | CF | R1 | ||||||||||
| Abs. | /100kb | Abs. | /100kb | Abs. | /100kb | Abs. | /100kb | Abs. | /100kb | Abs. | /100kb | Abs. | /100kb | Abs. | /100kb | ||
| A | 1 | 18 | 19.7 | 16 | 17.5 | 9 | 9.8 | 27 | 29.5 | 30 | 32.8 | 43 | 47.0 | 28 | 30.6 | 3 | 3.3 |
| B | 2 | 1 | 12.1 | 0 | 0.0 | 0 | 0.0 | 6 | 72.5 | 0 | 0.0 | 5 | 60.4 | 2 | 24.2 | 1 | 12.1 |
| C | 3 | 23 | 28.8 | 13 | 16.3 | 7 | 8.8 | 27 | 33.8 | 40 | 50.1 | 34 | 42.6 | 50 | 62.6 | 23 | 28.8 |
| 4 | 20 | 25.1 | 15 | 18.8 | 12 | 15.1 | 21 | 26.4 | 27 | 33.9 | 35 | 43.9 | 39 | 49.0 | 13 | 16.3 | |
| 5 | 23 | 28.9 | 4 | 5.0 | 9 | 11.3 | 17 | 21.3 | 15 | 18.8 | 34 | 42.7 | 41 | 51.5 | 16 | 20.1 | |
| 6 | 18 | 22.6 | 10 | 12.6 | 5 | 6.3 | 24 | 30.1 | 33 | 41.4 | 46 | 57.7 | 43 | 54.0 | 15 | 18.8 | |
| 7 | 32 | 40.5 | 28 | 35.5 | 15 | 19.0 | 37 | 46.9 | 46 | 58.3 | 59 | 74.7 | 57 | 72.2 | 21 | 26.6 | |
| 8 | 31 | 38.8 | 21 | 26.3 | 8 | 10.0 | 23 | 28.8 | 24 | 30.1 | 46 | 57.6 | 41 | 51.4 | 14 | 17.5 | |
| 9 | 20 | 24.7 | 9 | 11.1 | 10 | 12.4 | 32 | 39.6 | 29 | 35.9 | 45 | 55.7 | 41 | 50.7 | 13 | 16.1 | |
| D | 10 | 16 | 19.2 | 18 | 21.6 | 4 | 4.8 | 22 | 26.4 | 27 | 32.4 | 39 | 46.8 | 30 | 36.0 | 10 | 12.0 |
| 11 | 13 | 15.6 | 12 | 14.4 | 7 | 8.4 | 33 | 39.6 | 22 | 26.4 | 31 | 37.2 | 37 | 44.4 | 15 | 18.0 | |
| 12 | 16 | 19.3 | 13 | 15.7 | 6 | 7.3 | 27 | 32.6 | 23 | 27.8 | 30 | 36.3 | 29 | 35.1 | 12 | 14.5 | |
| 13 | 12 | 14.4 | 13 | 15.6 | 3 | 3.6 | 30 | 36.0 | 29 | 34.8 | 45 | 54.1 | 43 | 51.7 | 10 | 12.0 | |
| 14 | 21 | 25.3 | 14 | 16.8 | 12 | 14.4 | 23 | 27.7 | 25 | 30.1 | 37 | 44.5 | 34 | 40.9 | 13 | 15.6 | |
| E | 15 | 18 | 21.2 | 12 | 14.2 | 8 | 9.4 | 29 | 34.2 | 26 | 30.7 | 40 | 47.2 | 31 | 36.6 | 10 | 11.8 |
| 16 | 23 | 27.4 | 14 | 16.7 | 2 | 2.4 | 23 | 27.4 | 25 | 29.7 | 34 | 40.5 | 40 | 47.6 | 12 | 14.3 | |
| 17 | 29 | 34.4 | 11 | 13.0 | 7 | 8.3 | 47 | 55.7 | 39 | 46.2 | 50 | 59.3 | 53 | 62.8 | 20 | 23.7 | |
| 18 | 26 | 31.0 | 13 | 15.5 | 5 | 6.0 | 33 | 39.3 | 27 | 32.2 | 35 | 41.7 | 54 | 64.3 | 22 | 26.2 | |
| 19 | 33 | 39.0 | 6 | 7.1 | 7 | 8.3 | 37 | 43.8 | 27 | 31.9 | 41 | 48.5 | 41 | 48.5 | 18 | 21.3 | |
| Total | 393 | 242 | 136 | 518 | 514 | 729 | 734 | 261 | |||||||||
| χ2 (17 d.f.) | 33.3 | 38.3 | 26.3 | 29.9 | 33.6 | 25.8 | 36.2 | 32.4 | |||||||||
| P | 0.010 | 0.002 | 0.069 | 0.027 | 0.009 | 0.079 | 0.004 | 0.013 | |||||||||
1. Nomenclature as in ref. [64]
2. Includes R1
3. The sum across haplogroups is not 3390 as some Hg's are not shown and R1 is also considered in CF
Fig 2Dated tree including 104 subjects plus the Ust'-Ishim [27] and Loschbour [28] specimens (arrowed).
These latter were used as calibration points, by means of normally distributed priors with means 45,000 and 7,205 years ago, respectively. The 95% C.I.'s for the age of each node are represented as grey bars. The clade corresponding to Hg E1b1b-M35 has been collapsed since it is discussed in detail in a dedicated paper [65]. Groups of branches discussed in the text and in Table 4 are shadowed: a) deep branches; b) terminal branches with rho < = 20; c) terminal branches with length < = 10 mutations. Note that the positioning of the root is the result of the Bayesian process and not of the assessment of ancestral/derived states in branch 0 based on an outgroup (e.g. the chimpanzee).
Distribution of mutational events in 19 bins, by time windows.
| Reg. | Bin n. | Deep branches | Intermediate branches | Terminal branches | |||||
|---|---|---|---|---|---|---|---|---|---|
| Rho < = 20 | < = 10 mut. | ||||||||
| Abs. | /100kb | Abs. | /100kb | Abs. | /100kb | Abs. | /100kb | ||
| A | 1 | 16 | 17.5 | 111 | 121.4 | 32 | 35.0 | 10 | 10.9 |
| B | 2 | 1 | 12.1 | 8 | 96.7 | 4 | 48.3 | 1 | 12.1 |
| C | 3 | 14 | 17.5 | 126 | 157.9 | 38 | 47.6 | 19 | 23.8 |
| 4 | 21 | 26.4 | 106 | 133.1 | 31 | 38.9 | 10 | 12.6 | |
| 5 | 9 | 11.3 | 79 | 99.2 | 36 | 45.2 | 19 | 23.9 | |
| 6 | 10 | 12.6 | 117 | 146.8 | 40 | 50.2 | 16 | 20.1 | |
| 7 | 36 | 45.6 | 172 | 217.8 | 41 | 51.9 | 21 | 26.6 | |
| 8 | 21 | 26.3 | 113 | 141.6 | 38 | 47.6 | 18 | 22.6 | |
| 9 | 17 | 21.0 | 117 | 144.7 | 42 | 52.0 | 24 | 29.7 | |
| D | 10 | 19 | 22.8 | 102 | 122.4 | 28 | 33.6 | 10 | 12.0 |
| 11 | 16 | 19.2 | 99 | 118.9 | 31 | 37.2 | 16 | 19.2 | |
| 12 | 12 | 14.5 | 82 | 99.1 | 38 | 45.9 | 24 | 29.0 | |
| 13 | 19 | 22.8 | 124 | 149.0 | 26 | 31.2 | 12 | 14.4 | |
| 14 | 16 | 19.3 | 88 | 105.9 | 44 | 52.9 | 23 | 27.7 | |
| E | 15 | 17 | 20.1 | 105 | 123.8 | 30 | 35.4 | 18 | 21.2 |
| 16 | 15 | 17.8 | 90 | 107.1 | 35 | 41.6 | 19 | 22.6 | |
| 17 | 24 | 28.4 | 148 | 175.4 | 47 | 55.7 | 19 | 22.5 | |
| 18 | 23 | 27.4 | 108 | 128.7 | 45 | 53.6 | 28 | 33.4 | |
| 19 | 17 | 20.1 | 111 | 131.3 | 38 | 44.9 | 23 | 27.2 | |
| Total | 323 | 2006 | 664 | 330 | |||||
| χ2 (17 d.f.) | 37.2 | 87.2 | 18.9 | 26.8 | |||||
| P | 0.003 | 2.0×10−11 | 0.331 | 0.061 | |||||