| Literature DB >> 23626862 |
Henry R Johnston1, David J Cutler.
Abstract
The patterns of male and female recombination vary greatly on a macro scale. A unique motif in each gender, triggering a double strand break at its location, much in the way Chi sites operate in E. coli, could logically explain this difference. As such, we have undertaken a comprehensive search of all small motifs in an attempt to identify one or more that match to the available data. In the end, we conclude that no such motifs appear to exist in the human genome.Entities:
Mesh:
Year: 2013 PMID: 23626862 PMCID: PMC3633838 DOI: 10.1371/journal.pone.0062920
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Sizes of recombination mapping windows.
| Widths | Hotspots | Male Windows | Female Windows |
| <5kb | 22655 | 281 | 454 |
| 5–10kb | 8030 | 319 | 465 |
| 10–30kb | 2238 | 1344 | 1991 |
| 30–60kb | 73 | 1536 | 2341 |
| 60–90kb | 0 | 1141 | 1648 |
| 90–120kb | 0 | 804 | 1270 |
| >120kb | 0 | 4166 | 6335 |
|
| 32996 | 9591 | 14504 |
|
| 181943 | 3001938968 | 4973275771 |
|
| 5.51 kb | 312.9 kb | 342.8 kb |
This table shows the disparity in sizes between LD-identified hotspots and recombination mapping windows. Hotspots are finely localized, while mapping windows often span hundreds of kb.
6-mer Motif Analysis.
| LD-Defined Hotspots | Female Recombination Windows | Male Recombination Windows | deCODE Map | ||||||
| Motif | Count Excess (SD) | Clustering Score | Count Excess (SD) | Clustering | Count Excess (SD) | Clustering Score | Female K-S Score | Male K-S Score | |
| Hotspots | GGGGGG | 68.5 | 1.43 | 0.31 | 0.85 | 0.4 | 0.93 | 5.46 | 15.19 |
| GGGGGT | 44.5 | 1.23 | −0.2 | 0.83 | 0.1 | 0.92 | 4.95 | 14.89 | |
| AGGGGG | 41.8 | 1.20 | −0.12 | 0.83 | 0.1 | 0.92 | 4.75 | 14.81 | |
| GGGGGA | 35.4 | 1.19 | −0.11 | 0.85 | 0.3 | 0.92 | 4.42 | 14.48 | |
| TGGGGG | 31.2 | 1.18 | −0.14 | 0.84 | 0.2 | 0.92 | 4.84 | 14.8 | |
| Windows in Females | TGGACG | 1 | 1.03 | 4.4 | 0.85 | 4 | 0.93 | 5.95 | 14.6 |
| GACGTC | 1.7 | 1.04 | 3.9 | 0.87 | −1.2 | 0.89 | 5.62 | 14.24 | |
| GAGCTG | −0.3 | 1.00 | 2.9 | 0.85 | 0.1 | 0.92 | 4.19 | 14.46 | |
| GATCGG | 0.17 | 1.05 | 2.7 | 0.87 | 0.84 | 0.97 | 4.38 | 14.14 | |
| GAGCGC | 0.3 | 1.16 | 2.4 | 0.84 | 1.8 | 0.94 | 6.38 | 15.52 | |
| Windows in Males | GGGACG | 5.5 | 1.17 | −2.5 | 0.86 | 5.5 | 0.92 | 7.12 | 15.81 |
| GGACGT | 1.9 | 1.11 | 0.24 | 0.87 | 5.4 | 0.92 | 5.46 | 14.32 | |
| TGGACG | 1 | 1.03 | 4.4 | 0.85 | 4 | 0.93 | 5.95 | 14.6 | |
| GGACGG | 4.1 | 1.13 | 1.7 | 0.85 | 3.9 | 0.91 | 7.7 | 16.39 | |
| CGGATG | −1.7 | 0.98 | 0.08 | 0.86 | 3.8 | 0.92 | 5 | 15.51 | |
| Matching deCODE Female Map | TTAGCG | 0.3 | 1.04 | −2.5 | 0.90 | 0.7 | 0.95 | 3.87 | 14.38 |
| ATGCGA | 1.3 | 0.99 | 0.9 | 0.89 | −1.7 | 0.95 | 3.89 | 14.35 | |
| GGATGC | −1.4 | 0.97 | −0.1 | 0.87 | −0.2 | 0.92 | 3.9 | 14.29 | |
| TAAGCG | 0.1 | 1.05 | −0.8 | 0.88 | −1.2 | 0.94 | 3.9 | 14.32 | |
| ATCGGT | −0.1 | 0.95 | 0.01 | 0.89 | 0.2 | 0.96 | 3.91 | 14.42 | |
| Matching deCODE Male Map | TGACGT | −0.7 | 0.98 | 0.6 | 0.88 | −0.2 | 0.92 | 4.05 | 13.44 |
| TGCGTT | 0.2 | 1.01 | −1.1 | 0.87 | 0.4 | 0.94 | 4.03 | 13.53 | |
| ATGCGT | 0.1 | 1.01 | −0.1 | 0.86 | 0.4 | 0.94 | 4.03 | 13.57 | |
| GCGTTT | 1.6 | 1.03 | −2.6 | 0.87 | 1.5 | 0.95 | 4.15 | 13.58 | |
| TGAACG | 0.6 | 1.01 | −1.2 | 0.87 | 0.4 | 0.94 | 4.07 | 13.58 | |
This table is comprised of the data for 25 6-mer motifs. The rows are comprised of the motifs, each set of five chosen because they were the best identified in one of the available data sets. The columns cover each of the data sets, and detail the performance of the motif as gauged in each data set. Count Excess is the excess of motif count in real recombination windows when compared to random windows and is reported as the number of standard deviations above the mean, Clustering Score is the clustering statistic. K-S Score is the Kolmogorov-Smirnoff test statistic for a comparison of the motif distribution to the global recombination map. The score is summed over 39 chromosome arms.
Figure 1Multiple 7-mer motifs compared to deCODE recombination maps.
Here, a selection of chromosome arms is offered. The motifs are labeled at the top of the figure. The rows represent each of six chromosome arms, 5p, 8p, 12q, 13q, 17q, and 18p from top to bottom. The collection demonstrates the broader concept that G-rich motifs are found nearly uniformly throughout the genome, matching well to the deCODE female map, but poorly to the deCODE male map. The y-axis displays the percentage of the genome. In each graph, the motif in question is in blue, with the line generated by treating each instance of the motif as though it were a recombination event and dividing the total to each point on the chromosome arm by the total for the entire arm. The female map is in green, while the male map is in red. The x-axis displays the percentage of recombination, the number of recombination events to that point on the arm divided by the total number on the arm.
Figure 2The ideal 6-mer identified in hotspots.
Here we show the ideal 6-mer, identified as a composite of the top 100 overrepresented motifs in LD-identified hotspots.
Figure 3The best fitting motif on male 17q(CCGTGCGG) shown on multiple chromosome arms.
Here we demonstrate that while a motif might match well to the deCODE male map on some chromosomes, on others it is terribly matched. The y-axis displays the percentage of the genome. The motif in question is in blue, with the line generated by treating each instance of the motif as though it were a recombination event and dividing the total to each point on the chromosome arm by the total for the entire arm. The female map is in green, while the male map is in red. The x-axis displays the percentage of recombination, the number of recombination events to that point on the arm divided by the total number on the arm.
Comparisons Between Motifs in LD-Defined Hotspots.
| 13-mer | % of spots containing motif | Average number of motifs per spot |
| LD Hotspot | 50 | 0.91 |
| Random Spot | 37.6 | 0.60 |
|
| ||
| LD Hotspot | 52.5 | 1.19 |
| Random Spot | 45.5 | 0.90 |
|
| ||
| LD Hotspot | 68.5 | 1.75 |
| Random Spot | 60.1 | 1.36 |
|
| ||
| LD Hotspot | 56.7 | 1.18 |
| Random Spot | 49.9 | 0.93 |
|
| ||
| LD Hotspot | 50.6 | 1.10 |
| Random Spot | 43.6 | 0.81 |
|
| ||
| LD Hotspot | 58.3 | 1.61 |
| Random Spot | 49.9 | 1.19 |
This table details the presence of 6 motifs, the previously identified 13-mer and top five 7-mers, in real and randomized LD-identified hotspots. The percentages of hotspots containing a motif as well as the average number of motifs per hotspot are reported.
Correlations between recombination rates and motif locations.
| Male Recombination Rate | Female Recombination Rate | |
|
| −.006 | .031 |
|
| .285 | .197 |
|
| .304 | .326 |
|
| .254 | .222 |
|
| .289 | .209 |
|
| .331 | .289 |
This table presents the correlations between motif locations and the deCODE identified recombination rate in males and females for the previously identified 13-mer as well as the top five 7-mers.