| Literature DB >> 33144392 |
Xiaoshen Yin1, Alberto Arias-Pérez2, Tevfik Hamdi Kitapci2, Dennis Hedgecock2.
Abstract
Studies of linkage and linkage mapping have advanced genetic and biological knowledge for over 100 years. In addition to their growing role, today, in mapping phenotypes to genotypes, dense linkage maps can help to validate genome assemblies. Previously, we showed that 40% of scaffolds in the first genome assembly for the Pacific oyster Crassostrea gigas were chimeric, containing single nucleotide polymorphisms (SNPs) mapping to different linkage groups. Here, we merge 14 linkage maps constructed of SNPs generated from genotyping-by-sequencing (GBS) methods with five, previously constructed linkage maps, to create a compendium of nearly 69 thousand SNPs mapped with high confidence. We use this compendium to assess a recently available, chromosome-level assembly of the C. gigas genome, mapping SNPs in 275 of 301 contigs and comparing the ordering of these contigs, by linkage, to their assembly by Hi-C sequencing methods. We find that, while 26% of contigs contain chimeric blocks of SNPs, i.e., adjacent SNPs mapping to different linkage groups than the majority of SNPs in their contig, these apparent misassemblies amount to only 0.08% of the genome sequence. Furthermore, nearly 90% of 275 contigs mapped by linkage and sequencing are assembled identically; inconsistencies between the two assemblies for the remaining 10% of contigs appear to result from insufficient linkage information. Thus, our compilation of linkage maps strongly supports this chromosome-level assembly of the oyster genome. Finally, we use this assembly to estimate, for the first time in a Lophotrochozoan, genome-wide recombination rates and causes of variation in this fundamental process.Entities:
Keywords: Pacific oyster Crassostrea gigas; genome assembly; genotyping-by-sequencing; linkage mapping; recombination rate
Year: 2020 PMID: 33144392 PMCID: PMC7718752 DOI: 10.1534/g3.120.401728
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Correlations among rank orders of common markers on maps made using the regression (RG) and maximum likelihood (ML) methods of JoinMap 4.1 and Lep-MAP3 (LM3).
Figure 2Decision tree for selecting high-confidence SNPs.
Student’s t-tests on comparing sum of lengths (A), total no. of markers (B), average spacing (C), and genome coverage (D) between linkage maps constructed using JoinMap 4.1 and Lep-MAP3
| (A) Sum of lengths | (B) Total no. of markers | (C) Average spacing (cM) | (D) Genome coverage | |||||
|---|---|---|---|---|---|---|---|---|
| Family | JoinMap 4.1 | Lep-MAP3 | JoinMap 4.1 | Lep-MAP3 | JoinMap 4.1 | Lep-MAP3 | JoinMap 4.1 | Lep-MAP3 |
| 23×31 | 454.6 | 943.0 | 1,032 | 1,660 | 0.445 | 0.572 | 0.862 | 0.863 |
| 23×40 | 466.3 | 826.2 | 636 | 1,504 | 0.745 | 0.553 | 0.860 | 0.863 |
| 31×23 | 585.8 | 640.9 | 760 | 999 | 0.781 | 0.648 | 0.861 | 0.862 |
| 40×92 | 540.7 | 806.7 | 790 | 1,679 | 0.693 | 0.483 | 0.861 | 0.863 |
| 47×92 | 497.1 | 791.3 | 665 | 1,119 | 0.759 | 0.714 | 0.861 | 0.862 |
| 92×40 | 589.7 | 585.8 | 699 | 885 | 0.856 | 0.67 | 0.861 | 0.862 |
| 522.367 | 765.667 | 763.667 | 1307.667 | 0.713 | 0.607 | 0.861 | 0.863 | |
| −3.201 | −4.392 | 2.02 | −4.392 | |||||
| 0.024 | 0.007 | 0.099 | 0.007 | |||||
Number of SNPs in each combination of mapping method (i.e., JoinMap 4.1 vs. Lep-MAP3), grouping accuracy (i.e., whether a SNP is assigned to the consensus linkage group for its contig, conLG, or not, non-conLG), and level of family support (i.e., whether a SNP is mapped in one or more than one family)
| JoinMap 4.1 | Lep-MAP3 | ||||
|---|---|---|---|---|---|
| 1 family | >1 family | 1 family | >1 family | totals | |
| conLG | 1,607 | 846 | 45,795 | 16,919 | 65,167 |
| non-conLG | 44 | 5 | 2,163 | 256 | 2,468 |
| totals | 1,651 | 851 | 47,958 | 17,175 | 67,635 |
Analysis of maximum likelihood parameter estimates from the three-way loglinear model. Sources, as defined in caption to Table 2
| Source | d.f. | Chi-square | |
|---|---|---|---|
| grouping accuracy | 1 | 12,198.3 | <0.0001 |
| mapping method | 1 | 4,407.8 | <0.0001 |
| level of family support | 1 | 290.8 | <0.0001 |
| grouping accuracy × mapping method | 1 | 14.2 | 0.0002 |
| grouping accuracy × level of family support | 1 | 54.3 | <0.0001 |
| mapping method × level of family support | 1 | 0.4 | 0.529 |
| grouping accuracy × mapping method × level of family support | 1 | 0.8 | 0.386 |
2×2 contingency tables, testing whether grouping accuracy, mapping method, and level of family support (see Table 2) are independent within each layer of the three factors. The odds ratio is the product of the upper left cell and the lower right cell divided by the product of the upper right cell and the lower left cell
| Factor 1, level 1 | Factor 2 | Statistics | Factor 1, level 2 | Factor 2 | Statistics | ||||
|---|---|---|---|---|---|---|---|---|---|
| Level 1 | Level 2 | Level 1 | Level 2 | ||||||
| non-conLG | conLG | 0.0003 | non-conLG | conLG | 0.0314 | ||||
| Lep-MAP3 | 2163 | 45795 | d.f.: | 1 | Lep-MAP3 | 256 | 16919 | d.f.: | 1 |
| JoinMap 4.1 | 44 | 1607 | odds ratio: | 1.725 | JoinMap 4.1 | 5 | 846 | odds ratio: | 2.560 |
| JoinMap 4.1 | Lep-MAP3 | 2.52E-16 | JoinMap 4.1 | Lep-MAP3 | 0.9320 | ||||
| >1 family | 846 | 16919 | d.f.: | 1 | >1 family | 5 | 256 | d.f.: | 1 |
| 1 family | 1607 | 45795 | odds ratio: | 1.425 | 1 family | 44 | 2163 | odds ratio: | 0.960 |
| 1 family | >1 family | 0.0004 | 1 family | >1 family | 4.22E-72 | ||||
| non-conLG | 44 | 5 | d.f.: | 1 | non-conLG | 2163 | 256 | d.f.: | 1 |
| conLG | 1607 | 846 | odds ratio: | 4.633 | conLG | 45795 | 16919 | odds ratio: | 3.122 |
Figure 3Recombination rate (cM/Mb) by family (A), chromosome (B), sex (C), and parent (i.e., family × sex) (D).
Figure 4Standardized RG- and ML-based recombination rates (RR, cM/Mb) along ten chromosomes of the Chr_v1 genome assembly for six, interrelated F2 families. Dashed lines indicate the nominal 10th percentile (-1.28) and 90th percentile (1.28) of the standardized RR. Arrows indicate the leftmost position of microsatellite markers linked to centromeres (Hubert ).