| Literature DB >> 24278224 |
Xiaoteng Fu1, Jinzhuang Dou, Junxia Mao, Hailin Su, Wenqian Jiao, Lingling Zhang, Xiaoli Hu, Xiaoting Huang, Shi Wang, Zhenmin Bao.
Abstract
Genetic linkage maps are indispensable tools in genetic, genomic and breeding studies. As one of genotyping-by-sequencing methods, RAD-Seq (restriction-site associated DNA sequencing) has gained particular popularity for construction of high-density linkage maps. Current RAD analytical tools are being predominantly used for typing codominant markers. However, no genotyping algorithm has been developed for dominant markers (resulting from recognition site disruption). Given their abundance in eukaryotic genomes, utilization of dominant markers would greatly diminish the extensive sequencing effort required for large-scale marker development. In this study, we established, for the first time, a novel statistical framework for de novo dominant genotyping in mapping populations. An integrated package called RADtyping was developed by incorporating both de novo codominant and dominant genotyping algorithms. We demonstrated the superb performance of RADtyping in achieving remarkably high genotyping accuracy based on simulated and real mapping datasets. The RADtyping package is freely available at http://www2.ouc.edu.cn/mollusk/ detailen.asp?id=727.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24278224 PMCID: PMC3836964 DOI: 10.1371/journal.pone.0079960
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1An overview of the RADtyping approach for de novo codominant and dominant genotyping in a mapping population.
Representative reference sites are obtained by assembling parental sequencing reads into “locus” clusters. These sites are further classified into parent-shared and parent-specific sites for subsequent codominant and dominant genotyping. Main principles of codominant and dominant genotyping algorithms are shown in flowcharts, and more details are described in the Methods section.
Figure 2Evaluation of the performance of RADtyping using a pseudo F1 mapping population.
The simulated population was created by a crossing of two Arabidopsis plants with predefined SNPs in their genomes and progeny were subject to in silico sequencing together with their parents at different sequencing depths with sequencing errors enabled. De novo codominant and dominant genotyping was evaluated in three key aspects: genotype coverage (a, b), removal of repetitive sites (b, e), and genotyping accuracy (c, f).
Summary of polymorphic markers obtained by 2b-RAD sequencing of a C. farreri mapping population.
| Segregation pattern | Total marker no. | Marker no. in accord with Mendelian segregation | Mapped marker no. | |
| Codominant marker | (AA×aa) or (aa×AA) | 203 | n.a. | n.a. |
| (Aa×aa) or (aa×Aa) | 1882 | 1432 | 1166 | |
| (Aa×Aa) | 314 | 233 | 187 | |
| Dominant marker | (AA×--) or (--×AA) | 413 | n.a. | n.a. |
| (A-×--) or (--×A-) | n.a. | 3216 | 2453 | |
| (A-×A-) | n.a. | 1430 | n.a. |
Total marker no. refers to all polymorphic markers reported by RADtyping regardless of whether they follow Mendelian segregation in progeny.
For dominant markers, only those in accord with Mendelian segregation were scored to ensure the correct assignment of markers to different segregation patterns.
This segregation type was scored separately apart from the main pipeline.
Sanger validation of 2b-RAD genotypes.
| Maker type | Genotype class | 2b-RAD genotype | Validated by Sanger sequencing | Validation rate |
|
| ||||
| Parent (depth: 49–77×) | Heterozygote | 8 | 8 | 100% |
| Homozygote | 8 | 8 | 100% | |
| Progeny (depth: 14–21.2×) | Heterozygote | 12 | 10 | 84% |
| Homozygote | 20 | 20 | 100% | |
| Total | 48 | 46 | 96% | |
|
| ||||
| Parent (depth: 37–63×) | Presence | 8 | 8 | 100% |
| Absence | 8 | 8 | 100% | |
| Progeny (depth: 13.7–22×) | Presence | 13 | 12 | 92% |
| Absence | 8 | 8 | 100% | |
| Total | 37 | 36 | 97% |
Codominant and dominant SNPs confirmed by Sanger-based amplicon sequencing.
| Marker | BsaXI tags | Forward primer (5′→3′) | Reverse primer (5′→3′) |
|
| |||
| m119628 |
|
|
|
| f83678 |
|
|
|
| m12011 |
|
|
|
| f47186 |
|
|
|
| f79797 |
|
|
|
| m81459 |
|
|
|
| m386 |
|
|
|
| f12046 |
|
|
|
|
| |||
| df33179 |
|
|
|
| dm25086 |
|
|
|
| df29520 |
|
|
|
| dm27070 |
|
|
|
| df4428 |
|
|
|
| df12778 |
|
|
|
| df9608 |
|
|
|
| dm25622 |
|
|
|
BsaXI restriction sites are highlighted in bold and SNP alleles are indicated in parentheses.
Consistency of codominant genotyping on replicate 2b-RAD libraries prepared from two parents and ten progeny.
| Genotyped from Replicate 2 | ||||
| Genotyped from Replicate 1 | Homozygous (Parent) | Heterozygous (Parent) | Homozygous (Progeny) | Heterozygous (Progeny) |
| Same genotype | 1,527 | 1,578 | 6,813 | 5,307 |
| Different, homozygous | 0 | 8 | 0 | 401 |
| Different, heterozygous | 5 | 4 | 150 | 13 |
| Agreement (%) | 99.7% | 99.2% | 98.1% | 92.8% |
Note, average sequencing depths for two parents were 181× and 185× in rep1 and 190× and 235× in rep2, while for progeny, they were 37–46× in rep1 and 22–30× in rep2.
Consistency of dominant genotyping on replicate 2b-RAD libraries prepared from two parents and ten progeny.
| Genotyped from Replicate 2 | ||||
| Genotyped from Replicate 1 | Absent (Parent) | Present (Parent) | Absent (Progeny) | Present (Progeny) |
| Same genotype | 3,972 | 3,972 | 14,133 | 12,915 |
| Different, absent | – | 0 | – | 316 |
| Different, present | 0 | – | 112 | – |
| Agreement (%) | 100% | 100% | 99.2% | 97.6% |
Note, average sequencing depths for two parents were 181× and 185× in rep1 and 190× and 235× in rep2, while for progeny, they were 37–46× in rep1 and 22–30× in rep2.