| Literature DB >> 32730349 |
Juan Sui1,2, Sheng Luan1,2, Ping Dai1,2, Qiang Fu1,2, Xianhong Meng1,2, Kun Luo1,2, Baoxiang Cao1,2, Jie Kong1,2.
Abstract
Using pooled DNA genotyping to estimate the proportional contributions from multiple families in a pooled sample is of particular interest for selective breeding in aquaculture. We compared different pooled libraries with separate 2b-RAD sequencing of Litopenaeus vannamei individuals to assess the effect of different population structures (different numbers of individuals and families) on pooled DNA sequencing, the accuracy of parent sequencing of the DNA pools and the effect of SNP numbers on pooled DNA sequencing. We demonstrated that small pooled DNA genotyping of up to 53 individuals by 2b-RAD sequencing could provide a highly accurate assessment of population allele frequencies. The accuracy increased as the number of individuals and families increased. The allele frequencies of the parents from each pool were highly correlated with those of the pools or the corresponding individuals in the pool. We chose 500-28,000 SNPs to test the effect of SNP number on the accuracy of pooled sequencing, and no linear relationship was found between them. When the SNP number was fixed, increasing the number of individuals in the mixed pool resulted in higher accuracy of each pooled genotyping. Our data confirmed that pooled DNA genotyping by 2b-RAD sequencing could achieve higher accuracy than that of individual-based genotyping. The results will provide important information for shrimp breeding programs.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32730349 PMCID: PMC7392308 DOI: 10.1371/journal.pone.0236343
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Number of individuals and families included in each pool.
| Poo1 Family | F1 | F2 | F3 | F4 | F5 | F6 | F7 | F8 | F9 | F10 | Total |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Pool 1 | 5 | 5 | 5 | 15 | |||||||
| Pool 2 | 10 | 10 | 10 | 30 | |||||||
| Pool 3 | 5 | 5 | 5 | 5 | 5 | 5 | 30 | ||||
| Pool 4 | 10 | 10 | 10 | 5 | 5 | 5 | 3 | 3 | 1 | 1 | 53 |
Comparison of allele frequency estimates between pool DNA and individuals.
| Pool | Allele | 4 pools and 53 individuals | 4 pools and the constituent individuals | ||
|---|---|---|---|---|---|
| Pearson’s coefficient | Relative error | Pearson’s coefficient | Relative error | ||
| 1 | A | 0.9836 | 1.5099±13.0022 | 0.9876 | 0.3718±2.2273 |
| T | 0.9831 | 1.3733±11.2925 | 0.9872 | 0.4451±3.2501 | |
| C | 0.9832 | 0.8099±6.7785 | 0.9863 | 0.2290±1.6713 | |
| G | 0.9832 | 0.8660±9.5250 | 0.9870 | 0.2271±1.9190 | |
| 2 | A | 0.9873 | 1.2037±8.2768 | 0.9908 | 0.4570±3.0972 |
| T | 0.9870 | 1.0190±6.4142 | 0.9909 | 0.4176±2.6806 | |
| C | 0.9869 | 0.7160±5.9080 | 0.9902 | 0.2817±2.0893 | |
| G | 0.9874 | 0.6616±6.0681 | 0.9909 | 0.2562±2.3916 | |
| 3 | A | 0.9922 | 0.9675±8.1875 | 0.9936 | 0.4907±2.8691 |
| T | 0.9921 | 0.8381±5.7278 | 0.9936 | 0.4510±2.4208 | |
| C | 0.9922 | 0.6149±5.0877 | 0.9934 | 0.3095±2.0098 | |
| G | 0.9918 | 0.6426±5.3051 | 0.9933 | 0.3346±2.2224 | |
| 4 | A | 0.9936 | 0.6970±4.9239 | 0.9936 | 0.6970±4.9239 |
| T | 0.9934 | 0.6876±4.6355 | 0.9934 | 0.6876±4.6355 | |
| C | 0.9935 | 0.4877±4.3982 | 0.9935 | 0.4877±4.3982 | |
| G | 0.9935 | 0.4538±4.2453 | 0.9935 | 0.4538±4.2453 | |
Correlation between allele frequencies of the four pools or of individuals in the pools and those of their corresponding parents.
| Pool | Allele | 4 pools and their corresponding parents | Individuals in the 4 pools and their corresponding parents | ||
|---|---|---|---|---|---|
| Pearson’s coefficient | Relative error | Pearson’s coefficient | Relative error | ||
| 1 | A | 0.9500 | 0.2350±0.9439 | 0.9560 | 0.1623±0.4107 |
| T | 0.9504 | 0.2345±0.9871 | 0.9566 | 0.1651±0.4094 | |
| C | 0.9496 | 0.1566±0.7712 | 0.9556 | 0.1146±0.3508 | |
| G | 0.9510 | 0.1526±0.7315 | 0.9561 | 0.1167±0.3606 | |
| 2 | A | 0.9550 | 0.1738±0.5284 | 0.9600 | 0.1436±0.3756 |
| T | 0.9563 | 0.1747±0.4961 | 0.9708 | 0.1441±0.3441 | |
| C | 0.9560 | 0.1263±0.5064 | 0.9600 | 0.1011±0.3106 | |
| G | 0.9566 | 0.1208±0.4113 | 0.9604 | 0.1016±0.3101 | |
| 3 | A | 0.9804 | 0.3596±1.9266 | 0.9848 | 0.2310±0.8351 |
| T | 0.9802 | 0.3558±1.3142 | 0.9847 | 0.2501±0.7928 | |
| C | 0.9800 | 0.2517±1.4403 | 0.9841 | 0.1638±0.6778 | |
| G | 0.9799 | 0.2442±1.0795 | 0.9844 | 0.1675±0.6624 | |
| 4 | A | 0.9814 | 3.3075±55.0572 | 0.9874 | 1.6226±16.0181 |
| T | 0.9812 | 3.1723±41.3763 | 0.9873 | 1.7100±18.6187 | |
| C | 0.9814 | 1.9864±25.1980 | 0.9871 | 0.8983±10.0046 | |
| G | 0.9816 | 1.8871±26.9650 | 0.9873 | 0.9837±12.0227 | |
Fig 1Correlation between site number and Pearson’s coefficients of allele frequencies between pool- and individual- sequencing.
(a) Pool 1 that contains three families (F1-F3), each with five individuals. (b) Pool 2 that contains the same three families with pool 1, each with 10 individuals including those used in Pool 1. (c) Pool 3 that contains six families (F1-F6), each with five individuals, and eight of the fifteen individuals from F1-F3 were the same as those in Pool 1. (d) Pool 4 that contained ten families, including all 53 individuals used in Pool 1, Pool 2 and Pool 3. The x-axis and y-axis categorizations correspond to site number and Pearson’s coefficient of allele frequencies between pool- and individual- sequence, respectively.
SNP number required for standard deviation < 0.0001 of the correlation coefficients of allele frequencies between pool- and individual- sequence under a given SNP number.
| A | T | C | G | |
|---|---|---|---|---|
| Pool 1 | 18,000 | 24,000 | 27,000 | 26,500 |
| Pool 2 | 22,500 | 12,500 | 24,000 | 24,500 |
| Pool 3 | 17,000 | 15,500 | 21,500 | 24,000 |
| Pool 4 | 16,500 | 13,500 | 13,500 | 24,000 |
Correlations of allele frequency estimated between pool DNA and individuals for 3 repeats of pool 3 and pool 5.
| Pool | Allele | 2 pools and 53 individuals | 2 pools and the constituent individuals | ||
|---|---|---|---|---|---|
| Pearson’s coefficient | Relative error* | Pearson’s coefficient | Relative error* | ||
| 3 | A | 0.9929±0.0006 | 0.9187±0.0672 | 0.9944±0.0007 | 0.4676±0.0219 |
| T | 0.9928±0.0006 | 0.8505±0.0108 | 0.9944±0.0007 | 0.4548±0.0066 | |
| C | 0.9929±0.0006 | 0.6127±0.0197 | 0.9943±0.0008 | 0.3136±0.0184 | |
| G | 0.9928±0.0008 | 0.5869±0.0487 | 0.9944±0.0009 | 0.3111±0.0303 | |
| 5 | A | 0.9937±0.0004 | 0.7754±0.0385 | 0.9948±0.0004 | 0.4311±0.0222 |
| T | 0.9937±0.0002 | 0.7669±0.0608 | 0.9948±0.0002 | 0.4282±0.0180 | |
| C | 0.9937±0.0003 | 0.6751±0.0195 | 0.9947±0.0003 | 0.3414±0.0187 | |
| G | 0.9937±0.0003 | 0.5646±0.0445 | 0.9949±0.0003 | 0.3292±0.0199 | |
* The standard deviation was of the mean of three repeats.
Correlations between allele frequencies of pool 3, pool 5 or of individuals in the pools and those of their corresponding parents.
| Pool | Allele | 2 pools and their corresponding parents | Individuals in the 2 pools and their corresponding parents | ||
|---|---|---|---|---|---|
| Pearson’s coefficient | Relative error | Pearson’s coefficient | Relative error | ||
| 3 | A | 0.9819±0.0013 | 0.3195±0.0353 | 0.9848 | 0.2310 |
| T | 0.9820±0.0015 | 0.3216±0.0306 | 0.9847 | 0.2501 | |
| C | 0.9818±0.0015 | 0.2223±0.0263 | 0.9841 | 0.1638 | |
| G | 0.9819±0.0017 | 0.2237±0.0183 | 0.9844 | 0.1675 | |
| 5 | A | 0.9838±0.0005 | 1.1506±0.0580 | 0.9864 | 0.7794 |
| T | 0.9839±0.0002 | 1.2009±0.0412 | 0.9865 | 0.8438 | |
| C | 0.9839±0.0004 | 0.7826±0.0370 | 0.9864 | 0.4625 | |
| G | 0.9840±0.0004 | 0.7058±0.0256 | 0.9863 | 0.4505 | |