| Literature DB >> 25442170 |
Ximei Li, Wenhui Gao, Huanle Guo, Xianlong Zhang, David D Fang, Zhongxu Lin1.
Abstract
BACKGROUND: Availability of molecular markers has proven to be an efficient tool in facilitating progress in plant breeding, which is particularly important in the case of less researched crops such as cotton. Considering the obvious advantages of single nucleotide polymorphisms (SNPs) and insertion-deletion polymorphisms (InDels), expressed sequence tags (ESTs) were analyzed in silico to identify SNPs and InDels in this study, aiming to develop more molecular markers in cotton.Entities:
Mesh:
Year: 2014 PMID: 25442170 PMCID: PMC4265408 DOI: 10.1186/1471-2164-15-1046
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Polymorphic rates of the SNP and InDel markers
| Primers | No. markers | No. polymorphic markers/loci | Polymorphic rate (%) | Classes | Subclasses 5) | No. Markers | No. Polymorphic markers/loci | Polymorphic rate (%) |
|---|---|---|---|---|---|---|---|---|
| HAU-SNP1) | 356 | 47/50 | 13.20 | 4 | 27 | 6/6 | 22.22 | |
| inter | 5 | 21 | 4/5 | 19.05 | ||||
| 23/134 | 6 | 16 | 2/3 | 12.50 | ||||
| 17.16% | 7 | 11 | 0/0 | 0.00 | ||||
| ≥8 | 59 | 11/11 | 18.64 | |||||
| 4 | 0 | 0/0 | -- | |||||
| hemi | 5 | 8 | 1/1 | 12.50 | ||||
| 24/222 | 6 | 9 | 2/2 | 22.22 | ||||
| 10.81% | 7 | 10 | 0/0 | 0.00 | ||||
| ≥8 | 195 | 21/22 | 10.77 | |||||
| HAU-SNP2) | 455 | 43/43 | 9.45 | 4 | 171 | 15/15 | 8.77 | |
| 5 | 89 | 12/12 | 13.48 | |||||
| 6 | 97 | 6/6 | 6.19 | |||||
| 7 | 38 | 3/3 | 7.89 | |||||
| ≥8 | 60 | 7/7 | 11.67 | |||||
| HAU-InDel3) | 415 | 41/42 | 9.88 | |||||
| HAU-InDel4) | 123 | 6/7 | 4.88 | |||||
| Total | 1,349 | 137/142 | 10.16 |
1)HAU-SNP001 ~ 356, which were developed by comparing ESTs between G. hirsutum and G. barbadense.
2)HAU-SNP357 ~ 811, which were developed by mining G. hirsutum unigenes.
3)HAU-InDel001 ~ 415, which were developed by mining the 3′UTRs of public G. hirsutum sequences.
4)HAU-InDel416 ~ 538, which were developed by blasting putative 3′UTRs of G. hirsutum against the 3′UTRs of Arabidopsis.
5)Subclasses mean different types of clusters/unigenes classified by the number of sequences contained in a cluster/unigene.
Distribution of SNP and InDel markers on the interspecific BC linkage map
| Chromosome | Size (cM) | Marker interval (cM) | Total loci | SNP loci | HAU-SNP loci 1) | HAU-SNP loci 2) | HAU-InDel loci 3) | HAU-InDel loci 4) |
|---|---|---|---|---|---|---|---|---|
| Chr01 | 186.87 | 2.46 | 76 | 3 | 1 | 2 | 0 | 0 |
| Chr02 | 156.03 | 2.40 | 65 | 6 | 3 | 2 | 1 | 0 |
| Chr03 | 156.23 | 2.06 | 76 | 5 | 2 | 2 | 1 | 0 |
| Chr04 | 140.07 | 2.50 | 56 | 1 | 0 | 0 | 1 | 0 |
| Chr05 | 242.76 | 1.75 | 139 | 6 | 1 | 2 | 3 | 0 |
| Chr06 | 171.43 | 2.04 | 84 | 1 | 1 | 0 | 0 | 0 |
| Chr07 | 103.39 | 1.48 | 70 | 4 | 2 | 2 | 0 | 0 |
| Chr08 | 148.05 | 1.53 | 97 | 4 | 0 | 3 | 1 | 0 |
| Chr09 | 148.83 | 1.43 | 104 | 9 | 2 | 3 | 3 | 1 |
| Chr10 | 179.66 | 1.91 | 94 | 9 | 4 | 2 | 2 | 1 |
| Chr11 | 234.77 | 1.63 | 144 | 7 | 1 | 2 | 4 | 0 |
| Chr12 | 221.04 | 2.19 | 101 | 6 | 3 | 0 | 2 | 1 |
| Chr13 | 208.14 | 2.12 | 98 | 5 | 2 | 2 | 1 | 0 |
| AT genome | 2297.27 | 1.91 | 1204 | 66 | 22 | 22 | 19 | 3 |
| Chr14 | 156.15 | 1.63 | 96 | 4 | 1 | 0 | 2 | 1 |
| Chr15 | 189.00 | 1.60 | 118 | 5 | 2 | 0 | 3 | 0 |
| Chr16 | 94.32 | 0.99 | 95 | 5 | 4 | 1 | 0 | 0 |
| Chr17 | 149.43 | 2.13 | 70 | 5 | 3 | 2 | 0 | 0 |
| Chr18 | 146.95 | 1.47 | 100 | 6 | 3 | 2 | 0 | 1 |
| Chr19 | 252.27 | 1.64 | 154 | 8 | 1 | 4 | 3 | 0 |
| Chr20 | 107.50 | 1.00 | 108 | 6 | 3 | 0 | 2 | 1 |
| Chr21 | 256.03 | 1.82 | 141 | 5 | 1 | 3 | 1 | 0 |
| Chr22 | 166.03 | 1.84 | 90 | 3 | 1 | 0 | 2 | 0 |
| Chr23 | 193.19 | 1.82 | 106 | 4 | 1 | 2 | 1 | 0 |
| Chr24 | 187.04 | 1.64 | 114 | 3 | 1 | 1 | 1 | 0 |
| Chr25 | 151.25 | 1.41 | 107 | 5 | 3 | 1 | 1 | 0 |
| Chr26 | 197.09 | 1.70 | 116 | 8 | 1 | 3 | 3 | 1 |
| DT genome | 2246.24 | 1.59 | 1415 | 67 | 25 | 19 | 19 | 4 |
| Total | 4543.51 | 1.73 | 2619 | 133 | 47 | 41 | 38 | 7 |
1)HAU-SNP001 ~ 356.
2)HAU-SNP357 ~ 811.
3)HAU-InDel001 ~ 415.
4)HAU-InDel416 ~ 538.
Summary of cotton base variations
| Types of SNPs | Interspecific EST-SNPs | Intraspecific EST-SNPs | Total | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Inter/hemi-SNPs | Hemi-SNPs | Subtotal | 4 | 5 | 6 | 7 | ≥8 | Subtotal | ||
| C → T | 265 (26.26%) | 1036 (29.21%) | 1301 (28.56%) | 158 (30.10%) | 153 (36.00%) | 249 (27.04%) | 103 (29.94%) | 170 (24.50%) | 833 (28.64%) | 2134 (28.59%) |
| G → A | 271(26.86%) | 981 (27.66%) | 1252 (27.48%) | 147 (28.00%) | 120 (28.24%) | 244 (26.49%) | 97 (28.20%) | 170 (24.50%) | 778 (26.74%) | 2030 (27.19%) |
| All transitions | 536 (53.12%) | 2017 (56.86%) | 2553 (56.04%) | 305 (58.10%) | 273 (64.24%) | 493 (53.53%) | 200 (58.14%) | 340 (48.99%) | 1611 (55.38%) | 4164 (55.78%) |
| C → G | 74 (7.33%) | 282 (7.95%) | 356 (7.81%) | 40 (7.62%) | 12 (2.82%) | 61 (6.62%) | 17 (4.94%) | 47 (6.77%) | 177 (6.09%) | 533 (7.14%) |
| A → T | 82 (8.13%) | 410 (11.56%) | 492 (10.80%) | 45 (8.57%) | 41 (9.65%) | 96 (10.42%) | 33 (9.59%) | 61 (8.79%) | 276 (9.49%) | 768 (10.29%) |
| C → A | 88 (8.72%) | 304 (8.57%) | 392 (8.60%) | 33 (6.29%) | 31 (7.29%) | 67 (7.27%) | 23 (6.69%) | 53 (7.64%) | 207 (7.12%) | 599 (8.02%) |
| T → G | 97 (9.61%) | 304 (8.57%) | 401 (8.80%) | 42 (8.00%) | 32 (7.53%) | 79 (8.58%) | 19 (5.52%) | 73 (10.52%) | 245 (8.42%) | 646 (8.65%) |
| All transversions | 341 (33.80%) | 1300 (36.65%) | 1641 (36.02) | 160 (30.48%) | 116 (27.29%) | 303 (32.90%) | 92 (26.74%) | 234 (33.72%) | 905 (31.11%) | 2546 (34.11%) |
| A/- | 52 (5.15%) | 56 (1.58%) | 108 (2.37%) | 18 (3.43%) | 13 (3.06%) | 40 (4.34%) | 25 (7.27%) | 47 (6.77%) | 143 (4.92%) | 251 (3.36%) |
| C/- | 22 (2.18%) | 36 (1.01%) | 58 (1.27%) | 13 (2.48%) | 5 (1.18%) | 29 (3.15%) | 4 (1.16%) | 16 (2.31%) | 67 (2.30%) | 125 (1.67%) |
| G/- | 28 (2.78%) | 37 (1.04%) | 65 (1.43%) | 16 (3.05%) | 3 (0.71%) | 27 (2.93%) | 11 (3.20%) | 22 (3.17%) | 79 (2.72%) | 144 (1.93%) |
| T/- | 30 (2.97%) | 101 (2.85%) | 131 (2.88%) | 13 (2.48%) | 15 (3.53%) | 29 (3.15%) | 12 (3.49%) | 35 (5.04%) | 104 (3.58%) | 235 (3.15%) |
| All InDels | 132 (13.08%) | 230 (6.48%) | 362 (7.95%) | 60 (11.43%) | 36 (8.47%) | 125 (13.57%) | 52 (15.12%) | 120 (17.29%) | 393 (13.51%) | 755 (10.11%) |
| Total | 1009 (100.00%) | 3547 (100.00%) | 4556 (100.00%) | 525 (100.00%) | 425 (100.00%) | 921 (100.00%) | 344 (100.00%) | 694 (100.00%) | 2909 (100.00%) | 7465 (100.00%) |
GO analysis of consensus sequences used to design the HAU-SNP-prefixed markers on level 1
| Functional categories | Number of genes | Number of SNPs | SNPs/gene | |
|---|---|---|---|---|
|
| Cellular component | 21 | 292 | 13.9 |
| Molecular function | 107 | 1001 | 9.36 | |
| Biological process | 80 | 845 | 10.56 | |
|
| Cellular component | 7 | 43 | 6.14 |
| Molecular function | 92 | 594 | 6.46 | |
| Biological process | 78 | 406 | 5.21 | |
|
| Cellular component | 28 | 335 | 11.96 |
| Molecular function | 199 | 1595 | 8.02 | |
| Biological process | 158 | 1251 | 7.92 |
1)HAU-SNP001 ~ 356. 2)HAU-SNP357 ~ 811.
Figure 1Sequence comparisons between analyses and actual Sanger-sequencing results of PCR products from two cotton genotypes (Emian 22 and 3–79). Vector and primer sequences are removed, and the predicted SNPs are shown in boxes; Gh, Gb, DT and DW are from GenBank. a) Marker HAU-SNP248; b) Marker HAU-SNP304; and c) Marker HAU-SNP504.
Figure 2Three types of clusters during analysis of interspecific EST-SNPs. a: Clusters with only inter-homoeologue SNPs (Both G. hirsutum and G. barbadense harbor two base types at a certain base, and no difference in base types exists between them); b: Clusters with inter/hemi-SNPs (G. hirsutum and G. barbadense harbor different base types at one or more bases); c: Clusters with only hemi-SNPs (One of G. hirsutum and G. barbadense harbors only one base type at a certain base, and the other one harbors two base types at the certain base. Base types between them are partially different).
Figure 3Two types of unigenes during analysis of intraspecific EST-SNPs. a: Unigenes with only putative SNPs; b: Unigenes with reliable SNPs.