| Literature DB >> 24312368 |
Stacey L Lance1, Cara N Love, Schyler O Nunziata, Jason R O'Bryhim, David E Scott, R Wesley Flynn, Kenneth L Jones.
Abstract
Development and optimization of novel species-specific microsatellites, or simple sequence repeats (SSRs) remains an important step for studies in ecology, evolution, and behavior. Numerous approaches exist for identifying new SSRs that vary widely in terms of both time and cost investments. A recent approach of using paired-end Illumina sequence data in conjunction with the bioinformatics pipeline, PAL_FINDER, has the potential to substantially reduce the cost and labor investment while also improving efficiency. However, it does not appear that the approach has been widely adopted, perhaps due to concerns over its broad applicability across taxa. Therefore, to validate the utility of the approach we developed SSRs for 32 species representing 30 families, 25 orders, 11 classes, and six phyla and optimized SSRs for 13 of the species. Overall the IPE method worked extremely well and we identified 1000s of SSRs for all species (mean = 128,485), with 17% of loci being potentially amplifiable loci, and 25% of these met our most stringent criteria designed to that avoid SSRs associated with repetitive elements. Approximately 61% of screened primers yielded strong amplification of a single locus.Entities:
Mesh:
Year: 2013 PMID: 24312368 PMCID: PMC3842982 DOI: 10.1371/journal.pone.0081853
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Taxonomic information for the 32 species sequenced.
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|
|
| Animalia | Arthropoda | Insecta | Coleoptera | Dytiscidae |
|
| |
| 2 | Animalia | Arthropoda | Insecta | Hemiptera | Plataspidae |
|
| |
|
| Animalia | Arthropoda | Insecta | Lepidoptera | Nymphalidae |
|
| |
| 4 | Animalia | Arthropoda | Insecta | Plecoptera | Capniidae |
|
| |
|
| Animalia | Arthropoda | Malacostraca | Decapoda | Lithodidae |
|
| |
|
| Animalia | Arthropoda | Malacostraca | Decapoda | Ocypodidae |
|
| |
| 7 | Animalia | Arthropoda | Malacostraca | Decapoda | Ocypodidae |
|
| |
| 8 | Animalia | Chordata | Actinopterygii | Cypriniformes | Cyprinidae |
|
| |
|
| Animalia | Chordata | Actinopterygii | Salmoniformes | Salmonidae |
|
| |
|
| Animalia | Chordata | Amphibia | Caudata | Ambystomatidae |
|
| |
|
| Animalia | Chordata | Amphibia | Caudata | Pletodontidae |
|
| |
|
| Animalia | Chordata | Aves | Charadriiformes | Alcidae |
|
| |
|
| Animalia | Chordata | Aves | Charadriiformes | Alcidae |
|
| |
| 14 | Animalia | Chordata | Aves | Passeriformes | Troglodytidae |
|
| |
|
| Animalia | Chordata | Aves | Pelecaniformes | Pelecanidae |
|
| |
| 16 | Animalia | Chordata | Aves | Pelecaniformes | Sulidae |
|
| |
|
| Animalia | Chordata | Aves | Procellariiformes | Hydrobatidae |
|
| |
|
| Animalia | Chordata | Mammalia | Cetacea | Delphinidae |
|
| |
| 19 | Animalia | Chordata | Mammalia | Chiroptera | Phyllostomatidae |
|
| |
| 20 | Animalia | Chordata | Mammalia | Didelphimorphia | Didelphidae |
|
| |
| 21 | Animalia | Chordata | Mammalia | Rodentia | Cricetidae |
|
| |
| 22 | Animalia | Chordata | Reptilia | Squamata | Colubridae |
|
| |
| 23 | Animalia | Chordata | Reptilia | Squamata | Phrynosomatidae |
|
| |
| 24 | Animalia | Chordata | Reptilia | Testudines | Geoemydidae |
|
| |
|
| Animalia | Mollusca | Bivalvia | Unionoida | Unionidae |
|
| |
| 26 | Plantae | Embryophyta | Equisetopsida | Asterales | Campanulaceae |
|
| |
|
| Plantae | Magnoliophyta | Magnoliopsida | Asterales | Asteraceae |
|
| |
|
| Plantae | Magnoliophyta | Magnoliopsida | Caryophyllales | Cactaceae |
|
| |
|
| Plantae | Magnoliophyta | Magnoliopsida | Fabales | Fabaceae |
|
| |
| 30 | Plantae | Magnoliophyta | Magnoliopsida | Rosales | Rosaceae |
|
| |
| 31 | Plantae | Magnoliophyta | Magnoliopsida | Scrophulariales | Scrophulariaceae |
|
| |
| 32 | Plantae | Tracheophyta | Coniferopsida | Coniferales | Cupressaceae |
|
| |
Sample number in bold indicates a Nextera library preparation method was used instead of the standard Illumina preparation.
The number of paired end reads out of 5 million that contain microsatellites, and within those the number that contain suitable sequence for primers and are considered potentially amplifiable loci (PALs).
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|
|
|
| 50,735 | 2,576 | 1,333 | 3,413 | 6,072 | 3,946 | 35,971 |
| 2 |
| 86,717 | 13,953 | 28 | 122 | 2,408 | 6,674 | 77,485 |
|
|
| 62,927 | 6,998 | 250 | 34,241 | 1,790 | 4,599 | 6,747 |
| 4 |
| 73,137 | 13,090 | 2,462 | 11,669 | 9,277 | 14,391 | 35,338 |
|
|
| 430,868 | 54,838 | 350 | 194,790 | 20,956 | 51,573 | 163,199 |
|
|
| 644,886 | 144,502 | 70 | 13,010 | 42,400 | 199,907 | 389,499 |
| 7 |
| 545,301 | 94,805 | 114 | 13,360 | 40,449 | 88,638 | 402,740 |
| 8 |
| 238,812 | 30,099 | 2,796 | 1,560 | 106,375 | 9,013 | 119,069 |
|
|
| 286,604 | 26,109 | 140 | 257 | 1,943 | 3,374 | 20,395 |
|
|
| 5,970 | 1,582 | 4 | 70 | 290 | 554 | 664 |
|
|
| 27,272 | 4,198 | 1,572 | 1,043 | 16,853 | 4,281 | 3,523 |
|
|
| 14,288 | 2,136 | 4,189 | 2,054 | 2,246 | 1,995 | 3,804 |
|
|
| 17,166 | 3,093 | 26 | 274 | 608 | 1,444 | 741 |
| 14 |
| 113,109 | 4,760 | 64,127 | 28,928 | 11,599 | 5,837 | 2,618 |
|
|
| 12,421 | 2,554 | 2,450 | 3,459 | 1,344 | 3,032 | 2,135 |
| 16 |
| 82,003 | 3,913 | 4,275 | 69,353 | 1,684 | 4,531 | 2,160 |
|
|
| 2,541 | 418 | 592 | 390 | 217 | 646 | 696 |
|
|
| 34,387 | 6,999 | 2,150 | 301 | 4,110 | 2,411 | 25,415 |
| 19 |
| 25,278 | 7,403 | 2,774 | 253 | 4,344 | 3,096 | 14,811 |
| 20 |
| 94,285 | 12,811 | 3,865 | 2,821 | 36,927 | 13,016 | 37,656 |
| 21 |
| 132,502 | 33,500 | 86 | 316 | 4,433 | 3,817 | 24,848 |
| 22 |
| 244,857 | 26,215 | 302 | 4,144 | 8,975 | 5,967 | 6,827 |
| 23 |
| 139,529 | 46,255 | 4,320 | 1,092 | 21,778 | 63,513 | 48,827 |
| 24 |
| 22,319 | 6,370 | 19 | 71 | 486 | 1,146 | 4,648 |
|
|
| 105,238 | 8,601 | 4,015 | 606 | 44,611 | 13,035 | 42,971 |
| 26 |
| 37,868 | 7,242 | 8 | 12 | 60 | 1,440 | 5,722 |
|
|
| 31,634 | 7,607 | 75 | 405 | 405 | 4,555 | 2,167 |
|
|
| 60,583 | 6,964 | 58 | 539 | 1,159 | 2,597 | 2,611 |
|
|
| 391,973 | 5,845 | 105 | 2,154 | 426 | 1,841 | 1,319 |
| 30 |
| 42,786 | 14,777 | 1,295 | 723 | 606 | 14,632 | 25,530 |
| 31 |
| 32,170 | 7,232 | 400 | 147 | 484 | 7,907 | 23,232 |
| 32 |
| 21,352 | 2,853 | 18 | 36 | 87 | 1,375 | 1,337 |
Also included are the number of those SSRs that contained hexanucleotide, pentanucleotide, tetranucleotide, trinucleotide, or dinucleotide repeats. Sample number in bold indicates a Nextera library preparation method was used instead of the standard Illumina preparation.
Results of General Linear Model analysis examining role of taxonomy on the number of sequences that had microsatellites (No. msats), the number of PALs, the number of PALs that were different repeat types, the number of premium PALs (pPALs), the number of pPALs that were different repeat types, and the proportion of PALs that were pPALs.
|
|
|
| |
|---|---|---|---|
|
| NS | NS | <0.0001 |
|
| NS | NS | <0.0001 |
| 6mers | NS | NS | NS |
| 5mers | NS | NS | NS |
| 4mers | NS | NS | 0.0491 |
| 3mers | NS | NS | 0.0016 |
| 2mers | NS | 0.05 | <0.0001 |
|
| NS | NS | 0.0003 |
| 6mers | NS | NS | NS |
| 5mers | NS | NS | NS |
| 4mers | 0.06 | NS | 0.0061 |
| 3mers | NS | NS | 0.0032 |
| 2mers | NS | NS | 0.0001 |
|
| NS | 0.0207 | <0.0001 |
Figure 1The mean and 95% upper confidence limit (values in parentheses are high values that go off the scale) for the number of SSR’s (a), PALs (b), pPALs (c), and percent of PALs that were pPALs that were observed across classes.
Sample number and for each the number of pPALs found and the number that contained hexanucleotide, pentanucleotide, tetranucleotide, trinucleotide, or dinucleotide repeats.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
|
| 201 | 3 | 0 | 3 | 71 | 124 |
| 2 | 2,423 | 0 | 2 | 12 | 238 | 2,171 |
|
| 136 | 0 | 1 | 44 | 53 | 38 |
| 4 | 937 | 2 | 39 | 68 | 180 | 648 |
|
| 19,407 | 16 | 51 | 913 | 3,213 | 15,214 |
|
| 52,682 | 2 | 239 | 2,368 | 12,449 | 37,624 |
| 7 | 24,022 | 1 | 179 | 1,061 | 5,879 | 16,902 |
| 8 | 4,635 | 3 | 21 | 188 | 439 | 3,984 |
|
| 6,671 | 26 | 32 | 491 | 830 | 5,292 |
|
| 322 | 1 | 9 | 62 | 91 | 159 |
|
| 1,118 | 13 | 54 | 426 | 411 | 214 |
|
| 667 | 11 | 51 | 165 | 287 | 148 |
|
| 1,016 | 6 | 83 | 246 | 419 | 262 |
| 14 | 845 | 29 | 59 | 149 | 377 | 231 |
|
| 626 | 9 | 55 | 107 | 317 | 138 |
| 16 | 949 | 20 | 69 | 119 | 442 | 299 |
|
| 165 | 1 | 11 | 29 | 69 | 56 |
|
| 2,150 | 2 | 8 | 261 | 297 | 1,582 |
| 19 | 3,178 | 8 | 29 | 442 | 454 | 2,246 |
| 20 | 7,049 | 30 | 65 | 1,062 | 1,595 | 4,297 |
| 21 | 17,797 | 39 | 120 | 1,914 | 1,695 | 14,029 |
| 22 | 6,314 | 48 | 474 | 1,948 | 1,563 | 2,281 |
| 23 | 14,511 | 10 | 107 | 2,014 | 6,509 | 5,871 |
| 24 | 2,545 | 8 | 22 | 169 | 411 | 1,935 |
|
| 1,163 | 0 | 3 | 91 | 285 | 784 |
| 26 | 2,722 | 2 | 6 | 15 | 413 | 2,286 |
|
| 813 | 6 | 38 | 49 | 466 | 254 |
|
| 1,208 | 9 | 97 | 94 | 422 | 586 |
|
| 803 | 6 | 145 | 65 | 382 | 205 |
| 30 | 402 | 8 | 6 | 10 | 97 | 281 |
| 31 | 791 | 3 | 2 | 5 | 195 | 586 |
| 32 | 1,180 | 3 | 6 | 39 | 421 | 711 |
Figure 2Frequency histograms of forward primer sequence copy number within 5 million paired end reads.
The proportion of all primers observed 1, 2-10, 11-100, 101-1000, 1001-10,000, 10,001 – 100,000 or > 100,000 times is shown for Mammallia (a), Insecta (b), and Magnoliopsida (c).
Forty-eight primers were tested for amplification across 13 species.
|
|
| ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
| Number of loci with good amplification | 11 | 24 | 26 | 25 | 19 | 23 | 29 | 11 | 22 | 29 | 40 | 11 | 30 |
| Number of loci with good amplification, but were too small (e.g., <100bp) | 0 | 3 | 2 | 0 | 0 | 1 | 5 | 6 | 3 | 4 | 1 | 24 | 1 |
| Number of loci that would require further optimization | 14 | 12 | 10 | 9 | 11 | 15 | 3 | 16 | 13 | 5 | 5 | 9 | 8 |
| Number of loci that yielded zero amplification | 23 | 9 | 10 | 14 | 18 | 9 | 11 | 15 | 10 | 10 | 2 | 4 | 8 |