| Literature DB >> 31315288 |
Xuan-Min Guang1, Jin-Quan Xia1, Jian-Qing Lin1, Jun Yu1, Qiu-Hong Wan1, Sheng-Guo Fang2.
Abstract
Simple sequence repeats (SSRs) are known as microsatellites, and consist of tandem 1-6-base motifs. They have become one of the most popular molecular markers, and are widely used in molecular ecology, conservation biology, molecular breeding, and many other fields. Previously reported methods identify monomorphic and polymorphic SSRs and determine the polymorphic SSRs via experimental validation, which is potentially time-consuming and costly. Herein, we present a new strategy named insertion/deletion (INDEL) SSR (IDSSR) to identify polymorphic SSRs by integrating SSRs with nucleotide insertions/deletions (INDEL) solely based on a single genome sequence and the sequenced pair-end reads. These INDEL indexes and polymorphic SSRs were identified, as well as the number of repeats, repeat motifs, chromosome location, annealing temperature, and primer sequences, enabling future experimental approaches to determine the correctness and polymorphism. Experimental validation with the giant panda demonstrated that our method has high reliability and stability. The efficient SSR pipeline would help researchers obtain high-quality genetic markers for plants and animals of interest, save labor, and reduce costly marker-screening experiments. IDSSR is freely available at https://github.com/Allsummerking/IDSSR.Entities:
Keywords: IDSSR; INDEL; efficient; high quality; polymorphic SSRs
Year: 2019 PMID: 31315288 PMCID: PMC6678329 DOI: 10.3390/ijms20143497
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Distribution of polymorphic simple sequence repeats (SSRs) in the giant panda and Gallus gallus.
| SSR Length (bp) | Dinucleotide | Trinucleotide | Tetranucleotide | Pentanucleotide | Hexanucleotide | Total | |
|---|---|---|---|---|---|---|---|
| Giant Panda | Number | 4064 | 428 | 363 | 26 | 1 | 4882 |
| Length (bp) | 75,324 | 6933 | 7204 | 580 | 24 | 90,065 | |
| Percentage of repeat | 83.24% | 8.77% | 7.44% | 0.53% | 0.02% | 100% | |
|
| Number | 453 | 621 | 763 | 273 | 47 | 2157 |
| Length (bp) | 9412 | 8067 | 19,176 | 8650 | 1482 | 46,787 | |
| Percentage of repeat | 21.00% | 28.79% | 35.37% | 12.66% | 2.17% | 100% |
Figure 1Locations of simple sequence repeats in the giant panda genome.
The primers of the giant panda microsatellites.
| Locus | Number of Bases per Repeat Unit | Forward Primer Sequence | Reverse Primer Sequence | Label * |
|---|---|---|---|---|
| GP1 | 6 | CTCGTGCTGGGCTGAAGAGAGAAG | CCCCATCACAATGTCTGCAGCTG | 5′-TET |
| GP2 | 5 | GATGGGCCACCTTGACATGTACAT | ACTGAAGACCCAGGAGAGAGCTTT | 5′-FAM |
| GP3 | 5 | AACAAAAACCCCCAAACCAAACCC | GGTCGGTAGCTATGAAGTGTTGGG | 5′-FAM |
| GP4 | 5 | TCATTGTTACTCTGCCTGTATCTGTT | CTTGTGCTCTCTCTCCGTCAAATA | -- |
| GP5 | 5 | ACCACAGCCAAGGGTTGTATTGTT | GGGTTGTGAGTTGAAGCCCTACAT | 5′-FAM |
| GP6 | 5 | CTCAAGGCAGTTGTTCCCACTCTT | TCCATATTGGAAAACCCTACACTGGAA | 5′-FAM |
| GP7 | 5 | TGGTGGTAATGAAATCCCTCAGCT | CTTCTATCCTCAGTGAAGCCGTCC | 5′-TET |
| GP8 | 5 | CTTACTTTCACATCTGGGCCCTCC | ACATGCAATGAAACAGGGACCACT | 5′-TET |
| GP9 | 5 | TTAACTGGGGGTGTACTGGATGGT | TAAGGGTGCTATTCTCGCCATTCC | 5′-FAM |
| GP10 | 5 | CTCGGAGGGCATCTGTTGGATTAA | CCATGAGCGTGGGGCCTATTTAAA | 5′-FAM |
| GP11 | 5 | TCTTCAACAAAACAATTCTTTTGCTTGT | TTAAAACCAGCGTGGCAGATTTTG | - |
| GP12 | 5 | CCAACTCACGGAGGGGATATCAAG | AACCACATCCTATTCTGACTGCCT | - |
| GP13 | 5 | CCTCAACTCCTTCCCCTGCAAAAT | GGTGTCGTCAAGTACATGGGTCTC | 5′-TET |
| GP14 | 5 | TCTGTCAGCTGAGTTGACCTTGAG | TTTGCAGCAAAAAGTTCTCTTGCC | - |
| GP15 | 5 | GAGACAGGCTATCTTACATTGGGCT | AATTGTAGCAGGGTCTCATGGCTG | 5′-HEX |
| GP16 | 5 | TATCTCTAAGTGCCCTGGGGTCAG | CGGACTCGTTCCTAGTGTGTGG | 5′-HEX |
| GP17 | 5 | TCGTTGAACGCCACATCAAAAACT | TTCAGGATTCTGGGCACTACTGGA | 5′-HEX |
| GP18 | 5 | TCGAGGGCTTGCGACTTTATTTCA | AGAGCTGGATTGGAGAAAGCTTGA | 5′-TET |
| GP19 | 5 | AGGAAGGGAAGGGAAGGGAAAGAA | TCCTCACAAACCAGAGAGTATGGGA | - |
| GP20 | 5 | TGCTCGAAAGGAAACTACCAGGAA | CCAAGGTCATGGAGGCACATTTTA | 5′-HEX |
| GP21 | 5 | ACAAATGCAATAGAAGGGAAAGTCTGT | ATGGTGCCCTGGGTGTTATACG | - |
| GP22 | 5 | TTTGGAGAGGCGGAAAGAGCTTTT | TTTTGCTGCGAGGAGGTGATAGTC | 5′-HEX |
| GP23 | 5 | GGCGTCCCAGTACGTAACTCTCTA | ATACACTTTGGAGGCACCTGGATG | 5′-TET |
| GP24 | 5 | GATATTCTCTCTCCCTCTCCCCTG | TTCCATTTTGAGCCAAAAGTTACTTAGT | 5′-TET |
| GP25 | 5 | CATCTGAGCACTTGAAAGCCAGT | GTCACTACAGCAATCATATAACCCTGT | 5′-HEX |
| GP26 | 5 | CTCAGGATCGTGAGTTTAAGCCCC | GGTTGTCTTATTTCCTGTGCATTTGGT | 5′-HEX |
| GP27 | 5 | TCCAGCTAAACAAACTGCCCTTCT | CTACTGGTCAGCTGCAAGGACTTG | 5′-TET |
* TET: tetrachloro-6-car-boxyfluorescein; FAM: 6-carboxyfluorescein; HEX: hexachloro-6-car-boxyfluorescein.
Figure 2Flowchart of the insertion/deletion SSR (IDSSR) pipeline.