| Literature DB >> 26732855 |
Fuli Liu1,2, Zimin Hu3, Wenhui Liu4, Jingjing Li3, Wenjun Wang1, Zhourui Liang1, Feijiu Wang1, Xiutao Sun1.
Abstract
Using transcriptome data to mine microsatellite and develop markers has growingly become prevalent. However, characterizing the possible function of microsatellite is relatively rare. In this study, we explored microsatellites in the transcriptome of the brown alga Sargassum thunbergii and characterized the frequencies, distribution, function and evolution, and developed primers to validate these microsatellites. Our results showed that Tri-nucleotide is the most abundant, followed by di- and mono-nucleotide. The length of microsatellite was significantly affected by the repeat motif size. The density of microsatellite in the CDS region is significantly lower than that in the UTR region. The annotation of the transcripts containing microsatellite showed that 573 transcripts have GO terms and can be categorized into 42 groups. Pathways enrichment showed that microsatellites were significantly overrepresented in the genes involved in pathways such as Ubiquitin mediated proteolysis, RNA degradation, Spliceosome, etc. Primers flanking 961 microsatellite loci were designed, and among the 30 pairs of primer selected randomly for availability test, 23 were proved to be efficient. These findings provided new insight into the function and evolution of microsatellite in transcriptome, and the identified microsatellite loci within the annotated gene will be useful for developing functional markers in S. thunbergii.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26732855 PMCID: PMC4702172 DOI: 10.1038/srep18947
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
General information for microsatellite analysis.
| Items | Number |
|---|---|
| Total number of sequences examined | 36119 |
| Total size of examined sequences (bp) | 43180505 |
| Total number of identified SSRs | 2915 |
| Number of sequence containing SSR | 2528 |
| Number of sequences containing more than one SSR | 322 |
| Number of SSRs present in compound formation | 93 |
| Number of SSR per kbp | 0.068 |
Figure 1The number distribution of different microsatellite motif types in Sargassum thunbergii transcriptome.
Distribution and Characteristics of microsatellites in different transcript regions.
| Region | Total number of base pair (bp) | Number of SSR | Number of SSR per kpb | Mean length of SSR (bp) |
|---|---|---|---|---|
| Coding | 761730 | 629 | 0.83 | 16.35 |
| UTR | 1753943 | 1571 | 0.90 | 16.49 |
| 5′UTR | 892254 | 832 | 0.93 | 16.53 |
| 3′UTR | 861689 | 739 | 0.86 | 16.44 |
Figure 2The classification of genes containing microsatellite locus based on the Gene ontology (GO) annotation.
Figure 3The top 20 enriched pathways involved the gens containing microsatellite locus.
Microsatellite markers development and their application in a tested population.
| Loci | Primer sequences (5′-3′) | Repeat motif | Ta (°C) | Size range (bp) | NA | PIC | HO | HE |
|---|---|---|---|---|---|---|---|---|
| SW1 | F:AACGGAAGCGCAATACGAC | (AC)11 | 60 | 411–421 | 5 | 0.653 | 0.633 | 0.720 |
| GCAGACACGGTTGACGAAG | ||||||||
| SW6 | CAAAGTTGCTGCGTGATTCG | (CT)11 | 60 | 160–164 | 2 | 0.375 | 0.433 | 0.508 |
| CACGATGTGTCGCCTTCTG | ||||||||
| SW9 | AAAGTTGCTGAGCCGTTCG | (ACC)8 | 60 | 435–456 | 4 | 0.694 | 0.433 | 0.754 |
| CAGGAGGACCATCGATCCC | ||||||||
| SW10 | TGGCTGTGTGGATACGACC | (ACT)8 | 60 | 332–353 | 5 | 0.651 | 0.566 | 0.714 |
| TGTCGCAATGCTCGTTGTAG | ||||||||
| SW16 | CCCAAATCAGCGAAAGGCG | (GTT)21 | 61 | 402–411 | 3 | 0.342 | 0.333 | 0.422 |
| CGGTGCTACGATACTGCCC | ||||||||
| SW17 | GCCTTCGTTACGCTTGACC | (ATTT)6 | 60 | 364–372 | 5 | 0.420 | 0.100 | 0.473 |
| TACCACCTGAGCAATCCCG | ||||||||
| SW18 | ACCCGACGAGCTCTACAAG | (CAGT)6 | 60 | 428–452 | 5 | 0.427 | 0.333 | 0.465 |
| TGAGTGGGTTGAAGACGGG | ||||||||
| SW21 | ATGCCAGGAGCTACACAGG | (AAACC)7 | 60 | 383–400 | 4 | 0.418 | 0.600 | 0.532 |
| AGATGGCTCAACCTCTGCC | ||||||||
| SW24 | TTGCCCGGGTATCCTGTTC | (AAACAT)4 | 60 | 310–316 | 2 | 0.339 | 0.300 | 0.440 |
| TTTCGCGTTGAGCACTTCG | ||||||||
| SW35 | GCTATGTCAACAACCACCTCT | (ATC)4…(ATC)4 | 59 | 332–348 | 4 | 0.411 | 0.500 | 0.521 |
| TTCTGATTCGAGGTATTGTGC |
Ta: annealing temperature; NA: The observed number of alleles; PIC: Polymorphism Information Content; Ho: the mean observed heterozygosity; HE: the mean expected heterozygosity.