| Literature DB >> 28854643 |
Akshay Kumar Avvaru1, Saketh Saxena1, Divya Tej Sowpati1, Rakesh Kumar Mishra1.
Abstract
Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1-6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb.Entities:
Keywords: Django; JavaScript; database; genomics; microsatellites; simple sequence repeats
Mesh:
Year: 2017 PMID: 28854643 PMCID: PMC5533116 DOI: 10.1093/gbe/evx132
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Comparison of MSDB with Other SSR Databases: (A) Number of Species and (B) Database Features.
| (A) Number of Species | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Kingdom/ Group | Micro Organism Tandem Repeats Database | UgMicro SatDb | Kazusa Marker Database | Plant Microsatellite DNAs Database | Tandem Repeats Database | FishMicro Sat | Polymorphic Simple Sequence Repeats Database | MICAS | EuMicroSat Db | MSDB |
| Bacteria | 1,109 | 0 | 0 | 0 | 1 | 0 | 85 | 4,772 | 0 | 5732 |
| Archaea | 91 | 0 | 0 | 0 | 0 | 0 | 0 | 217 | 0 | 514 |
| Plants | 0 | 80 | 14 | 110 | 2 | 0 | 0 | 0 | 31 | 74 |
| Fungi | 0 | 80 | 0 | 0 | 1 | 0 | 0 | 0 | 31 | 191 |
| Protozoa | 0 | 80 | 0 | 0 | 0 | 0 | 0 | 0 | 31 | 72 |
| Invertebrates | 0 | 80 | 0 | 0 | 9 | 0 | 0 | 0 | 31 | 112 |
| Vertebrates | 0 | 80 | 0 | 0 | 9 | 190 | 0 | 0 | 31 | 198 |
| Viruses | 1,463 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Does not support dynamic filtering of the results. The filtering parameters should be selected initially.
Comparison only across different strains of same species.
Grouping only based on the kingdoms.
Only a tabular view of the data without dynamic filters.
Filtering only based on the type of repeat.
Only pie charts available.
Examples of Repeat Motif Classification Shown for a Normal Motif (ACT), a Palindrome (ACGT), and Cyclical Variation of a Palindrome (AATTCG, Variation of GAATTC)
| Repeat Class | Cyclical Variations (“+” Strand) | Reverse Complement (“−” Strand) | Number of Motifs in Class |
|---|---|---|---|
| ACT | ACT, CTA, TAC | AGT, GTA, TAG | 6 |
| ACGT | ACGT, CGTA, GTAC, TACG | ACGT, CGTA, GTAC, TACG | 4 |
| AATTCG | AATTCG, ATTCGA, TTCGAA, TCGAAT, CGAATT, GAATTC | CGAATT, GAATTC, AATTCG, ATTCGA, TTCGAA, TCGAAT | 6 |
. 1.—An example of AGAT repeat illustrating the details that were recorded by the custom repeat identification Python script.
. 2.—View page of MSDB showing various interactive plots from the repeat data of Homo sapiens. (A) Bar plot of the 10 most frequent repeats. (B) Pie chart showing distribution of repeat classes grouped by motif length (mono-, di-, tri-, tetra-, penta-, and hexamers). (C and D) Relation between the frequency and length of AT, AGC, ACAT, and AGAT repeats depicted as line and stacked-bar charts respectively.