| Literature DB >> 26412851 |
Gaurav Sablok1, G V Padma Raju2, Suresh B Mudunuri3, Ratna Prabha4, Dhananjaya P Singh4, Vesselin Baev5, Galina Yahubyan5, Peter J Ralph6, Nicola La Porta7.
Abstract
Organelle genomes evolve rapidly as compared with nuclear genomes and have been widely used for developing microsatellites or simple sequence repeats (SSRs) markers for delineating phylogenomics. In our previous reports, we have established the largest repository of organelle SSRs, ChloroMitoSSRDB, which provides access to 2161 organelle genomes (1982 mitochondrial and 179 chloroplast genomes) with a total of 5838 perfect chloroplast SSRs, 37 297 imperfect chloroplast SSRs, 5898 perfect mitochondrial SSRs and 50 355 imperfect mitochondrial SSRs across organelle genomes. In the present research, we have updated ChloroMitoSSRDB by systematically analyzing and adding additional 191 chloroplast and 2102 mitochondrial genomes. With the recent update, ChloroMitoSSRDB 2.00 provides access to a total of 4454 organelle genomes displaying a total of 40 653 IMEx Perfect SSRs (11 802 Chloroplast Perfect SSRs and 28 851 Mitochondria Perfect SSRs), 275 981 IMEx Imperfect SSRs (78 972 Chloroplast Imperfect SSRs and 197 009 Mitochondria Imperfect SSRs), 35 250 MISA (MIcroSAtellite identification tool) Perfect SSRs and 3211 MISA Compound SSRs and associated information such as location of the repeats (coding and non-coding), size of repeat, motif and length polymorphism, and primer pairs. Additionally, we have integrated and made available several in silico SSRs mining tools through a unified web-portal for in silico repeat mining for assembled organelle genomes and from next generation sequencing reads. ChloroMitoSSRDB 2.00 allows the end user to perform multiple SSRs searches and easy browsing through the SSRs using two repeat algorithms and provide primer pair information for identified SSRs for evolutionary genomics.Database URL: http://www.mcr.org.in/chloromitossrdb.Entities:
Mesh:
Year: 2015 PMID: 26412851 PMCID: PMC4584093 DOI: 10.1093/database/bav084
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Updated enhanced illustrated view of the flow of the information of the data in ChloroMitoSSRDB 2.00.
Structure of table ‘chloromitometa’ that stores the meta-information of all the mitochondrial and chloroplast genomes
| Information | Field | Data type | Key | Example |
|---|---|---|---|---|
| Accession number | acc_no | int(11) | 5881414, 110189662 | |
| Sequence ID | seq_id | varchar( | PRI | NC_000834, AC_000022 |
| Sequence name | seq_name | varchar(500) | ||
| Sequence type | seq_type | varchar(50) | Complete genome, complete sequence | |
| Sequence length | seq_length | int(11) | 16 613 bp, 7686 bp | |
| Nucleotide composition of A | a_per | Float | 33.06% | |
| Nucleotide composition of T | t_per | Float | 41.87% | |
| Nucleotide composition of G | g_per | Float | 13.58% | |
| Nucleotide composition of C | c_per | Float | 11.49% | |
| Organelle type | Organelle | char(1) | M (for Mitochondrion), C (Chloroplast) | |
| Taxon ID | Taxon | int(11) | 85636, 6334 |
Structure of the tables ‘chloromitoperfectmicrosatellite’ and ‘chloromitoimperfectmicrosatellite’ that stores the repeat information detected by IMEx of all perfect and imperfect microsatellites of mitochondrial and chloroplast genomes
| Information | Field | Data type | Key | Example |
|---|---|---|---|---|
| Sequence ID | index_no | varchar(11) | PRI | NC_000834, AC_000022 |
| Starting co-ordinate of SSR | Start | int(11) | PRI | 172, 12843 |
| Ending co-ordinate of SSR | End | int(11) | PRI | 182, 12885 |
| Motif (repeating unit) | Motif | varchar(10) | AT, G, CAAC | |
| Number of repetitions | Iterations | int(5) | 3, 7 | |
| Length of repeat tract | tract_length | int(11) | 12 bp, 18 bp | |
| Nucleotide composition of A | a_per | Float | 50.00% | |
| Nucleotide composition of T | t_per | Float | 0.00% | |
| Nucleotide composition of G | g_per | Float | 33.33% | |
| Nucleotide composition of C | c_per | Float | 16.67% | |
| Repeat position Info | coding_info | varchar(50) | Coding (if repeat in coding region) or NULL (if outside) | |
| Protein ID (if repeat in coding region) | protein_id | int(11) | 110189664 (if repeat in coding region) or 0 (if non-coding) | |
| Imperfection % of the tract | Imperfection | Float | 9%, 0% | |
| Alignment Line 1 | Alignment_line1 | Text | TTAA-TAATTAA | |
| Alignment Line 2 | Alignment_line2 | Text | **** ******* | |
| Alignment Line 3 | Alignment_line3 | Text | TTAATTAATTAA |
The last four columns (imperfection, alignment_line1, alignment_line2 and alignment_line3) are present only in the table storing imperfect microsatellites (chloromitoimperfectmicrosatellite).
Structure of the table ‘misa_ssr_info’ that stores the repeat information detected by MISA of all perfect and compound microsatellites of mitochondrial and chloroplast genomes
| Information | Field | Data type | Key | Example |
|---|---|---|---|---|
| Accession number | acc_no | int(11) | 5881414, 110189662 | |
| Sequence ID | index_no | varchar(11) | PRI | NC_000834, AC_000022 |
| Motif with iteration count | SSR | varchar(255) | (AT)4 | |
| Type of repeat | SSR_type | varchar(5) | p1, (mono), p2 (di), p3 (tri) etc, c and c* (compound) | |
| Size | int(4) | int(4) | 31, 20 | |
| Starting co-ordinate of SSR | SSR_start | int(7) | PRI | 172, 12843 |
| Ending co-ordinate of SSR | SSR_end | int(7) | PRI | 182, 12885 |
Figure 2.Entity-relationship model diagram showing the layout of the database schema in ChloroMitoSSRDB 2.00.
Figure 3.Webpage of ChloroMitoSSRDB 2.00 describing repeat summary of Acidosasa purpurea chloroplast extracted from IMEx. (A) Details of chloroplast microsatellites. (B) Repeat summary of Acidosasa purpurea chloroplast repeat extracted by IMEx and nucleotide composition of Acidosasa purpurea chloroplast. (C) Summary of perfect and imperfect repeats in Acidosasa purpurea chloroplast along with graphical distribution. (D) Mono-nucleotide perfect repeats of Acidosasa purpurea chloroplast where coding repeats in Protein ID column are linked to NCBI.
Figure 4.Repeat summary of Acidosasa purpurea chloroplast repeat extracted by MISA. (A) Details of chloroplast microsatellites. (B) Repeat summary of Acidosasa purpurea chloroplast repeat extracted by MISA and nucleotide composition of Acidosasa purpurea chloroplast. (C) Summary of MISA perfect and compound SSRs in Acidosasa purpurea chloroplast in tabular and graphical manner. (D) Detail information about perfect and compound SSRs in Acidosasa purpurea chloroplast. (E) Primers list and associated information available for any particular SSR.
Structure of the table ‘misa_ssr_primer’ that stores the primer information of microsatellites of mitochondrial and chloroplast genomes detected by MISA
| Information | Field | Data Type | Key | Example |
|---|---|---|---|---|
| Accession number | acc_no | int(11) | PRI | 5881414, 110189662 |
| Motif with iteration count | SSR | varchar(255) | (AT)4 | |
| Type of repeat | SSR_type | varchar(5) | p1, (mono), p2 (di), p3 (tri) etc, c and c* (compound) | |
| Size | int(4) | int(4) | 31, 20 | |
| Starting co-ordinate of SSR | SSR_start | int(7) | PRI | 172, 12843 |
| Ending co-ordinate of SSR | SSR_end | int(7) | PRI | 182, 12885 |
| Forward primer 1 | FORWARD_PRIMER_1 | varchar(30) | AAAAAGGCCCCTTCCCCC | |
| Melting temperature for forward primer 1 | Tm_F_1 | varchar(6) | 59.463 | |
| Size of forward primer 1 | size_F_1 | int(6) | 18 | |
| Reverse primer 1 | REVERSE_PRIMER_1 | varchar(30) | GCGCCTAAGGATCCTGTGAG | |
| Melting temperature for reverse primer 1 | Tm_R_1 | varchar(6) | 60.25 | |
| Size of reverse primer 1 | size_R_1 | int(6) | 20 | |
| Product size (in bp) | PRODUCT_size_bp_1 | 220 | ||
| Starting co-ordinate of primer 1 | start_bp_1 | 6256 | ||
| Ending co-ordinate of primer 1 | end_bp_1 | 6475 |
The last nine columns of the table will be repeated for reverse primer 1, forward primer 2, reverse primer 2, forward primer 3 and backward primer 3.
Figure 5.Advanced search and SSR extraction options in ChloroMitoSSRDB. (A) Advanced search page. (B) Page providing facility to extract SSRs from NGS Reads. (C) Page providing option of SSRs extraction in user-provided FASTA sequence.