| Literature DB >> 21112874 |
Pankaj Kumar1, Pasumarthy S Chaitanya, Hampapathalu A Nagarajaram.
Abstract
PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.Entities:
Mesh:
Year: 2010 PMID: 21112874 PMCID: PMC3013739 DOI: 10.1093/nar/gkq1198
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Schematic representation of PSSRFinder. C_PSSRF and NC_PSSRF are the two PERL programs which parse coding and non-coding PSSRs respectively from the BLAST output.
Structure of MySQL table which is used for storing coding PSSR information
| Information | Field | Type | Null | Key | Default | Extra |
|---|---|---|---|---|---|---|
| PSSR number | P_n | int(11) | No | PRI | NULL | auto_increment |
| Strain name | Strn | varchar(90) | YES | NULL | ||
| PSSR | mf | varchar(8) | YES | NULL | ||
| Repeat length | rpt | int(11) | YES | NULL | ||
| Start of repeat | strt_rpt | varchar(20) | YES | NULL | ||
| End of repeat | end_rpt | varchar(20) | YES | NULL | ||
| Mutation point | mut_pnt | varchar(20) | YES | NULL | ||
| Sequence | seq | varchar(50) | YES | NULL | ||
| Strand type | strnd_type | varchar(5) | YES | NULL | ||
| Protein length | prtn_len | bigint(20) | YES | NULL | ||
| Protein ID | prtn_id | varchar(20) | YES | NULL | ||
| ORF | orf_name | varchar(20) | YES | NULL | ||
| Protein function | prtn_func | varchar(150) | YES | NULL | ||
| DNA sequence of length 400 nucleotides | seq_link | varchar(550) | YES | NULL |
Structure of MySQL table which is used for storing non-coding PSSR information
| Information | Field | Type | Null | Key | Default | Extra |
|---|---|---|---|---|---|---|
| PSSR number | P_n | int(11) | NO | PRI | NULL | auto_increment |
| Strain name | Strn | varchar(90) | YES | NULL | ||
| PSSR | mf | varchar(8) | YES | NULL | ||
| Repeat length | rpt | int(11) | YES | NULL | ||
| Start of repeat | s_rpt | varchar(20) | YES | NULL | ||
| End of repeat | e_rpt | varchar(20) | YES | NULL | ||
| Mutation point | mut_pnt | varchar(20) | YES | NULL | ||
| Sequence | seq | varchar(50) | YES | NULL | ||
| Distance from left ORF | L_D | varchar(10) | YES | NULL | ||
| Left strand type | U_S_T | varchar(5) | YES | NULL | ||
| Left protein length | U_P_L | bigint(20) | YES | NULL | ||
| Left protein ID | U_P_I | varchar(20) | YES | NULL | ||
| Left ORF | U_orf | varchar(20) | YES | NULL | ||
| Distance from right ORF | R_D | varchar(10) | YES | NULL | ||
| Right strand type | D_S_T | varchar(5) | YES | NULL | ||
| Right protein length | D_P_L | bigint(20) | YES | NULL | ||
| Right protein ID | D_P_I | varchar(20) | YES | NULL | ||
| Right ORF | D_orf | varchar(20) | YES | NULL | ||
| DNA sequence of 400 nucleotide length | seq_link | varchar(550) | YES | NULL |
Figure 2.Overview of PSSRdb shown using screen-shots of various pages. (A) Main page containing species name which can be selected; (B) PSSRs found in the selected species; (C) Table containing the useful details of the selected coding PSSRs found in the selected species; (D) Table containing the useful details of the selected non-coding PSSRs found in the selected species; (E) Sequence alignment of a selected PSSR (in this case G tract).