| Literature DB >> 12791171 |
Jeff W Bizzaro1, Kenneth A Marx.
Abstract
BACKGROUND: Simple sequence repeats (SSRs), microsatellites or polymeric sequences are common in DNA and are important biologically. From mononucleotide to trinucleotide repeats and beyond, they can be found in long (> 6 repeating units) tracts and may be characterized by quantifying the frequencies in which they are found and their tract lengths. However, most of the existing computer programs that find SSR tracts do not include these methods.Entities:
Mesh:
Substances:
Year: 2003 PMID: 12791171 PMCID: PMC165442 DOI: 10.1186/1471-2105-4-22
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Frequency plots (one per panel) for equivalent organism sequence data across a range of GC compositions: Dictyostelium discoideum (0.97 Mb at 26.% GC), Oryza sativa japonica (1.5 Mb at 45.% GC) and Chlamydomonas reinhardtii (0.35 Mb at 62.% GC), including a sequence produced by a random number generator with equal weights between bases (1.0 Mb at 50.% GC). The sequences are concatenated from GenBank document "source" sequences and analyzed by Poly as described in the text. These data are presented solely to illustrate the methods described here and not to describe new research.
Figure 2Operation of the Poly algorithm on DNA sequences. Panel A illustrates the process for a window size of n = 1. Panel B illustrates the process for n = 3. The dotted lines show which bases are compared. White boxes represent bases which are part of an SSR tract, and black boxes represent those which are not.