| Literature DB >> 19707300 |
Renee S Arias1, Linda L Ballard, Brian E Scheffler.
Abstract
UNLABELLED: We introduce here the concept of Unique Pattern Informative Combinations (UPIC), a decision tool for the cost-effective design of DNA fingerprinting/genotyping experiments using simple-sequence/tandem repeat (SSR/STR) markers. After the first screening of SSR-markers tested on a subset of DNA samples, the user can apply UPIC to find marker combinations that maximize the genetic information obtained by a minimum or desirable number of markers. This allows a cost-effective planning of future experiments. We have developed Perl scripts to calculate all possible subset combinations of SSR markers, and determine based on unique patterns or alleles, which combinations can discriminate among all DNA samples included in a test. This makes UPIC an essential tool for optimizing resources when working with microsatellites. An example using real data from eight markers and 12 genotypes shows that UPIC detected groups of as few as three markers sufficient to discriminate all 12- DNA samples. Should markers for future experiments be chosen based only on polymorphism-information content (PIC), the necessary number of markers for discrimination of all samples cannot be determined. We also show that choosing markers using UPIC, an informative combination of four markers can provide similar information as using a combination of six markers (23 vs. 25 patterns, respectively), granting a more efficient planning of experiments. Perl scripts with documentation are also included to calculate the percentage of heterozygous loci on the DNA samples tested and to calculate three PIC values depending on the type of fertilization and allele frequency of the organism. AVAILABILITY: Perl scripts are freely available for download from http://www.ars.usda.gov/msa/jwdsrc/gbru.Entities:
Keywords: GeneMapper; best SSR markers; microsatellites; simple sequence repeats; software
Year: 2009 PMID: 19707300 PMCID: PMC2720665 DOI: 10.6026/97320630003352
Source DB: PubMed Journal: Bioinformation ISSN: 0973-2063
Figure 1Graphic representation of UPIC values for the 8 markers and 12 DNA samples in our example. •: unique patterns (UP)(y-axis,left) that allow discrimination of the 12 DNA samples tested, corresponding to informative combinations (IC) of variable number of polymorphic markers (x-axis). Ο: optimum UPIC values for different number of markers in the combination. A: minimum number of markers (3) in an IC that can discriminate the 12 DNA samples, the 3 markers can detect up to 18 unique patterns (UP) or alleles; B and C: point to IC of 4 and 6 markers (B, C) respectively, both providing similar amount of information in UP values; D: shows the maximum number of UP (34) detectable by all 8 markers. Numbers on top of the histogram are the actual number of IC for K number of markers used in the combinations, i.e., for combinations of 5 markers, there are 25 IC out of 70 possible combinations.
Figure 2Flow diagram of Perl scripts for calculation of three PIC values, percentage of heterozygous loci and UPIC values.