| Literature DB >> 19515225 |
Ursula K Frei1, Bernd Wollenweber, Thomas Lübberstedt.
Abstract
BACKGROUND: Analysis of allelic variation for relevant genes and monitoring chromosome segment transmission during selection are important approaches in plant breeding and ecology. To minimize the number of required molecular markers for this purpose is crucial due to cost and time constraints. To date, software for identification of the minimum number of required markers has been optimized for human genetics and is only partly matching the needs of plant scientists and breeders. In addition, different software packages with insufficient interoperability need to be combined to extract this information from available allele sequence data, resulting in an error-prone multi-step process of data handling.Entities:
Mesh:
Year: 2009 PMID: 19515225 PMCID: PMC2707369 DOI: 10.1186/1471-2105-10-176
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Flowchart of haplotype differentiation (PolyMin-phaseI). Starting with the complete matrix of haplotypes × polymorphisms, the algorithm used in PolyMin divides the haplotypes in groups and subgroups, until no further subdivision is possible. As long as there are polymorphisms within a subgroup that have a PIC value different from 0, new subgroups can be formed. The process stops either if there are no more polymorphisms with PIC values different from 0 or if all haplotypes are divided into different subgroups. The results of the differentiation are recorded (see double framed boxes). Pm: polymorphism, PIC: polymorphism information content.
Figure 2Comparison of programs (the shaded areas show the steps of the analysis covered by the respective program).
Comparison of PolyMin phase I, PolyMin phase I+II and BEST in their ability to detect minimum sets of polymorphisms and the usability of these polymorphism sets for genotype differentiation.
| Gene | Number of different haplotypes | Number of possible genotypes | Program used | Minimum number of polymorphisms | Identified genotypes | Number of non- identifiable genotypes | Number of genotype groups left |
| CCoAOMT- 1 | 12 | 78 | A1 | 6 | 67 | 11 | 5 |
| B2 | 5 | 29 | 49 | 21 | |||
| C3 | 6 | 73* | 5* | 2* | |||
| D4 | 17 | 77 | 2 | 1 | |||
| CCoAOMT-2 | 16 | 136 | A1 | 7 | 128 | 8 | 4 |
| B2 | 5 | 49 | 87 | 25 | |||
| C3 | 6 | 110* | 26* | 12* | |||
| D4 | 43 | 136 | 0 | 0 | |||
| COMT | 30 | 465 | A1 | 12 | 447 | 18 | 9 |
| B2 | 8 | 234 | 231 | 75 | |||
| C3 | 9 | 415* | 0 | 24* | |||
| D4 | 74 | 465 | 50* | 0 | |||
| Lac1 | 16 | 136 | A1 | 8 | 122 | 14 | 7 |
| B2 | 4 | 19 | 117 | 16 | |||
| C3 | 8 | 124* | 12* | 6* | |||
| D4 | 27 | 132 | 4 | 2 | |||
| Lac5-4 | 20 | 210 | A1 | 9 | 182 | 28 | 13 |
| B2 | 6 | 51 | 169 | 45 | |||
| C3 | 9 | 182* | 28* | 13* | |||
| D4 | 47 | 203 | 7 | 3 | |||
| Lac5-6 | 8 | 36 | A1 | 6 | 34 | 2 | 1 |
| B2 | 6 | 34 | 2 | 1 | |||
| C3 | 6 | 34* | 2* | 1* | |||
| D4 | 7 | 34 | 2 | 1 | |||
| Ra1 | 12 | 78 | A1 | 11 | 78 | 0 | 0 |
| B2 | 11 | 78 | 0 | 0 | |||
| C3 | 11 | 78* | 0 | 0 | |||
| D4 | 11 | 78 | 0 | 0 | |||
A1 PolyMin phase I,
B2 PolyMin phase I+II (= non-informative polymorphisms excluded),
C3 BEST,
D4PolyMin genotype differentiation optimized (= all non-redundant polymorphisms included),
*The genotype differentiation for the minimum polymorphism set generated by BEST was performed with PolyMin.
Example data sets where extracted from NCBI (CCoAOMT- 1: AY323241–AY323271, CCoAOMT- 2: AY279004–AY279035, COMT: AY323272–AY323305, Lac1: AY464016–AY464051, Ra1 DQ013174–DQ013202) or from Xing et al. [25].
Result from PolyMin phase I, showing 9 polymorphisms, selected for the differentiation of 13 haplotypes (reduced data set for demonstration).
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | ||||
| no | pm | Bp | rp | SEQ09 | SEQ10 | SEQ11 | SEQ14 | SEQ12 | SEQ13 | SEQ03 | SEQ02 | SEQ01 | SEQ04 | SEQ05 | SEQ06 | SEQ07 |
| 2 | 14 | 15 | 4 | A | G | G | G | G | G | A | A | A | A | A | G | G |
| 3 | 62 | 63 | 1 | - | - | - | - | - | - | - | A | A | A | A | A | - |
| 5 | 13 | 14 | 1 | - | - | - | - | - | - | - | - | - | A | A | A | A |
| 6 | 63 | 64 | 1 | C | C | C | C | C | C | T | C | T | C | T | C | C |
| 7 | 12 | 13 | 1 | A | A | A | - | - | A | A | A | A | A | A | A | A |
| 8 | 22 | 23 | 19 | - | - | - | - | A | A | - | - | - | - | - | - | - |
| 9 | 52 | 53 | 2 | A | - | A | A | A | A | A | A | A | A | A | A | A |
Lines in bold: polymorphisms that will be excluded from the set, when using PolyMin phase II, as they do not contribute to the differentiation of the 13 haplotypes.
no – polymorphism number,
pm – name of the polymorphism,
bp – position of the polymorphism within the sequence,
rp – number of redundant polymorphisms found in the whole sequence.