| Literature DB >> 15980517 |
Jinho Yoo1, Bonghee Seo, Yangseok Kim.
Abstract
SNPAnalyzer is a software that performs four essential statistical analyses of SNPs in a common computational environment. It is composed of three main modules: (i) data manipulation, (ii) analysis and (iii) visualization. The data manipulation module is responsible for data input and output, and handles genotype, phenotype and genetic distance data. To ensure user convenience, the data format is simple. The analysis module performs statistical calculations and consists of four subcomponents: (i) Hardy-Weinberg equilibrium, (ii) Haplotype Estimation, (iii) linkage disequilibrium (LD) and (iv) quantitative trait locus analysis. The main feature of the analysis module is multiple implementations of different algorithms and indices for haplotype estimation and for LD analysis. This enables users to compare separate results generated by different algorithms, which help to avoid biased results acquired by applying a single statistical algorithm. The performance of all implemented algorithms has been validated using experimentally proven datasets. The visualization module presents most of the analyzed results as figures, rather than as simple text, which aids in the intuitive understanding of complex data. The SNPAnalyzer has been developed using C and C++ and is available at http://www.istech.info/istech/board/login_form.jsp.Entities:
Mesh:
Year: 2005 PMID: 15980517 PMCID: PMC1160189 DOI: 10.1093/nar/gki428
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
The numbers of individual of each ethnic group and the number of SNPs used for redefining haplotypes
| Ethnic group | Afr | Asi |
|---|---|---|
| Sample no. | 72 | 75 |
| SNP no. | 22 | 22 |
African American and Asian American ethnic groups are denoted as Afr and Asi, respectively.
Figure 1A screen shot of the Haplotype Estimation component. Algorithm selection and data importing are managed at the top panel, which is followed by the contents of the imported individuals' genotypes and the frequencies of the observed alleles. The middle histogram shows the distribution of the frequencies of the observed alleles. Among the three bottom panels, the left panels show the haplotypes' frequencies estimated from genotypes and frequency histograms. The right panel displays the individuals' reconstructed haplotypes and their reconstruction accuracies.
The accuracies of haplotype estimation produced by SNPAnalyzer
| Ethnic group | Afr | Asi | ||
|---|---|---|---|---|
| Error type | DIS | AER | DIS | AER |
| Gibbs | 0.021 | 0.028 | 0.017 | 0.027 |
| EM | 0.018 | 0.028 | 0.033 | 0.027 |
| Clark | 0.156 | 0.222 | 0.162 | 0.293 |
African American and Asian American ethnic groups are denoted as Afr and Asi, respectively. Gibbs, EM and Clark represent Gibbs sampling-based algorithm, the EM-based algorithm and Clark's algorithm, respectively.
Figure 2The result of the LD and four gamete tests on two ethnic groups. (a) the LD pattern of |D′| of the African American ethnic group reveals that there exists two small LD blocks in the specified genomic region, which are shown in red, and where the value of |D′| is close to 1.0. (b) LD pattern of |D′| of the Asian American ethnic group reveals that there exists one small LD block and one relatively large LD block in the genomic region. The right-hand side of the LD pattern displays the results of the four gamete test, the patterns of which are similar to the LD patterns of |D′|.