| Literature DB >> 18695938 |
Taku Miyagawa1, Nao Nishida1, Jun Ohashi1, Ryosuke Kimura1,2, Akihiro Fujimoto1, Minae Kawashima1,3, Asako Koike4, Tsukasa Sasaki5, Hisashi Tanii6, Takeshi Otowa7, Yoshio Momose8,9, Yasuo Nakahara8, Jun Gotoh8, Yuji Okazaki10, Shoji Tsuji8,9, Katsushi Tokunaga11.
Abstract
Genome-wide association studies (GWAS) using a large number of single nucleotide polymorphisms (SNPs) have successfully been applied to identify genetic variants of common diseases. However, genotyping using the new array technologies is often associated with spurious results that could unfavorably affect analyses of GWAS. Consequently, data cleaning is of paramount importance in excluding spurious genotyping results. In this study, we investigated the criteria required for the appropriate cleaning of 389 unrelated healthy Japanese samples analyzed using the GeneChip Human Mapping 500K Array Set for GWAS. The samples were randomly subdivided into two groups, and the allele frequencies in the groups were compared for individual SNPs as a quasi-case-control study. Then, observed results were filtered by four parameters (SNP call rate, confidence score obtained using the Bayesian Robust Linear Model with Mahalanobis genotype-calling algorithm, Hardy-Weinberg equilibrium, and minor allele frequency) and assessed for deviation from the null hypothesis. We found that appropriate data cleaning could be achieved using these four parameters. Our findings offer an avenue for obtaining appropriate data from GWAS.Entities:
Mesh:
Year: 2008 PMID: 18695938 DOI: 10.1007/s10038-008-0322-y
Source DB: PubMed Journal: J Hum Genet ISSN: 1434-5161 Impact factor: 3.172