Biao Li1, Gao Wang, Suzanne M Leal. 1. Center for Statistical Genetics, Department of Molecular and Human Genetics, One Baylor Plaza 700D, Baylor College of Medicine, Houston, TX 77030, USA.
Abstract
MOTIVATION: Next-generation sequencing and other high-throughput technology advances have promoted great interest in detecting associations between complex traits and genetic variants. Phenotype selection, quality control (QC) and control of confounders are crucial and can have a great impact on the ability to detect associations. Although there are programs to perform association analyses, e.g. PLINK and GenABEL, they cannot be used for comprehensive management and QC of phenotype data. To address this need PhenoMan was developed: to select individuals based on multiple phenotype criteria or population membership; control for missing covariate data; remove related individuals, duplicate samples and individuals with incorrect sex specification; recode primary traits and covariates; transform data; remove or winsorize outliers; select covariates for analysis; and create residuals. To ensure consistency and harmonization between analyses, a report is generated for every dataset. Summary statistics are also provided in graphical or text format. PhenoMan can be used for selection and manipulation of quantitative, disease and control data. SUMMARY: Phenoman is freeware that provides approaches for efficient exploration and management of phenotype data. Proper QC of phenotypes before proceeding to the association analysis is critical to ensure control of type I and II errors, reliable effect estimates and consistent results between studies. PhenoMan is highly beneficial for the preparation of qualitative and quantitative trait data for association studies using new datasets as well as those obtained from public repositories. AVAILABILITY AND IMPLEMENTATION: code.google.com/p/phenoman
MOTIVATION: Next-generation sequencing and other high-throughput technology advances have promoted great interest in detecting associations between complex traits and genetic variants. Phenotype selection, quality control (QC) and control of confounders are crucial and can have a great impact on the ability to detect associations. Although there are programs to perform association analyses, e.g. PLINK and GenABEL, they cannot be used for comprehensive management and QC of phenotype data. To address this need PhenoMan was developed: to select individuals based on multiple phenotype criteria or population membership; control for missing covariate data; remove related individuals, duplicate samples and individuals with incorrect sex specification; recode primary traits and covariates; transform data; remove or winsorize outliers; select covariates for analysis; and create residuals. To ensure consistency and harmonization between analyses, a report is generated for every dataset. Summary statistics are also provided in graphical or text format. PhenoMan can be used for selection and manipulation of quantitative, disease and control data. SUMMARY: Phenoman is freeware that provides approaches for efficient exploration and management of phenotype data. Proper QC of phenotypes before proceeding to the association analysis is critical to ensure control of type I and II errors, reliable effect estimates and consistent results between studies. PhenoMan is highly beneficial for the preparation of qualitative and quantitative trait data for association studies using new datasets as well as those obtained from public repositories. AVAILABILITY AND IMPLEMENTATION: code.google.com/p/phenoman
Authors: Yen-Pei Christy Chang; James Dae-Ok Kim; Karen Schwander; Dabeeru C Rao; Mike B Miller; Alan B Weder; Richard S Cooper; Nicholas J Schork; Michael A Province; Alanna C Morrison; Sharon L R Kardia; Thomas Quertermous; Aravinda Chakravarti Journal: Eur J Hum Genet Date: 2006-04 Impact factor: 4.246
Authors: Shaun Purcell; Benjamin Neale; Kathe Todd-Brown; Lori Thomas; Manuel A R Ferreira; David Bender; Julian Maller; Pamela Sklar; Paul I W de Bakker; Mark J Daly; Pak C Sham Journal: Am J Hum Genet Date: 2007-07-25 Impact factor: 11.025
Authors: Paul I W de Bakker; Manuel A R Ferreira; Xiaoming Jia; Benjamin M Neale; Soumya Raychaudhuri; Benjamin F Voight Journal: Hum Mol Genet Date: 2008-10-15 Impact factor: 6.150
Authors: Mengyuan Kan; Paul L Auer; Gao T Wang; Kristine L Bucasas; Stanley Hooker; Alejandra Rodriguez; Biao Li; Jaclyn Ellis; L Adrienne Cupples; Yii-Der Ida Chen; Josée Dupuis; Caroline S Fox; Myron D Gross; Joshua D Smith; Nancy Heard-Costa; James B Meigs; James S Pankow; Jerome I Rotter; David Siscovick; James G Wilson; Jay Shendure; Rebecca Jackson; Ulrike Peters; Hua Zhong; Danyu Lin; Li Hsu; Nora Franceschini; Chris Carlson; Goncalo Abecasis; Stacey Gabriel; Michael J Bamshad; David Altshuler; Deborah A Nickerson; Kari E North; Leslie A Lange; Alexander P Reiner; Suzanne M Leal Journal: Eur J Hum Genet Date: 2016-01-13 Impact factor: 4.246