Xiayuan Huang1, Robert C Elston2, Guilherme J Rosa3, John Mayer4, Zhan Ye4, Terrie Kitchner5, Murray H Brilliant5,6, David Page1,7, Scott J Hebbring5,6. 1. Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA. 2. Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH 44106, USA. 3. Department of Animal Science, University of Wisconsin-Madison, Madison, WI 53706, USA. 4. Biomedical Informatics Research Center. 5. Center for Human Genetics, Marshfield Clinic Research Institute, Marshfield, WI 54449, USA. 6. Department of Medical Genetics. 7. Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA.
Abstract
Motivation: Pedigree analysis is a longstanding and powerful approach to gain insight into the underlying genetic factors in human health, but identifying, recruiting and genotyping families can be difficult, time consuming and costly. Development of high throughput methods to identify families and foster downstream analyses are necessary. Results: This paper describes simple methods that allowed us to identify 173 368 family pedigrees with high probability using basic demographic data available in most electronic health records (EHRs). We further developed and validate a novel statistical method that uses EHR data to identify families more likely to have a major genetic component to their diseases risk. Lastly, we showed that incorporating EHR-linked family data into genetic association testing may provide added power for genetic mapping without additional recruitment or genotyping. The totality of these results suggests that EHR-linked families can enable classical genetic analyses in a high-throughput manner. Availability and implementation: Pseudocode is provided as supplementary information. Contact: HEBBRING.SCOTT@marshfieldresearch.org. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Pedigree analysis is a longstanding and powerful approach to gain insight into the underlying genetic factors in human health, but identifying, recruiting and genotyping families can be difficult, time consuming and costly. Development of high throughput methods to identify families and foster downstream analyses are necessary. Results: This paper describes simple methods that allowed us to identify 173 368 family pedigrees with high probability using basic demographic data available in most electronic health records (EHRs). We further developed and validate a novel statistical method that uses EHR data to identify families more likely to have a major genetic component to their diseases risk. Lastly, we showed that incorporating EHR-linked family data into genetic association testing may provide added power for genetic mapping without additional recruitment or genotyping. The totality of these results suggests that EHR-linked families can enable classical genetic analyses in a high-throughput manner. Availability and implementation: Pseudocode is provided as supplementary information. Contact: HEBBRING.SCOTT@marshfieldresearch.org. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Eugene C Rich; Wylie Burke; Caryl J Heaton; Susanne Haga; Linda Pinsky; M Priscilla Short; Louise Acheson Journal: J Gen Intern Med Date: 2004-03 Impact factor: 5.128
Authors: Louise Emilsson; Cisca Wijmenga; Joseph A Murray; Jonas F Ludvigsson Journal: Clin Gastroenterol Hepatol Date: 2015-01-31 Impact factor: 11.382
Authors: Catherine A McCarty; Rex L Chisholm; Christopher G Chute; Iftikhar J Kullo; Gail P Jarvik; Eric B Larson; Rongling Li; Daniel R Masys; Marylyn D Ritchie; Dan M Roden; Jeffery P Struewing; Wendy A Wolf Journal: BMC Med Genomics Date: 2011-01-26 Impact factor: 3.063
Authors: Wendy S Rubinstein; Louise S Acheson; Suzanne M O'Neill; Mack T Ruffin; Catharine Wang; Jennifer L Beaumont; Nan Rothrock Journal: Genet Med Date: 2011-11 Impact factor: 8.822
Authors: Omri Gottesman; Helena Kuivaniemi; Gerard Tromp; W Andrew Faucett; Rongling Li; Teri A Manolio; Saskia C Sanderson; Joseph Kannry; Randi Zinberg; Melissa A Basford; Murray Brilliant; David J Carey; Rex L Chisholm; Christopher G Chute; John J Connolly; David Crosslin; Joshua C Denny; Carlos J Gallego; Jonathan L Haines; Hakon Hakonarson; John Harley; Gail P Jarvik; Isaac Kohane; Iftikhar J Kullo; Eric B Larson; Catherine McCarty; Marylyn D Ritchie; Dan M Roden; Maureen E Smith; Erwin P Böttinger; Marc S Williams Journal: Genet Med Date: 2013-06-06 Impact factor: 8.822