Nathaniel D Mercaldo1, Kyle B Brothers2, David S Carrell3, Ellen W Clayton4, John J Connolly5, Ingrid A Holm6, Carol R Horowitz7, Gail P Jarvik8, Terrie E Kitchner9, Rongling Li10, Catherine A McCarty11, Jennifer B McCormick12, Valerie D McManus13, Melanie F Myers14, Joshua J Pankratz15, Martha J Shrubsole16, Maureen E Smith17, Sarah C Stallings18, Janet L Williams19, Jonathan S Schildcrout20. 1. Department of Radiology, Institute for Technology Assessment, Massachusetts General Hospital, Boston, Massachusetts, USA. 2. Department of Pediatrics, University of Louisville, Louisville, Kentucky, USA. 3. Kaiser Permanente Washington Health Research Institute, Seattle, Washington, USA. 4. Center for Biomedical Ethics and Society, Vanderbilt University, Nashville, Tennessee, USA. 5. Center for Applied Genomics, Children's Hospital of Philadelphia, Philadelphia, Pennsylvania, USA. 6. Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA. 7. Department of Population Health Science and Policy, Ichan School of Medicine at Mt. Sinai, New York, New York, USA. 8. Department of Genome Sciences, University of Washington, Seattle, Washington, USA. 9. Center for Human Genetics, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA. 10. Division of Genomic Medicine, National Human Genome Research Institute, Bethesda, Maryland, USA. 11. Department of Family Medicine and Biobehavioral Health, University of Minnesota Medical School, Duluth, Minnesota, USA. 12. Biomedical Ethics Program, Mayo Clinic, Rochester, Minnesota, USA. 13. Biomedical Informatics Research Center, Marshfield Clinic Research Foundation, Marshfield, Wisconsin, USA. 14. Division of Human Genetics, Cincinnati Children's Hospital, College of Medicine, University of Cincinnati, Cincinnati, Ohio, USA. 15. Department of Information Technology, Mayo Clinic, Rochester, Minnesota, USA. 16. Vanderbilt Epidemiology Center, Vanderbilt University, Nashville, Tennessee, USA. 17. Center for Genetic Medicine, Northwestern University, Chicago, Illinois, USA. 18. Division of Geriatric Medicine, Vanderbilt University, Nashville, Tennessee, USA. 19. Genomic Medicine Institute, Geisinger, Danville, Pennsylvania, USA. 20. Department of Biostatistics, Vanderbilt University, Nashville, Tennessee, USA.
Abstract
Objective: We describe a stratified sampling design that combines electronic health records (EHRs) and United States Census (USC) data to construct the sampling frame and an algorithm to enrich the sample with individuals belonging to rarer strata. Materials and Methods: This design was developed for a multi-site survey that sought to examine patient concerns about and barriers to participating in research studies, especially among under-studied populations (eg, minorities, low educational attainment). We defined sampling strata by cross-tabulating several socio-demographic variables obtained from EHR and augmented with census-block-level USC data. We oversampled rarer and historically underrepresented subpopulations. Results: The sampling strategy, which included USC-supplemented EHR data, led to a far more diverse sample than would have been expected under random sampling (eg, 3-, 8-, 7-, and 12-fold increase in African Americans, Asians, Hispanics and those with less than a high school degree, respectively). We observed that our EHR data tended to misclassify minority races more often than majority races, and that non-majority races, Latino ethnicity, younger adult age, lower education, and urban/suburban living were each associated with lower response rates to the mailed surveys. Discussion: We observed substantial enrichment from rarer subpopulations. The magnitude of the enrichment depends on the accuracy of the variables that define the sampling strata and the overall response rate. Conclusion: EHR and USC data may be used to define sampling strata that in turn may be used to enrich the final study sample. This design may be of particular interest for studies of rarer and understudied populations.
Objective: We describe a stratified sampling design that combines electronic health records (EHRs) and United States Census (USC) data to construct the sampling frame and an algorithm to enrich the sample with individuals belonging to rarer strata. Materials and Methods: This design was developed for a multi-site survey that sought to examine patient concerns about and barriers to participating in research studies, especially among under-studied populations (eg, minorities, low educational attainment). We defined sampling strata by cross-tabulating several socio-demographic variables obtained from EHR and augmented with census-block-level USC data. We oversampled rarer and historically underrepresented subpopulations. Results: The sampling strategy, which included USC-supplemented EHR data, led to a far more diverse sample than would have been expected under random sampling (eg, 3-, 8-, 7-, and 12-fold increase in African Americans, Asians, Hispanics and those with less than a high school degree, respectively). We observed that our EHR data tended to misclassify minority races more often than majority races, and that non-majority races, Latino ethnicity, younger adult age, lower education, and urban/suburban living were each associated with lower response rates to the mailed surveys. Discussion: We observed substantial enrichment from rarer subpopulations. The magnitude of the enrichment depends on the accuracy of the variables that define the sampling strata and the overall response rate. Conclusion: EHR and USC data may be used to define sampling strata that in turn may be used to enrich the final study sample. This design may be of particular interest for studies of rarer and understudied populations.
Authors: Charles Safran; Meryl Bloomrosen; W Edward Hammond; Steven Labkoff; Suzanne Markel-Fox; Paul C Tang; Don E Detmer Journal: J Am Med Inform Assoc Date: 2006-10-31 Impact factor: 4.497
Authors: Saskia C Sanderson; Kyle B Brothers; Nathaniel D Mercaldo; Ellen Wright Clayton; Armand H Matheny Antommaria; Sharon A Aufox; Murray H Brilliant; Diego Campos; David S Carrell; John Connolly; Pat Conway; Stephanie M Fullerton; Nanibaa' A Garrison; Carol R Horowitz; Gail P Jarvik; David Kaufman; Terrie E Kitchner; Rongling Li; Evette J Ludman; Catherine A McCarty; Jennifer B McCormick; Valerie D McManus; Melanie F Myers; Aaron Scrol; Janet L Williams; Martha J Shrubsole; Jonathan S Schildcrout; Maureen E Smith; Ingrid A Holm Journal: Am J Hum Genet Date: 2017-02-09 Impact factor: 11.025
Authors: Elissa V Klinger; Sara V Carlini; Irina Gonzalez; Stella St Hubert; Jeffrey A Linder; Nancy A Rigotti; Emily Z Kontos; Elyse R Park; Lucas X Marinacci; Jennifer S Haas Journal: J Gen Intern Med Date: 2014-12-20 Impact factor: 5.128
Authors: P Coorevits; M Sundgren; G O Klein; A Bahr; B Claerhout; C Daniel; M Dugas; D Dupont; A Schmidt; P Singleton; G De Moor; D Kalra Journal: J Intern Med Date: 2013-10-18 Impact factor: 8.989
Authors: Omri Gottesman; Helena Kuivaniemi; Gerard Tromp; W Andrew Faucett; Rongling Li; Teri A Manolio; Saskia C Sanderson; Joseph Kannry; Randi Zinberg; Melissa A Basford; Murray Brilliant; David J Carey; Rex L Chisholm; Christopher G Chute; John J Connolly; David Crosslin; Joshua C Denny; Carlos J Gallego; Jonathan L Haines; Hakon Hakonarson; John Harley; Gail P Jarvik; Isaac Kohane; Iftikhar J Kullo; Eric B Larson; Catherine McCarty; Marylyn D Ritchie; Dan M Roden; Maureen E Smith; Erwin P Böttinger; Marc S Williams Journal: Genet Med Date: 2013-06-06 Impact factor: 8.822
Authors: Chaitanya Shivade; Preethi Raghavan; Eric Fosler-Lussier; Peter J Embi; Noemie Elhadad; Stephen B Johnson; Albert M Lai Journal: J Am Med Inform Assoc Date: 2013-11-07 Impact factor: 4.497
Authors: Maureen E Smith; Saskia C Sanderson; Kyle B Brothers; Melanie F Myers; Jennifer McCormick; Sharon Aufox; Martha J Shrubsole; Nanibaá A Garrison; Nathaniel D Mercaldo; Jonathan S Schildcrout; Ellen Wright Clayton; Armand H Matheny Antommaria; Melissa Basford; Murray Brilliant; John J Connolly; Stephanie M Fullerton; Carol R Horowitz; Gail P Jarvik; Dave Kaufman; Terri Kitchner; Rongling Li; Evette J Ludman; Catherine McCarty; Valerie McManus; Sarah Stallings; Janet L Williams; Ingrid A Holm Journal: BMC Med Res Methodol Date: 2016-11-24 Impact factor: 4.615
Authors: Chin Yee Shim; Si Yee Chan; Yuan Wei; Hazim Ghani; Liyana Ahmad; Hanisah Sharif; Mohammad Fathi Alikhan; Saifuddien Haji Bagol; Surita Taib; Chee Wah Tan; Xin Mei Ong; Lin-Fa Wang; Yan Wang; An Qi Liu; Hong Shen Lim; Justin Wong; Lin Naing; Anne Catherine Cunningham Journal: Front Public Health Date: 2022-09-12
Authors: Aya A Mitani; Nathaniel D Mercaldo; Sebastien Haneuse; Jonathan S Schildcrout Journal: BMC Med Res Methodol Date: 2021-07-11 Impact factor: 4.612