| Literature DB >> 21269473 |
Catherine A McCarty, Rex L Chisholm, Christopher G Chute, Iftikhar J Kullo, Gail P Jarvik, Eric B Larson, Rongling Li, Daniel R Masys, Marylyn D Ritchie, Dan M Roden, Jeffery P Struewing, Wendy A Wolf.
Abstract
INTRODUCTION: The eMERGE (electronic MEdical Records and GEnomics) Network is an NHGRI-supported consortium of five institutions to explore the utility of DNA repositories coupled to Electronic Medical Record (EMR) systems for advancing discovery in genome science. eMERGE also includes a special emphasis on the ethical, legal and social issues related to these endeavors. ORGANIZATION: The five sites are supported by an Administrative Coordinating Center. Setting of network goals is initiated by working groups: (1) Genomics, (2) Informatics, and (3) Consent & Community Consultation, which also includes active participation by investigators outside the eMERGE funded sites, and (4) Return of Results Oversight Committee. The Steering Committee, comprised of site PIs and representatives and NHGRI staff, meet three times per year, once per year with the External Scientific Panel. CURRENT PROGRESS: The primary site-specific phenotypes for which samples have undergone genome-wide association study (GWAS) genotyping are cataract and HDL, dementia, electrocardiographic QRS duration, peripheral arterial disease, and type 2 diabetes. A GWAS is also being undertaken for resistant hypertension in ≈ 2,000 additional samples identified across the network sites, to be added to data available for samples already genotyped. Funded by ARRA supplements, secondary phenotypes have been added at all sites to leverage the genotyping data, and hypothyroidism is being analyzed as a cross-network phenotype. Results are being posted in dbGaP. Other key eMERGE activities include evaluation of the issues associated with cross-site deployment of common algorithms to identify cases and controls in EMRs, data privacy of genomic and clinically-derived data, developing approaches for large-scale meta-analysis of GWAS data across five sites, and a community consultation and consent initiative at each site. FUTURE ACTIVITIES: Plans are underway to expand the network in diversity of populations and incorporation of GWAS findings into clinical care.Entities:
Mesh:
Year: 2011 PMID: 21269473 PMCID: PMC3038887 DOI: 10.1186/1755-8794-4-13
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Figure 1Organizational structure of the eMERGE network
Comparison of Biobanks and Phenotypes Across the eMERGE sites
| Marshfield | Population-based | 20,000; 98% Caucasian, mean age 48, range 18-102 | Low HDL cholesterol, cataract (n = 3968) Secondary: diabetic retinopathy |
| Mayo Clinic | PAD cases identified from the Mayo non-invasive vascular laboratory database; control subjects without PAD identified from the Cardiovascular Health Clinic | 1687 cases (mean age 65) and 1725 controls (mean age 60) | Peripheral arterial disease (PAD) (n = 3412); Secondary: red blood cell indices |
| Northwestern | Outpatient clinic and hospital-based | Approximately 10,000; 12% African American, 8% Hispanic; Mean age 50, range 18 - 90+ | Primary: type II diabetes (n = 3531); Secondary: lipids and height |
| Group Health | ACT Study Cohort of aged 65 and olderrandomly sampled from an HMO all known not to be demented at enrollment and followed for development of dementia, and Alzheimer's Disease (source of cases and controls) ADPR: Alzheimer's disease cases from a model incidence case registry (source of cases) | Approximately 4000 persons over age 65 from ACT Study | Alzheimer Disease (n = 3390), carotid artery stenosis; Secondary: statin adverse events |
| Vanderbilt | Use of discarded blood/non-human subjects linked to electronic medical records | Approximately 75,000; 70% Caucasian, 10% African American; mean age 53, range 18-100 | Electrocardiographic QRS duration (n = 3192); Secondary: PheWAS |
Sample size estimates for GWAS of diabetic retinopathy with power estimates for the individual sites and with combined data
| Marshfield | 367/569 | .424 |
| Northwestern | 150/1262 | .243 |
| Mayo | 83/3412 | .101 |
| Group Health | 324/667 | .449 |
| Vanderbilt | 260/500 | .270 |
| Combined | 1184/6410 | .999 |
Figure 2Comparison of ICD9 Codes Between two sites for patients with Type 2 Diabetes. The graphic shows the prevalence of ICD-9 codes, grouped into ICD-9 code sections (n=120). The linear regression line drawn between these two populations show that these two groups are similar, with a slope slightly favoring a higher prevalence of more codes at Northwestern (NU) than Vanderbilt (VU).
Best practices being developed by the eMERGE sites
| Marshfield | Enhance internal EMRs to capture data in a structured format. This may involve changing existing input points in the record. Information validated against questionnaire data where applicable; Develop and evaluate a computer-based consenting process along with revisions to the current written informed consent document for our general biobank; Development of validated electronic algorithms for cataract, HDL, and diabetic retinopathy |
| Mayo Clinic | Manual abstraction vs. EMR-based algorithms: virtually all algorithms ultimately are dependent upon unstructured data; develop criteria for standardizing data dictionaries and best practices for handling missing data elements; community engagement survey instrument & educational video to educate community regarding biobank and community engagement processes; develop institutional policy and procedure for sharing of GWAS data; assess phenotyping heterogeneity from the EMR |
| Northwestern | Informatics: Identify shortcomings of data capture from routine clinical care and repurposed for research; Develop and implement common standards for formatting and sharing data; Community engagement: Develop model consent language; Summarize community engagement efforts around data sharing in our population; Genomics: Develop process for GWAS data certification review and approval; Other/general: Develop best practices for interacting with IRBs around biorepository formation and ongoing consultation |
| Group Health | Mapping the electronic derived cases vs. 'research quality' (e.g. dementia). How to handle cases from different sources; Use of "low tech" methods to extract NLP information; identify participant-centered best practices regarding consent from existing cohorts; develop recommendations for institutions, investigators re consent, data sharing, other issues with GWAS and related research (products from a consensus panel process) |
| Vanderbilt | Identify shortcomings and enhance internal EMRs to capture data in a structured format; develop methods for assessing/labeling certainty of data shared to public databases; create a description of the various analogs to human subjects biobanking in a non-human subjects model |
| Administrative Coordinating Center | Creation of a library of searchable phenotype algorithms plus associated metadata; creation of educational materials on genomic data privacy for IRBs and other regulatory decision makers; develop a re-identification risk framework for biomedical data to be shared to dbGaP |