| Literature DB >> 27054044 |
Kenneth M Borthwick1, Diane T Smelser1, Jonathan A Bock1, James R Elmore2, Evan J Ryer2, Zi Ye3, Jennifer A Pacheco4, David S Carrell5, Michael Michalkiewicz6, William K Thompson4, Jyotishman Pathak3, Suzette J Bielinski7, Joshua C Denny8, James G Linneman9, Peggy L Peissig9, Abel N Kho4, Omri Gottesman10, Harpreet Parmar6, Iftikhar J Kullo3, Catherine A McCarty11, Erwin P Böttinger10, Eric B Larson5, Gail P Jarvik12, John B Harley13, Tanvir Bajwa6, David P Franklin2, David J Carey1, Helena Kuivaniemi14, Gerard Tromp1.
Abstract
BACKGROUND ANDEntities:
Keywords: Aortic aneurysm; Case-Control study; Computing methodologies; Electronic health records; Electronic medical record; ICD-9; KNIME
Year: 2015 PMID: 27054044 PMCID: PMC4820287
Source DB: PubMed Journal: Int J Biomed Data Min ISSN: 2090-4916
Case control counts and demographics at different sites.
| Biobank | AAA Cases | Controls | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Cohort | Participants | Target diseases | N | Sex | Age | Sex | Age | |||
| N | % Male | Mean | SD | N | % Male | Mean | SD | |||
| Aurora | 17,902 | 256 | 80 | 73 | 10.04 | 16,011 | 44 | 63 | 15.37 | |
| GHS | 3,111 | AAA and obesity | 699 | 81 | 71 | 7.43 | 1,591 | 47 | 62 | 10.86 |
| Group Health | 3,528 | Alzheimer's disease | 178 | 72 | 80 | 5.16 | 2,490 | 44 | 78 | 8.94 |
| Marshfield | 4,987 | Eye diseases | 100 | 71 | 74 | 8.48 | 3,928 | 38 | 72 | 10.46 |
| Mayo | 10,062 | Cardiovascular diseases | 178 | 81 | 73 | 7.65 | 4,930 | 54 | 67 | 10.90 |
| Mount Sinai | 6,545 | Kidney diseases | 30 | 94 | 76 | 7.58 | 5,324 | 41 | 63 | 11.98 |
| Northwestern | 4,937 | Type 2 diabetes and cancer | 11 | 73 | 65 | 6.98 | 3,967 | 17 | 54 | 14.60 |
| Vanderbilt | 9,584 | Cardiovascular diseases | 38 | 58 | 78 | 7.71 | 152 | 76 | 78 | 8.59 |
| Total | 60,656 | 1,490 | 38,393 | |||||||
For eMERGE sites the number of biobank participants is with high-density genome-wide data [1], and for Aurora, the number is the total number of consented patients with a blood sample in the biobank.
Appropriate controls for the diseases mentioned were also included in the genome-wide data sets.
Age, age at diagnosis for cases, and age at enrollment for controls. AAA, Abdominal Aortic Aneurysm; GHS, Geisinger Health System.
Figure 1Overview of ePhenotyping. The diagram outlines the processes for phenotype algorithm development, validation and implementation as a KNIME workflow, as well as the interactions between the various study sites and investigators outside the eMERGE Network. AAA, abdominal aortic aneurysm; EHR, Electronic Health Record; eMERGE, electronic MEdical Records and GEnomics Network (http://www.gwas.org); KNIME, Konstanz Information Miner (http://www.knime.org/); PheKB, Phenotype KnowledgeBase available at http://www.phekb.org, an online collaborative repository for building, validating, and sharing electronic phenotype algorithms and their performance characteristics.
Figure 2ePhenotyping algorithm for the identification of cases with abdominal aortic aneurysms (AAA) and appropriate controls for research studies. For codes used in the algorithm, see Supplementary Material Table 1.
Figure 3KNIME representation of the abdominal aortic aneurysm (AAA) ePhenotyping algorithm. (A) Global overview of the algorithm. The inputs on the left are abstract representations of the data required by the algorithm. The data fields are enumerated in each “Table Creator” node. Each site can supply the input data via any of KNIME's data reader nodes, provided all fields are present, named according to the templates and of the correct data type. The algorithm is encapsulated in the central meta node. (B) Expansion of the meta node, showing individual KNIME nodes with annotation and graphic background to facilitate comprehension of algorithm steps. (C) Enlarged top portion of (B) for readability.
Figure 4Entity relational diagram (ERD) of the input for the AAA ePhenotyping algorithm.
Summary of chart review results at four participating sites
| Manual chart review | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GHS | Mayo Clinic | Marshfield Clinic | Aurora Health | |||||||||
| Case | Control | Total | Case | Control | Total | Case | Control | Total | Case | Control | Total | |
| EHR prediction | ||||||||||||
| Case | 47 | 3 | 50 | 44 | 6 | 50 | 25 | 0 | 25 | 48 | 2 | 50 |
| Control | 0 | 50 | 50 | 0 | 50 | 50 | 0 | 22 | 22 | 0 | 50 | 50 |
| Total | 47 | 53 | 100 | 44 | 56 | 100 | 25 | 22 | 47 | 48 | 52 | 100 |
| Case PPV | 94 | 88 | 100 | 96 | ||||||||
| Control PPV | 100 | 100 | 100 | 100 | ||||||||
Trained chart reviewers.
Clinician chart reviewers.
GHS, Geisinger Health System; EHR, electronic health record.
Distribution of AAA case types 1, 2 and 3 at GHS, Aurora Health System and Mayo Clinic biobanks.
| Case Type | GHS | Aurora | Mayo | |||
|---|---|---|---|---|---|---|
| N | % | N | % | N | % | |
| 1 | 295 | 42.2 | 0 | 0 | 72 | 40.4 |
| 2 | 16 | 2.3 | 7 | 2.7 | 0 | 0 |
| 3 | 388 | 55.5 | 249 | 97.3 | 106 | 59.4 |
| All | 699 | 100 | 256 | 100 | 178 | 100 |
Breakdown of the AAA cases into those who were operated for AAA (case Type 1), who had a ruptured AAA (case Type 2), or who had at least twice an ICD-9 code for AAA in their EHR (case Type 3)
Distribution of false positive AAA case types 1, 2 and 3 at GHS, Aurora Health System and Mayo Clinic based on manual chart review validation.
| Case Type | GHS N | Aurora N | Mayo N | Total N |
|---|---|---|---|---|
| 1 | 0 | 0 | 2 | 2 |
| 2 | 1 | 0 | 0 | 1 |
| 3 | 2 | 2 | 4 | 8 |
| All | 3 | 2 | 6 | 11 |