Jean-Baptiste Escudié1,2,3, Bastien Rance4,5, Georgia Malamut4, Sherine Khater4, Anita Burgun4,5, Christophe Cellier4, Anne-Sophie Jannot4,5. 1. Georges Pompidou European Hospital (HEGP), AP-HP, Paris, France. jean-baptiste.escudie@aphp.fr. 2. INSERM UMRS 1138, Paris Descartes University, Paris, France. jean-baptiste.escudie@aphp.fr. 3. Pôle Informatique Médicale et Santé Publique, Hôpital Européen Georges Pompidou, 20 rue Leblanc, 75015, Paris, France. jean-baptiste.escudie@aphp.fr. 4. Georges Pompidou European Hospital (HEGP), AP-HP, Paris, France. 5. INSERM UMRS 1138, Paris Descartes University, Paris, France.
Abstract
BACKGROUND: Data collected in EHRs have been widely used to identifying specific conditions; however there is still a need for methods to define comorbidities and sources to identify comorbidities burden. We propose an approach to assess comorbidities burden for a specific disease using the literature and EHR data sources in the case of autoimmune diseases in celiac disease (CD). METHODS: We generated a restricted set of comorbidities using the literature (via the MeSH® co-occurrence file). We extracted the 15 most co-occurring autoimmune diseases of the CD. We used mappings of the comorbidities to EHR terminologies: ICD-10 (billing codes), ATC (drugs) and UMLS (clinical reports). Finally, we extracted the concepts from the different data sources. We evaluated our approach using the correlation between prevalence estimates in our cohort and co-occurrence ranking in the literature. RESULTS: We retrieved the comorbidities for 741 patients with CD. 18.1% of patients had at least one of the 15 studied autoimmune disorders. Overall, 79.3% of the mapped concepts were detected only in text, 5.3% only in ICD codes and/or drugs prescriptions, and 15.4% could be found in both sources. Prevalence in our cohort were correlated with literature (Spearman's coefficient 0.789, p = 0.0005). The three most prevalent comorbidities were thyroiditis 12.6% (95% CI 10.1-14.9), type 1 diabetes 2.3% (95% CI 1.2-3.4) and dermatitis herpetiformis 2.0% (95% CI 1.0-3.0). CONCLUSION: We introduced a process that leveraged the MeSH terminology to identify relevant autoimmune comorbidities of the CD and several data sources from EHRs to phenotype a large population of CD patients. We achieved prevalence estimates comparable to the literature.
BACKGROUND: Data collected in EHRs have been widely used to identifying specific conditions; however there is still a need for methods to define comorbidities and sources to identify comorbidities burden. We propose an approach to assess comorbidities burden for a specific disease using the literature and EHR data sources in the case of autoimmune diseases in celiac disease (CD). METHODS: We generated a restricted set of comorbidities using the literature (via the MeSH® co-occurrence file). We extracted the 15 most co-occurring autoimmune diseases of the CD. We used mappings of the comorbidities to EHR terminologies: ICD-10 (billing codes), ATC (drugs) and UMLS (clinical reports). Finally, we extracted the concepts from the different data sources. We evaluated our approach using the correlation between prevalence estimates in our cohort and co-occurrence ranking in the literature. RESULTS: We retrieved the comorbidities for 741 patients with CD. 18.1% of patients had at least one of the 15 studied autoimmune disorders. Overall, 79.3% of the mapped concepts were detected only in text, 5.3% only in ICD codes and/or drugs prescriptions, and 15.4% could be found in both sources. Prevalence in our cohort were correlated with literature (Spearman's coefficient 0.789, p = 0.0005). The three most prevalent comorbidities were thyroiditis 12.6% (95% CI 10.1-14.9), type 1 diabetes 2.3% (95% CI 1.2-3.4) and dermatitis herpetiformis 2.0% (95% CI 1.0-3.0). CONCLUSION: We introduced a process that leveraged the MeSH terminology to identify relevant autoimmune comorbidities of the CD and several data sources from EHRs to phenotype a large population of CDpatients. We achieved prevalence estimates comparable to the literature.
Authors: Wei-Qi Wei; Pedro L Teixeira; Huan Mo; Robert M Cronin; Jeremy L Warner; Joshua C Denny Journal: J Am Med Inform Assoc Date: 2015-09-02 Impact factor: 4.497
Authors: Eric I Benchimol; Astrid Guttmann; David R Mack; Geoffrey C Nguyen; John K Marshall; James C Gregor; Jenna Wong; Alan J Forster; Douglas G Manuel Journal: J Clin Epidemiol Date: 2014-04-26 Impact factor: 6.437
Authors: Graciela H Gonzalez; Tasnia Tahsin; Britton C Goodale; Anna C Greene; Casey S Greene Journal: Brief Bioinform Date: 2015-09-29 Impact factor: 11.622
Authors: Franck Diaz-Garelli; Roy Strowd; Virginia L Lawson; Maria E Mayorga; Brian J Wells; Thomas W Lycan; Umit Topaloglu Journal: JCO Clin Cancer Inform Date: 2020-06
Authors: Jose-Franck Diaz-Garelli; Roy Strowd; Tamjeed Ahmed; Brian J Wells; Rebecca Merrill; Javier Laurini; Boris Pasche; Umit Topaloglu Journal: JAMIA Open Date: 2019-08-05
Authors: Elena Díaz-Santiago; Fernando M Jabato; Elena Rojano; Pedro Seoane; Florencio Pazos; James R Perkins; Juan A G Ranea Journal: PLoS Genet Date: 2020-10-01 Impact factor: 5.917