Richard Jackson1, Rashmi Patel1,2, Sumithra Velupillai1,3, George Gkotsis1, David Hoyle4, Robert Stewart1,2. 1. Institute of Psychiatry, Psychology and Neuroscience, King's College London, London, SE5 8AF, UK. 2. South London and Maudsley NHS Foundation Trust, London, SE5 8AZ, UK. 3. School of Computer Science and Communication, TH Royal Institute of Technology, Stockholm, SE-100 44, Sweden. 4. Independent Researcher, Manchester, UK.
Abstract
Background: Deep Phenotyping is the precise and comprehensive analysis of phenotypic features in which the individual components of the phenotype are observed and described. In UK mental health clinical practice, most clinically relevant information is recorded as free text in the Electronic Health Record, and offers a granularity of information beyond what is expressed in most medical knowledge bases. The SNOMED CT nomenclature potentially offers the means to model such information at scale, yet given a sufficiently large body of clinical text collected over many years, it is difficult to identify the language that clinicians favour to express concepts. Methods: By utilising a large corpus of healthcare data, we sought to make use of semantic modelling and clustering techniques to represent the relationship between the clinical vocabulary of internationally recognised SMI symptoms and the preferred language used by clinicians within a care setting. We explore how such models can be used for discovering novel vocabulary relevant to the task of phenotyping Serious Mental Illness (SMI) with only a small amount of prior knowledge. Results: 20 403 terms were derived and curated via a two stage methodology. The list was reduced to 557 putative concepts based on eliminating redundant information content. These were then organised into 9 distinct categories pertaining to different aspects of psychiatric assessment. 235 concepts were found to be expressions of putative clinical significance. Of these, 53 were identified having novel synonymy with existing SNOMED CT concepts. 106 had no mapping to SNOMED CT. Conclusions: We demonstrate a scalable approach to discovering new concepts of SMI symptomatology based on real-world clinical observation. Such approaches may offer the opportunity to consider broader manifestations of SMI symptomatology than is typically assessed via current diagnostic frameworks, and create the potential for enhancing nomenclatures such as SNOMED CT based on real-world expressions.
Background: Deep Phenotyping is the precise and comprehensive analysis of phenotypic features in which the individual components of the phenotype are observed and described. In UK mental health clinical practice, most clinically relevant information is recorded as free text in the Electronic Health Record, and offers a granularity of information beyond what is expressed in most medical knowledge bases. The SNOMED CT nomenclature potentially offers the means to model such information at scale, yet given a sufficiently large body of clinical text collected over many years, it is difficult to identify the language that clinicians favour to express concepts. Methods: By utilising a large corpus of healthcare data, we sought to make use of semantic modelling and clustering techniques to represent the relationship between the clinical vocabulary of internationally recognised SMI symptoms and the preferred language used by clinicians within a care setting. We explore how such models can be used for discovering novel vocabulary relevant to the task of phenotyping Serious Mental Illness (SMI) with only a small amount of prior knowledge. Results: 20 403 terms were derived and curated via a two stage methodology. The list was reduced to 557 putative concepts based on eliminating redundant information content. These were then organised into 9 distinct categories pertaining to different aspects of psychiatric assessment. 235 concepts were found to be expressions of putative clinical significance. Of these, 53 were identified having novel synonymy with existing SNOMED CT concepts. 106 had no mapping to SNOMED CT. Conclusions: We demonstrate a scalable approach to discovering new concepts of SMI symptomatology based on real-world clinical observation. Such approaches may offer the opportunity to consider broader manifestations of SMI symptomatology than is typically assessed via current diagnostic frameworks, and create the potential for enhancing nomenclatures such as SNOMED CT based on real-world expressions.
Entities:
Keywords:
electronic health records; natural language processing; schizophrenia; serious mental illness; word2vec
Authors: Gayan Perera; Matthew Broadbent; Felicity Callard; Chin-Kuo Chang; Johnny Downs; Rina Dutta; Andrea Fernandes; Richard D Hayes; Max Henderson; Richard Jackson; Amelia Jewell; Giouliana Kadra; Ryan Little; Megan Pritchard; Hitesh Shetty; Alex Tulloch; Robert Stewart Journal: BMJ Open Date: 2016-03-01 Impact factor: 2.692
Authors: Christophe Gaudet-Blavignac; Vasiliki Foufi; Mina Bjelogrlic; Christian Lovis Journal: J Med Internet Res Date: 2021-01-26 Impact factor: 5.428
Authors: Sally L Baxter; Adam R Klie; Bharanidharan Radha Saseendrakumar; Gordon Y Ye; Michael Hogarth Journal: J Med Internet Res Date: 2020-08-14 Impact factor: 5.428