Literature DB >> 34952891

Dialogue: High-throughput studies in rheumatology: time for unsupervised clustering?

George Bertsias1,2.   

Abstract

Entities:  

Keywords:  autoimmune diseases; health services research; lupus erythematosus; systemic

Mesh:

Year:  2021        PMID: 34952891      PMCID: PMC8710894          DOI: 10.1136/lupus-2021-000643

Source DB:  PubMed          Journal:  Lupus Sci Med        ISSN: 2053-8790


× No keyword cloud information.
In complex autoimmune rheumatic diseases, high-throughput technologies simultaneously analysing dozens, hundreds or thousands of biological cues (genes, metabolites, serum proteins etc) have long been considered valuable in obtaining unique pathogenic insights while facilitating the discovery of therapeutic targets and biomarkers for diagnosis, monitoring and prognosis.1 In the current issue of Lupus Science and Medicine, Brunekreef et al2 used a custom chip-based microarray to probe serum samples for a total 57 known and new IgG autoantibodies and explore their diagnostic utility in SLE. By comparing the prevalence of each autoantibody in 483 patients with SLE and 1397 disease controls (including 361 healthy individuals), they found that anti-double stranded(ds)DNA antibodies and antibodies against Cytosine-phosphate-Guanine (anti-CpG) DNA motifs could best discriminate SLE versus control groups with corresponding area under the receiver operating curve (AUC) values of 0.800 and 0.756, respectively.2 Notably, 15.1% of patients with SLE negative for anti-dsDNA tested positive for anti-CpG DNA antibodies, therefore suggesting added diagnostic value. Although the exact specificity of CpG-targeting antibodies was not explored and some cross-reactivity with anti-dsDNA antibodies cannot be entirely excluded, the results are biologically plausible given the abundance of nucleic acids containing unmethylated or hypomethylated CpG DNA in SLE.3–5 Pending further standardisation of the CpG DNA detection methods and validation of these findings, certain methodological aspects of this work merit discussion. First, patients were designated as SLE or other disease/condition by the use of a text mining algorithm that searched for pre-specified disease-related or symptom-related keywords in retrospectively collected electronic health records. Although, in general, such strategies are considered valid and advantageous for large datasets,6 algorithm-assigned diagnoses were not ascertained by the existing classification criteria or other means. This might account for the lower-than-expected frequency of anti-nuclear antibodies (19 out of 147 first samples tested negative) in patients with SLE and also the fact that about 30% of all patients received more than one diagnosis. Second, the researchers assigned patients without SLE to multiple control groups including one with mild, non-specific symptoms resembling healthy controls, a second with lupus-like (or incomplete lupus) presentations (eg, arthritis, nephritis, serositis) and a third with an autoimmune disease other than SLE.2 Notwithstanding this might reflect the ‘real-life’ situation where patients do not always fit into exact diagnostic entities, one should consider that autoimmune rheumatic diseases like SLE tend often to develop over time; therefore, some of the disease controls might represent early (or pre-) lupus forms.7 8 This is also supported by the between-group differences in the prevalence of autoantibodies reported by the authors.2 These complexities in the definition and phenotypic heterogeneity of autoimmune rheumatic disorders bring out the issue of how we can best use high-throughput studies and big data towards disease diagnosis/classification and risk stratification. To date, the majority of studies have employed a conventional, ‘supervised’-type approach to analyse biological (input) data which are tagged with pre-specified (output) ‘labels’ (diagnostic or endophenotypic groups). Although this method is straightforward and can yield accurate classification results, especially following implementation of sophisticated machine learning tools,9–11 it is biased heavily on the accuracy of the available diagnostic information (considered to be ‘ground truth’) and pre-existing grouping of the dataset. In the situation we have no accurate prior knowledge on the diagnostic groups for the samples or the output is not really “yes or no” (eg, SLE or not) but rather behaves as a continuum of states (eg, ranging from healthy, pre-lupus, mild lupus, severe lupus), unsupervised clustering (or learning) might represent a more suitable solution. Indeed, these computational methods require no preconceived assumptions, work with unlabeled outputs and infer the inherent structure present within a dataset.10 12 Accordingly, they are useful to recognise hidden patterns or combinations of biological data, therefore providing a natural clustering of the complex-structured samples. Interpretability of the resulting clusters and characterisation of their distinctive features in a compact form may require additional steps as part of a decision-making process;13 nonetheless, unsupervised approaches move closer to the current concept of revisiting autoimmune rheumatic diseases based on the underlying molecular taxonomy.14 To this end, high-throughput studies such as this by Brunekreef et al2 represent notable contributions in the diagnostics of rheumatic diseases and the identification of sub-phenotypes with possibly distinct underlying pathophysiology. With accruing experience in the analysis of big data, the community should gradually move forward to implementing less biased classification methods to ultimately ‘let the data speak for themselves’.
  13 in total

Review 1.  Preclinical Rheumatoid Arthritis: Progress Toward Prevention.

Authors:  Kulveer Mankia; Paul Emery
Journal:  Arthritis Rheumatol       Date:  2016-04       Impact factor: 10.995

Review 2.  Moving towards a molecular taxonomy of autoimmune rheumatic diseases.

Authors:  Guillermo Barturen; Lorenzo Beretta; Ricard Cervera; Ronald Van Vollenhoven; Marta E Alarcón-Riquelme
Journal:  Nat Rev Rheumatol       Date:  2018-01-24       Impact factor: 20.543

Review 3.  Machine Learning in Rheumatic Diseases.

Authors:  Mengdi Jiang; Yueting Li; Chendan Jiang; Lidan Zhao; Xuan Zhang; Peter E Lipsky
Journal:  Clin Rev Allergy Immunol       Date:  2021-02       Impact factor: 8.667

Review 4.  Preclinical lupus.

Authors:  Rebecka Bourn; Judith A James
Journal:  Curr Opin Rheumatol       Date:  2015-09       Impact factor: 5.006

Review 5.  A Toll for lupus.

Authors:  H J Anders
Journal:  Lupus       Date:  2005       Impact factor: 2.911

Review 6.  An introduction to machine learning and analysis of its use in rheumatic diseases.

Authors:  Kathryn M Kingsmore; Christopher E Puglisi; Amrie C Grammer; Peter E Lipsky
Journal:  Nat Rev Rheumatol       Date:  2021-11-02       Impact factor: 20.543

7.  Lupus or not? SLE Risk Probability Index (SLERPI): a simple, clinician-friendly machine learning-based model to assist the diagnosis of systemic lupus erythematosus.

Authors:  Christina Adamichou; Irini Genitsaridi; Dionysis Nikolopoulos; Myrto Nikoloudaki; Argyro Repa; Alessandra Bortoluzzi; Antonis Fanouriakis; Prodromos Sidiropoulos; Dimitrios T Boumpas; George K Bertsias
Journal:  Ann Rheum Dis       Date:  2021-02-10       Impact factor: 19.103

Review 8.  Data Processing and Text Mining Technologies on Electronic Medical Records: A Review.

Authors:  Wencheng Sun; Zhiping Cai; Yangyang Li; Fang Liu; Shengqun Fang; Guoyan Wang
Journal:  J Healthc Eng       Date:  2018-04-08       Impact factor: 2.682

9.  Oxidized mitochondrial nucleoids released by neutrophils drive type I interferon production in human lupus.

Authors:  Simone Caielli; Shruti Athale; Bojana Domic; Elise Murat; Manjari Chandra; Romain Banchereau; Jeanine Baisch; Kate Phelps; Sandra Clayton; Mei Gong; Tracey Wright; Marilynn Punaro; Karolina Palucka; Cristiana Guiducci; Jacques Banchereau; Virginia Pascual
Journal:  J Exp Med       Date:  2016-04-18       Impact factor: 14.307

10.  Microarray testing in patients with systemic lupus erythematosus identifies a high prevalence of CpG DNA-binding antibodies.

Authors:  Tammo Brunekreef; Maarten Limper; Rowena Melchers; Linda Mathsson-Alm; Jorge Dias; Imo Hoefer; Saskia Haitjema; Jacob M van Laar; Henny Otten
Journal:  Lupus Sci Med       Date:  2021-10
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.