A Egli1. 1. Clinical Bacteriology and Mycology, University Hospital Basel, Basel, Switzerland; Applied Microbiology Research, Department of Biomedicine, University of Basel, Basel, Switzerland. Electronic address: adrian.egli@usb.ch.
In recent years, digitalization and artificial intelligence have made tremendous progress. In medicine, data-driven technologies are especially applicable in areas with a high degree of automation and standardization of data [1,2]. Substantial advances have as well been reported in clinical microbiology, but their translation into routine application remains a long process with several technical and regulatory hurdles. Some of the low-hanging fruits for diagnostics scenarios include (i) dashboards to interconnect and visualize microbiology data [3,4], (ii) automated analysis of images such as microscopy slides [5] or agar plates [6,7] and (iii) association of genome sequences and proteomic profiles with pathogen phenotypes [8,9]. Clinical applications require standardized data formats, ontologies with an interoperable information technology environment [10], infrastructure with sufficient storage and computational capacity, and technical expertise to address the needs of microbiologists and infectious diseases experts.In the present themed issue, Luz et al. summarize machine learning algorithms for the analysis of routine electronic health records. The authors identified 52 studies covering various aspects of infectious disease management including sepsis, hospital-acquired and surgical site infections, and microbiological test results. The heterogeneity of machine learning algorithms ranged from logistic regression, random forest, support vector machines to artificial neural networks. A key gap is the lack of essential information on data handling [11]. Pfeiffer-Smadja et al. ask if the time has come for machine learning in routine practice of clinical microbiology [12]. In 97 studies, the data sources used were highly diverse ranging from genomic data and microscopic images to mass spectrometry. Almost 40% of studies were from low- and middle-income countries—highlighting the opportunities that digitalization and digital biomarkers have to offer considering decreasing costs and cloud-based services [13]. However, digital biomarkers also require validation in clinical studies to show their impact on relevant outcomes. Lacking standardized data and algorithms poses an important challenge for reproduction and validation studies [14]. As a result of issues in data handling, two prominent published coronavirus disease 2019 (COVID-19) articles were recently retracted [15,16]. Journals clearly need standards for data and code sharing. The FAIR principles provide an excellent guidance [17]. Although software code and tools are often shared on GitHub (github.com) [18], the details provided are often limited with missing explanatory code books or instructions. Proper data and code-handling policies should be part of the new research quality standard and will allow independent validation of machine learning algorithms and data sets.Smith and Kirby report on applications in modern image analysis [19]. Machine-learning-based image analysis may revolutionize microscopy for classical Gram stains, ova and parasite preparation, and histopathological slides. For example, a neural network could categorize Gram stains from positive blood cultures with remarkable precision into Gram positives/negatives and cocci/rods [5]. Of note, state-of-the-art infrastructure to generate high-quality images, data storage and processing may be required. However, smartphone devices can bridge the technology gaps [20,21]. Similarly, based on pattern recognition, single bacterial colonies growing on agar plates can be categorized or even identified [6,7]. Both applications, automated microscopy and agar plate inspection, are likely to radically change the workflow in modern diagnostic laboratories [22]. Perhaps parallel to how we have embraced matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass-spectrometry for identification, making biochemical tests almost superfluous [23]. However, there may be potential for extracting additional information from MALDI-TOF spectra. Weis and colleagues look into this key technology [24] and summarize algorithms to link spectral profiles to microbiological phenotypes. In their review, 36 studies using machine learning for species identification and antibiotic susceptibility testing were identified. Most commonly used machine learning techniques included support vector machines, genetic algorithms, artificial neural networks and quick classifiers. Within the studies identified, a wide range of qualities were noted and only four studies validated their findings [24].All authors highlight the need for validated algorithms. Validation is also a key point in the regulatory process and impacts reimbursement. From May 2021, the medical device and in vitro medical device regulations of Europe will steer software with a diagnostic, monitoring or therapeutic purpose (http://ec.europa.eu/growth/sectors/medical-devices/regulatory-framework/), forming the basis for CE labelling including machine-learning-based algorithms in clinical microbiology. Both academia and industry will benefit from standards in data and code handling as this process will support validation and further build trust in computational models and methods [25,26]. A process additionally fuelled by (i) well-designed clinical studies and (ii) cross-validation to known and well-established statistical approaches. Ethical and legal aspects should also be raised if such algorithms are to be integrated into personalized and public health medicine [27]. As illustrated, during the COVID-19 crisis, multiple models have predicted different outcomes [28,29] of, for example, fatality rates and impact of the lockdown. In public health emergencies high-quality real-time data must be available in machine-readable formats for the scientific community. Such infrastructure for public health monitoring needs to be further developed. If public health decisions rely on such models, in return models should to be validated in a similar way to algorithms in personalized medicine because the impact for the general population and economics is significant.Clearly, an interesting and challenging time for clinical microbiology and infectious disease is ahead. Standards in data and code handling are a first step, which will allow us to use the opportunities of digitalization and machine learning to improve diagnostics and patient care.
Authors: Courtney Hebert; Jennifer Flaherty; Justin Smyer; Jing Ding; Julie E Mangino Journal: Am J Infect Control Date: 2017-11-10 Impact factor: 2.918
Authors: Seth Flaxman; Swapnil Mishra; Axel Gandy; H Juliette T Unwin; Thomas A Mellan; Helen Coupland; Charles Whittaker; Harrison Zhu; Tresnia Berah; Jeffrey W Eaton; Mélodie Monod; Azra C Ghani; Christl A Donnelly; Steven Riley; Michaela A C Vollmer; Neil M Ferguson; Lucy C Okell; Samir Bhatt Journal: Nature Date: 2020-06-08 Impact factor: 49.962
Authors: María Linares; María Postigo; Daniel Cuadrado; Alejandra Ortiz-Ruiz; Sara Gil-Casanova; Alexander Vladimirov; Jaime García-Villena; José María Nuñez-Escobedo; Joaquín Martínez-López; José Miguel Rubio; María Jesús Ledesma-Carbayo; Andrés Santos; Quique Bassat; Miguel Luengo-Oroz Journal: Malar J Date: 2019-01-24 Impact factor: 2.979
Authors: Katharina Last; Nicholas R Power; Sarah Dellière; Petar Velikov; Anja Šterbenc; Ivana Antal Antunovic; Maria João Lopes; Valentijn Schweitzer; Aleksandra Barac Journal: Clin Microbiol Infect Date: 2021-06-28 Impact factor: 8.067