| Literature DB >> 34248665 |
Paresh C Giri1, Anand M Chowdhury2, Armando Bedoya2, Hengji Chen3, Hyun Suk Lee4, Patty Lee2, Craig Henriquez3, Neil R MacIntyre2, Yuh-Chin T Huang2.
Abstract
Analysis of pulmonary function tests (PFTs) is an area where machine learning (ML) may benefit clinicians, researchers, and the patients. PFT measures spirometry, lung volumes, and carbon monoxide diffusion capacity of the lung (DLCO). The results are usually interpreted by the clinicians using discrete numeric data according to published guidelines. PFT interpretations by clinicians, however, are known to have inter-rater variability and the inaccuracy can impact patient care. This variability may be caused by unfamiliarity of the guidelines, lack of training, inadequate understanding of lung physiology, or simply mental lapses. A rules-based automated interpretation system can recapitulate expert's pattern recognition capability and decrease errors. ML can also be used to analyze continuous data or the graphics, including the flow-volume loop, the DLCO and the nitrogen washout curves. These analyses can discover novel physiological biomarkers. In the era of wearables and telehealth, particularly with the COVID-19 pandemic restricting PFTs to be done in the clinical laboratories, ML can also be used to combine mobile spirometry results with an individual's clinical profile to deliver precision medicine. There are, however, hurdles in the development and commercialization of the ML-assisted PFT interpretation programs, including the need for high quality representative data, the existence of different formats for data acquisition and sharing in PFT software by different vendors, and the need for collaboration amongst clinicians, biomedical engineers, and information technologists. Hurdles notwithstanding, the new developments would represent significant advances that could be the future of PFT, the oldest test still in use in clinical medicine.Entities:
Keywords: DLCO; artificial intelligence; flow-volume loop; lung volumes; machine learning; pulmonary function test; spirometry
Year: 2021 PMID: 34248665 PMCID: PMC8264499 DOI: 10.3389/fphys.2021.678540
Source DB: PubMed Journal: Front Physiol ISSN: 1664-042X Impact factor: 4.566
Machine learning methods that may be useful in the analysis of the numeric and graphic data of the pulmonary function tests.
| Methods | Description |
| Random forests | A method of decision tree analysis in which a supervised algorithm works through “bagging” approach to create multiple decision trees with a random subset of the data. These decision trees are then merged to get a more accurate and stable prediction. It is the most common machine learning technique and is best suited for classification and regression tasks. |
| Neural network | A set of algorithms that uses interconnected layers of computational units (analogous to neurons in the brain) to find relationships in data by iteratively adapting the weights between units. The network typically consists of an input layer that receives the data, several hidden layers, and an output layer. The network can learn using supervised training where an input/output relationship is known or through unsupervised training where no outputs are provided. |
| Convolutional neural network | A form of neural network, in which the network learns to optimize the filters (or kernels) that slide along input features through automated learning and provides translational responses. It is most applied to analyze visual images. |
| Fuzzy logic | A means of fuzzy mathematics that is best suited to handle partial truth where the truth value of the variables may be any real number between 0 and 1. The method has the capability of recognizing, interpreting, and utilizing data and information that are vague and imprecise, and outputs the degrees of truth. The reasoning style fuzzy system can be combined with the learning structure of neural networks to become fuzzy-neural systems. The hybrid intelligent system has the strength of incorporating the universal approximation theorem to discover the interpretable IF-THEN fuzzy rules. |
| Naïve Bayes | A probabilistic classifier based on Bayes’ theorem. It assumes that the value of a particular feature is independent of the value of any other feature. It is a simple technique that only requires a small number of training data to estimate the parameters necessary for classification. The naive Bayes model can be used without accepting Bayesian probability or using any Bayesian methods. |
| Support vector machine | A supervised machine learning that analyzes data for classification and regression analysis. It can build a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier. It can also perform a non-linear classification using the “kernel trick” mapping the inputs into high-dimensional feature spaces. |
| k-means clustering | A common unsupervised machine learning method, in which unsupervised algorithms aim to group input vectors into k clusters based on k averages of points (i.e., centroids) without referring to known, or labeled outcomes. |
| Adaptive Boosting (AdaBoost) | A statistical classification algorithm that is frequently used with other “weaker” machine learning algorithms (e.g., decision tree) to improve their performance. AdaBoost when used with decision trees is often referred to as the best out-of-the-box classifier. The AdaBoost basically improve the relative “hardness” of each learner and converge them to a stronger learner. |
FIGURE 1The diagram shows how machine learning may be used in pulmonary function testing. The two main areas are to assist in the interpretation and to discover novel physiological biomarkers.