Literature DB >> 33285483

Hypergraph learning for identification of COVID-19 with CT imaging.

Donglin Di¹, Feng Shi², Fuhua Yan³, Liming Xia⁴, Zhanhao Mo⁵, Zhongxiang Ding⁶, Fei Shan⁷, Bin Song⁸, Shengrui Li¹, Ying Wei², Ying Shao², Miaofei Han², Yaozong Gao², He Sui⁵, Yue Gao⁹, Dinggang Shen¹⁰.

Abstract

The coronavirus disease, named COVID-19, has become the largest global public health crisis since it started in early 2020. CT imaging has been used as a complementary tool to assist early screening, especially for the rapid identification of COVID-19 cases from community acquired pneumonia (CAP) cases. The main challenge in early screening is how to model the confusing cases in the COVID-19 and CAP groups, with very similar clinical manifestations and imaging features. To tackle this challenge, we propose an Uncertainty Vertex-weighted Hypergraph Learning (UVHL) method to identify COVID-19 from CAP using CT images. In particular, multiple types of features (including regional features and radiomics features) are first extracted from CT image for each case. Then, the relationship among different cases is formulated by a hypergraph structure, with each case represented as a vertex in the hypergraph. The uncertainty of each vertex is further computed with an uncertainty score measurement and used as a weight in the hypergraph. Finally, a learning process of the vertex-weighted hypergraph is used to predict whether a new testing case belongs to COVID-19 or not. Experiments on a large multi-center pneumonia dataset, consisting of 2148 COVID-19 cases and 1182 CAP cases from five hospitals, are conducted to evaluate the prediction accuracy of the proposed method. Results demonstrate the effectiveness and robustness of our proposed method on the identification of COVID-19 in comparison to state-of-the-art methods.

Entities: Chemical Disease Gene Species

Keywords: COVID-19 pneumonia; Hypergraph learning; Uncertainty calculation; Vertex-weighted

Year: 2020 PMID： 33285483 PMCID： PMC7690277 DOI： 10.1016/j.media.2020.101910

Source DB: PubMed Journal: Med Image Anal ISSN： 1361-8415 Impact factor: 8.545

Introduction

The coronavirus disease pandemic, named COVID-19, has become the largest global public health crisis since late 2019. COVID-19 was caused by a kind of savagely contagious virus, and could lead to acute respiratory distress and multiple organ failure (Chen, Zhou, Dong, Qu, Gong, Han, Qiu, Wang, Liu, Wei, et al., 2020, Holshue et al., 2020, Li, Qin, Xu, Yin, Wang, Kong, Bai, Lu, Fang, Song, et al., 2020a, Li, Guan, Wu, Wang, Zhou, Tong, Ren, Leung, Lau, Wong, et al., 2020b, Wang, Hu, Hu, Zhu, Liu, Zhang, Wang, Xiang, Cheng, Xiong, et al., 2020a). The latest guideline, published by the Chinese government (the trial sixth version) (General Office of National Health Committee et al., 2020), declares that the diagnosis of COVID-19 must be confirmed by the reverse transcription polymerase chain reaction (RT-PCR) or gene sequencing for respiratory or blood specimens. Recent studies (Fang, Zhang, Xie, Lin, Ying, Pang, Ji, 2020, Gozes, Frid-Adar, Greenspan, Browning, Zhang, Ji, Bernheim, Siegel, Xie, Zhong, Zhao, Zheng, Wang, Liu, 2020) have investigated the sensitivity of non-contrast chest CT, and demonstrated that, recognizing either diffusion or focal ground-glass opacities as the disease characteristics in CT is a reliable and efficient approach. More specifically, the bilateral and peripheral ground-class and consolidative pulmonary opacities in CT are the typical features of COVID-19 symptoms, and the greater severity of the disease with increasing time from onset symptoms shows larger lung involvement and more linear opacities, a.k.a.the “crazy-paving” pattern and the “reverse halo” sign (Xie, Zhong, Zhao, Zheng, Wang, Liu, 2020, Bernheim, Mei, Huang, Yang, Fayad, Zhang, Diao, Lin, Zhu, Li, et al., 2020). However, these image features are similar between COVID-19 and other types of pneumonia, which brings difficulty for its differential diagnosis (Li, Qin, Xu, Yin, Wang, Kong, Bai, Lu, Fang, Song, et al., 2020a, Bai, Hsieh, Xiong, Halsey, Choi, Tran, Pan, Shi, Wang, Mei, et al., 2020). For example, GGO refers to an area of increased attenuation in the lung with preserved bronchial and vascular markings. It is a non-specific sign with a wide etiology, such as infection, chronic interstitial disease, and acute alveolar disease. Consolidation on CT scans refers to a pattern that appears as a homogeneous increase in lung parenchymal attenuation that obscures the margins of vessels and airway walls, which could be caused by pneumonia. Studies proposed that in COVID-19, these abnormalities tend to have bilateral peripheral involvement in multiple lobes, and may progress to “crazy paving” patterns in a later stage (Bernheim, Mei, Huang, Yang, Fayad, Zhang, Diao, Lin, Zhu, Li, et al., 2020, Pan et al., 2020). In this work, we extracted features from lung lobe and pulmonary segments to reflect the distribution differences of infections, which was proven to be an efficient way for COVID-19 differential diagnosis. To reduce the workload in diagnosing COVID-19, plenty of machine learning and deep learning-based studies have been conducted (Gozes, Frid-Adar, Greenspan, Browning, Zhang, Ji, Bernheim, Siegel, Li, Qin, Xu, Yin, Wang, Kong, Bai, Lu, Fang, Song, et al., 2020a, Narin, Kaya, Pamuk, Zhang, Xie, Li, Shen, Xia, Shan, Gao, Wang, Shi, Shi, Han, Xue, Shen, Shi). As shown in a recent review article, methods such as U-Net (Ronneberger et al., 2015) were used to segment the infections, and methods such as Radiomics (Shi, Xia, Shan, Wu, Wei, Yuan, Jiang, Gao, Sui, Shen, Wang, Kang, Ma, Zeng, Xiao, Guo, Cai, Yang, Li, Meng, et al.) or ResNet (Li et al., 2020a) were used to extract features for disease diagnosis. However, most studies have a limited number of participants, and methods were evaluated on single-center data, where its generalizability to other datasets is not sufficiently evaluated. To be clinically meaningful, there are two major challenges: (1) Noisy data, due to the large variations of data collected in an emergent situation, such as using different reconstruction kernels and CT manufactures, along with possible patient movements; (2) Confusing cases, due to similar radiological appearance of COVID-19 and other pneumonia, especially in the early stage. Therefore, how to handle these challenges is the key for successful application of computed-aided COVID-19 diagnosis methods (Fig. 1 ).

Fig. 1

Illustration of lung CT image, infection, lung lobes, and pulmonary segments on a CAP case (left) and a COVID-19 case (right).

Illustration of lung CT image, infection, lung lobes, and pulmonary segments on a CAP case (left) and a COVID-19 case (right). Accordingly, in this work, we propose an uncertainty based learning framework, called Uncertainty Vertex-weighted Hypergraph Learning (UVHL), to identify COVID-19 from CAP with CT images. The most essential task is to exploit the latent relationship among various COVID-19 cases and CAP cases, and then make a prediction for a new testing case, i.e., whether belonging to COVID-19 or not. The proposed framework employs a vertex-weighted hypergraph structure to formulate data correlation among different cases. The module of “uncertainty score measurement” is used to generate two metrics, i.e., (1) noisy data aleatoric uncertainty and (2) the model’s inability epistemic uncertainty. Then, the proposed UVHL conducts learning on the hypergraph structure to make a prediction for the new testing case, by simultaneously (a) incorporating the uncertainty values of measured data to relieve the misleading patterns from noisy low-quality data and (b) allocating more attention to the nodes distributing around the classifying interface in the latent representation space. Another advantage of the proposed framework is its flexibility in utilizing multi-modal data/features when available. We apply our proposed method to a large dataset, with 2148 COVID-19 cases and 1182 CAP cases. The experimental results show that our proposed method can achieve a satisfactory accuracy of 90% for identification of COVID-19 from CAP. The main contributions of this paper are summarized as follows: We propose to formulate data correlation among all COVID-19 and CAP cases using hypergraph, for exploring high-order relationship using multi-type CT features (such as regional features and radiomics features). We propose an uncertainty vertex-weighted strategy to relieve the influence of noisy (CT) data collected from suspected COVID-19 patients in emergent situation. We have demonstrated better prediction accuracy in the task of identifying COVID-19 from CAP, and have also shown how different types of CT features perform in this task.

Related work

In this section, we briefly review recent works on diagnosing COVID-19 and introduce current studies on hypergraph learning.

AI-based COVID-19 diagnosis

As introduced in Zu et al. (2020), COVID-19 patients could be divided into mild, moderate, severe and critically ill stages, according to the severity of disease development. In the mild stage, the pneumonia symptom is difficult to be observed from CT images for a suspected patient. With the development of the disease, ground-glass opacity (GGO), increased crazy-paving pattern, and consolidation can be observed (Li and Xia, 2020). When it becomes a serious situation, the symptom will deteriorate and also the gradual resolution of consolidation could be observed in CT images. In the very early studies, several statistics-based methods (Chen, Zhou, Dong, Qu, Gong, Han, Qiu, Wang, Liu, Wei, et al., 2020, Li, Guan, Wu, Wang, Zhou, Tong, Ren, Leung, Lau, Wong, et al., 2020b, Wang, Hu, Hu, Zhu, Liu, Zhang, Wang, Xiang, Cheng, Xiong, et al., 2020a) are proposed to develop automatic detection and patient monitoring methods for diagnosis of COVID-19. However, only simple data statistics is employed in these methods, which limits the capability of diagnosing suspected patients when facing the challenge of noisy data and confusing cases. To further improve the prediction accuracy, a group of AI-based methods (Narin, Kaya, Pamuk, Shan, Gao, Wang, Shi, Shi, Han, Xue, Shen, Shi, Gozes, Frid-Adar, Greenspan, Browning, Zhang, Ji, Bernheim, Siegel) are proposed in the following. In Bernheim et al. (2020); Shan et al. (2020); Tang et al. (2020), reliable representations from CT are learned to represent the symptom of COVID-19. The co-relationship between chest CT and RT-PCR testing has also been investigated in COVID-19 (Ai et al., 2020, Fang, Zhang, Xie, Lin, Ying, Pang, Ji, 2020, Xie, Zhong, Zhao, Zheng, Wang, Liu, 2020). Gozes et al. (2020) introduce an AI-based automatic CT image analysis tool for detection, quantification, and tracking of coronavirus. Although there have been plenty of works on AI-assisted COVID-19 diagnosis tools, the identification of COVID-19 from CAP has not fully investigated, which has become an important issue recently. In this task, Bai et al. (2020) investigate the prediction accuracy of radiologists in differentiating COVID-19 from CAP based on CT features and demonstrate the radiologists are capable of distinguishing with moderate to high accuracy. Ouyang et al., 2020 propose a dual-sampling attention network, including an attention module with a 3D convolutional network (CNN), to classify the regions of infected lesions into COVID-19 or typical viral pneumonia. Another issue is the correlation among the COVID-19 cases and the CAP cases, which is important to identify the category of a new testing case, i.e., the focus of this paper.

Preliminary on hypergraph learning

Hypergraph learning has been widely applied in many tasks, such as identifying non-random structure in structural connectivity of the cortical microcircuits (Dotko et al., 2016), identifying high-order brain connectome biomarkers for disease diagnosis (Zu et al., 2016), and studying the co-relationships between functional and structural connectome data (Munsell et al., 2016), where multi-view information from multiple atlases can also be used (Jia, Yap, Shen, 2012, Shi, Yap, Fan, Gilmore, Lin, Shen, 2010). Hypergraph learning was first introduced in Zhou et al. (2007), in which each node represents one case, each hyperedge captures the correlation between each pair of nodes, and the learning process is conducted on a hypergraph as a propagation process. By this method, the transductive inference on hypergraph aims to minimize the label differences between vertices that are connected by more and stronger hyperedges. Then, the hypergraph learning is conducted as a label propagation process on the hypergraph to obtain the label projection matrix (Liu et al., 2017), or as a spectral clustering (Li and Milenkovic, 2017). Other applications of hypergraph learning include video object segmentation (Huang et al., 2009), images ranking (Huang et al., 2010), and landmark retrieval (Zhu et al., 2015). Hypergraph learning has the advantage of modeling high-order correlation modeling, but the reliability of different vertices on the hypergraph, also important to conduct accurate learning, has not been well investigated.

Materials and preprocessing

In this section, we first introduce materials used in this work and image preprocessing steps. Then, multi-type features, including regional features and radiomics features from CT images are extracted.

Dataset

In this study, a total of 3330 CT images were collected, including 2148 from COVID-19 patients and the rest 1182 from CAP patients. These images were provided by the Ruijin Hospital of Shanghai Jiao Tong University, Tongji Hospital of Huazhong University of Science and Technology, China-Japan Union Hospital of Jilin University, Hangzhou First People’s Hospital of Zhejiang University, Shanghai Public Health Clinical Center of Fudan University and Sichuan University West China Hospital. All the COVID-19 cases were confirmed as positive by RT-PCR and acquired from Jan. 9, 2020 to Feb. 14, 2020. CAP images were obtained from Jul. 30, 2018 to Feb. 22, 2020. The CT scanners used in this study include uCT 780 from UIH, Optima CT520, Discovery CT750, LightSpeed 16 from GE, Aquilion ONE from Toshiba, SOMATOMForce from Siemens, and SCENARIA from Hitachi. The CT protocol here includes: 120 kV, reconstructed CT thickness ranging from 0.625 to 2 mm, and breath-hold at full inspiration. All images were de-identified before sending for analysis. This study was approved by the Institutional Review Board of participating institutes. Written informed consent was waived due to retrospective nature of the study.

Preprocessing

In this study, both regional and radiomics features are extracted from CT image for each patient. More specifically, we first perform segmentation of left / right lung, 5 lung lobes, and 18 pulmonary segments, as well as infected lesions by deep learning based network, i.e., VB-Net, in a portal software (Shan et al., 2020), for each CT image. To generate regional features, we calculate a dimension of features for each patient, including histogram distribution, infected lesion counting numbers, the mean and variance grey values of lesion area, lesion surface area, and additional density and mass features, etc. To generate radiomics features, radiomics computation is performed on the infected lesions and a dimension of for each patient is extracted, including the first-order intensity statistics and texture features such as gray level co-occurrence matrix (Shi et al., 2020). With the information on age and sex also included, the representations for each patient can be concatenated as overall.

The method

In this section, we introduce our proposed Uncertainty Vertex-weighted Hypergraph Learning (UVHL) method for COVID-19 identification. Fig. 2 shows in the framework of our proposed method, which is composed of three steps, i.e., (1) “Data Uncertainty Measurement”, (2) “Uncertainty-vertex Hypergraph modeling” and (3) “Uncertainty-vertex Hypergraph Learning”, respectively.

Fig. 2

Illustration of our proposed Uncertainty Vertex-weighted Hypergraph Learning (UVHL) method for COVID-19 identification. Given a bunch of patients, “Data Uncertainty Measurement” stage calculates and generates the uncertainty score for each case of CAP and COVID-19, denoted as green Gvely. The “Uncertainty-vertex Hypergraph Modelling” then constructs the hypergraph structure for both labeled cases and unknown cases, former of which are embedded and denoted with the color bars. The stage “Uncertainty-vertex Hypergraph Learning” can learn and classify all of the cases into the two diseases, consequently.

Data uncertainty measurement

As introduced before, the data quality may suffer from the unstable, noisy nature caused in the emergent situation. To overcome this limitation, it is important to identify the reliability of different cases during the learning processing. In this step, a data uncertainty measurement process is conducted to generate uncertainty scores for all cases used in the learning processing. Here, two types of uncertainty factors are calculated in our method. Aleatoric Uncertainty. The data is abnormal, noisy or collected by mistake with low quality. Epistemic Uncertainty. The features of these cases lie around the decision boundary that makes the distinguishing model under a serious challenge. We will introduce how to calculate these uncertainty scores in details as below.

Aleatoric uncertainty

The aleatoric uncertainty represents the quality of data (Gal, Ghahramani, 2016, Kendall, Gal, 2017), based on the comparison of data distributions. The objective is to estimate the parameters in the uncertainty measuring model that minimizes the Kullback-Leibler (KL) divergence (Van Erven, Harremos, 2014, Hershey, Olsen, 2007, Moreno, Ho, Vasconcelos, 2004) between true distribution (provided by the label of data) and predicted distribution (the output of the uncertainty measuring model) over training samples : Hence, the loss function can be defined as KL-Divergence: which is minimized during the training process. In detail, the loss for a single case can be calculated as Eq. (2):where denotes i.e., the difference between positive cases and negative cases, both of which are the output before the last softmax layer in the model. Theoretically, should follow a Guassian distribution and target is . Note that could be replaced by any loss function and we adopt the Cross-Entropy, where is designed to make the gradient of back propagation changing smoothly (Nix, Weigend, 1994, Le, Smola, Canu, 2005). denotes the embeded feature vector of each patient and is the corresponding label. represents the output after the last softmax function, which maps features of 191 dimensions to the binary prediction results. stands for the entropy of . denotes the predicted variance. To avoid the potential division by zero, we replace by . Therefore, can be used to predict the uncertainty score for each case. Note that and are redundant for optimization. Therefore, for samples, we can rewrite the loss function as Eq. 3: If the Cross-Entropy between the predicted and true label is quite large, the model tends to predict a higher to make inputs with high uncertainty having a smaller effect on the loss. Thus, low quality data will be allocated a higher in the model. This allows the network to learn to attenuate the effect from erroneous labels, thus becoming more robust to noisy data. In our task, we denote as aleatoric uncertainty to identify low quality data, as defined in Eq. (4):

Epistemic uncertainty

Epistemic uncertainty refers to the model’s inability for accurate and precise prediction. To compute this measurement, we use the dropout variation inference, which is a widely adopted practical approach for approximate inference (Gal and Ghahramani, 2016). The Monte Carlo estimation method is referred as MC dropout. Our approximate predictive distribution is given by Eq. (5):where is a set of random variables for a model with layers. and denote the input and the corresponding output of any MC dropout model, respectively. The effect of our MC dropout can be attributed to impose a Gaussian distribution on each layer during the test stage. In detail, the multi-layer perception neural network (MLP) model can be trained with dropout. But different from the conventional settings, these dropout layers are kept open during the testing stage. Each case is predicted for times, and the epistemic uncertainty for this case can be calculated using the variance of these values. The predicted mean for one case can be obtained by Eq. (6):or more specifically by Eq. (7) in our task: The epistemic uncertainty can be approximated as Kendall and Gal (2017) in Eq. (8), which is the variance of repetitions:where denotes the sample and denotes the test with dropout. Combined with aleatoric uncertainty introduced before, our proposed uncertainty is calculated as:Note that in the standard definition, epistemic uncertainty includes aleatoric uncertainty, since the ability of classification model using epistemic uncertainty may be affected by low-quality data (aleatoric uncertainty) or the inherent limitations of the model to distinguish boundary data. To normalize the uncertainty its mean and standard deviation in the whole dataset can be calculated as . Then, sigmoid function is adopted to ensure the uncertainty score ranging from 0 to 1. is an adjustable parameter, to make different uncertainty cases more distinctive. If the is set to positive, the cases with the high uncertainty score will be adjusted higher, the cases with the low uncertainty score will be lower, and vice versa. Weights of all data are shown in Eq. (10):In the end of this step, by leveraging the uncertainty, the quality of data is measured and also the weighted vertices are generated accordingly.

Uncertainty-vertex hypergraph construction

To identify the COVID-19 cases, it is important to exploit the data correlation. Here, the hypergraph structure is employed to model the relationship among the known training COVID-19 cases, the known training CAP cases, and the unknown testing cases. In the hypergraph, each vertex denotes one case, and there are totally vertices according to the number of cases involved. Given the two types of features, i.e., the regional features and radiomics features, two groups of hyperedges are generated to build the connections among these cases. For the regional features, each time one vertex (case) is selected as the centroid, and its nearest neighbors (cases) are selected to be connected by one hyperedge. This process repeats until all vertices have been selected once. Then, a group of hyperedges based on the regional feature can be generated. The same process is performed for the radiomics feature, which generates another group of hyperedges. These two groups of hyperedges are concatenated to build the final hypergraph. Different from conventional hypergraph, the uncertainty-vertex hypergraph not only cares about features and the label of each vertex, but also considers the uncertainty of each vertex. In this way, these more reliable vertices could contribute more during the learning process, and vice versa. Here, is the vertex set, is the hyperedges set, and is the pre-defined matrix of hyperedge weights. Besides these, denotes the uncertainty matrix for all the vertices. Therefore, our uncertainty-vertex hypergraph can be written as . Leveraging vertex weights an incidence matrix is then generated to represent the relationship among different vertices. In the end of this stage, the uncertainty vertex-weighted hypergraph is constructed to represent the correlation among all cases.

Uncertainty-vertex hypergraph learning

As shown in Fig. 3 , compared with the conventional hypergraph learning method, the proposed UVHL structure considers the uncertainty of each vertex individually and the learning process is conducted on an unequal space. The learning task on the uncertainty-vertex hypergraph can be formulated as:

Fig. 3

Besides the hyperedge weights, the uncertainty-vertex hypergraph contains the uncertainty score of each vertex.

Besides the hyperedge weights, the uncertainty-vertex hypergraph contains the uncertainty score of each vertex. More specifically, the smoothness regularizer function and the empirical loss term can be, respectively, rewritten as follows:where is the column of and . The uncertainty vertex-weighted hypergraph loss function can be further rewritten as: Therefore, the target label matrix can be obtained as: With the generated label matrix ( in our task), the new coming testing case can be identified as COVID-19 or CAP accordingly.

Experiments

Evaluation metrics

In our experiments, six criteria are employed to evaluate the COVID-19 prediction accuracy, and the definition of the confusion matrix is shown in Table 1 .

Table 1

The definition of the confusion matrix for COVID-19 identification.

	Classify as COVID-19	Classify as CAP
COVID-19	True Positive (TP)	False Negative (FN)
CAP	False Positive (FP)	True Negative (TN)

Accuracy (ACC): ACC measures the proportion of samples that are correctly classified. . Sensitivity (SEN): SEN measures the proportion of actual positives that are correctly identified as such. This metric is also called as “recall”, reflecting the misdiagnose proportion. In actual medical diagnostic application scenarios, this evaluation metric is more critical. . Specificity (SPEC): SPEC measures the proportion of actual negatives that are correctly identified as such. It stands for the omission diagnose rate. . Balance (BAC): BAC is the mean value of SEN and SPEC. . Positive Predictive Value (PPV): PPV measures the proportion of detected positives that are true positive. . Negative Predictive Value (NPV): NPV measures the proportion of detected negatives that are true negative. . The definition of the confusion matrix for COVID-19 identification.

Compared methods

The following popular classification approaches are used for comparison : Support Vector Machine (SVM) (Cortes and Vapnik, 1995): It is a non-probabilistic linear classifier, used to perform supervised learning. It selects a group of the training data as support vectors to determine the boundary that divides different categories apart as unambiguously as possible. Multilayer Perceptron (MLP) Neural Network (Thimm, Fiesler, 1997, Orhan, Hekim, Ozer, 2011): As the fundamental feed-forward artificial neural network, MLP can be utilized to perform binary classification with the cross-entropy as the loss function. Inductive Hypergraph Learning (iHL) (Zhang et al., 2018): In iHL, all available features are combined into one single feature, and then a projection is learned on the hypergraph structure, which is used to conduct classification task on the pneumonia instances. This model learns the high-order representations from the training set and is evaluated in the testing set. Transductive Hypergraph Learning (tHL) (Zhou et al., 2007): The transductive learning on hypergraph is conducted to learn the label matrix. Both the training data and all testing data are employed in the hypergraph structure, yet leading to the commonly used semi-supervised learning approach.

Implementation

In our experiments, the whole dataset consists of 2148 COVID-19 cases and 1182 CAP cases. We randomly divide them into 10 subsets and perform 10-fold cross-validation, in which 9 subfolds are used for training and the rest one is used for testing each time. The data splitting process repeats 10 times, and the mean and standard deviation of all 10 runs are reported as the final result for comparison. All features are normalized into [0,1] in the training dataset, and the offset mean and variance are applied to the testing dataset for data normalization, respectively. All the training data were used for generating the uncertainty measuring model as well as the uncertainty score simultaneously. During the construction of hypergragh in UVHL, K nearest neighbors are connected for each vertex when generating hyperedges. We note that it is important to generate a suitable hypergraph structure for representation learning. However, how to select the best value in this procedure is difficult. A large will lead to high dissimilarity insider the hyperedge, while a small may be not informative enough to the overall hypergraph structure. To select a suitable we adopt the following strategy. First, a pool of candidate values is set as in our experiments. Given a set of training data and corresponding testing data, we further split the training data into 10 folds. The 10-fold cross-validation is conducted on the training data, where different are used. We then collect the prediction accuracy of different on the training data, and the with the best prediction accuracy is used for testing. In this way, the selection of can be fully automatic and optimized.

Results and discussions

Experimental results are demonstrated in Fig. 4 , and the detailed mean value and the significance of the t-test between UVHL and other methods are listed in Table 2 . From these results, we have the following observations:

Fig. 4

The prediction accuracy of UVHL and compared methods. The results show that UVHL outperforms other methods for all metrics.

Table 2

Prediction accuracy comparison of different methods on the pneumonia dataset. For each 10-fold, we compute the accuracy of the proposed method on testing data, and compare them with those of UVHL via paired t-test to generate the p-values for each metric. (“” denotes significance level is reached as -value .)

Methods		ACC		SEN		SPEC		BAC		PPV		NPV
SVM	(p-value)	0.84084	1.173e7	0.85714	1.438e6	0.81034	4.235e3	0.83374	1.037e4	0.89423	0.0498	0.75200	3.283e6
MLP	(p-value)	0.84685	4.917e6	0.86175	1.082e5	0.81897	0.0153	0.84036	2.349e3	0.89904	0.0507	0.76000	8.777e9
iHL	(p-value)	0.85135	5.260e7	0.86327	3.415e4	0.83052	0.0332	0.84790	7.905e3	0.90256	0.2367	0.76866	2.088e8
tHL	(p-value)	0.86486	3.533e4	0.89191	2.851e4	0.81743	4.559e-3	0.85467	0.0197	0.89898	0.2383	0.80547	7.071e5
UVHL	(std)	0.89790†	±0.0223	0.93269†	±0.0291	0.84000†	±0.0274	0.88635†	±0.0210	0.90654	±0.0222	0.88235†	±0.0383

Our proposed method UVHL achieves the most reliable prediction accuracy among all metrics. Compared with SVM and MLP, our approach obtains better prediction accuracy (i.e., 6.79% and 6.03% relative improvement in terms of ACC, respectively), demonstrating that the hypergraph based approach has the effective ability to tackle the pneumonia identification task. Compared with other hypergraph based methods, i.e., inductive hypergraph learning (iHL) (Zhang et al., 2018) and transductive hypergraph learning (tHL) (Zhou et al., 2007), our approach achieves relative gains of 5.47% and 3.82% in terms of ACC, respectively. Besides the better sensitivity value, our proposed UVHL method achieves much higher specificity value compared with all other methods. This indicates that our proposed method can not only have high recall of COVID-19 patients but also be effective on filtering CAP patients, which is quite useful in practice. The prediction accuracy of UVHL and compared methods. The results show that UVHL outperforms other methods for all metrics. Prediction accuracy comparison of different methods on the pneumonia dataset. For each 10-fold, we compute the accuracy of the proposed method on testing data, and compare them with those of UVHL via paired t-test to generate the p-values for each metric. (“” denotes significance level is reached as -value .)

Data uncertainty study

To evaluate the effectiveness of our proposed data uncertainty method, we further conduct ablation experiments to compare variants of the data uncertainty measurement procedure. First, we remove the uncertainty measurement procedure and treat all cases equally. Secondly, the SVM-based uncertainty score is calculated, instead of that of using MLP. Then, the two uncertainty measurements are used individually for comparison. Experimental results are reported in Table 3 , from which we can have the following observations:

Table 3

Experimental comparison on the data uncertainty measurement. For the “proposed uncertainty”, we compute its accuracy on testing data, and compare them with other settings via paired t-test to generate the -values. (“” denotes significance level is reached as -value .)

	Weighting strategy	ACC	SEN	SPEC	BAC	PPV	NPV
1	Equal Weight	0.85586	0.88426	0.80342	0.84384	0.89252	0.789912
2	Support Vectors	0.86066	0.87021	0.84442	0.85731	0.90983	0.78137
3	Aleatoric Uncertainty	0.87387	0.918919	0.78378	0.85135	0.89474	0.82857
4	Epistemic Uncertainty	0.88589	0.90741	0.84615	0.87678	0.91589	0.83193
5	Proposed Uncertainty	0.89790†	0.93269†	0.84000	0.88635†	0.90654	0.88235†

Compared with the method without uncertainty, i.e., with equal weights, all the other methods with uncertainty can achieve better prediction accuracy. The method with uncertainty from SVM performs worse than that of using MLP. It indicates that MLP has better identification effectiveness compared with SVM on uncertainty measurement. Compared with the case of using aleatoric uncertainty and epistemic uncertainty individually, the use of both uncertainties, i.e., the proposed method, achieves the best prediction accuracy, which demonstrates the effectiveness of our proposed data uncertainty strategy. Experimental comparison on the data uncertainty measurement. For the “proposed uncertainty”, we compute its accuracy on testing data, and compare them with other settings via paired t-test to generate the -values. (“” denotes significance level is reached as -value .)

Analysis on feature types

In this study, there are two types of features from CT, i.e., regional features and radiomics features. Here, we evaluate the effectiveness of these two features on the task of COVID-19 identification. We have conducted experiments with our proposed method using each feature individually. Experimental comparison is demonstrated in Table 4 . Our method using regional feature has higher sensitivity, while the specificity is relatively lower, compared with the cases of using radiomics features. These results indicate that regional feature is better in finding the true positive COVID-19 cases, while radiomics features have the advantage of identifying CAP cases. When using both types of features in our proposed method, the prediction accuracy becomes stable, along with both increasing sensitivity and specificity, as shown in the last row of Table 4. This observation demonstrates that our proposed method has the ability of jointly utilizing multi-type features and achieve better prediction accuracy.

Table 4

Experimental comparison on different feature types and their combination. For the “both” feature type, we compute its accuracy on testing data, and compare them with each other via paired t-test to generate the p-values. (“” denotes significance level is reached as -value .)

Feature types	ACC	SEN	SPEC	BAC	PPV	NPV
Regional	0.85886	0.90323	0.77586	0.83954	0.88288	0.81081
Radiomics	0.85946	0.86982	0.84182	0.85582	0.90889	0.78012
Both	0.89790†	0.93269†	0.84000	0.88635†	0.90654	0.88235†

Analysis on confounding factors

There are many confounding vital factors in this classification task, such as gender, age, image parameters, etc. In this study, we conduct sub-group analysis experiments on gender, age, and image slickness, since these are the most widely adopted sub-grouping methods. As shown in Table 5 , we can observe that the distributions of gender, age, and slice thickness are similar in each of two groups. For each factor, we compute the accuracy of our method on testing data and compare them with each other via paired t-test. The prediction accuracy in male and elder are slightly higher, with 2% and 1% in accuracy, respectively. Also, the images with thinner slice thickness ( mm) shows slightly higher prediction accuracy than other images with slice thickness between and which is reasonable as thinner slice images provide more detailed information.

Table 5

Experimental comparison on different feature types and their combination. For each factor, we compute the accuracy of the proposed method on testing data, and compare them with each other via paired t-test to generate the -values. (“” denotes significance level is reached as -value .)

Factors	% / Subjs	ACC	SEN	SPEC	BAC	PPV	NPV
Male	50.48%	0.90327†	0.93575†	0.84631†	0.89103	0.91438†	0.88248
Female	49.52%	0.88963	0.92768	0.83302	0.88035	0.89209	0.88560
Elder (>50)	54.20%	0.89236	0.93286	0.84466†	0.88876†	0.87611	0.91441†
Youth (⩽50)	45.80%	0.90224†	0.93078	0.82680	0.87879	0.93424†	0.81877
Thickness⩽1.0	59.91%	0.91479†	0.93084	0.85442†	0.89263†	0.96008†	0.76660
Thickness>1.0	40.09%	0.87491	0.93531†	0.82962	0.88247	0.80451	0.94478†

Analysis on few labeled data

As the large-scale labeled data for COVID-19 is expensive and maybe infeasible in emergent situations, how these methods perform with very limited labeled data is an important issue. It should be noted that we have not included MLP, as MLP performs very badly when having very few training data. To do that, we investigate how the compared methods work with respect to a small scale of labeled data from 10 to 100 for COVID-19 and CAP respectively. In these experiments, 100 cases for each category are selected as the validation data. The training data selection process repeats 10 times and the average prediction accuracy is calculated for comparison. Experimental results are shown in Fig. 5 . As shown in these results, we can observe that SVM performs inferior in all settings when given just very few labeled data, and the hypergraph based methods perform the best. We can also observe that our proposed method, i.e., UVHL, can achieve very stable prediction accuracy when only a few labeled data are available, which justifies the effectiveness of our proposed method in these difficult situations.

Fig. 5

Prediction accuracy comparison with respect to different scales of training data.

Conclusion

In this paper, we propose an uncertainty vertex-weighted hypergraph learning method to identify COVID-19 from CAP using CT images. Confronting the challenging issues from noisy data and confusing cases with similar clinical manifestations and imaging features, our proposed method employs a hypergraph structure to formulate the data correlation among the known COVID-19 cases, the known as CAP cases, and the testing cases. Through this method, two types of CT image features (including regional features and radiomics features) are extracted for patient representation. To overcome the limitations of the noisy data, a data uncertainty measurement process is conducted to measure the uncertainty of each training case. Finally, a vertex-weighted hypergraph learning process is used to predict whether a new case is COVID-19 or CAP. We have conducted experiments on a large multi-center pneumonia dataset, including 2148 COVID-19 cases and 1182 CAP cases from 5 hospitals, and the experimental results demonstrate the effectiveness of our proposed method on identification of COVID-19 in comparison to the existing state-of-the-art methods. In future work, the effectiveness of each individual feature should be fully investigated. Regarding the limited data and possible evolution of COVID-19, it is important to explore small sample learning methods as well as transfer learning techniques on this difficult task of identifying COVID-19.

CRediT authorship contribution statement

Donglin Di: Methodology, Formal analysis, Validation, Software, Writing - original draft. Feng Shi: Methodology, Investigation, Writing - review & editing. Fuhua Yan: Data curation, Resources, Investigation. Liming Xia: Data curation, Resources, Investigation. Zhanhao Mo: Data curation, Resources, Investigation. Zhongxiang Ding: Data curation, Resources, Investigation. Fei Shan: Data curation, Resources, Investigation. Bin Song: Data curation, Resources, Investigation. Shengrui Li: Software, Validation, Writing - original draft. Ying Wei: Methodology, Data curation, Investigation. Ying Shao: Methodology, Data curation, Investigation. Miaofei Han: Methodology, Data curation, Investigation. Yaozong Gao: Software, Validation, Investigation. He Sui: Data curation, Resources, Investigation. Yue Gao: Conceptualization, Writing - review & editing, Supervision. Dinggang Shen: Conceptualization, Writing - review & editing, Project administration.

Declaration of Competing Interest

The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: F.S., Y.W., Y.S., M.H., Yaozong G., and D.S. are employees of Shanghai United Imaging Intelligence Co., Ltd. The company has no role in designing and performing the surveillances and analyzing and interpreting the data. All other authors report no conflicts of interest relevant to this article.

12 in total

1. Automated COVID-19 Grading With Convolutional Neural Networks in Computed Tomography Scans: A Systematic Comparison.

Authors: Coen de Vente; Luuk H Boulogne; Kiran Vaidhya Venkadesh; Cheryl Sital; Nikolas Lessmann; Colin Jacobs; Clara I Sanchez; Bram van Ginneken
Journal: IEEE Trans Artif Intell Date: 2021-10-08

2. Automated diagnosis of COVID stages from lung CT images using statistical features in 2-dimensional flexible analytic wavelet transform.

Authors: Rajneesh Kumar Patel; Manish Kashyap
Journal: Biocybern Biomed Eng Date: 2022-07-01 Impact factor: 5.687

3. Computer-aided detection of COVID-19 from CT scans using an ensemble of CNNs and KSVM classifier.

Authors: Bejoy Abraham; Madhu S Nair
Journal: Signal Image Video Process Date: 2021-08-16 Impact factor: 1.583

4. RCoNet: Deformable Mutual Information Maximization and High-Order Uncertainty-Aware Learning for Robust COVID-19 Detection.

Authors: Shunjie Dong; Qianqian Yang; Yu Fu; Mei Tian; Cheng Zhuo
Journal: IEEE Trans Neural Netw Learn Syst Date: 2021-08-03 Impact factor: 10.451

5. COVID-19 Automatic Diagnosis With Radiographic Imaging: Explainable Attention Transfer Deep Neural Networks.

Authors: Wenqi Shi; Li Tong; Yuanda Zhu; May D Wang
Journal: IEEE J Biomed Health Inform Date: 2021-07-27 Impact factor: 7.021

6. Modality alignment contrastive learning for severity assessment of COVID-19 from lung ultrasound and clinical information.

Authors: Wufeng Xue; Chunyan Cao; Jie Liu; Yilian Duan; Haiyan Cao; Jian Wang; Xumin Tao; Zejian Chen; Meng Wu; Jinxiang Zhang; Hui Sun; Yang Jin; Xin Yang; Ruobing Huang; Feixiang Xiang; Yue Song; Manjie You; Wen Zhang; Lili Jiang; Ziming Zhang; Shuangshuang Kong; Ying Tian; Li Zhang; Dong Ni; Mingxing Xie
Journal: Med Image Anal Date: 2021-01-20 Impact factor: 8.545