| Literature DB >> 35505976 |
Osama R Shahin1, Hamoud H Alshammari2, Ahmed I Taloba1, Rasha M Abd El-Aziz1.
Abstract
As people all over the world are vulnerable to be affected by the COVID-19 virus, the automatic detection of such a virus is an important concern. The paper aims to detect and classify corona virus using machine learning. To spot and identify corona virus in CT-Lung screening and Computer-Aided diagnosis (CAD) system is projected to distinguish and classifies the COVID-19. By utilizing the clinical specimens obtained from the corona-infected patients with the help of some machine learning techniques like Decision Tree, Support Vector Machine, K-means clustering, and Radial Basis Function. While some specialists believe that the RT-PCR test is the best option for diagnosing Covid-19 patients, others believe that CT scans of the lungs can be more accurate in diagnosing corona virus infection, as well as being less expensive than the PCR test. The clinical specimens include serum specimens, respiratory secretions, and whole blood specimens. Overall, 15 factors are measured from these specimens as the result of the previous clinical examinations. The proposed CAD system consists of four phases starting with the CT lungs screening collection, followed by a pre-processing stage to enhance the appearance of the ground glass opacities (GGOs) nodules as they originally lock hazy with fainting contrast. A modified K-means algorithm will be used to detect and segment these regions. Finally, the use of detected, infected areas that obtained in the detection phase with a scale of 50×50 and perform segmentation of the solid false positives that seem to be GGOs as inputs and targets for the machine learning classifiers, here a support vector machine (SVM) and Radial basis function (RBF) has been utilized. Moreover, a GUI application is developed which avoids the confusion of the doctors for getting the exact results by giving the 15 input factors obtained from the clinical specimens.Entities:
Keywords: CAD system; Clinical specimens,SVM; Covid-19 analysis; Radial basis function
Year: 2022 PMID: 35505976 PMCID: PMC9050589 DOI: 10.1016/j.compeleceng.2022.108055
Source DB: PubMed Journal: Comput Electr Eng ISSN: 0045-7906 Impact factor: 4.152
DT and LR for detection of positive COVID-19 patients
|
i. Using the StandardScaler() function, standardize the selected characteristics. |
ii. Applying DT to the specified features using the DTClassifier (criterion=’entropy’, max_depth=5, random_state=0) function with some parameters. |
iii. Select features are used to train the model. |
iv. K-fold cv specifications: thresh=0.5, k_fold_seed=13, n_folds=10. |
v. Using the test dataset, forecast the outcome. |
vi. Predict the outcome using the test dataset. |
vii. To evaluate FN,FP, TN, and TP by using confusion_matrix() |
viii. Calculaterecall, precision, and F1 score with classification report() function. |
K-means Algorithm
| Step1: Choose the number of ‘k' required to identify the object's spatial representation. |
| Step2: In the primary group of centroids, this must be represented. |
| Step3: The data point is calculated between the points of respective centroids. |
| Step4: The point that is closest to the center is labeled. |
| Step5: The centroid group is recalculated based on the classified point. |
| Step6: Steps 2 and 3 are performed until the centroids no longer change. |
Fig. 1Proposed model of Clinical specimens and CT-lungs screening.
Fig. 2Flowchart for the COVID-19 detection.
Naïve Bayes
| Input: COVID-19 dataset |
| Calculate the likelihood of each element by breaking it down into its constituent tokens. |
| Store the values of infected in the database; |
| For each patient data D do |
| While (D not end) do |
| Image sample for following token, Ox; |
| Query folder for the infectious samples D(Ox); |
| Calculate the collected samples probability, P [D] and N [D]; |
| Calculate the number of samples by: V [D]=h(D [D], J [D]); |
| If V [D] > threshold: |
| Infected; |
| else |
| Non-infected; |
| End if |
| End while |
| End for |
| Return |
| Last classification (Infected/Non-infected); |
| end |
Fig. 3Decision Tree Algorithm.
Decision Tree
| Assume No. of Samples = S; |
| Data Points = p; |
| Target Inputs = q; |
| No. of Leaves = Ys; |
| Tree_Depth = T; |
| Criterion = R; |
| for y in test size; |
| do |
| for a in R do |
| test_size = p_test and q_test; |
| train_size = p_train and q_train; |
| for e < p do |
| Call function DT; |
| For f < T do |
| Evaluate the best_spilt; |
| Evaluate class (r); |
| Ys++; |
| Node, (r,R); |
| Return |
| Predicted_class (r); |
| Calculate the Accuracy; |
Fig. 4Naïve Bayes Algorithm working.
Fig. 5COVID-19 Detection Methodology.
Fig. 6SVM Working Principle.
Fig. 7Radial Basis Function Working.
Fig. 8K-Means clustering Working.
Confusion Matrix.
| Actual biopsy (True) | True Positive (TP) | False Positive (FN) |
| Actual biopsy (False) | False Negative (FP) | True Negative (TN) |
Support Vector Machine (SVM).
| Training set = |
| Kernel function = { |
| Number of nearest neighbors = k; |
| set |
| set γ = γ; |
| Create a trained SVM classifier |
| Compare the classifier |
| Maintain the classifier with the better accuracy; |
| Classification of Result finally (Infected/Non-infected); |