Literature DB >> 33777669

Factor analysis approach to classify COVID-19 datasets in several regions.

Mohammad Reza Mahmoudi1, Dumitru Baleanu2,3, Shahab S Band4, Amir Mosavi5,6.   

Abstract

The aim of this research is to investigate the relationships between the counts of cases with Covid-19 and the deaths due to it in seven countries that are severely affected from this pandemic disease. First, the Pearson's correlation is used to determine the relationships among these countries. Then, the factor analysis is applied to categorize these countries based on their relationships.
© 2021 The Authors.

Entities:  

Keywords:  Coronaviruses; Correlation; Covid-19; Factor analysis

Year:  2021        PMID: 33777669      PMCID: PMC7982653          DOI: 10.1016/j.rinp.2021.104071

Source DB:  PubMed          Journal:  Results Phys        ISSN: 2211-3797            Impact factor:   4.476


Introduction

In the winter months of 2019–2020, another type of coronavirus, Covid-19, has been reported in Wuhan [1]. This virus has severe destructive effects on the respiratory system. From January to now (April 18, 2020), this epidemic has become epidemic all over the world and day by day the cases with Covid-19 and the deaths due to Covid-19 are extremely increasing in most of countries [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15]. There are many techniques analyze the natural phenomena including artificial intelligence, mathematical and statistical methods such as optimization, deep learning, time series analysis, machine learning, regression modeling, clustering and numerical analysis [16], [17], [18], [19], [20], [21], [22], [23], [24], [25], [26], [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38], [39]. Since Covid-19 has many impacts on environment, health, society and economy, the study of the rate of spread of this disease and the comparison of its rate in different countries is essential. There are some researches about the classification of Covid-19 datasets [40], [41], [42], [43]. These researches are based on time series analysis, principal component analysis and fuzzy clustering. The aim of this research is to study the relationships between the counts of the cases with Covid-19 and the deaths due to it in seven countries that are severely affected from this pandemic disease. First, the coefficients of correlation are computed to determine the relationships between these countries. Then, the factor analysis is applied to categorize these countries using the counts of cases and deaths.

Material and method

This section is devoted to study the research’s dataset and to and to introduce the factor analysis.

Dataset

In this work, the counts of the cases with Covid-19 and the deaths due to it in United States America, United Kingdom, Spain, Italy, Iran, Germany, and France from February 22 to April 18 of 2020, are considered [43], [44]. Table 1 summarizes the descriptive statistics of dataset containing the mean and the standard deviation. It can be observed that Iran and United States America have the minimum and the maximum counts of the cases with Covid-19. In addition, Germany and United States America have the minimum and the maximum of the deaths due to Covid-19. The plots for the counts of the cases with Covid-19 and the deaths due to it are also demonstrated in Fig. 1 .
Table 1

The mean and standard deviation for the counts of the cases with Covid-19 and the deaths due to this pandemic disease.

CountryMean ± Standard Deviation
CasesUnited States America11900.8 ± 13327.5
Spain3187.6 ± 3016.8
Italy2922.6 ± 2056.1
Germany2329.2 ± 2245.2
France1851.5 ± 1876.0
United Kingdom1842.1 ± 2207.7
Iran1294.5 ± 921.5



DeathsUnited States America628.0 ± 1013.4
Italy385.5 ± 309.5
Spain330.1 ± 340.5
France316.6 ± 447.3
United Kingdom247.1 ± 342.0
Iran80.7 ± 55.6
Germany69.7 ± 94.9
Fig. 1

Counts of the cases (Left side of Top), counts of the deaths (Right side of Top), cumulative counts of the cases (Left side of Bottom), and cumulative counts of the deaths (Right side of Bottom).

The mean and standard deviation for the counts of the cases with Covid-19 and the deaths due to this pandemic disease. Counts of the cases (Left side of Top), counts of the deaths (Right side of Top), cumulative counts of the cases (Left side of Bottom), and cumulative counts of the deaths (Right side of Bottom). The relationships between the rates of the spread of Covid-19 among these countries have been studied using Pearson’s coefficient of correlation. As it can be seen in Table 2 , all of the values are more than 0.5 and significant, and consequently there are strong positive relationships between the rates of spread of Covid-19 in all of countries.
Table 2

Pearson’s coefficient of correlation between the rates of spread of Covid-19.

United States AmericaSpainItalyGermanyUnited KingdomFranceIran
PatientsFrance10.5860.8500.7660.8510.8560.673
Germany0.58610.6540.5140.5620.5500.942
Iran0.8500.65410.8480.8840.8660.718
Italy0.7660.5140.84810.8420.8790.567
Spain0.8510.5620.8840.84210.9290.666
United Kingdom0.8560.5500.8660.8790.92910.654
United States America0.6730.9420.7180.5670.6660.6541



DeathsFrance10.8020.6210.6770.8150.8040.716
Germany0.80210.5650.6290.7640.9270.885
Iran0.6210.56510.9340.7940.5550.449
Italy0.6770.6290.93410.8920.6260.508
Spain0.8150.7640.7940.89210.7430.627
United Kingdom0.8040.9270.5550.6260.74310.889
United States America0.7160.8850.4490.5080.6270.8891

* p-value < 0.001.

Pearson’s coefficient of correlation between the rates of spread of Covid-19. * p-value < 0.001.

Principles of factor analysis

Factor analysis (FA) as a popular multivariate statistical technique transforms some dependent features into some other features called factors such that the first factors of this transformation have the main information of the first dataset [45]. In other words, FA is used to convert a dataset with high dimensions to a dataset with lower dimensions, by considering minimum factors such that the dimension of the converted dataset is decreased. FA focuses on the correlations of variables such that the variables in a factor are highly correlated with each other and the variables in different factors are highly uncorrelated with each other. In applications, the number of the main factors in FA is usually considered as the number of the eigen-values of the correlation’s matrix with the values larger than one. To investigate the suitability of FA, the Kaiser-Meyer-Olkin (KMO) index is used. The KMO greater than 0.8 verifies the accuracy of FA. Assume is a random vector. Denoteandas the mean vector and covariance matrix of . The equation of factor analysis with m factors () is presented bysuch that and The matrix and the vectors and are called the factor loading matrix, the factors and errors, respectively. This model can be rewritten bysuch that is named as the loading of on the factor In orthogonal factor analysis, we haveandwhere Consequently,and is determines the proportion of that can be explained by the factors , …, The main aim of factor analysis is to find the values of the loadings. To compute the matrices and , different approaches such as principal component and maximum likelihood can be applied. Principal component approach uses eigen-values and eigen-vectors to decompose the matrix to find the matrix L. Maximum likelihood approach computes the likelihood and then optimize it to find the matrices and When the loading values are estimated, we can consider loading plots. Loading plots can be used to Study the correlations between variables Categorize and Classify the variables Detect the number of factors In loading plot, the angle between two variables () determines the correlation () between them.  = 90° verifies that two variables are uncorrelated (. The cases  = 0° and  = 180° refer to exact positive and negative linear relationship, respectively.

Results

This section reports the results of FA approach to classify the countries based on research’s variables. It should be noted that the number of the main factors in FA was considered as the number of the eigen-values of the correlation’s matrix with the values larger than one. Moreover, the KMO values were more than 0.8 that verify the accuracy of FA approach.

Counts of cases with Covid-19

The results of FA technique to categorize the research countries, in basis of the counts of the cases with Covid-19, are provided in Fig. 2 . The outputs demonstrate the statistical differences between the relationships among the countries and we can categorize the countries into following classes:
Fig. 2

FA technique to categorize the countries in basis of the counts of the cases with Covid-19.

FA technique to categorize the countries in basis of the counts of the cases with Covid-19. First class: Iran, France, Spain, Germany, Italy. Second class: United Kingdom, United States America.

Counts of the deaths due to Covid-19

The results of FA technique to categorize the research countries, in basis of the counts of the deaths due to Covid-19, are provided in Fig. 3 . The outputs demonstrate the statistical differences between the relationships among the countries and we can categorize the countries into following classes:
Fig. 3

FA technique to categorize the countries in basis of the counts of the deaths due to Covid-19.

FA technique to categorize the countries in basis of the counts of the deaths due to Covid-19. First class: France, United Kingdom, Germany and United States America. Second class: Iran, Italy and Spain.

Cumulative counts of the cases with Covid-19

The results of FA technique to categorize the research countries, in basis of the cumulative counts of the cases with Covid-19, are provided in Fig. 4 . The outputs demonstrate the statistical differences between the relationships among the countries and we can categorize the countries into following classes:
Fig. 4

FA technique to categorize the countries in basis of the cumulative counts of the cases with Covid-19.

FA technique to categorize the countries in basis of the cumulative counts of the cases with Covid-19. First class: France, Spain, Germany, Iran and Italy. Second class: United Kingdom and United States America.

Cumulative counts of the deaths due to Covid-19

The results of FA technique to categorize the research countries, in basis of the cumulative counts of the deaths due to Covid-19, are provided in Fig. 5 . The outputs demonstrate the statistical differences between the relationships among the countries and we can categorize the countries into following classes:
Fig. 5

FA technique to categorize the countries in basis of the cumulative counts of the deaths due to Covid-19.

FA technique to categorize the countries in basis of the cumulative counts of the deaths due to Covid-19. First class: France, United Kingdom, Germany and United States America. Second class: Iran, Italy and Spain.

Conclusion

Since Covid-19 has many impacts on environment, health, society and economy, the study of the rate of spread of this disease and the comparison of its rate in different countries is essential. The aim of this research was to study the cases with Covid-19 and the deaths due to this pandemic disease in seven countries that are severely affected from this pandemic disease. The cases and the deaths in United States America, United Kingdom, Spain, Italy, Iran, Germany, and France from February 22 to April 18 of 2020, were considered. First, the coefficients of correlation were computed to determine the relationships among these countries. The outputs showed that there were strong positive relationships between the rates of spread in all of countries. Then, the factor analysis was applied to categorize the countries in basis of the counts and deaths. For the cases with Covid-19, United Kingdom and United States America were similarly distributed to each other and were differently distributed from other countries. Also, for the deaths, Iran, Italy and Spain were similarly distributed to each other and were differently distributed from other countries. For future works, the authors suggest classifying the Covid-19 datasets of more regions based on FA technique, or apply this technique to classify the regions for other epidemic or pandemic diseases.

CRediT authorship contribution statement

Mohammad Reza Mahmoudi: Conceptualization, Investigation, Data curation, Validation, Methodology, Software, Writing - original draft. Dumitru Baleanu: Conceptualization, Supervision, Visualization, Writing - review editing. Shahab S. Band: Validation, Visualization, Software, Writing - review editing. Amir Mosavi: Visualization, Writing - review editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  20 in total

1.  Modeling caffeine adsorption by multi-walled carbon nanotubes using multiple polynomial regression with interaction effects.

Authors:  Mehdi Bahrami; Mohammad Javad Amiri; Mohammad Reza Mahmoudi; Sara Koochaki
Journal:  J Water Health       Date:  2017-08       Impact factor: 1.744

2.  Covid-19 and the Stiff Upper Lip - The Pandemic Response in the United Kingdom.

Authors:  David J Hunter
Journal:  N Engl J Med       Date:  2020-03-20       Impact factor: 91.245

3.  Time series modelling to forecast the confirmed and recovered cases of COVID-19.

Authors:  Mohsen Maleki; Mohammad Reza Mahmoudi; Darren Wraith; Kim-Hung Pho
Journal:  Travel Med Infect Dis       Date:  2020-05-13       Impact factor: 6.211

4.  Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding.

Authors:  Roujian Lu; Xiang Zhao; Juan Li; Peihua Niu; Bo Yang; Honglong Wu; Wenling Wang; Hao Song; Baoying Huang; Na Zhu; Yuhai Bi; Xuejun Ma; Faxian Zhan; Liang Wang; Tao Hu; Hong Zhou; Zhenhong Hu; Weimin Zhou; Li Zhao; Jing Chen; Yao Meng; Ji Wang; Yang Lin; Jianying Yuan; Zhihao Xie; Jinmin Ma; William J Liu; Dayan Wang; Wenbo Xu; Edward C Holmes; George F Gao; Guizhen Wu; Weijun Chen; Weifeng Shi; Wenjie Tan
Journal:  Lancet       Date:  2020-01-30       Impact factor: 79.321

5.  Transmission of 2019-nCoV Infection from an Asymptomatic Contact in Germany.

Authors:  Camilla Rothe; Mirjam Schunk; Peter Sothmann; Gisela Bretzel; Guenter Froeschl; Claudia Wallrauch; Thorbjörn Zimmer; Verena Thiel; Christian Janke; Wolfgang Guggemos; Michael Seilmaier; Christian Drosten; Patrick Vollmar; Katrin Zwirglmaier; Sabine Zange; Roman Wölfel; Michael Hoelscher
Journal:  N Engl J Med       Date:  2020-01-30       Impact factor: 91.245

6.  First cases of coronavirus disease 2019 (COVID-19) in France: surveillance, investigations and control measures, January 2020.

Authors:  Sibylle Bernard Stoecklin; Patrick Rolland; Yassoungo Silue; Alexandra Mailles; Christine Campese; Anne Simondon; Matthieu Mechain; Laure Meurice; Mathieu Nguyen; Clément Bassi; Estelle Yamani; Sylvie Behillil; Sophie Ismael; Duc Nguyen; Denis Malvy; François Xavier Lescure; Scarlett Georges; Clément Lazarus; Anouk Tabaï; Morgane Stempfelet; Vincent Enouf; Bruno Coignard; Daniel Levy-Bruhl
Journal:  Euro Surveill       Date:  2020-02

7.  COVID-19 battle during the toughest sanctions against Iran.

Authors:  Amirhossein Takian; Azam Raoofi; Sara Kazempour-Ardebili
Journal:  Lancet       Date:  2020-03-18       Impact factor: 79.321

8.  Rapid viral diagnosis and ambulatory management of suspected COVID-19 cases presenting at the infectious diseases referral hospital in Marseille, France, - January 31st to March 1st, 2020: A respiratory virus snapshot.

Authors:  Sophie Amrane; Hervé Tissot-Dupont; Barbara Doudier; Carole Eldin; Marie Hocquart; Morgane Mailhe; Pierre Dudouet; Etienne Ormières; Lucie Ailhaud; Philippe Parola; Jean-Christophe Lagier; Philippe Brouqui; Christine Zandotti; Laetitia Ninove; Léa Luciani; Céline Boschi; Bernard La Scola; Didier Raoult; Matthieu Million; Philippe Colson; Philippe Gautret
Journal:  Travel Med Infect Dis       Date:  2020-03-20       Impact factor: 6.211

Review 9.  COVID-19 and Italy: what next?

Authors:  Andrea Remuzzi; Giuseppe Remuzzi
Journal:  Lancet       Date:  2020-03-13       Impact factor: 79.321

View more
  2 in total

1.  Machine Learning-Based Prediction Models for Depression Symptoms Among Chinese Healthcare Workers During the Early COVID-19 Outbreak in 2020: A Cross-Sectional Study.

Authors:  Zhaohe Zhou; Dan Luo; Bing Xiang Yang; Zhongchun Liu
Journal:  Front Psychiatry       Date:  2022-04-29       Impact factor: 5.435

2.  Application of principal component analysis on temporal evolution of COVID-19.

Authors:  Ashadun Nobi; Kamrul Hasan Tuhin; Jae Woo Lee
Journal:  PLoS One       Date:  2021-12-02       Impact factor: 3.240

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.