Literature DB >> 36267652

Transmission characteristic and dynamic analysis of COVID-19 on contact network with Tianjin city in China.

Mingtao Li¹, Jin Cui¹, Juan Zhang², Xin Pei¹, Guiquan Sun^2,3.

Abstract

The outbreak of 2019 novel coronavirus pneumonia (COVID-19) has had a profound impact on people's lives around the world, and the spread of COVID-19 between individuals were mainly caused by contact transmission of the social networks. In order to analyze the network transmission of COVID-19, we constructed a case contact network using available contact data of 136 early diagnosed cases in Tianjin. Based on the constructed case contact network, the structural characteristics of the network were first analyzed, and then the centrality of the nodes was analyzed to find the key nodes. In addition, since the constructed network may contain missing edges and false edges, link prediction algorithms were used to reconstruct the network. Finally, to understand the spread of COVID-19 in the network, an individual-based susceptible-latent-exposed-infected-recover (SLEIR) model is established and simulated in the network. The results showed that the disease peak scale caused by the node with the highest centrality is larger, and reducing the contact infection rate of the infected person during the incubation period has a greater impact on the peak disease scale.

Entities: Chemical

Keywords: COVID-19; Centrality; Contact network; Link prediction; Simulated propagation

Year: 2022 PMID： 36267652 PMCID： PMC9561412 DOI： 10.1016/j.physa.2022.128246

Source DB: PubMed Journal: Physica A ISSN： 0378-4371 Impact factor: 3.778

Introduction

The outbreak of 2019 novel coronavirus pneumonia (COVID-19) caused by the coronavirus SARS-CoV-2 has had a profound impact on people’s lives around the world. Since the end of 2019, the spread of COVID-19 has brought enormous challenges to public health. How to effectively and quickly control the epidemic has become an urgent problem to be solved. The launch of multiple vaccines has brought hope to people’s lives [1]. However, while vaccines are being developed, viruses are constantly evolving and mutating [2], which exposes the world to potentially huge risks. Since the outbreak of the novel coronavirus pneumonia, China has adopted various non-drug interventions (closure of cities, villages, tracking and quarantine, etc.) to control the spread of the epidemic. The implementation of various measures made the epidemic in other parts of China, except Hubei Province, basically controlled in mid-to-early March 2020. Since the travel trajectory [3] of each confirmed case will be announced in the early stage of the epidemic in China, the contacts of the case can be identified based on its travel trajectory. An almost clear cases’ contact network can be obtained, and which provides a more realistic network for studying the spread of COVID-19. Studying the effectiveness of its early prevention and control strategies can provide more reasonable suggestions for China’s current epidemic prevention and control. There have been many studies based on the mathematical model to study the effectiveness of prevention and control strategies for COVID-19 [4], [5], [6], [7], [8], [9]. Tang et al. [6] devised SEIR model on the estimation of the transmission risk of COVID-19 and showed the effectiveness of control strategy by intensive contact tracing followed by quarantine and isolation in mainland China. Li et al. [7] developed a SEIQR difference-equation model of COVID-19 in Shanxi province that takes into account the transmission with discrete time imported cases. Sun et al. [8] presented a dynamical model to show the propagation of COVID-19 in Wuhan and the effects of lock-down and medical resources. Li et al. [9] developed differential equation model with tracing isolation strategy with close contacts of newly confirmed cases and discrete time imported cases to make predictions and perform assessment and risk analysis for COVID-19 outbreaks in Tianjin and Chongqing city. In addition, many scholars have also done research on individual-based network dynamics models. Xu et al. [10] established an individual-based transmission model to simulate the impact of different levels of non-drug interventions on controlling the spread of COVID-19. The results showed that isolating the infected and their first-level close contacts could not effectively control the spread of the disease. Isolating the second-level Close contacts can reduce the size of the outbreak. Firth et al. [11] simulated the prevention and control strategy of COVID-19 in a real social network, and the results showed that tracing of contacts reduced the outbreak scale of the disease, and maintaining social distancing while adopting the tracing and isolation strategy was very effective for epidemic control. Therefore, understanding and studying detailed cases’ transmission contact network can show the main factors influencing COVID-19 outbreaks. Using information published by the Health Commission of Tianjin city [12], we firstly constructed a case contact network data-set based on the statistical 136 confirmed COVID-19 patients. The detailed information includes the cumulative and daily laboratory-confirmed cases, and life track of these laboratory-confirmed cases. Then the topological features of this contact network were analyzed, and further we used link prediction algorithm to improve the original case contact network. Finally, we established a susceptible-latent-exposed-infected-recover (SLEIR) transmission model based on individual transmission to study the influence of the central nodes on COVID-19 transmission.

The data and network

The data

The first COVID-19 confirmed case appeared in Tianjin on January 21, 2020 [12]. Once a confirmed case is found, epidemiological investigators will investigate their movement track and close contacts, and then provide scientific basis for relevant departments and governments to prevent and control the epidemic through statistical analysis of convection data. The cumulative number of confirmed cases in Tianjin had reached 136 by March 14, 2020. According to the diagnosis time and residence area of 136 confirmed cases, the chart of daily newly confirmed cases and the map of confirmed cases in Tianjin are shown in Fig. 1. The time series chart of cases in Tianjin (Fig. 1(a)) showed that the daily newly confirmed cases fluctuated greatly after January 31, 2020, which could be caused by the department store incident in Baodi District. Ignoring the 6 confirmed cases from outside Tianjin, we found that Baodi District had the most cumulative confirmed cases, accounting for 46% of Tianjin, followed by Hedong District and Hebei District, accounting for 12% and 9% of Tianjin, respectively. Fig. 1(b) showed that Baodi District has the most cumulative number of confirmed cases in the early stage of Tianjin, followed by Hedong District and Hebei District, which is consistent with our calculation.

Fig. 1

Chart of daily newly confirmed cases and the map of confirmed cases in Tianjin. (a) New daily cases. (b) The map of confirmed cases. The darker the color, the more cases.

The network

From the Municipal Health Commission of Tianjin [12], we obtain detailed information of confirmed cases, such as gender, age, place of residence, date of onset, date of diagnosis, track of travel and history of residence or travel in Wuhan. Since the early stage of COVID-19 is characterized by contact transmission, and the existing data contains the contact information of confirmed cases, the contact data of cases can be used to construct a case contact network to study the spread of COVID-19 on the network. However, the transmission direction of the disease cannot be determined simply based on the date of diagnosis, because the incubation period of the disease varies from person to person. For example, the average incubation period of the disease obtained in literature [13], [14] are 5.2 days and 4 days, respectively. Therefore, the network studied in this paper is an undirected network. Then we build the case contact network. First, case nodes were numbered , , , according to its confirmed sequence. Secondly, we used their contact information to build links between nodes. If there is contact between case 1 and case 2, there exits a link between them. By sorting out the existing data we can obtain a network with 148 links is shown in Fig. 2 (there are 15 isolated nodes in the constructed network). Table 1 shows the partial properties of this constructed network by using method of [15]. The values of some properties show that this network is a sparse network. Due to the lack of information, there are many isolated nodes in the network. Therefore, the network properties reflected by average degree, average path length and clustering coefficient are not discussed here. However, the network diameter is much larger than the network average path length, which means that when the disease spreads in the network, the number of people that can be spread through the diameter is more than twice the number of people spread through the average path.

Fig. 2

Contact network of COVID-19 cases in Tianjin. The larger the circular node indicates the greater the degree of the node. Red square nodes are isolated nodes (imported cases without contact information). Network diameter: .

Table 1

Properties of the case network.

Network properties	Formula	Meaning	Value
Density	ρ=M12N(N−1)	the sparsity of the network	0.016122
Average degree	〈k〉=2MN	the average contact number of nodes	2
Average path length	L=1GE,GE=112N(N−1)∑i>jdij	the average distance between any two nodes	4.203591
Clustering coefficient	C=1N∑i=1NCi,Ci=2Eiki(ki−1)	mean Clustering coefficient of all nodes in the network	0.163142
Diameter	D=maxi,jdij	the maximum distance between any two nodes	10

The degree and cumulative degree distribution of the network

In order to understand the characteristics of the case contact network, we then studied the degree distribution of the network. Degree distribution refers to the proportion of nodes with degree in the number of nodes in the whole network. The degree distribution diagram of the network is shown in Fig. 3(a).

Fig. 3

The degree distribution and cumulative degree distribution of the case contact network. : the degree of node, : the proportion of nodes with a degree of to the number of nodes in the entire network, : the proportion of nodes whose degree is greater than or equal to to the number of nodes in the entire network. (a) Degree distribution. (b) The cumulative degree distribution.

From Fig. 3(a), it can be seen that the number of nodes in the case contact network with degree 1 is the most, followed by the two-degree node, the three-degree node and the zero-degree node respectively. The case contact network is characterized by the majority of small degree nodes and the few large degree nodes. Ignoring nodes with degree 0, degree distribution roughly follows long-tail distribution, as shown in Fig. 3(a). Because the degree distribution of the network is not regular, the cumulative degree distribution is introduced to analyze the network. The cumulative degree distribution is expressed by , which refers to the probability that the node degree is greater than or equal to . The cumulative degree distribution corresponding to the case contact network is shown in Fig. 3(b), which shows that the cumulative degree distribution of the case contact network roughly follows a power-law distribution, , where is called the power index [16], and is equal to 1.403538. The cumulative degree distribution values first decreased rapidly, which was caused by many small degree nodes in the case contact network. The results indicated that the node () with 28-degree is dominant in the case contact network, whose link number () accounts for 19% of the total link number (). The degree distribution and cumulative degree distribution of the case contact network. : the degree of node, : the proportion of nodes with a degree of to the number of nodes in the entire network, : the proportion of nodes whose degree is greater than or equal to to the number of nodes in the entire network. (a) Degree distribution. (b) The cumulative degree distribution.

Network centrality index and centrality analysis

In the established case contact network, it is necessary to introduce a centrality index to analyze the network in order to find the nodes that have the greatest influence on the network. Centrality measures are used to measure the importance of nodes in a network. In this study, degree centrality, betweenness centrality, closeness centrality and eigenvector centrality were used to measure the importance of nodes, whose calculation formula are shown in Table 2.

Table 2

Centrality measures.

Centrality measures	Formula
Degree centrality[17], [18]	CD(i)=∑j=1Naij
Betweenness centrality[18]	CB(i)=∑s≠i≠tnstigst
Closeness centrality[19]	CC(i)=N−1∑j=1,i≠jNdij
Eigenvector centrality[20]	CE(i)=c∑j=1Naijxj

By calculation, Table 3 lists the top 10 nodes for four types of centrality. It can be found that the first node of four kinds of centrality is the same node (), which indicates that the role of the case cannot be ignored in the case contact network. In addition, we also found that nodes s34 and s43 frequently appeared among the top ten nodes of the four centrality rankings, which was consistent with the existing data, and were all related to the department store of Baodi District. That is to say, s34, s37 and s43 played an important role in the epidemic in Baodi District. They were all sales staff of the department store in Baodi District, and greatly affected the spread of the COVID-19 in Tianjin. The top three nodes of centrality are presented in the network diagram as shown in Fig. 4.

Table 3

Top ten nodes by degree, betweenness, closeness, eigenvector.

Rank	CD	CB	CC	CE	Rank	CD	CB	CC	CE
1	s37	s37	s37	s37	6	s5	s127	s119	s124
2	s6	s34	s34	s43	7	s103	s43	s85	s85
3	s43	s20	s43	s34	8	s119	s5	s122	s99
4	s34	s6	s20	s48	9	s2	s93	s48	s53
5	s93	s119	s87	s128	10	s8	s85	s51	s51

Fig. 4

The presence of the top three nodes of degree centrality in the network. The larger the red node, the more forward the centrality. (a) Degree centrality. (b) Betweenness centrality. (c) Closeness centrality. (d) Eigenvector centrality.

Centrality measures. The regions of the top 10 nodes of the four kinds of centrality are analyzed below. Fig. 5 shows the geographical distribution of the top 10 nodes of centrality in Tianjin. It is not difficult to find that most of the top ten nodes belong to Baodi District, which is consistent with the fact that Baodi District is the largest cluster of COVID-19 epidemic in Tianjin. Both Hedong District and Hebei District appear in Fig. 5(a) and Fig. 5(b). This suggests that cases of Hedong district and Hebei district of Tianjin COVID-19 outbreak are influential in the development process. This was caused by the work area of the first confirmed case in Hedong District, who worked in the Tianjin bullet train section, traveled to Wuhan on business, and then infected colleagues living in Hedong District and Hebei District. This is consistent with the statistical data of Tianjin, namely, Baodi District of Tianjin has the most cases, followed by Hedong district and Hebei District (Fig. 1(b)).

Fig. 5

The geographical distribution of the top ten nodes of centrality in each district of Tianjin. The darker the color of the region, the greater the proportion of the top ten nodes belonging to that region. (a) Degree centrality. (b) Betweenness centrality. (c) Closeness centrality. (d) Eigenvector centrality.

Top ten nodes by degree, betweenness, closeness, eigenvector. The presence of the top three nodes of degree centrality in the network. The larger the red node, the more forward the centrality. (a) Degree centrality. (b) Betweenness centrality. (c) Closeness centrality. (d) Eigenvector centrality. The geographical distribution of the top ten nodes of centrality in each district of Tianjin. The darker the color of the region, the greater the proportion of the top ten nodes belonging to that region. (a) Degree centrality. (b) Betweenness centrality. (c) Closeness centrality. (d) Eigenvector centrality.

Link prediction

Introduction to link prediction

Because the established case contact network may contain false links and missing links. Therefore, link prediction algorithms are needed to improve the network. The basic idea of link prediction [21] is to assign a fraction to each pair of unlinked nodes in an undirected network with nodes and links, and then order all unconnected node pairs from the largest to the smallest according to this value, and the node pairs at the top have the highest probability of appearing links. In order to test the accuracy of link prediction algorithm, the known links in the network set is usually divided into training set and testing set : . Only the information of the testing set is used in the calculation, and any possible link between any pair of nodes that do not belong to the existing link set is called a non-existent link. Two commonly used metrics to measure the accuracy of link prediction algorithms are AUC and PRECISION. AUC is introduced here. AUC measures the accuracy of the algorithm as a whole, can be understood as the probability that the score of a link in the testing set is higher than the score of a randomly selected link that does not exist [21]. In other words, randomly choose an edge () from the test set to compare with a non-existent edge (): if the score value of is greater than the score value of , then add 1 point; if the scores are equal, add 0.5 points. So independently compare times, if there is times the score value of is greater than , times is equal, then AUC is defined as: The larger the AUC value is, the more accurate the prediction result is. Since few case attributes can be obtained in the case contact network, node similarity index based on local information is adopted here to make link prediction for the existing network. CN index, AA index and RA index are adopted in this paper. The specific formula is shown in Table 4.

Table 4

Index of link prediction.

Index	Formula
CN[22]	sxy=Γ(x)∩Γ(y)
AA[23]	sxy=∑z∈Γ(x)∩Γ(y)1logkz
RA[24]	sxy=∑z∈Γ(x)∩Γ(y)1kz

Index of link prediction.

Data processing of case contact network

Since there are many isolated points in the constructed case contact network, in order to solve this problem, the following assumptions are made: (1) Linking the edges according to the region where the isolated nodes are located, that is, it is assumed that they have contacts with all confirmed cases in the same region; (2) When there is only one case in the region where the isolated node is located, the confirmed cases in the region closest to the region are preferred to have contacts with the isolated node. Under the above assumptions, a case contact network with 136 nodes and 197 edges was constructed, as shown in Fig. 6.

Fig. 6

Optimized case contact network, which has 136 nodes and 197 links. Red nodes are isolated nodes in the network shown in Fig. 3.

Reconstructed network after link prediction

The optimized case contact network is used to make link prediction. Here, it is assumed that the ratio of training set to testing set is . The reconstructed network after link prediction using the Common Neighbor (CN) index, Adamic-Adar (AA) index and Resource Allocation (RA) index are shown in Fig. 7.

Fig. 7

After observing the reconstructed network predicted by the three indexes, it can be found that all they contain two isolated nodes(). Compared with the existing data, since these two isolated nodes are imported cases, this paper does not analyze whether these two cases have contact with other cases. The AUC values are used to analyze the prediction accuracy of the three indicators. After simulations, the mean AUC values of the three indicators are obtained, as shown in Table 5.

Table 5

AUC values corresponding to CN, AA and RA index.

Index	CN	AA	RA
AUC	0.8519	0.8608	0.8609

Reconstructed network. The thickness of the network edge is different because the weight of the edge is different for different link prediction indexes. The larger the circular node indicates the greater the degree of the node. Red square nodes are isolated nodes. (a) Common Neighbor (CN). (b) Adamic-Adar (AA). (c) Resource Allocation (RA). It can be concluded that the prediction accuracy of RA index is the highest. Therefore, the predicted results of RA indicators were compared with the original case contact network. Through the link prediction, the number of links in the network changes from 197 to 584. It is speculated that the extra links are mostly contacts of family members and colleagues that have not been counted. And the properties of the reconstructed network are shown in Table 6. The reconstructed network has a small average path length and large clustering coefficient, so the network after link prediction has small-world effect [25], [26], which indicates that the spread of COVID-19 is highly aggregative. By link prediction, the network diameter decreased significantly from 10 to 9 (). This means that when the number of network links is increased by link prediction, the transmission chain may not become shorter, which may be related to the incident of Baodi District department store, i.e. infection of staff, customer infection, customer family infection, etc.

Table 6

Properties of the reconstructed network based on the link prediction results of RA index.

Network properties(link prediction results of RA index)	Value
Density	0.06361656
Average degree	8
Average path length	3.262306
Clustering coefficient	0.8294807
Diameter	9

AUC values corresponding to CN, AA and RA index. Properties of the reconstructed network based on the link prediction results of RA index.

Network dynamics analysis based on individual model

In order to study the influence of the central nodes on COVID-19 transmission in the network, a susceptible-latent-exposed-infected-recover () transmission model based on individual transmission was established based on probabilistic discrete-time Markov chains [27], [28].

Transmission model

In our model, , , , and represent the probabilities of node being in the susceptible state, non-infectious state during the incubation period, infectious state during the incubation period, onset state and recovery state at time , respectively. A susceptible individual can be infected by exposed or infected individuals, it then becomes an latent individual, and are the transmission rate coefficient with the exposed period and the infectious period during the onset, respectively. Hence, the transmission from exposed individual and infected individual can be obtained as follows: where indicates the link from to in the network, and is the adjacency matrix of the network. Then, we can use the following equation to calculate the probability of a susceptible individual being not infected by its exposed or infected individual neighbors at time . Furthermore, the probability of a susceptible individual being infected by its exposed or infected individual neighbors at time can be obtained: Hence, the transmission rate is . The transfer rate from latent period () to exposed period () was , the transfer rate from exposed period () to symptomatic infectious period was , and the recover rate from infectious individual was . Hence, the full model equations with Markov chain are given by the following. According to the model, the possible scale of infected persons at time is .

Impact of central nodes on disease transmission

According to the existing data consider the influence of central nodes on the spread of disease in the network, it is assumed that the initial infected nodes are and the first three nodes of the central nodes (degree: ; betweenness: ; closeness: ; eigenvector: ). Without loss of generality, disease transmission is simulated on the original network and the predicted network, respectively. As the contact transmission rate of the disease is unknown, it is assumed that is 0.4998, and the transfer rate are [9]. Through simulations, we obtained the mean of the peak scale size of the disease as shown in Table 7.

Table 7

Centrility	Initial nodes	I(O)	I(P)	Peak time(O)	Peak time(P)
	s1,s3,s4	2.4670	62.6770	11	19
Degree	s37,s43,s6	45.4489	60.6559	17	16
Betweenness	s37,s34,s20	46.6879	60.8289	18	16
Closeness	s37,s34,s43	44.7174	60.7235	19	16
Eigenvector	s37,s43,s34	44.7174	60.7235	19	16

As can be seen from Table 7, in both the original network and the prediction network, the disease scale caused by the betweenness nodes among the central nodes is the largest, indicating that the media has a great influence in the process of network transmission. In addition, we also found that the peak scale of disease caused by the initial infected nodes in the original network and the prediction network was significantly different. In order to clarify the cause of this disparity, we calculated the shortest path from the prediction network to the central node, and found that after link prediction, node could reach the central node through the shortest path (, , ), thus causing the peak scale of the disease to be larger than that caused by the central node. From the peak time of the disease, the four central nodes no difference. However, compared with the peak propagation of the original nodes in the two networks, it can be found that the nodes can reach the peak faster when there are fewer connected paths. This means that when a disease is spreading through the network, the necessary disconnection helps the disease peak quickly, which in turn reduces its spread. The change curve of simulated infected persons is shown in Fig. 8.

Fig. 8

The change of the number of symptomatic infected people with time with different network. a. Original contact network. b. The predicted network.

Peak number of infections of the disease caused by the central nodes of the original network and the reconstructed network. O represents the original network and P represents the predicted network. I(O) and I(P) represent the peak size of the original network and the predicted network respectively. Fig. 8(a) shows that the first three nodes () of betweenness centrality is more likely to cause the large-scale spread of the disease. When the initial infected node in the network is the actual node data(), the peak size of the disease is much smaller than that caused by the central node, which means that the timely discovery and find out the key nodes in the network has a great influence on the spread of disease, and to find the real key nodes in the spread of the disease, usually depends on the nature of work and the social network. Fig. 8(b) shows that the propagation scale caused by the initial node is basically the same as that caused by the central node. This is because link prediction changes the link of in the initial node, and then it can reach the central node through some nodes, thus driving the propagation of the central node. This further reflects the influence of central nodes on the spread of disease in the network. That is, for the spread of disease in the network, it is necessary to find the central nodes with high aggregation and cut off the path to the central nodes in time, so that the disease can be controlled in a short time. The change of the number of symptomatic infected people with time with different network. a. Original contact network. b. The predicted network.

Impact of transmission rate on disease transmission

Due to the various prevention and control strategies adopted during the spread of COVID-19 to reduce its transmission rate of contact, such as 14-day isolation of infected persons, wearing masks and maintaining social distancing. Therefore, the influence of infection rate on disease transmission was further considered. To facilitate comparison, we chose to simulate the impact of transmission rate on disease transmission in the prediction network. Due to the large scale of diseases caused by initial nodes and media-central nodes, we considered the influence of transmission rate on its transmission, and the transmission rate was divided into four cases in the simulation process: ; ; ; . The reason for this value here depends on the prevention and control strategy of early isolation for 14 days. The change curve of simulated infected persons is shown in Fig. 9.

Fig. 9

The change of the number of symptomatic infected people in the predicted network with time and different transmission rates. a. Betweenness nodes(). b. Original nodes().

Fig. 9 shows that reducing has a greater impact on the spread of the disease than reducing . Decreasing has little effect on the peak time of the disease, while decreasing delays the peak time, which is more evident in Fig. 9(b), because it need take some time for the original node to infect the central node. For the betweenness centrality nodes (Fig. 9(a)), decreasing can reduce 9.26% of infected persons, while decreasing can only reduce 4.32% of infected persons. For the original infected nodes in the network (Fig. 9(b)), reducing can reduce the number of infected nodes by 11.45%, while reducing can only reduce the number of infected nodes by 3.85%. This shows the importance of digging out asymptomatic patients to control the spread of the disease, which is closely related to the current policy of timely nucleic acid testing, namely, the timely detection of asymptomatic infected persons and the reduction of transmission caused by asymptomatic infected persons. In addition, both figures in Fig. 9 show that reducing the two transmission rates at the same time can effectively reduce the number of infected persons by 48.8% and 54.61%, respectively. This means that it makes sense to take all kinds of prevention and control measures to reduce the transmission rate of COVID-19. Also, we can find that at the same time, reducing the transmission rate effect on the spread of the original nodes is greater than the influence of the central nodes. This may be because of the original network node to the path of the center node buffer to infectious center node, so as to make the central node infected probability reduced, and does not cause widespread dissemination around central node. The change of the number of symptomatic infected people in the predicted network with time and different transmission rates. a. Betweenness nodes(). b. Original nodes().

Conclusion and discussion

Up to now, there are still many clusters of confirmed cases of COVID-19 in China, and the situation abroad is grim. Therefore, under the normal situation of prevention and control strategies, how to quickly and effectively control the cluster of epidemic has become our focus. By collecting the data of early confirmed COVID-19 cases in Tianjin, we obtained 136 confirmed COVID-19 cases in Tianjin from January 21, 2020 to March 14, 2020. Baodi District of Tianjin had the most confirmed cases (from the Baodi department store incident), followed by Hedong District and Hebei District. Since the incident in Baodi District department store was an early cluster epidemic, we wanted to provide evidence for the current cluster epidemic control strategy by studying the early Tianjin epidemic. In order to study the early COVID-19 epidemic in Tianjin, we first constructed a case contact network based on the close contact information of early confirmed cases in Tianjin, in which confirmed cases were nodes and contact between cases were links. Then using the igraph package of R software, the network is visualized and a series of analysis is carried out, and the analysis shows that the network’s cumulative degree distribution roughly obeys the power-law distribution. The results of centrality analysis show that most of the most influential cases in the network are concentrated in Baodi District, followed by Hedong District and Hebei District. Since the case contact network constructed in this paper may contain false links and missing links, link prediction is needed to reconstruct the network. However, there are many isolated nodes in the network initially constructed. Therefore, assumptions are made to eliminate the isolated nodes in the network, and further link prediction is carried out by using the indexes of CN, AA and RA to reconstruct the network. In terms of the accuracy of the index, RA is the best, followed by AA and CN. According to the prediction result of RA index, the average path of the network is small, and the clustering coefficient is large, which indicates that the predicted case contact network has the nature of a small-world network. The large diameter of the network is a reflection of the wide spread of COVID-19. Furthermore, we want to know the influence of the central nodes on disease transmission. A susceptible-latent-exposed-infected-recover () model based on individual transmission was established. By simulating disease spread in the original network and the predicted network, we can find the key nodes in the network can accelerate the spread of COVID-19, so identifying the key node is important to control the disease in time. This also reflects that people with larger social circles may accelerate the spread of the disease when the disease spreads through contacts, which corresponds to the spread of COVID-19, and the timely detection and isolation of their multilevel contacts may be reduced if a confirmed case is identified spread of disease. In addition, reducing the transmission rate of the disease can also control the spread of the disease, which requires that the infected people must be isolated, reduce contact with people, which also shows that wearing masks and other prevention and control measures are reasonable. Moreover, reducing the transmission rate of asymptomatic infected people is more effective in controlling the disease than reducing the transmission rate of infected people, which is consistent with the current policy of detecting positive cases by nucleic acid tests in time. Because of the occurrence of the cluster epidemic, we still can not give the strategy to control it quickly. Compared with previous studies, we have given the goal of controlling the spread of COVID-19 from the perspective of the source of infection: detect positive cases in time and reduce the spread of latent. This is consistent with the current goal of multiple rounds of nucleic acid detection. In addition, finding key points quickly and reducing contact transmission rates are powerful ways to control disease transmission. But in real life, at the beginning of the outbreak it is quickly difficult to find the key cases. How to make the key cases in actual life and the key nodes in the network one to one correspondence, we need to address these features in the future.

CRediT authorship contribution statement

Mingtao Li: Conceptualization, Methodology, Modeling, Writing – original draft, Review & editing. Jin Cui: Methodology, Software, Writing – original draft. Juan Zhang: Methodology, Writing – original draft, Review & editing. Xin Pei: Conceptualization, Methodology, Software, Writing – original draft. Guiquan Sun: Supervision, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

12 in total

1. Collective dynamics of 'small-world' networks.

Authors: D J Watts; S H Strogatz
Journal: Nature Date: 1998-06-04 Impact factor: 49.962

2. The centrality of a graph.

Authors: G Sabidussi
Journal: Psychometrika Date: 1966-12 Impact factor: 2.500

3. Analysis of COVID-19 transmission in Shanxi Province with discrete time imported cases.

Authors: Ming-Tao Li; Gui-Quan Sun; Juan Zhang; Yu Zhao; Xin Pei; Li Li; Yong Wang; Wen-Yi Zhang; Zi-Ke Zhang; Zhen Jin
Journal: Math Biosci Eng Date: 2020-05-21 Impact factor: 2.080

4. Using a real-world network to model localized COVID-19 control strategies.

Authors: Josh A Firth; Joel Hellewell; Petra Klepac; Stephen Kissler; Adam J Kucharski; Lewis G Spurgin
Journal: Nat Med Date: 2020-08-07 Impact factor: 87.241

5. Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia.

Authors: Qun Li; Xuhua Guan; Peng Wu; Xiaoye Wang; Lei Zhou; Yeqing Tong; Ruiqi Ren; Kathy S M Leung; Eric H Y Lau; Jessica Y Wong; Xuesen Xing; Nijuan Xiang; Yang Wu; Chao Li; Qi Chen; Dan Li; Tian Liu; Jing Zhao; Man Liu; Wenxiao Tu; Chuding Chen; Lianmei Jin; Rui Yang; Qi Wang; Suhua Zhou; Rui Wang; Hui Liu; Yinbo Luo; Yuan Liu; Ge Shao; Huan Li; Zhongfa Tao; Yang Yang; Zhiqiang Deng; Boxi Liu; Zhitao Ma; Yanping Zhang; Guoqing Shi; Tommy T Y Lam; Joseph T Wu; George F Gao; Benjamin J Cowling; Bo Yang; Gabriel M Leung; Zijian Feng
Journal: N Engl J Med Date: 2020-01-29 Impact factor: 176.079

6. Clinical Characteristics of Coronavirus Disease 2019 in China.

Authors: Wei-Jie Guan; Zheng-Yi Ni; Yu Hu; Wen-Hua Liang; Chun-Quan Ou; Jian-Xing He; Lei Liu; Hong Shan; Chun-Liang Lei; David S C Hui; Bin Du; Lan-Juan Li; Guang Zeng; Kwok-Yung Yuen; Ru-Chong Chen; Chun-Li Tang; Tao Wang; Ping-Yan Chen; Jie Xiang; Shi-Yue Li; Jin-Lin Wang; Zi-Jing Liang; Yi-Xiang Peng; Li Wei; Yong Liu; Ya-Hua Hu; Peng Peng; Jian-Ming Wang; Ji-Yang Liu; Zhong Chen; Gang Li; Zhi-Jian Zheng; Shao-Qin Qiu; Jie Luo; Chang-Jiang Ye; Shao-Yong Zhu; Nan-Shan Zhong
Journal: N Engl J Med Date: 2020-02-28 Impact factor: 91.245