| Literature DB >> 35332184 |
Łukasz G Gajewski1, Robert Paluch2, Krzysztof Suchecki2, Adam Sulik2, Boleslaw K Szymanski3, Janusz A Hołyst2,4.
Abstract
In recent years, research on methods for locating a source of spreading phenomena in complex networks has seen numerous advances. Such methods can be applied not only to searching for the "patient zero" in epidemics, but also finding the true sources of false or malicious messages circulating in the online social networks. Many methods for solving this problem have been established and tested in various circumstances. Yet, we still lack reviews that would include a direct comparison of efficiency of these methods. In this paper, we provide a thorough comparison of several observer-based methods for source localisation on complex networks. All methods use information about the exact time of spread arrival at a pre-selected group of vertices called observers. We investigate how the precision of the studied methods depends on the network topology, density of observers, infection rate, and observers' placement strategy. The direct comparison between methods allows for an informed choice of the methods for applications or further research. We find that the Pearson correlation based method and the method based on the analysis of multiple paths are the most effective in networks with synthetic or real topologies. The former method dominates when the infection rate is low; otherwise, the latter method takes over.Entities:
Mesh:
Year: 2022 PMID: 35332184 PMCID: PMC8948209 DOI: 10.1038/s41598-022-09031-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Summary diagrams of precision metric results for all tested networks (major columns) and infection rates (major rows). The colours indicate the localisation methods, whereas observer placement strategies are marked per (minor) column in each block (labels are placed at the very bottom of the plot), while minor rows represent observer densities d. Bars within minor blocks show all methods, ordered from the best to the worst (a high precision indicates a high performance), with the background colour of the minor block indicating the best localisation method. The asterisk indicates the best localisation and placement strategy combination per row within a major block, i.e., for a given density, topology and infection rate. Bars are normalised to the highest score per graph, infection rate, and density.
Figure 2Summary diagrams of CSS metric for all tested networks (major columns) and infection rates (major rows). The colours indicate the localisation methods, whereas observer placement strategies are marked per (minor) column in each block (labels are placed at the very bottom of the plot), while minor rows represent observer densities d. Bars within minor blocks show all methods, ordered from the best to the worst (a low CSS value indicates a high performance), with the background colour of the minor block indicating the best localisation method. The asterisk indicates the best localisation and placement strategy combination per row within a major block, i.e., for a given density and topology. Bars are normalised to the highest score per graph, infection rate and density.
Computational complexity of the tested methods. We estimated the listed experimental complexities for this review. However, the values of coefficients may change for networks of different sizes or when different hardware or software is used for test execution. For the details on theoretical complexities derivations as functions of observer density, and system size, see the Supplementary Information Sec. Computation time.
| Method | Theoretical complexity | Experimental fit | Comments |
|---|---|---|---|
| GMLA | Much faster than the EPL and scales better than the LPTV, yet, slower than the PC and the TRBS at low densities or small graphs. At high densities or very large graphs, it is the fastest of all the methods | ||
| LPTV | In our tests, the complexity turned out to be higher than declared due to the matrix inversion operation. One of the slowest methods | ||
| EPL | In our tests, it was the slowest of all the methods due to the very high constant factor (initial cost). However, for high densities or very large graphs, it can actually be faster than the LPTV. Appropriate pre-computing is also possible to mitigate the costs | ||
| TRBS | One of the faster methods, alongside the PC, except for high density or large scale graphs | ||
| PC | One of the faster methods, alongside the TRBS, except for high density or large scale graphs |