| Literature DB >> 33924216 |
Valentina Y Guleva1, Polina O Andreeva1, Danila A Vaganov1.
Abstract
Finding the building blocks of real-world networks contributes to the understanding of their formation process and related dynamical processes, which is related to prediction and control tasks. We explore different types of social networks, demonstrating high structural variability, and aim to extract and see their minimal building blocks, which are able to reproduce supergraph structural and dynamical properties, so as to be appropriate for diffusion prediction for the whole graph on the base of its small subgraph. For this purpose, we determine topological and functional formal criteria and explore sampling techniques. Using the method that provides the best correspondence to both criteria, we explore the building blocks of interest networks. The best sampling method allows one to extract subgraphs of optimal 30 nodes, which reproduce path lengths, clustering, and degree particularities of an initial graph. The extracted subgraphs are different for the considered interest networks, and provide interesting material for the global dynamics exploration on the mesoscale base.Entities:
Keywords: dynamics prediction; interest network; motif; sampling; social network; subgraph extraction
Year: 2021 PMID: 33924216 PMCID: PMC8074582 DOI: 10.3390/e23040492
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1The framework of the appropriate subgraphs search. Sampling methods are used for subgraph extraction, then motif extraction techniques are used for topological verification: Motif distributions for sub- and supergraphs are compared; the regression model on the base of sample motif distribution is used for transient time prediction for the SI diffusion model. The best subgraphs and corresponding methods for topological and functional criteria are evaluated.
The discussion topics finally chosen and the corresponding graph topological features (features are averaged for several networks within topics of interests).
| Nodes | Clustering | Density | Assort. | |
|---|---|---|---|---|
| badtattoos | 703.(36) | 0.0071 | 0.0018 | −0.0506 |
| gonewildcurvy | 1191.(54) | 0.0064 | 0.0015 | −0.1482 |
| southpark | 2031.(81) | 0.0109 | 0.0008 | −0.0236 |
| HogwartsRP | 54.(09) | 0.3462 | 0.1432 | −0.2259 |
| redditblack | 201.(72) | 0.3055 | 0.0453 | −0.1748 |
| geology | 554.(54) | 0.0214 | 0.0025 | −0.0541 |
| hardwareswap | 1712.(27) | 0.0613 | 0.0025 | −0.0818 |
| counterstrike | 446.(81) | 0.0086 | 0.0025 | −0.0668 |
| stopsmoking | 830.(63) | 0.0503 | 0.0024 | −0.1102 |
| memes | 443.(27) | 0.0142 | 0.0023 | −0.0313 |
| feminism | 675.(36) | 0.0259 | 0.0022 | −0.0310 |
| introvert | 581.(90) | 0.0145 | 0.0022 | −0.0863 |
| pizza | 615.(54) | 0.0198 | 0.0021 | −0.0882 |
| vegetarian | 804.(27) | 0.0266 | 0.0021 | −0.0594 |
| depression | 2973.(09) | 0.0139 | 0.0005 | −0.0726 |
| CrazyIdeas | 2600.(63) | 0.0127 | 0.0005 | −0.0488 |
| lifehacks | 3126 | 0.0119 | 0.0004 | −0.0635 |
| conservatives | 126.(18) | 0.2698 | 0.0191 | −0.4898 |
| 90daysgoal | 169.(27) | 0.4219 | 0.0545 | −0.4184 |
| csshelp | 253.(54) | 0.0526 | 0.0067 | −0.4451 |
| freedonuts | 605.(27) | 0.0228 | 0.0026 | −0.3823 |
| altgonewild | 112.(72) | 0.0075 | 0.0111 | −0.3164 |
| bonsai | 318.(45) | 0.3263 | 0.0128 | −0.3188 |
| colorado | 464.(90) | 0.0289 | 0.0036 | 0.04103 |
| GreenBayPackers | 1576.(18) | 0.0317 | 0.0019 | 0.0159 |
| beertrade | 806.(72) | 0.0413 | 0.0034 | 0.0100 |
Figure 2Mean squared error between motif distribution of extracted subgraphs and corresponding supergraphs for different sampling techniques.
Figure 3Coefficient of determination for regression model over subgraph M motif distribution, predicting for different sampling techniques.
Figure 4Resulting subgraphs on the corresponding supergraphas and their single versions for the selected topics: (a) Counter strike, (b) Free donut—i like it!, (c) feminism, (d) introvert, (e) pizza, and (f) stop smoking. Node size correspond to its degree, blue nodes highlight subgraphs extracted.