| Literature DB >> 33285894 |
Thanh Trinh1, Dingming Wu1, Joshua Zhexue Huang1, Muhammad Azhar1.
Abstract
Event-based social networks (EBSNs) are widely used to create online social groups and organize offline events for users. Activeness and loyalty are crucial characteristics of these online social groups in terms of determining the growth or inactiveness of the social groups in a specific time frame. However, there is less research on these concepts to clarify the existence of groups in event-based social networks. In this paper, we study the problem of group activeness and user loyalty to provide a novel insight into online social networks. First, we analyze the structure of EBSNs and generate features from the crawled datasets. Second, we define the concepts of group activeness and user loyalty based on a series of time windows, and propose a method to measure the group activeness. In this proposed method, we first compute a ratio of a number of events between two consecutive time windows. We then develop an association matrix to assign the activeness label for each group after several consecutive time windows. Similarly, we measure the user loyalty in terms of attended events gathered in time windows and treat loyalty as a contributive feature of the group activeness. Finally, three well-known machine learning techniques are used to verify the activeness label and to generate features for each group. As a consequence, we also find a small group of features that are highly correlated and result in higher accuracy as compared to the whole features.Entities:
Keywords: EBSNs; activeness; loyalty; social networks
Year: 2020 PMID: 33285894 PMCID: PMC7516425 DOI: 10.3390/e22010119
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Example of an event-based social network (EBSN).
Dataset statistics.
| City | #Groups | #Events | #Users | #YES | #NO |
|---|---|---|---|---|---|
| New York | 1269 | 28,355 | 591,580 | 331,436 | 105,433 |
| San Francisco | 867 | 14,205 | 342,662 | 245,767 | 66,611 |
| London | 985 | 17,309 | 610,189 | 246,413 | 108,070 |
| Sydney | 297 | 5980 | 179,081 | 82,399 | 30,075 |
Group categories.
| Alternative Lifestyle | Book Clubs | Career/Business |
|---|---|---|
| Cars/Motorcycles | Community/Environment | Dancing |
| Education/Learning | Fashion/Beatuy | Fine Arts/Culture |
| Fitness | Food/Drink | Games |
| Health/Wellbeing | Hobbies | Language/Ethnic Identity |
| Lgbt | Movement/Politics | Movies/Films |
| Music | New Age/Spirituality | Outdoors/Adventure |
| Paranormal | Parents/Family | Pets/Animals |
| Photography | Religion/Beliefs | Sci-Fi/Fantasy |
| Singles | Socializing | Sports/Recreation |
| Support | Tech | Writing |
The features derived from datasets.
| Category | Feature | Description | Type |
|---|---|---|---|
| Group-based | CATEGORY | Corresponding category value | Integer |
| N_TOPICS | Number of topics in a group | Integer | |
| N_USERS | Number of users in a group | Integer | |
| RATING | Score average of group reviews | Double | |
| YEAR | The year a group is created in | Integer | |
| MONTH | The month a group is created in | Integer | |
| DAY_OF_MONTH | The day a group is created on | Integer | |
| WEEKDAY | The weekday a group is created on | Integer | |
| Event-based | N_EVENTS | Number of events | Integer |
| RSVPs | Number of all RSVPS | Integer | |
| Y_RSVPs | Number of all RSVPS only with YES | Integer | |
| N_RSVPs | Number of all RSVPS only with NO | Integer | |
| AVERAGE_RSVPs | Average of all RSVPSs | Double | |
| SD_RSVPs | Standard deviation of all RSVPs | Integer | |
| AVERAGE_Y_RSVPs | Average of RSVPS only with YES | Integer | |
| SD_Y_RSVPs | Standard deviation of all RSVPS only with YES | Integer | |
| AVERAGE_N_RSVPs | Average of RSVPS only with NO | Integer | |
| SD_N_RSVPs | Standard deviation of all RSVPS only with NO | Integer | |
| AVERAGE_DAY | Average days between two consecutive events | Double | |
| SD_DAY | Standard deviation of numbers of days between two consecutive events | Double | |
| N_EVENT_ORGANIZER | Number of events that has organizers | Double | |
| User-based | N_ORGANIZER | Number of organizers in the group | Integer |
| N_ATTENDEES | Number of users who confirm at least one YES | Integer | |
| BIO | Number of users who have a biography | Integer | |
| ADDRESS | Number of users who have address information | Integer |
Notations.
|
| Group G | |
|
| The | |
|
| Event | |
|
| The number of events created by group | Integer |
|
| Ratio of | Double |
|
| The measure of the loyalty of a user | Double |
Figure 2Example of the time frame of group G.
Figure 3Category type distribution in groups created in four cities in the year 2014.
Figure 4Distribution of average days between two consecutive events in each group.
Figure 5Example of the time window T in and .
Figure 6New York—The numbers of groups and events in 8 time windows.
Figure 7San Francisco—The numbers of groups and events in 8 time windows.
Figure 8London—The numbers of groups and events in 8 time windows.
Figure 9Sydney—The numbers of groups and events in 8 time windows.
The distribution of numbers of groups in three different activeness labels after several time windows in the four cities.
| Total | Inactive | Stable | Active | Total | Inactive | Stable | Active | ||
|---|---|---|---|---|---|---|---|---|---|
|
| 715 |
| 549 | ||||||
|
| 715 | 286 | 134 | 295 |
| 549 | 217 | 104 | 228 |
|
| 715 | 227 | 201 | 287 |
| 549 | 177 | 157 | 215 |
|
| 715 | 249 | 136 | 330 |
| 549 | 198 | 107 | 244 |
|
| 715 | 233 | 145 | 337 |
| 549 | 194 | 124 | 231 |
|
| 715 | 256 | 124 | 335 |
| 549 | 211 | 102 | 236 |
|
| 715 | 273 | 128 | 314 |
| 549 | 230 | 95 | 224 |
|
| 715 | 283 | 131 | 301 |
| 549 | 251 | 88 | 210 |
| ( | ( | ||||||||
|
|
|
|
|
|
|
|
| ||
|
| 481 |
| 152 | ||||||
|
| 481 | 157 | 79 | 245 |
| 152 | 50 | 26 | 76 |
|
| 481 | 100 | 155 | 226 |
| 152 | 37 | 40 | 75 |
|
| 481 | 128 | 88 | 265 |
| 152 | 42 | 30 | 80 |
|
| 481 | 124 | 92 | 265 |
| 152 | 39 | 28 | 85 |
|
| 481 | 146 | 80 | 255 |
| 152 | 46 | 21 | 85 |
|
| 481 | 155 | 89 | 237 |
| 152 | 43 | 27 | 82 |
|
| 481 | 179 | 82 | 220 |
| 152 | 51 | 26 | 75 |
| ( | ( | ||||||||
Figure 10Distributions in terms of percentage and the number of attended events that users participate in among the total events of their groups in the two-year period. (a) New York—Percentage of attended events in total events for users. (b) New York—Number of attended events for users. (c) San Francisco—Percentage of attended events in total events for users. (d) San Francisco—Number of attended events for users. (e) London—Percentage of attended events in total events for users. (f) London—Number of attended events for users. (g) Sydney—Percentage of attended events in total events for users. (h) Sydney—Number of attended events for users.
Figure 11Numbers of loyal users and disloyal users varying in different for the four cities. (a) New York—Number of loyal users and disloyal users. (b) San Francisco—Number of loyal users and disloyal users. (c) London—Number of loyal users and disloyal users. (d) Sydney—Number of loyal users and disloyal users.
Figure 12Number of loyal users in groups varying in different . Groups are obtained from Table 5.
Average accuracy of three methods for both ALL and Selected features generated from the three stages.
| ALL | Selected | ||||||
|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
| ||
|
| 69.92 | 65.64 | 69.91 | 71.99 | 68.25 | 71.32 | |
|
|
| 74.68 | 71.37 | 74.73 | 77.74 | 75.04 | 76.02 |
|
| 70.64 | 66.19 | 70.21 | 73.07 | 71.37 | 71.58 | |
|
| 69.21 | 66.97 | 69.28 | 72.46 | 69.16 | 70.82 | |
|
|
| 71.13 | 68.37 | 72.21 | 75.63 | 72.57 | 74.21 |
|
| 71.47 | 67.69 | 69.72 | 74.59 | 72.94 | 72.15 | |
|
| 69.22 | 61.72 | 71.58 | 70.58 | 65.29 | 73.02 | |
|
|
| 71.5 | 67.61 | 72.15 | 74.66 | 72.51 | 73.82 |
|
| 69.15 | 62.07 | 68.73 | 70.99 | 67.88 | 69.85 | |
|
| 66.1 | 60.34 | 64.71 | 66.33 | 65 | 68.21 | |
|
|
| 73.04 | 68.8 | 71.41 | 74.84 | 71.52 | 75.15 |
|
| 69.71 | 60.86 | 66.31 | 68.32 | 64.73 | 71.54 | |