| Literature DB >> 34539309 |
Anna Malinovskaya1, Philipp Otto1.
Abstract
An important problem in network analysis is the online detection of anomalous behaviour. In this paper, we introduce a network surveillance method bringing together network modelling and statistical process control. Our approach is to apply multivariate control charts based on exponential smoothing and cumulative sums in order to monitor networks generated by temporal exponential random graph models (TERGM). The latter allows us to account for temporal dependence while simultaneously reducing the number of parameters to be monitored. The performance of the considered charts is evaluated by calculating the average run length and the conditional expected delay for both simulated and real data. To justify the decision of using the TERGM to describe network data, some measures of goodness of fit are inspected. We demonstrate the effectiveness of the proposed approach by an empirical application, monitoring daily flights in the United States to detect anomalous patterns.Entities:
Keywords: MCUSUM; MEWMA; Multivariate Control Charts; Network Modelling; Network Monitoring; Statistical Process Control; TERGM
Year: 2021 PMID: 34539309 PMCID: PMC8440157 DOI: 10.1007/s10260-021-00589-z
Source DB: PubMed Journal: Stat Methods Appt ISSN: 1613-981X
Upper control limits for the MEWMA chart based on the estimates and for two different windows sizes and
| 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1.0 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 7 | 50 | 39.32 | 35.76 | 31.04 | 26.52 | 22.58 | 19.22 | 16.46 | 14.10 | 12.14 | 10.46 |
| 75 | 45.76 | 41.03 | 35.20 | 29.81 | 25.27 | 21.61 | 18.53 | 15.86 | 13.66 | 11.72 | |
| 100 | 50.52 | 45.30 | 38.58 | 32.56 | 27.62 | 23.48 | 20.00 | 17.13 | 14.71 | 12.66 | |
| 14 | 50 | 55.14 | 43.44 | 33.80 | 26.96 | 21.98 | 18.16 | 15.16 | 12.76 | 10.76 | 9.15 |
| 75 | 65.63 | 50.94 | 39.29 | 31.23 | 25.26 | 20.68 | 17.24 | 14.50 | 12.20 | 10.32 | |
| 100 | 73.85 | 56.20 | 42.97 | 33.97 | 27.44 | 22.52 | 18.72 | 15.69 | 13.21 | 11.15 |
Upper control limits for the MCUSUM chart based on the estimates and for two different windows sizes and
| 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1.0 | 1.1 | 1.2 | 1.3 | 1.4 | 1.5 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7 | 50 | 21.01 | 19.36 | 17.75 | 16.21 | 14.83 | 13.52 | 12.25 | 11.10 | 9.99 | 9.00 | 8.03 |
| 75 | 25.19 | 22.90 | 20.82 | 18.95 | 17.33 | 15.91 | 14.47 | 13.19 | 11.94 | 10.74 | 9.61 | |
| 100 | 28.25 | 25.59 | 23.31 | 21.18 | 19.38 | 17.67 | 16.12 | 14.67 | 13.27 | 11.96 | 10.73 | |
| 14 | 50 | 30.10 | 27.84 | 25.64 | 23.67 | 21.69 | 19.83 | 17.92 | 16.17 | 14.44 | 12.75 | 11.06 |
| 75 | 37.35 | 34.60 | 31.91 | 29.25 | 26.86 | 24.53 | 22.37 | 20.20 | 18.14 | 16.19 | 14.32 | |
| 100 | 43.06 | 39.52 | 36.20 | 33.15 | 30.45 | 27.84 | 25.43 | 23.16 | 20.97 | 18.81 | 16.77 |
Upper control limits for the MEWMA chart based on the estimates and for two different windows sizes and
| 0.1 | 0.2 | 0.3 | 0.4 | 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1.0 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|
| 7 | 50 | 65.03 | 43.79 | 32.23 | 25.29 | 20.36 | 16.72 | 13.91 | 11.66 | 9.85 | 8.31 |
| 75 | 82.40 | 52.41 | 38.13 | 29.52 | 23.59 | 19.23 | 15.88 | 13.23 | 11.13 | 9.42 | |
| 100 | 96.23 | 59.39 | 42.92 | 32.96 | 26.19 | 21.24 | 17.58 | 14.69 | 12.32 | 10.35 | |
| 14 | 50 | 71.09 | 45.47 | 32.66 | 24.81 | 19.51 | 15.80 | 12.95 | 10.74 | 8.96 | 7.53 |
| 75 | 89.03 | 55.26 | 38.71 | 29.06 | 22.85 | 18.40 | 15.07 | 12.46 | 10.35 | 8.66 | |
| 100 | 103.00 | 62.73 | 43.73 | 32.56 | 25.40 | 20.40 | 16.65 | 13.73 | 11.43 | 9.57 |
Upper control limits for the MCUSUM chart based on the estimates and for two different windows sizes and
| 0.5 | 0.6 | 0.7 | 0.8 | 0.9 | 1.0 | 1.1 | 1.2 | 1.3 | 1.4 | 1.5 | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 7 | 50 | 51.85 | 47.41 | 42.96 | 38.27 | 33.87 | 29.71 | 25.94 | 22.51 | 19.01 | 15.79 | 13.06 |
| 75 | 75.93 | 69.26 | 62.97 | 56.83 | 50.79 | 44.82 | 39.02 | 33.73 | 29.05 | 24.95 | 20.95 | |
| 100 | 97.58 | 89.46 | 81.68 | 73.51 | 65.78 | 59.01 | 52.43 | 45.96 | 39.96 | 34.40 | 29.23 | |
| 14 | 50 | 55.72 | 51.27 | 46.63 | 41.90 | 37.54 | 33.29 | 29.18 | 25.46 | 21.69 | 18.21 | 15.36 |
| 75 | 80.28 | 73.70 | 67.32 | 61.13 | 54.75 | 48.86 | 43.15 | 37.76 | 32.81 | 28.26 | 24.20 | |
| 100 | 102.34 | 94.25 | 85.88 | 78.05 | 70.90 | 63.96 | 57.03 | 50.65 | 44.31 | 38.71 | 33.39 |
Fig. 1Comparison of the autocorrelation function (ACF) values when the network characteristics are estimated with a sliding window approach of size containing (left) and not containing (right) overlapping network states
Fig. 2Anomaly types for the generation of observations from Phase II and calculation of the associated run length
Anomaly cases
| Anomaly Type | Description | Case | |
|---|---|---|---|
| Type A | Change in the transition matrix | A.1 | |
| A.2 | |||
| A.3 | |||
| Type B | Change of the fraction | B.1 | |
| B.2 | |||
| B.3 | |||
| Type C | Increase of the proportion of mutual edges by | C.1 | |
| C.2 | |||
| C.3 | |||
Fig. 3Conditional expected delays for anomalies of Type A for MCUSUM (left) and MEWMA (right) together with the different choices of the reference parameter k and the smoothing parameter , the window sizes and , and the network estimates (solid lines) and (dashed lines). Black points indicate the minimum CED for each setting
Fig. 4Conditional expected delays for anomalies of Type B for MCUSUM (left) and MEWMA (right) together with the different choices of the reference parameter k and the smoothing parameter , the window sizes and , and the network estimates (solid lines) and (dashed lines). Black points indicate the minimum CED for each setting
Fig. 5Conditional expected delays for anomalies of Type C for MCUSUM (left) and MEWMA (right) together with the different choices of the reference parameter k and the smoothing parameter , the window sizes and , and the network estimates (solid lines) and (dashed lines). Black points indicate the minimum CED for each setting
Summary of the CED results to detect anomalies of Type C with the additional test case
| CED | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Case | C.1 | C.2 | C.3 | C.1 | C.2 | C.3 | |||
| Parameter | 0.005 | 0.01 | 0.02 | 0.05 | 0.005 | 0.01 | 0.02 | 0.05 | |
| MEWMA with | Min. | 36.26 | 17.83 | 1.85 | |||||
| 0.1 | 1.0 | 1.0 | 0.7 | 0.2 | 0.5 | 0.9 | 0.4 | ||
| Max. | 42.55 | 25.62 | 7.16 | 2.26 | 36.36 | 8.92 | 2.56 | 1.60 | |
| 0.9 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | 0.1 | ||
| MEWMA with | Min. | 35.92 | 24.26 | 3.57 | 35.60 | 8.60 | |||
| 0.1 | 0.3 | 0.9 | 1.0 | 0.2 | 0.3 | 1.0 | 0.5 | ||
| Max. | 43.78 | 28.75 | 9.90 | 3.25 | 42.21 | 12.94 | 3.49 | 1.89 | |
| 1.0 | 0.1 | 0.1 | 0.1 | 1.0 | 0.1 | 0.1 | 0.1 | ||
| MCUSUM with | Min. | 29.59 | 23.20 | 6.66 | 1.94 | 29.17 | |||
| 0.5 | 1.1 | 1.5 | 1.5 | 1.0 | 1.5 | 1.5 | 1.5 | ||
| Max. | 40.38 | 26.19 | 18.70 | 5.07 | 33.68 | 12.51 | 3.91 | 2.32 | |
| 1.5 | 1.5 | 0.5 | 0.5 | 1.4 | 0.5 | 0.5 | 0.5 | ||
| MCUSUM with | Min. | 23.56 | 9.45 | 2.98 | 30.57 | 10.88 | 3.14 | 1.72 | |
| 0.7 | 1.0 | 1.5 | 1.5 | 0.6 | 1.4 | 1.5 | 1.5 | ||
| Max. | 36.62 | 27.58 | 18.86 | 7.04 | 35.22 | 14.56 | 6.19 | 3.00 | |
| 1.4 | 1.4 | 0.5 | 0.5 | 1.2 | 0.5 | 0.5 | 0.5 | ||
The corresponding smoothing and reference parameters and k are provided under the respective CED. The minimum CED for each case and the control chart group are underlined. The maximum CED represents the “worst-case” scenario. In case several values of the parameter correspond to the CED result, only the smallest value is reported
Fig. 6Illustration of the flight network on April 1 of each year excluding isolated vertices. It can be seen that the topology of the network has changed. The red coloured nodes represent the 30 busiest airports
Descriptive statistics of the US flight network data
| 2018 | 2019 | 2020 | ||
|---|---|---|---|---|
| Phase | I | II | II | |
| Number of nodes | 358 | 360 | 354 | |
| Density | Min. | 0.031 | 0.033 | 0.022 |
| Median | 0.037 | 0.038 | 0.038 | |
| Max. | 0.039 | 0.040 | 0.041 | |
| Min. | 0.97 | 0.96 | 0.89 | |
| Reciprocity | Median | 0.99 | 0.99 | 0.99 |
| Max. | 1.00 | 1.00 | 1.00 | |
| Min. | 0.315 | 0.322 | 0.263 | |
| Transitivity | Median | 0.339 | 0.339 | 0.326 |
| Max. | 0.357 | 0.354 | 0.345 |
Density is calculated on networks without multiple edges
Fig. 7Distribution of the estimated coefficients in 2018, 2019 and 2020
Fig. 8Illustration of the goodness of fit assessment for the TERGM. The considered networks belong to the period April 3–9 2019
Fig. 9The MCUSUM control chart (above) and the logarithmic MEWMA control chart (below). The horizontal red line corresponds to the upper control limit and the red points to the occurred signals