| Literature DB >> 35496353 |
Jalshayin Bhachech1, Arnab Chakrabarti2, Taisei Kaizoji3, Anindya S Chakrabarti4.
Abstract
Abstract: What determines the stability of networks inferred from dynamical behavior of a system? Internal and external shocks in a system can destabilize the topological properties of comovement networks. In real-world data, this creates a trade-off between identification of turbulent periods and the problem of high dimensionality. Longer time-series reduces the problem of high dimensionality, but suffers from mixing turbulent and non-turbulent periods. Shorter time-series can identify periods of turbulence more accurately, but introduces the problem of high dimensionality, so that the underlying linkages cannot be estimated precisely. In this paper, we exploit high-frequency multivariate financial data to analyze the origin of instability in the inferred networks during periods free from external disturbances. We show that the topological properties captured via centrality ordering is highly unstable even during such non-turbulent periods. Simulation results with multivariate Gaussian and fat-tailed stochastic process calibrated to financial data show that both sampling frequencies and the presence of outliers cause instability in the inferred network. We conclude that instability of network properties do not necessarily indicate systemic instability.Entities:
Year: 2022 PMID: 35496353 PMCID: PMC9035503 DOI: 10.1140/epjb/s10051-022-00332-x
Source DB: PubMed Journal: Eur Phys J B ISSN: 1434-6028 Impact factor: 1.398
Fig. 1Properties of financial networks. We have utilized intraday data for all of the NIFTY-50 stocks listed in Table 4. Panel (a): Depiction of the correlation network created from return data on 22.12.2020 sampled at 10 s interval. The network has been thresholded for visual clarity. Panel (b): Heatmap of the correlation matrix for the PageRank vectors across the inferred networks constructed with different sampling frequencies, viz. 10, 30 s, 1, 2, 5, 10, 30 min and 1 hr. Panel (c): Distribution of normalized PageRank across the stocks shown in panel (a)
Correlation table of for different values of frequencies and corresponding to average subsample correlation matrix . The mean correlation values are reported and their corresponding standard deviations are also reported in the brackets. Larger differences in frequencies are associated with lower correlations between and
| Frequency | 10 s | 30 s | 1 min | 2 min | 5 min | 10 min | 30 min | 1 h |
|---|---|---|---|---|---|---|---|---|
| 10sec | 1.0 | 0.99 | 0.99 | 0.97 | 0.89 | 0.78 | 0.51 | 0.32 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 30sec | 0.99 | 1.00 | 0.99 | 0.97 | 0.89 | 0.78 | 0.52 | 0.32 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 1min | 0.99 | 0.99 | 1.00 | 0.97 | 0.89 | 0.79 | 0.52 | 0.32 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 2min | 0.97 | 0.97 | 0.97 | 1.00 | 0.89 | 0.80 | 0.52 | 0.31 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 5min | 0.89 | 0.89 | 0.89 | 0.89 | 1.00 | 0.81 | 0.51 | 0.31 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 10min | 0.78 | 0.78 | 0.79 | 0.80 | 0.81 | 1.00 | 0.62 | 0.31 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 30min | 0.51 | 0.52 | 0.52 | 0.52 | 0.51 | 0.62 | 1.00 | 0.60 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 1hr | 0.32 | 0.32 | 0.32 | 0.31 | 0.31 | 0.37 | 0.60 | 1.00 |
| ( | ( | ( | ( | ( | ( | ( |
List of equities in NIFTY-50 dataset
| Ticker | Name |
|---|---|
| RIL | RELIANCE INDUSTRIES LTD |
| TCS | TATA CONSULTANCY SVCS LTD |
| HDFCB | HDFC BANK LIMITED |
| INFO | INFOSYS LTD |
| HUVR | HINDUSTAN UNILEVER LTD |
| HDFC | HOUSING DEVELOPMENT FINANCE |
| ICICIBC | ICICI BANK LTD |
| KMB | KOTAK MAHINDRA BANK LTD |
| BHARTI | BHARTI AIRTEL LTD |
| BAF | BAJAJ FINANCE LTD |
| SBIN | STATE BANK OF INDIA |
| ITC | ITC LTD |
| HCLT | HCL TECHNOLOGIES LTD |
| WPRO | WIPRO LTD |
| APNT | ASIAN PAINTS LTD |
| MSIL | MARUTI SUZUKI INDIA LTD |
| AXSB | AXIS BANK LTD |
| LT | LARSEN & TOUBRO LTD |
| UTCEM | ULTRATECH CEMENT LTD |
| NEST | NESTLE INDIA LTD |
| BJFIN | BAJAJ FINSERV LTD |
| SUNP | SUN PHARMACEUTICAL INDUS |
| HDFCLIFE | HDFC LIFE INSURANCE CO LTD |
| TTAN | TITAN CO LTD |
| BJAUT | BAJAJ AUTO LTD |
| ONGC | OIL & NATURAL GAS CORP LTD |
| ADSEZ | ADANI PORTS AND SPECIAL ECON |
| TTMT | TATA MOTORS LTD |
| PWGR | POWER GRID CORP OF INDIA LTD |
| MM | MAHINDRA & MAHINDRA LTD |
| SRCM | SHREE CEMENT LTD |
| DIVI | DIVI’S LABORATORIES LTD |
| JSTL | JSW STEEL LTD |
| NTPC | NTPC LTD |
| IOCL | INDIAN OIL CORP LTD |
| TECHM | TECH MAHINDRA LTD |
| BPCL | BHARAT PETROLEUM CORP LTD |
| SBILIFE | SBI LIFE INSURANCE CO LTD |
| BRIT | BRITANNIA INDUSTRIES LTD |
| COAL | COAL INDIA LTD |
| EIM | EICHER MOTORS LTD |
| GRASIM | GRASIM INDUSTRIES LTD |
| DRRD | DR. REDDY’S LABORATORIES |
| TATA | TATA STEEL LTD |
| IIB | INDUSIND BANK LTD |
| CIPLA | CIPLA LTD |
| HMCL | HERO MOTOCORP LTD |
| GAIL | GAIL INDIA LTD |
| HNDL | HINDALCO INDUSTRIES LTD |
| UPLL | UPL LTD |
Correlation between the PageRanks across days averaged over all possible pairs of days in the sample, corresponding to different frequencies (f). The average correlation decreases for lower frequencies. The correlation values for RCV are higher than the noise-corrected covariance matrix
| Frequency | Average correlation (RCV) | Average correlation ( |
|---|---|---|
| f=10sec | 0.64 (± 0.13) | 0.38 (± 0.16) |
| f=30sec | 0.56 (± 0.14) | 0.37 (± 0.16) |
| f=1min | 0.53 (± 0.14) | 0.37 (± 0.16) |
| f=2min | 0.45 (± 0.15) | 0.37 (± 0.16) |
| f=5min | 0.30 (± 0.16) | 0.34 (± 0.15) |
| f=10min | 0.20 (± 0.19) | 0.20 (± 0.17) |
| f=30min | 0.07 (± 0.16) | 0.09 (± 0.16) |
| f=1hr | 0.02 (± 0.15) | 0.02 (± 0.16) |
Robustness check: Correlation table of for different values of frequencies and corresponding to average subsample correlation matrix obtained from the data of additional 138 days. The mean correlation values are reported and their corresponding standard deviations are also reported in the brackets. Larger differences in frequencies are associated with lower correlations between and . The values are quite similar to the values reported in Table 6, and thus, the method discussed in the paper is robust
| Frequency | 10sec | 30sec | 1min | 2min | 5min | 10min | 30min | 1hr |
|---|---|---|---|---|---|---|---|---|
| 10sec | 1 | 0.99 | 0.99 | 0.96 | 0.88 | 0.78 | 0.51 | 0.36 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 30sec | 0.99 | 1 | 0.99 | 0.97 | 0.88 | 0.78 | 0.51 | 0.36 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 1min | 0.99 | 0.99 | 1 | 0.97 | 0.88 | 0.78 | 0.51 | 0.37 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 2min | 0.96 | 0.97 | 0.97 | 1 | 0.88 | 0.79 | 0.51 | 0.35 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 5min | 0.88 | 0.88 | 0.88 | 0.88 | 1 | 0.81 | 0.49 | 0.33 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 10min | 0.78 | 0.78 | 0.78 | 0.79 | 0.80 | 1 | 0.61 | 0.41 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 30min | 0.50 | 0.51 | 0.51 | 0.51 | 0.49 | 0.62 | 1 | 0.65 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 1hr | 0.36 | 0.36 | 0.37 | 0.35 | 0.33 | 0.41 | 0.65 | 1 |
| ( | ( | ( | ( | ( | ( | ( |
Robustness check: Correlation between the PageRanks corresponding to different frequencies (f) averaged across all possible pairs of days in the additional dataset of 138 days. The average correlation decreases for lower frequencies. The correlation values for RCV are higher than the noise-corrected covariance matrix . The values are close to the values in Table 1, and thus, the results are not vulnerable to sampling fluctuations
| Frequency | Average Correlation (RCV) | Average correlation ( |
|---|---|---|
| f=10sec | 0.65 ( | 0.35 ( |
| f=30sec | 0.54 ( | 0.35 ( |
| f=1min | 0.47 ( | 0.35 ( |
| f=2min | 0.38 ( | 0.35 ( |
| f=5min | 0.26 ( | 0.31 ( |
| f=10min | 0.18 ( | 0.23 ( |
| f=30min | 0.07 ( | 0.08 ( |
| f=1hr | 0.03 ( | 0.03 ( |
Robustness check through alternative stability measure: Heatmap of the Spearman correlation matrix of the row-sums of . Instead of the PageRank, here, we take the p-dimensional row-sum of the matrix as an lower dimensional representation of the underlying network structure. To check the stability of the rank ordering of the stocks, we calculate the Spearman correlation between the row-sums of matrix corresponding to different sampling frequency. We observe a pattern similar to that we find for PageRank, suggesting that the stability of the network can be significantly affected by sampling frequency and sample size
| 10sec | 30sec | 1min | 2min | 5min | 10min | 30min | 1hr | |
|---|---|---|---|---|---|---|---|---|
| 10sec | 1 | 0.99 | 0.99 | 0.97 | 0.89 | 0.78 | 0.52 | 0.33 |
| (± 0.00) | (± 0.00) | (± 0.01) | (± 0.05) | (± 0.11) | (± 0.18) | (± 0.19) | ||
| 30sec | 0.99 | 1 | 0.99 | 0.97 | 0.89 | 0.78 | 0.52 | 0.33 |
| (± 0.00) | (± 0.00) | (± 0.01) | (± 0.05) | (± 0.11) | (± 0.18) | (± 0.19) | ||
| 1min | 0.99 | 0.99 | 1 | 0.97 | 0.89 | 0.78 | 0.52 | 0.33 |
| (± 0.00) | (± 0.00) | (± 0.01) | (± 0.05) | (± 0.11) | (± 0.18) | (± 0.19) | ||
| 2min | 0.97 | 0.97 | 0.97 | 1 | 0.88 | 0.80 | 0.52 | 0.32 |
| (± 0.01) | (± 0.01) | (± 0.01) | (± 0.05) | (± 0.11) | (± 0.19) | |||
| 5min | 0.89 | 0.89 | 0.89 | 0.88 | 1 | 0.81 | 0.51 | 0.32 |
| (± 0.05) | (± 0.05) | (± 0.05) | (± 0.05) | (± 0.09) | (± 0.19) | (± 0.21) | ||
| 10min | 0.78 | 0.78 | 0.78 | 0.80 | 0.81 | 1 | 0.62 | 0.38 |
| (± 0.10) | (± 0.11) | (± 0.09) | (± 0.16) | (± 0.18) | ||||
| 30min | 0.51 | 0.52 | 0.52 | 0.52 | 0.51 | 0.62 | 1 | 0.61 |
| (± 0.17) | (± 0.17) | (± 0.17) | (± 0.17) | (± 0.19) | (± 0.16) | (± 0.14) | ||
| 1hr | 0.33 | 0.33 | 0.33 | 0.32 | 0.32 | 0.38 | 0.61 | 1 |
| (± 0.19) | (± 0.19) | (± 0.21) | (± 0.14) |
Fig. 2Instability of simulated networks induced by sampling frequencies. Main panel: The average correlations of PageRanks between the network with known covariance matrix and realized sample networks. We simulate sample networks by Eq. 1 calibrating the covariance matrix and we generate simulated data of same size as in baseline dataset at different frequencies. Along with the point estimates of the average correlations between PageRanks, we show the corresponding standard deviation in red bars. Inset: The sample size corresponding to each frequency is shown. Sample size decreases for lower frequencies, leading to increase in the increase in the standard deviation in the left panel
Simulation exercises with Gaussian process to capture the similarity between true network with known covariance matrix and realized sample networks. This table provides average correlations of PageRanks between the true network and the sample networks for different frequencies. Low frequencies are associated with less similarities. This model captures the effects of sampling frequencies. This can be compared with table 5 which has the effects coming from both extreme fluctuations and sampling frequencies. The effects of extreme fluctuations are evident in the low values of the correlations in Table 5, especially for smaller degrees of freedom indicating fatter tails
| 10sec | 30sec | 1min | 2min | 5min | 10min | 30min | 1hr | |
|---|---|---|---|---|---|---|---|---|
| TRUE network | 0.95 | 0.88 | 0.81 | 0.70 | 0.53 | 0.40 | 0.24 | 0.09 |
| ( | ( | ( | ( | ( | ( | ( | ( | |
| Sample size | 2550 | 849 | 424 | 212 | 84 | 42 | 14 | 7 |
Stability of networks generated from simulated multivariate Gaussian processes. The table shows of for different values of frequencies and corresponding to the average covariance matrix RCV obtained from simulation. Standard deviations corresponding to the point estimates of the correlations across PageRanks are given in the brackets. Larger differences in frequencies are associated with lower correlations between the resulting networks. Since the underlying stochastic process is Gaussian, the decay in correlations can be attributed to the differences in sample sizes across frequencies
| Frequency | 10sec | 30sec | 1min | 2 min | 5 min | 10 min | 30 min | 1 h |
|---|---|---|---|---|---|---|---|---|
| 10sec | 1.00 | 0.96 | 0.89 | 0.81 | 0.62 | 0.47 | 0.27 | 0.19 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 30sec | 0.96 | 1.00 | 0.94 | 0.84 | 0.65 | 0.50 | 0.29 | 0.20 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 1min | 0.89 | 0.94 | 1.00 | 0.90 | 0.70 | 0.54 | 0.31 | 0.21 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 2min | 0.81 | 0.84 | 0.90 | 1.00 | 0.76 | 0.61 | 0.36 | 0.24 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 5min | 0.62 | 0.65 | 0.70 | 0.76 | 1.00 | 0.79 | 0.47 | 0.30 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 10min | 0.47 | 0.50 | 0.54 | 0.61 | 0.79 | 1.00 | 0.60 | 0.41 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 30min | 0.27 | 0.29 | 0.31 | 0.36 | 0.47 | 0.60 | 1.00 | 0.66 |
| ( | ( | ( | ( | ( | ( | ( | ||
| 1hr | 0.19 | 0.20 | 0.21 | 0.24 | 0.30 | 0.41 | 0.66 | 1.00 |
| ( | ( | ( | ( | ( | ( | ( |
Stability of networks generated from simulated multivariate Gaussian processes (identical to the simulation in Table 7). Similarity of the networks constructed with different frequencies over the span of the data. Networks constructed from higher frequency data are relatively more stable across days. The networks constructed from lower frequency data on the other hand shows much less stability. This simulation exercise assumed identical stochastic data-generating process, and therefore, the decay in network stability can be attributed to only sampling frequencies
| Frequency | Average correlation | Sample size |
|---|---|---|
| f=10 s | 0.96 (± 0.01) | 2550 |
| f=30 s | 0.88 (± 0.02) | 849 |
| f=1 min | 0.78 (± 0.04) | 424 |
| f=2 min | 0.63 (± 0.08) | 212 |
| f=5 min | 0.37 (± 0.13) | 84 |
| f=10 min | 0.22 (± 0.14) | 42 |
| f=30 min | 0.07 (± 0.14) | 14 |
| f=1 h | 0.03 (± 0.13) | 7 |
Fig. 3Box-plot displaying error distribution across edges of simulated networks for known covariance matrices, across four different frequencies (10 s, 30 s, 1 min, and 2 min, respectively). The simulation data are used from the same dataset used in Table 7. The error for a given edge is defined as the difference between simulated weight and the underlying true weight. The variance of the error distribution increases for lower frequencies. Additionally, the median shifts further from zero as the frequency decreases. Since the underlying process is Gaussian, the monotonic variation in the dispersion of the error distribution can be attributed to changes in the sampling frequency
Stability of networks generated from simulated multivariate Gaussian processes with time-varying volatility (using stochastic processes described by Eqs. 10 and 11). Correlation of PageRanks between the true network (from ) and the realized network constructed from RCV matrix. Following the process described in the text, we estimate the values of and as 3.33, 1 and 1.2 respectively
| Frequency | Average correlation | Sample size |
|---|---|---|
| f=10sec | 0.94 ( | 2160 |
| f=30sec | 0.86 ( | 720 |
| f=1min | 0.77 ( | 360 |
| f=2min | 0.66 ( | 182 |
| f=5min | 0.47 ( | 72 |
| f=10min | 0.39 ( | 36 |
| f=30min | 0.19 ( | 12 |
| f=1hr | 0.15 ( | 6 |
Fig. 5ACF and PACF obtained from 10 s and 1 min returns data of TCS on a single day. 10 s Returns data show significant time-dependence in lag 1 both in terms of ACF and Partial ACF as opposed to the 1 min returns data
Fig. 4Instability of networks induced by extreme fluctuations. Left panel: Histogram of degrees of freedom (df) of t-distributions calibrated to 1 min return data for all 50 stocks in our dataset. We see that bulk of the distribution is concentrated between degrees of freedom 2.5 to 4. Right panel: The values of correlation of the PageRanks between networks across days averaged for each frequency considered. The black line with stars corresponds to the empirical data. The other lines with different identifiers are obtained from simulated data from a multivariate t-distribution with degrees of freedom ranging from 2.8 to 4.2. Smaller degrees of freedom lead to lower values of correlation. The overall pattern of decay in correlation is visible for all the cases considered. In particular, the empirical feature seems to match with t-distributed multivariate stochastic process with degrees of freedom around 3.5
Simulation exercises to capture the effects of extreme fluctuations on similarity between true network with known covariance matrix and realized sample networks. This table provides average correlations of PageRanks between the true network and the sample networks for different frequencies and different degrees of freedom. Low frequencies are associated with less similarities. This model captures the effects of extreme fluctuations as well as sampling frequencies
| 10 s | 30 s | 1 min | 2 min | 5 min | 10 min | 30 min | 1 h | |
|---|---|---|---|---|---|---|---|---|
| TRUE network | 0.83 | 0.79 | 0.71 | 0.61 | 0.45 | 0.34 | 0.19 | 0.09 |
| (df=4) | ( | ( | ( | ( | ( | ( | ( | ( |
| TRUE network | 0.92 | 0.85 | 0.77 | 0.66 | 0.47 | 0.36 | 0.20 | 0.10 |
| (df=6) | ( | ( | ( | ( | ( | ( | ( | ( |
| TRUE network | 0.45 | 0.44 | 0.42 | 0.38 | 0.33 | 0.27 | 0.16 | 0.10 |
| (df=3) | ( | ( | ( | ( | ( | ( | ( | ( |
| Sample size | 2550 | 849 | 424 | 212 | 84 | 42 | 14 | 7 |