| Literature DB >> 33266826 |
Abstract
Information geometry provides a correspondence between differential geometry and statistics through the Fisher information matrix. In particular, given two models from the same parametric family of distributions, one can define the distance between these models as the length of the geodesic connecting them in a Riemannian manifold whose metric is given by the model's Fisher information matrix. One limitation that has hindered the adoption of this similarity measure in practical applications is that the Fisher distance is typically difficult to compute in a robust manner. We review such complications and provide a general form for the distance function for one parameter model. We next focus on higher dimensional extreme value models including the generalized Pareto and generalized extreme value distributions that will be used in financial risk applications. Specifically, we first develop a technique to identify the nearest neighbors of a target security in the sense that their best fit model distributions have minimal Fisher distance to the target. Second, we develop a hierarchical clustering technique that utilizes the Fisher distance. Specifically, we compare generalized extreme value distributions fit to block maxima of a set of equity loss distributions and group together securities whose worst single day yearly loss distributions exhibit similarities.Entities:
Keywords: clustering; information geometry; quantitative financial risk
Year: 2019 PMID: 33266826 PMCID: PMC7514593 DOI: 10.3390/e21020110
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1Comparison between pairwise KL-Divergence and Fisher information metric values for NASDAQ 100 parameters and distance functions to a distribution. Note that the KL divergence concentrates a number of distance values near zero and also has several large outliers whereas the Fisher distance distribution decreases roughly linearly for increasing distance.
Figure 2Generalized Pareto distributions fit using the hybrid method of [45] and empirical loss distribution histograms.
Figure 3Geodesic paths between generalized Pareto distribution parameters of AAPL and 50 randomly selected NASDAQ 100 securities.
Ranked Distances from AAPL to the top and bottom 20 stocks.
| Rank | Stock | Distance | Rank | Stock | Distance | Rank | Stock | Distance | Rank | Stock | Distance |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | CHKP | 0.012 | 11 | KLAC | 0.055 | 86 | WYNN | 0.397 | 96 | MELI | 0.533 |
| 2 | XLNX | 0.012 | 12 | FOX | 0.055 | 87 | PAYX | 0.425 | 97 | KHC | 0.540 |
| 3 | MCHP | 0.020 | 13 | FOXA | 0.056 | 88 | AVGO | 0.437 | 98 | LILAK | 0.544 |
| 4 | QVCA | 0.027 | 14 | VIAB | 0.060 | 89 | FB | 0.440 | 99 | VRTX | 0.579 |
| 5 | ISRG | 0.033 | 15 | SBUX | 0.061 | 90 | NFLX | 0.467 | 100 | TSLA | 0.612 |
| 6 | LBTYA | 0.037 | 16 | TXN | 0.063 | 91 | EA | 0.472 | 101 | SWKS | 0.612 |
| 7 | CTSH | 0.043 | 17 | WBA | 0.064 | 92 | BMRN | 0.487 | 102 | AAL | 0.672 |
| 8 | ADI | 0.048 | 18 | JBHT | 0.073 | 93 | REGN | 0.508 | 103 | CTRP | 0.678 |
| 9 | MAT | 0.048 | 19 | ULTA | 0.073 | 94 | ALXN | 0.514 | 104 | LILA | 0.727 |
| 10 | CHTR | 0.054 | 20 | PCLN | 0.077 | 95 | BIDU | 0.533 | 105 | INCY | 0.752 |
Figure 4Generalized Pareto distribution fits for APPL (black), its 10 nearest neighbors (red), and the remaining NASDAQ 100 stocks (blue).
Figure 5Portions of the dendrogram of a Fisher distance based hierarchical clustering of the best fit GEV distributions for the block maximum of S&P 500 stocks.short