| Literature DB >> 33960506 |
Luca Allodi1, Fabio Massacci2,3, Julian Williams4.
Abstract
The assumption that a cyberattacker will potentially exploit all present vulnerabilities drives most modern cyber risk management practices and the corresponding security investments. We propose a new attacker model, based on dynamic optimization, where we demonstrate that large, initial, fixed costs of exploit development induce attackers to delay implementation and deployment of exploits of vulnerabilities. The theoretical model predicts that mass attackers will preferably (i) exploit only one vulnerability per software version, (ii) largely include only vulnerabilities requiring low attack complexity, and (iii) be slow at trying to weaponize new vulnerabilities . These predictions are empirically validated on a large data set of observed massed attacks launched against a large collection of information systems. Findings in this article allow cyber risk managers to better concentrate their efforts for vulnerability management, and set a new theoretical and empirical basis for further research defining attacker (offensive) processes.Entities:
Keywords: Cyber security; hackers model; risk management; update costs
Mesh:
Year: 2021 PMID: 33960506 PMCID: PMC9543271 DOI: 10.1111/risa.13732
Source DB: PubMed Journal: Risk Anal ISSN: 0272-4332 Impact factor: 4.302
Summary of Predictions Derived from the Model
| Model Variable | Regressor | Expectation | Hypothesis | Rationale |
|---|---|---|---|---|
|
|
|
|
| Shorter exploitation times are associated with more vulnerable systems, hence |
|
|
|
|
| The introduction of a new reliable, low‐complexity exploit minimizes implementation costs, thus |
|
|
|
|
| High impact vulnerabilities allow the attacker for a complete control of the attacked systems, hence |
|
|
|
|
| Selecting a higher impact exploit for a new vulnerability increases the expected revenue and increases the fraction of newly controlled systems with respect to the old vulnerability. |
Parameters and Variables from the Model
| Parameter | Description |
|---|---|
| Variable | |
|
| Continuous time index. |
|
| An update time when an attacker updates the vulnerabilities, indexed by |
|
| The universe of known vulnerabilities affecting all systems. |
|
| The total number of target machines affected by vulnerabilities in |
|
| The subset of vulnerabilities in |
|
| The fraction of |
|
| Revenue function from successful attacks. |
|
| Variable cost function for deploying attacks. |
|
| Fixed cost of adding at |
|
| Arrival rate of vulnerability patches to the universe of systems. |
|
| Discount rate of the attacker. |
|
| Profit function for a given set of vulnerabilities |
Fig 1Computing the delay () between attacks against different vulnerabilities.
Note: Change in the number of attacked systems for two attacks against different systems days apart. The first attack happens at and the number of attacked systems is derived from Equation (1) as . The number of systems attacked by the new exploit introduced at is derived as .
Fig 2Distribution of time between of subsequent attacks with similar signatures.
Note: Fraction of systems receiving the same attack repeatedly in time (red, solid) compared to those receiving a second attack against a different vulnerability (black, dashed). The vertical line indicates number of days after the first attacks where it is more likely to receive an attack against a new vulnerability rather than against an old one
Variables Included in Our Data Set
| Variable | Description |
|---|---|
|
| The identifier of the previous and the current vulnerability |
|
| The delay expressed in fraction of year between the first and the second attacks. |
|
| The number of detected attacks for the pair |
|
| The number of systems attacked by the pair. |
|
| The Complexity of the vulnerability as indicated by its CVSS assessment. Can be either |
|
| The impact of the vulnerability measured over the loss in confidentiality, integrity, and availability of the affected information. It is computed on a scale from 0 to 10 where 10 represents maximum loss in all metrics, and 0 represents no loss. Mell et al. ( |
|
| The date of the vulnerability publication on the National Vulnerability Database. |
|
| The name of the software affected by the vulnerability. |
|
| The last version of the affected software where the vulnerability is present. |
|
| The country where the user system is at the time of the second attack. |
|
| The profile of the user or “host.” |
|
| The average number of attacks received by a user per day. |
|
| The maximum number of attacks received by a user per day. |
Summary Excerpt from Our Data Set
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| 2003‐0533 | 2008‐4250 | 83 | 186 | 830 | IT |
|
| 2003‐0818 | 2003‐0818 | 146 | 1 | 1 | US |
|
| 2003‐0818 | 2009‐4324 | 616 | 1 | 1 | CH |
|
| 2003‐0818 | 2009‐4324 | 70 | 52 | 55 | US |
|
Note: We provide an example useful to interpret these data. Looking at the third row, one WINE system (= 1) located in Switzerland (= CH) suffered only once (= 1) from an attack targeting the vulnerability CVE‐2009‐4324 that was preceded by an attack targeting CVE‐2003‐0818 almost two years earlier (= 616). In the fourth row, = 52 systems in the United States (= US) received = 55 times the first attack on followed by the second attack on just two months apart (= 70). In both cases, the systems considered are of type EVOLVE, indicating that the affected systems have been upgraded and moved from some other country to the country listed in during our observation period.
Sample Attack Scenarios and Compatibility with Work‐Aversion Hypothesis
| Type | Condition | Hypothesis |
|---|---|---|
|
|
| Often for Hypothesis |
|
|
| Less frequent for Hypothesis |
|
|
| Almost never for Hypothesis |
Note: We expect the majority of attacks generated by the work‐averse attacker to be of type . Attack should be less frequent than , as it requires to engineer a new exploit. contradicts the work aversion hypothesis and should be the least common type.
Fig 3Loess regression of volume of attacks in time.
Note: Volume of received attacks as a function of time for the three types of attack. is represented by a solid black line, by a long‐dashed red line, by a dashed green line. The gray areas represent 95% confidence intervals. For nternet Explorer vulnerabilities, the maximum between two attacks is 1,288 days; for ERVER it is 1,374 days; ROD 1,411; LUGIN 1,428. This can be determined by the timing of first appearance of the attack in the WINE database.
Ordinary Least Squares and Robust Regression Results
| Dependent Variable: Natural Logarithm of the Number of Attacked Machines | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Model 1 | Model 2 | Model 3 | ||||||||||
| OLS | Robust | OLS | Robust | OLS | Robust | |||||||
|
|
|
|
|
|
| |||||||
| c | 0.927 | 0.006 | 0.731 | 0.096 | 1.065 | 0.122 | 0.845 | 0.171 | 0.933 | −0.106 | 0.783 | 0.039 |
| (0.001) | (0.003) | (0.001) | (0.003) | (0.001) | (0.003) | (0.001) | (0.003) | (0.004) | (0.005) | (0.003) | (0.004) | |
|
| 0.018 | −0.051 | 0.012 | −0.044 | −0.006 | −0.092 | −0.003 | −0.071 | −0.005 | −0.091 | −0.004 | −0.071 |
| (0.001) | (0.001) | (0.001) | (0.001) | (0.001) | (0.001) | (0.001) | (0.001) | (0.001) | (0.001) | (0.001) | (0.001) | |
|
| −0.326 | −0.479 | −0.228 | −0.324 | −0.313 | −0.464 | −0.22 | −0.314 | ||||
| (0.002) | (0.002) | (0.001) | (0.001) | (0.002) | (0.002) | (0.001) | (0.001) | |||||
|
| 0.144 | 0.236 | 0.063 | 0.131 | ||||||||
| (0.003) | (0.003) | (0.003) | (0.003) | |||||||||
|
| −0.088 | −0.209 | 0.012 | −0.087 | ||||||||
| (0.003) | (0.003) | (0.002) | (0.002) | |||||||||
| Z1: | 0.604 | 0.37 | 0.679 | 0.422 | 0.671 | 0.419 | ||||||
| (0.002) | (0.001) | (0.002) | (0.001) | (0.002) | (0.001) | |||||||
| Z2: | 0.155 | 0.105 | 0.17 | 0.116 | 0.163 | 0.114 | ||||||
| (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | |||||||
| Z3: | 0.191 | 0.129 | 0.208 | 0.141 | 0.223 | 0.149 | ||||||
| (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | |||||||
| Z4: | 0.112 | 0.072 | 0.116 | 0.076 | 0.113 | 0.075 | ||||||
| (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | |||||||
| Z5: | 0.24 | 0.147 | 0.212 | 0.127 | 0.279 | 0.157 | ||||||
| (0.003) | (0.003) | (0.003) | (0.003) | (0.003) | (0.003) | |||||||
| Z6: | 0.328 | 0.227 | 0.358 | 0.246 | 0.41 | 0.271 | ||||||
| (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | (0.002) | |||||||
| Z7: | 0.513 | 0.442 | 0.567 | 0.49 | 0.531 | 0.477 | ||||||
| (0.004) | (0.003) | (0.004) | (0.003) | (0.004) | (0.003) | |||||||
| Z8: | 0.379 | 0.274 | 0.412 | 0.299 | 0.411 | 0.301 | ||||||
| (0.003) | (0.002) | (0.003) | (0.002) | (0.003) | (0.002) | |||||||
|
| – | – | 0.326 | 0.341 | – | – | 0.331 | 0.347 | – | – | 0.331 | 0.347 |
|
| 0.00 | 0.093 | – | – | 0.016 | 0.126 | – | – | 0.017 | 0.13 | – | – |
|
| 348.66 | 26,551.47 | – | – | 18,548.25 | 33,422.78 | – | – | 9,989.88 | 28,915.60 | – | – |
| Obs. | 2324500 | 2324500 | 2324500 | 2324500 | 2324500 | 2324500 | 2324500 | 2324500 | 2324500 | 2324500 | 2324500 | 2324500 |
Note: The three model equations reflect the definition of the expected (log) number of affected machines after an interval . The regression model formulation is derived from prime principle from Equation (9). The expected coefficient signs are given in Table V. For each model, we run four sets of regressions. OLS and robust regressions are provided to addresses heteroscedasticity in the data. and ‐statistics are reported for the OLS estimations. Note that the pseudo‐ are computed for the robust regressions, using the McFadden‐adjusted approach , where is the log likelihood for the full model minus the number of slope parameters versus the log likelihood of the intercept alone and should not be compared directly to the OLS . Coefficient estimations of the two sets of regressions are consistent. All coefficient signs for the three models reflect the work‐averse attacker model predictions, with the only exception of the estimation for with no controls for which the prediction for is inverted. This may indicate that user characteristics are relevant factors for the arrival time of exploits when other factors related to the system are not accounted for. The introduction of in Model 2 significantly changes the estimate for , whereas in Model 3 leaves the estimates for and unchanged. High vulnerabilities tend to increase volume of attacks. We report only standard errors without starring p‐values as all coefficients are significant due to the number of observations in the data set. All standard errors are estimated using the Huber–White approach.