| Literature DB >> 28943782 |
Sascha Caron1,2, Jong Soo Kim3, Krzysztof Rolbiecki3,4, Roberto Ruiz de Austri5, Bob Stienen1.
Abstract
A key research question at the Large Hadron Collider is the test of models of new physics. Testing if a particular parameter set of such a model is excluded by LHC data is a challenge: it requires time consuming generation of scattering events, simulation of the detector response, event reconstruction, cross section calculations and analysis code to test against several hundred signal regions defined by the ATLAS and CMS experiments. In the BSM-AI project we approach this challenge with a new idea. A machine learning tool is devised to predict within a fraction of a millisecond if a model is excluded or not directly from the model parameters. A first example is SUSY-AI, trained on the phenomenological supersymmetric standard model (pMSSM). About 300, 000 pMSSM model sets - each tested against 200 signal regions by ATLAS - have been used to train and validate SUSY-AI. The code is currently able to reproduce the ATLAS exclusion regions in 19 dimensions with an accuracy of at least [Formula: see text]. It has been validated further within the constrained MSSM and the minimal natural supersymmetric model, again showing high accuracy. SUSY-AI and its future BSM derivatives will help to solve the problem of recasting LHC results for any model of new physics. SUSY-AI can be downloaded from http://susyai.hepforge.org/. An on-line interface to the program for quick testing purposes can be found at http://www.susy-ai.org/.Entities:
Year: 2017 PMID: 28943782 PMCID: PMC5586744 DOI: 10.1140/epjc/s10052-017-4814-9
Source DB: PubMed Journal: Eur Phys J C Part Fields ISSN: 1434-6044 Impact factor: 4.590
The experimental analyses used in the ATLAS study [6]. The middle column denotes the final state for which the analysis is optimized, and the third column shows the target scenario of this analysis
| Reference | Final state | Category |
|---|---|---|
| [ | 0 lepton + 2–6 jets + | Inclusive |
| [ | 0 lepton + 7–10 jets + | |
| [ | 1 lepton + jets + | |
| [ |
| |
| [ | SS/3 lepton + jets + | |
| [ |
| |
| [ | Monojet | |
| [ | 0 lepton stop search | Third generation squarks |
| [ | 1 lepton stop search | |
| [ | 2 lepton stop search | |
| [ | Monojet search | |
| [ | Stop search with | |
| [ | 2 | |
| [ | Asymmetric stop search | |
| [ | 1 lepton plus Higgs final state | Electroweak |
| [ | Dilepton final state | |
| [ |
| |
| [ | Trilepton final state | |
| [ | Four-lepton final state | |
| [ | Disappearing track | |
| [ | Long-lived particle search | Other |
| [ |
|
Variable input parameters of the ATLAS pMSSM scan and the range over which these parameters are scanned
| Parameter | Description | Scanned range |
|---|---|---|
|
| 1 |
|
|
| 1 |
|
|
| 3 |
|
|
| 3 |
|
|
| 1 |
|
|
| 1 |
|
|
| 1 |
|
|
| 3 |
|
|
| 3 |
|
|
| 3 |
|
|
| Stop trilinear coupling |
|
|
| Sbottom trilinear coupling |
|
|
| Stau trilinear coupling |
|
|
| Higgsino mass parameter |
|
|
| Bino mass parameter |
|
|
| Wino mass parameter |
|
|
| Gluino mass parameter |
|
|
| Pseudoscalar Higgs mass |
|
|
| Ratio of vacuum expectation values |
|
Preselection cuts for the pMSSM benchmark points [6]
| Parameter | Minimum value | Maximum value |
|---|---|---|
|
|
| 0.0017 |
|
|
|
|
| BR( |
|
|
| BR( |
|
|
| BR( |
|
|
|
| − | 0.1208 |
|
| − | 2 MeV |
| Masses of charged sparticles | 100 GeV | − |
|
| 103 GeV | − |
|
| 124 GeV | 128 GeV |
Fig. 1Graphical representation of a sample decision tree
Variables used for training in the pMSSM. The variables are identified according to the SLHA-1 standard [74], given in order used by SUSY-AI
| Parameter | Block | No. | Parameter | Block | No. |
|---|---|---|---|---|---|
|
| MSOFT | 1 |
| MSOFT | 46 |
|
| MSOFT | 2 |
| MSOFT | 47 |
|
| MSOFT | 3 |
| MSOFT | 49 |
|
| MSOFT | 31 |
| AU | 3.3 |
|
| MSOFT | 33 |
| AD | 3.3 |
|
| MSOFT | 34 |
| AE | 3.3 |
|
| MSOFT | 36 |
| HMIX | 1 |
|
| MSOFT | 41 |
| HMIX | 4 |
|
| MSOFT | 43 |
| HMIX | 3 |
|
| MSOFT | 44 |
Fig. 2Distribution of the number of true allowed or excluded points by ATLAS [6] as a function of the classifier output
Fig. 3Confidence level that the classification is correct, as defined in the text. The horizontal lines indicate specific confidence levels
Fig. 4a ROC curves for different minimum CLs following from an analysis of Fig. 3. Note that only the upper left corner of the full ROC curve is shown. b ROCs for SUSY-AI and 20 decision trees (ROCs overlap here). The decision trees model educated manual cuts on the dataset. The square marker indicates the location of SUSY-AI performance without a cut on CL, while discs denote different decision trees
Fig. 5Example of a decision tree modeling educated manual cuts, trained with a 75:25 ratio of training and testing data. The figure has been created with GraphViz [75]. The information shown in the nodes are, respectively: a cut that is made in the parameter space (line 1), an impurity measure which is minimized in the training (line 2), a number of model points in the node (line 3), the distribution of model points over the classes [excluded, allowed] (line 4) and a label of the majority class (line 5)
Fig. 6Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the true classification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points by the prediction, with white areas denoting no misclassifications
Fig. 7Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the trueclassification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points, with white areas denoting no misclassifications. The dashed bins contain no data points
Fig. 13Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the true classification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points. The dashed bins contain no data points. Cf. Fig. 7a of Ref. [6]
Fig. 14Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the true classification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points. The dashed bins contain no data points. Cf. Fig. 9a of Ref. [6]; however, note that the plots here take into account all searches
Fig. 15Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the true classification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points. The dashed bins contain no data points. Cf. Fig. 11a of Ref. [6]; however, note that the plots here take into account all searches
Fig. 16Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the true classification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points. The dashed bins contain no data points. Cf. Fig. 11b of Ref. [6]; however, note that the plots here take into account all searches
Fig. 17Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the true classification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points. The dashed bins contain no data points. Cf. Fig. 12 of Ref. [6]
Fig. 18Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the true classification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points. The dashed bins contain no data points
Fig. 19Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the true classification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points
Fig. 20Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane. The color in the second and third column indicates the fraction of allowed data points for the true classification and the out-of-bag prediction, respectively. The last column shows the fraction of misclassified model points
Features’ importance for the trained RF classifier
| Parameter | Importance | Parameter | Importance |
|---|---|---|---|
| mL1 | 0.021 | M1 | 0.058 |
| me1 | 0.019 | M2 | 0.164 |
| mL3 | 0.014 | mu | 0.130 |
| me3 | 0.014 | M3 | 0.242 |
| mQ1 | 0.079 | At | 0.013 |
| mu1 | 0.066 | Ab | 0.012 |
| md1 | 0.037 | Atau | 0.012 |
| mQ3 | 0.026 | mA2 | 0.031 |
| mu3 | 0.018 | tanbeta | 0.019 |
| md3 | 0.026 |
Input parameters of the natural SUSY scenario of Ref. [76], and the range over which these parameters were scanned
| Parameter | Description | Scanned range |
|---|---|---|
|
| 3 |
|
|
| 3 |
|
|
| Gluino mass parameter |
|
|
| Stop trilinear coupling |
|
|
| Higgsino mass parameter |
|
|
| Ratio of vacuum expectation values |
|
The experimental analyses used in Ref. [76]. The ATLAS-CONF and CMS-SUS papers are only available as conference proceedings, the others are given by their arXiv number. The middle column corresponds to the final state of the respective search, and the third column shows the total integrated luminosity employed in this analysis. The fourth column gives the total number of signal regions
| Reference | Final state |
| #SR |
|---|---|---|---|
| 1308.2631 (ATLAS) [ | 0 | 20.1 | 6 |
| 1403.4853 (ATLAS) [ | 2 | 20.3 | 12 |
| 1404.2500 (ATLAS) [ | SS 2 | 20.3 | 5 |
| 1407.0583 (ATLAS) [ | 1 | 20.0 | 27 |
| 1407.0608 (ATLAS) [ | monojet + | 20.3 | 3 |
| 1303.2985 (CMS) [ |
| 11.7 | 59 |
| ATLAS-CONF-2012-104 [ | 1 | 5.8 | 2 |
| ATLAS-CONF-2013-024 [ | 0 | 20.5 | 3 |
| ATLAS-CONF-2013-047 [ | 0 | 20.3 | 10 |
| ATLAS-CONF-2013-061 [ | 0–1 | 20.1 | 9 |
| ATLAS-CONF-2013-062 [ | 1–2 | 20.0 | 19 |
| CMS-SUS-13-016 [ | OS 2 | 19.7 | 1 |
Fig. 8Results of testing a natural SUSY scenario with the trained classifier in the higgsino LSP and gluino mass plane assuming stop masses larger than 600 GeV. The colors indicate the probability that a particular point is not excluded. For reference c shows the training data in the same plane as a, b after applying a constraint on the stop mass to filter out data points mimicking natural SUSY. The dashed bins contain no data points. The dashed stripe in a, b corresponds to the points that were outside the 95 CL boundaries of CheckMATE; see text
Fig. 9Several ROCs for the pMSSM-trained classifier with varying CL cuts when tested on the natural SUSY sample
Fig. 10Results of testing cMSSM with the trained classifier. The colors in b indicate the probability of the single point not being excluded. The white band in b corresponds to the points that were outside the 95 CL boundaries
Fig. 11Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane after imposing the constraints on the soft breaking parameters summarized in Table 8 (left). The color in the second and third column indicates the fraction of allowed data points. The last column shows the fraction of misclassified points. The dashed bins contain no data points
Input parameters of the pMSSM subspace in the light stop (left) and the electroweakino (right) scenarios
| Parameter | Range | Parameter | Range |
|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Fig. 12Color histograms for a projection of the 19-dimensional pMSSM parameter space on the – plane after imposing the constraints on the soft breaking parameters summarized in Table 8 (right). The contours denote the mass of . The color in the second and third column indicates the fraction of allowed data points. The last column shows the fraction of misclassified points. The dashed bins contain no data points
Results of the validation of the SUSY-AI classifier with out-of-bag estimation
| CL | # | #/total | Accuracy | Precision | Sensitivity | NPV | Specificity |
|---|---|---|---|---|---|---|---|
| Out-of-bag | |||||||
| 0.0 |
| 1.0000 | 0.93226 | 0.93951 | 0.94665 | 0.92152 | 0.91133 |
| 0.68 |
| 0.93248 | 0.95735 | 0.96072 | 0.96835 | 0.95222 | 0.94094 |
| 0.95 |
| 0.70646 | 0.99094 | 0.99092 | 0.99426 | 0.99096 | 0.98573 |
| 0.98 |
| 0.59367 | 0.99543 | 0.99573 | 0.99672 | 0.99496 | 0.99346 |
| 0.99 |
| 0.51570 | 0.99708 | 0.99747 | 0.99764 | 0.99649 | 0.99624 |
Results of the validation of the RF classifier with a split dataset (0.75 training, 0.25 testing)
| CL | # | #/total | Accuracy | Precision | Sensitivity | NPV | Specificity |
|---|---|---|---|---|---|---|---|
| Dataset splitting train:test = 75:25 | |||||||
| 0.0 |
| 1.0000 | 0.92271 | 0.91653 | 0.93049 | 0.92912 | 0.91491 |
| 0.68 |
| 0.90712 | 0.9545 | 0.95516 | 0.95302 | 0.95386 | 0.95595 |
| 0.95 |
| 0.63031 | 0.99022 | 0.99047 | 0.9893 | 0.99 | 0.99109 |
| 0.98 |
| 0.51321 | 0.99485 | 0.99559 | 0.99353 | 0.99419 | 0.99604 |
| 0.99 |
| 0.43830 | 0.99644 | 0.99685 | 0.99554 | 0.99608 | 0.99724 |
Classification of events following from a comparison of true classification and prediction [69]
| True classification | ||
|---|---|---|
| Positive | Negative | |
| Prediction | ||
| Positive | True positive (TP) | False positive (FP) |
| Negative | False negative (FN) | True negative (TN) |