| Literature DB >> 35613093 |
Sara Cuéllar1, Paulo Granados1, Ernesto Fabregas2, Michel Curé3, Héctor Vargas1, Sebastián Dormido-Canto2, Gonzalo Farias1.
Abstract
Scientists and astronomers have attached great importance to the task of discovering new exoplanets, even more so if they are in the habitable zone. To date, more than 4300 exoplanets have been confirmed by NASA, using various discovery techniques, including planetary transits, in addition to the use of various databases provided by space and ground-based telescopes. This article proposes the development of a deep learning system for detecting planetary transits in Kepler Telescope light curves. The approach is based on related work from the literature and enhanced to validation with real light curves. A CNN classification model is trained from a mixture of real and synthetic data. The model is then validated only with unknown real data. The best ratio of synthetic data is determined by the performance of an optimisation technique and a sensitivity analysis. The precision, accuracy and true positive rate of the best model obtained are determined and compared with other similar works. The results demonstrate that the use of synthetic data on the training stage can improve the transit detection performance on real light curves.Entities:
Mesh:
Year: 2022 PMID: 35613093 PMCID: PMC9132280 DOI: 10.1371/journal.pone.0268199
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Machine learning approaches for exoplanet detection.
| Ref | Catalog | Feature Extraction | ML Method | Performance |
|---|---|---|---|---|
| [ | REAL Kepler Q1-Q17 DR24 | 114 Attributes calculated | RF (3 classes) | Accuracy: 0,973 |
| [ | REAL Kepler Q1-Q17 DR24 | 1D folding curve: global & local view | LLR Fully connected NN CNN | Accuracy: 0.917, 0.94, 0.958 AUC: 0.963, 0.977 0.988 |
| [ | REAL Kepler Q1-Q17 DR24 | 1D folding curve: global & local view Centroid curves Stellar parameters | DCNN | Accuracy: 0.975. Precision: 0.955 |
| [ | REAL TESS 1-5 sector | 1D folding curve: global & local view Secondary eclipse view | CNN for Triage | Accuracy: 0.974. AUC: 0.992 Precision: 0.97 |
| [ | REAL Kepler Cumulative | Features from interactive table | SVM, KNN, RF | Training metrics Accuracy: 0.9896. Precision: 0.9955 Recall: 0.9721 F1: 0.9837 |
| [ | SIMULATED with tansit REAL without transit | 50000 lightcurves: 25000 with transit 25000 without transit | MLP, CNN | Accuracy: 0.99. Recall: 0.99 |
| [ | SIMULATED REAL Kepler Q1-Q17 DR24 TESS 1-5 sector | TSFresh 789 features | Gradient Boosted trees | Simulated AUC: 0.92 Recall: 0.92 Precision:0.94 Kepler AUC: 0.948. Recall: 0.96 Precision:0.82 TESS AUC: 0.80. Recall: 0.82 Precision:0.81 |
Transit parameters.
| Parameter | Value |
|---|---|
|
| 0.85 to 8.5 [days] |
| 2 to 35 | |
| 0.005 to 0.4 | |
|
| 85 to 90 [deg] |
|
| 0.210 to 0.731 |
|
| 0.035 to 0.442 |
Fig 12D Phase folding for three cases: Synthetic lightcurve (left). Real lightcurve with transit from the host star KIC7051180 (center). Real lightcurve without transit from the host star KIC757076 (right).
Fig 2Training and validation stages of the proposed method.
Detection performance under proposed scenarios.
| With transit | Without transit | Metrics | ||||||
|---|---|---|---|---|---|---|---|---|
| Scenario |
|
|
|
|
|
| TPR | Precision |
| 1 | 0 | 483 | 0 | 483 | 0.5 | 0.743 | 0.740 | 0.747 |
| 2 | 483 | 0 | 0 | 483 | 0.5 | 0.206 | 0.120 | 0.750 |
GA implementation details.
| Parameter | Description | Value |
|---|---|---|
| Encode | Integer | 11 bits |
| Encode | Two decimals | 7 bits |
| Chromosome size | Encoded | 18 bits |
| Population size | Number of chromosomes in one generation | 10 |
| Number of generations | Iterations | 50 |
| Selection | Tournament between parents | 3 |
| Crossover type and rate | Single point at the middle | 0.9 |
| Mutation type and rate | Random bit flip | 0.1 |
| Fitness function | Determines members that survives |
|
GA results for different parameter settings.
| Population | #Generations | S | T | F1 | #F1 calc. |
|---|---|---|---|---|---|
|
| 5 | 1633 | 0.09 | 0.9607 | 80 |
| 10 | 1173 | 0.07 | 0.9371 | 160 | |
| 15 | 1334 | 0.41 | 0.9591 | 240 | |
| 20 | 759 | 0.28 | 0.9607 | 320 | |
|
|
|
|
|
| |
|
| 5 | 1794 | 0.10 | 0.9560 | 180 |
| 10 | 1403 | 0.28 | 0.9751 | 380 | |
| 15 | 1403 | 0.19 | 0.9753 | 540 | |
|
|
|
|
|
| |
| 50 | 1794 | 0.1 | 0.9560 | 1800 | |
|
| 5 | 1334 | 0.23 | 0.9651 | 440 |
|
|
|
|
|
| |
| 15 | 1403 | 0.19 | 0.9753 | 1320 | |
| 20 | 1403 | 0.18 | 0.9705 | 1760 | |
| 50 | 1403 | 0.2 | 0.9753 | 4400 |
Fig 33D plot of F1 against ratio λ and threshold T.
Fig 4ROC curve of every ratio.
Sensivity analysis results.
| λ(%) | S | T | Accuracy | Precision | TPR | FPR | F1 | FNR |
|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 0.608 | 0.860 | 0.923 | 0.720 | 0.060 | 0.808 | 0.280 |
| 5 | 25 | 0.639 | 0.825 | 0.825 | 0.850 | 0.210 | 0.837 | 0.150 |
| 10 | 60 | 0.953 | 0.835 | 0.818 | 0.900 | 0.130 | 0.857 | 0.100 |
| 15 | 85 | 0.394 | 0.840 | 0.876 | 0.920 | 0.210 | 0.897 | 0.080 |
| 20 | 121 | 0.631 | 0.870 | 0.922 | 0.830 | 0.080 | 0.873 | 0.170 |
| 25 | 161 | 0.999 | 0.885 | 0.824 | 0.940 |
| 0.878 | 0.060 |
| 30 | 207 | 0.608 | 0.710 | 0.810 | 0.940 | 0.370 | 0.870 | 0.060 |
| 35 | 260 | 0.456 | 0.850 | 0.873 |
| 0.070 | 0.919 |
|
| 40 | 322 | 0.963 | 0.945 | 0.873 |
| 0.080 | 0.919 |
|
| 45 | 395 | 0.685 | 0.895 | 0.882 | 0.900 | 0.130 | 0.891 | 0.100 |
| 50 | 483 | 0.578 | 0.840 | 0.897 | 0.880 | 0.040 | 0.888 | 0.120 |
| 55 | 590 | 0.727 | 0.910 | 0.782 | 0.720 | 0.130 | 0.750 | 0.280 |
| 60 | 730 | 0.765 | 0.855 | 0.932 |
| 0.090 | 0.950 |
|
| 65 | 897 | 0.444 | 0.905 | 0.938 | 0.910 | 0.140 | 0.923 | 0.090 |
| 70 | 1127 | 0.234 |
|
|
| 0.050 |
|
|
| 75 | 1450 | 0.561 | 0.940 | 0.940 | 0.950 | 0.060 | 0.945 | 0.050 |
| 80 | 1932 | 0.141 | 0.945 | 0.872 | 0.960 | 0.090 | 0.914 | 0.040 |
Comparison of the proposed approach with related work.
| Training | Test | |||||||
|---|---|---|---|---|---|---|---|---|
| Ref | Accuracy | Precision | TPR |
| Accuracy | Precision | Recall |
|
| [ | 0.989 | 0.995 | 0.972 | 0.983 | ||||
| [ | 1.000 | 1.000 | 1.000 | 1.000 | 0.500 | 0.500 | 0.010 | 0.019 |
| Ours | 0.986 | 0.986 | 0.985 | 0.985 | 0.980 | 0.970 | 0.990 | 0.980 |