| Literature DB >> 32651380 |
Rafal A Angryk1, Petrus C Martens2, Berkay Aydin3, Dustin Kempton3, Sushant S Mahajan2, Sunitha Basodi3, Azim Ahmadzadeh3, Xumin Cai3, Soukaina Filali Boubrahimi3, Shah Muhammad Hamdi3, Michael A Schuh3, Manolis K Georgoulis2,4.
Abstract
We introduce and make openly accessible a comprehensive, multivariate time series (MVTS) dataset extracted from solar photospheric vector magnetograms in Spaceweather HMI Active Region Patch (SHARP) series. Our dataset also includes a cross-checked NOAA solar flare catalog that immediately facilitates solar flare prediction efforts. We discuss methods used for data collection, cleaning and pre-processing of the solar active region and flare data, and we further describe a novel data integration and sampling methodology. Our dataset covers 4,098 MVTS data collections from active regions occurring between May 2010 and December 2018, includes 51 flare-predictive parameters, and integrates over 10,000 flare reports. Potential directions toward expansion of the time series, either "horizontally" - by adding more prediction-specific parameters, or "vertically" - by generalizing flare into integrated solar eruption prediction, are also explained. The immediate tasks enabled by the disseminated dataset include: optimization of solar flare prediction and detailed investigation for elusive flare predictors or precursors, with both operational (research-to-operations), and basic research (operations-to-research) benefits potentially following in the future.Entities:
Year: 2020 PMID: 32651380 PMCID: PMC7351763 DOI: 10.1038/s41597-020-0548-x
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1The block diagram of our dataset generation process, with principal procedures of flare cleaning (in red), MVTS generation and flare integration (in blue), and the eventual machine-learning-ready dataset creation (in orange).
Fig. 2GOES15 1-8 Å solar X-ray flux from 2011-02-14 to 2011-02-15. The GOES flare classification is provided on the minor y-axis. The plot also includes annotations of flares exceeding GOES class C5.0, with red vertical lines indicating the flares’ peak time. The example interval also shows that during these two days of intense activity background X-ray flux was high, making it difficult to identify small flares. Notice also that the first two C-class flares peak essentially simultaneously (i.e., within 1 minute from each other).
Fig. 3Overview of our 4-step flare data enhancement and cross-cheking procedures as well as accompanied enhancements after each step (brief explanations also provided). The cross-checking with secondary flare data sources (SSW Latest Events and Hinode-XRT) results in three sets of flare reports: (1) primary-verified, where the locations of the primary flare reports (from GOES) are verified by at least one secondary source; (2) secondary-verified, where GOES reported locations could not be verified but SSW and XRT reported locations are in agreement; and (3) non-verified, where flare location from any of the three data sources cannot be verified.
Fig. 4Scatter plot of the primary- and secondary-verified heliographic latitudes of flares (in degrees), as a function of peak times, ranging between May 1, 2010 and December 31, 2018.
Fig. 5The number of flares for each GOES flare class after flare cross-checking procedures were applied. Blue bars show the primary-verified flares, with cross-checked GOES locations, orange bars show the secondary-verified flares whose GOES location could not be verified and green bars show the non-verified flares.
Computed magnetic field parameters.
| Magnetic Field Parameters from[ | Description | Formula |
|---|---|---|
| ABSNJZH[ | Absolute value of the net current helicity in G2/m | |
| EPSX*[ | Sum of X-component of normalized Lorentz force | |
| EPSY*[ | Sum of Y-component of normalized Lorentz force | |
| EPSZ*[ | Sum of Z-component of normalized Lorentz force | |
| MEANALP[ | Mean twist parameter | |
| MEANGAM[ | Mean inclination angle | |
| MEANGBH[ | Mean value of the horizontal field gradient | |
| MEANGBT[ | Mean value of the total field gradient | |
| MEANGBZ[ | Mean value of the vertical field gradient | |
| MEANJZD[ | Mean vertical current density | |
| MEANJZH[ | Mean current helicity | |
| MEANPOT[ | Mean photospheric excess magnetic energy density | |
| MEANSHR[ | Mean shear angle | |
| R_VALUE*[ | Total unsigned flux around high gradient polarity inversion lines using the | Φ = Σ |
| SAVNCPP[ | Sum of the absolute value of the net current per polarity | |
| SHRGT45[ | Area with shear angle greater than 45 degrees | |
| TOTBSQ*[ | Total magnitude of Lorentz force | |
| TOTFX*[ | Sum of X-component of Lorentz force | |
| TOTFY*[ | Sum of Y-component of Lorentz force | |
| TOTFZ*[ | Sum of Z-component of Lorentz force | |
| TOTPOT[ | Total photospheric magnetic energy density | |
| TOTUSJH[ | Total unsigned current helicity | |
| TOTUSJZ[ | Total unsigned vertical current | |
| USFLUX[ | Total unsigned flux in Maxwells | Φ = |
Parameters marked with asteriks (*) are discussed in[21], but are not available in SHARP headers.
Fig. 6Example slicing and labeling of time series, characterized by an elementary time unit of length . Time steps (t) can then be defined at instances corresponding to integer multiples of .
Summary and categorization of the time series parameters in our dataset.
| Parameter Category | Time and Location | Magnetic Field Parameters (Table | Flare History Parameters | Quality | ||
|---|---|---|---|---|---|---|
| Individual Parameters | ABSNJZH | EPSX | ||||
| EPSY | EPSZ | |||||
| TIMESTAMP | MEANALP | MEANGAM | BFLARE | BFLARE_LABELa | QUALITY | |
| LAT_MIN | MEANGBH | MEANGBT | BFLARE_LOC | BFLARE_LABEL_LOCa | XRQUALITYb | |
| LON_MIN | MEANGBZ | MEANJZD | CFLARE | CFLARE_LABELa | CRVAL1 | |
| LAT_MAX | MEANJZH | MEANPOT | CFLARE_LOC | CFLARE_LABEL_LOCa | CRVAL2 | |
| LON_MAX | MEANSHR | SAVNCPP | MFLARE | MFLARE_LABELa | CRLN_OBS | |
| HC_ANGLE | SHRGT45 | TOTBSQ | MFLARE_LOC | MFLARE_LABEL_LOCa | CRLT_OBS | |
| NOAA_AR | TOTFX | TOTFY | XFLARE | XFLARE_LABELa | SPEI | |
| TOTFZ | TOTPOT | XFLARE_LOC | XFLARE_LABEL_LOCa | IS_TMFI | ||
| TOTUSJH | TOTUSJZ | XR_MAXb | ||||
| USFLUX | R_VALUE | |||||
aThe flare label series (e.g., CFLARE_LABEL or XFLARE_LABEL_LOC) are stored as annotations in the form of JSON objects, shown as follows:
{
“magnitude” : [GOES class of the flare],
“id” : [flare identifier],
“NOAA_AR” : [associated NOAA active region number if available],
“narn_source” : [data source where NOAA_AR is obtained- GOES, SSW, or XRT]
“verification” : [verification flag- Primary, Secondary, or Non-verified]
}.
bXR_MAX series signifies the maximum X-ray flux (from 1–8 Angstrom), while XRQUALITY is the quality flag showing its quality.
| Measurement(s) | solar flare • stellar radiation • solar magnetic data |
| Technology Type(s) | digital curation |
| Factor Type(s) | temporal interval • flare class • location |
| Sample Characteristic - Environment | star • climate system |