Author Contributions
Methodology, J.Y.; data curation, J.Y.; writing—original draft preparation, J.Y.; writing—review and editing, J.Y., H.W., B.Q., A.-L.L. and F.R.; supervision, H.W., B.Q., A.-L.L. and F.R.; project administration, B.Q.; funding acquisition, B.Q. All authors have read and agreed to the published version of the manuscript.
Figure 1.
(a) Raw light curve with temporal gaps. (b) The light curve after removing the temporal gaps.
Figure 1.
(a) Raw light curve with temporal gaps. (b) The light curve after removing the temporal gaps.
Figure 2.
Examples of light curves in the Activity, Cephei, Classical Cepheids, Scuti, Eclipsing variables, Ellipsoida, Doradus, Rotating, RR Lyrae type AB, RR Lyrae type C, RVTauri, Slowly pulsating B star, and Semiregular classes selected in the training set, respectively.
Figure 2.
Examples of light curves in the Activity, Cephei, Classical Cepheids, Scuti, Eclipsing variables, Ellipsoida, Doradus, Rotating, RR Lyrae type AB, RR Lyrae type C, RVTauri, Slowly pulsating B star, and Semiregular classes selected in the training set, respectively.
Figure 3.
(a) Similar light curves in different types. (b) Fourier transform spectral images of similar light curves. (c) Wavelet transform low-frequency spectral images of similar light curves.
Figure 3.
(a) Similar light curves in different types. (b) Fourier transform spectral images of similar light curves. (c) Wavelet transform low-frequency spectral images of similar light curves.
Figure 4.
(a) The light curve after removing the temporal gaps. (b) Low-frequency image generated by the wavelet transform. (c) High-frequency images generated by wavelet transform. (d) Spectral image generated by Fourier transform.
Figure 4.
(a) The light curve after removing the temporal gaps. (b) Low-frequency image generated by the wavelet transform. (c) High-frequency images generated by wavelet transform. (d) Spectral image generated by Fourier transform.
Figure 5.
The importance of different features and the height of the bar correspond to the average importance degree evaluated during the multi-classification period.
Figure 5.
The importance of different features and the height of the bar correspond to the average importance degree evaluated during the multi-classification period.
Figure 6.
The specific structure diagram of the RLNet classification network. The network is used to distinguish the variable stars into 11 categories. Note (The blue area 1 is the improved residual structure).
Figure 6.
The specific structure diagram of the RLNet classification network. The network is used to distinguish the variable stars into 11 categories. Note (The blue area 1 is the improved residual structure).
Figure 7.
The curve of RLNet validation set accuracy (a) and loss (b) with training epochs.
Figure 7.
The curve of RLNet validation set accuracy (a) and loss (b) with training epochs.
Figure 8.
The normalized confusion matrix using RLNet. The closer the diagonal value is to 1, the better the classification performance. The values in the plot are rounded to two decimal places.
Figure 8.
The normalized confusion matrix using RLNet. The closer the diagonal value is to 1, the better the classification performance. The values in the plot are rounded to two decimal places.
Figure 9.
ROC curve of each type. Each curve represents a category, and the values in brackets represent the area under the ROC curve (AUC). Macro-average curves and micro-average curves representing all categories were also drawn. The dotted line represents the “random opportunity” curve.
Figure 9.
ROC curve of each type. Each curve represents a category, and the values in brackets represent the area under the ROC curve (AUC). Macro-average curves and micro-average curves representing all categories were also drawn. The dotted line represents the “random opportunity” curve.
Figure 10.
PR curve of each type. Each curve represents a category, and the values in brackets represent the area under the P-R curve (AP).
Figure 10.
PR curve of each type. Each curve represents a category, and the values in brackets represent the area under the P-R curve (AP).
Figure 11.
2D t-SNE projection of 10% of Kepler training set, where each point is a light curve, and each light curve is colored by its category label.
Figure 11.
2D t-SNE projection of 10% of Kepler training set, where each point is a light curve, and each light curve is colored by its category label.
Figure 12.
Clustering results between the merging categories, with different colors representing different categories and the blue crosses × are the clustering centers. (a) ECL and BCEP. (b) DSCUT and GDOR. (c) ACT and MISC.
Figure 12.
Clustering results between the merging categories, with different colors representing different categories and the blue crosses × are the clustering centers. (a) ECL and BCEP. (b) DSCUT and GDOR. (c) ACT and MISC.
Figure 13.
Normalization confusion matrix using RLNet in the merging categories. (a) ECL and BCEP. (b) DSCUT and GDOR. (c) ACT and MISC. The closer the diagonal value is to 1, the better the classification performance. The values in the plot are rounded to two decimal places.
Figure 13.
Normalization confusion matrix using RLNet in the merging categories. (a) ECL and BCEP. (b) DSCUT and GDOR. (c) ACT and MISC. The closer the diagonal value is to 1, the better the classification performance. The values in the plot are rounded to two decimal places.
Figure 14.
Normalization confusion matrix using RLNet in the same category as the ensemble classifier. The closer the diagonal value is to 1, the better the classification performance. The values in the plot are rounded to two decimal places.
Figure 14.
Normalization confusion matrix using RLNet in the same category as the ensemble classifier. The closer the diagonal value is to 1, the better the classification performance. The values in the plot are rounded to two decimal places.
Figure 15.
ROC curve when the light curve is classified into 14 categories. Each curve represents a category, and the values in brackets represent the area under the ROC curve (AUC). Macro-average curves and micro-average curves representing all categories were also drawn. The dotted line represents the “random opportunity” curve.
Figure 15.
ROC curve when the light curve is classified into 14 categories. Each curve represents a category, and the values in brackets represent the area under the ROC curve (AUC). Macro-average curves and micro-average curves representing all categories were also drawn. The dotted line represents the “random opportunity” curve.
Table 1.
Number of light curves with given classification, the number of light curves simulated (via SMOTE) of each category in the training set, and the number of categories finally classified.
Table 1.
Number of light curves with given classification, the number of light curves simulated (via SMOTE) of each category in the training set, and the number of categories finally classified.
Abbreviation | Class | Original Number | Balanced Number | Final Number |
---|
ACT | Activity | 18,250 | 19,874 | |
BCEP | Cephei | 18,254 | 19,956 | 38,386 |
CLCEP | Classical Cepheids | 18 | 19,992 | 19,992 |
DSCUT | Scuti | 884 | 1137 | 1137 |
ECL | Eclipsing variables | 2322 | 18,430 | |
ELL | Ellipsoida | 191 | 19,881 | 19,881 |
GDOR | Doradus | 424 | 19,712 | |
MISC | Miscellaneous/No-variable | 90,388 | 20,000 | 39,874 |
ROT | Rotating | 7718 | 17,694 | 17,694 |
RRAB | RR Lyrae type AB | 4 | 8001 | 8001 |
RRC | RR Lyrae type C | 16 | 19,998 | 19,998 |
RVTAU | RVTauri | 3 | 3928 | 3928 |
SPB | Slowly pulsating B star | 371 | 19,840 | 19,840 |
SR | Semiregular | 4 | 3991 | 3991 |
Table 2.
Statistical and Kepler photometry features of the light curves, except for the frequency features.
Table 2.
Statistical and Kepler photometry features of the light curves, except for the frequency features.
Attribute Name | Description |
---|
IMAG | Sloan I magnitude |
JMAG | 2MASS J magnitude |
KMAG | Sloan K magnitude |
TEFF | Effective surface temperature |
G.R.color | G-R color |
Radius | Stellar radius |
Log G | log 10 of the surface gravity |
Std | Standard deviation of the flux |
Median | Median of the flux |
Skew | Statistical skewness of the flux distribution |
Kurtosis | Statistical kurtosis (peakedness) of the flux |
Beyond1st | Fraction of all data points above the first standard deviation of flux |
SSDev | Sum of the square of the difference of the flux from the median flux |
Table 3.
Classification accuracy in different dimensions using the wavelet transform and Fourier transform, respectively.
Table 3.
Classification accuracy in different dimensions using the wavelet transform and Fourier transform, respectively.
| Dimensionality | 65 | 115 | 217 | 419 |
---|
Method | |
---|
Fourier transform | 27% | 62% | 71.6% | 89% |
Wavelet transform | 72% | 98.7% | 97.6% | 96% |
Table 4.
The accuracy results measured using seven typical light curve classification networks under the same test set.
Table 4.
The accuracy results measured using seven typical light curve classification networks under the same test set.
Method | Accuracy |
---|
Random forest | 0.767 |
AlexNet (2012) [36] | 0.741 |
Mahabal et al. (2017) [2] | 0.401 |
Muthukrishna et al. (2019) [37] | 0.780 |
Linares et al. (2020) [38] | 0.768 |
Szklenár et al. (2020) [10] | 0.254 |
Morales et al. (2021) [9] | 0.934 |
Ours | 0.987 |
Table 5.
Using seven typical light curves to classify the precision of network measurements under different categories. The bold entries in the table highlight the best results in each column.
Table 5.
Using seven typical light curves to classify the precision of network measurements under different categories. The bold entries in the table highlight the best results in each column.
| Category | MISC | BCEP | CLCEP | DSCUT | ELL | ROT | RRAB | RRC | RVTAU | SPB | SR |
---|
Precision | |
---|
Random forest | 0.999 | 0.685 | 0.848 | 0.612 | 0.589 | 0.654 | 0.999 | 0.831 | 0.995 | 0.570 | 0.001 |
AlexNet (2012) [36] | 0.997 | 0.472 | 0.995 | 0.723 | 0.903 | 0.398 | 0.998 | 0.998 | 0.999 | 0.850 | 0.001 |
Mahabal et al. (2017) [2] | 0.394 | 0.110 | 0.481 | 0.607 | 0.304 | 0.182 | 0.969 | 0.559 | 0.001 | 0.376 | 0.001 |
Muthukrishna et al. (2019) [37] | 0.680 | 0.492 | 0.999 | 0.872 | 0.953 | 0.824 | 0.999 | 0.999 | 0.999 | 0.972 | 0.985 |
Linares et al. (2020) [38] | 0.499 | 0.593 | 0.997 | 0.821 | 0.911 | 0.779 | 0.999 | 0.994 | 0.998 | 0.905 | 0.838 |
Szklenár et al. (2020) [10] | 0.001 | 0.230 | 0.431 | 0.001 | 0.001 | 0.001 | 0.244 | 0.001 | 0.001 | 0.001 | 0.001 |
Morales et al. (2021) [9] | 0.999 | 0.892 | 0.998 | 0.791 | 0.894 | 0.913 | 0.999 | 0.998 | 0.996 | 0.898 | 0.995 |
Ours | 0.999 | 0.978 | 0.999 | 0.973 | 0.995 | 0.970 | 0.999 | 0.999 | 0.999 | 0.999 | 0.994 |
Table 6.
Using seven typical light curves to classify the recall of network measurements under different categories. The bold entries in the table highlight the best results in each column.
Table 6.
Using seven typical light curves to classify the recall of network measurements under different categories. The bold entries in the table highlight the best results in each column.
| Category | MISC | BCEP | CLCEP | DSCUT | ELL | ROT | RRAB | RRC | RVTAU | SPB | SR |
---|
Recall | |
---|
Random forest | 0.999 | 0.820 | 0.884 | 0.420 | 0.404 | 0.745 | 0.999 | 0.999 | 0.822 | 0.632 | 0.001 |
AlexNet (2012) [36] | 0.958 | 0.748 | 0.939 | 0.283 | 0.629 | 0.639 | 0.999 | 0.875 | 0.979 | 0.648 | 0.001 |
Mahabal et al. (2017) [2] | 0.990 | 0.035 | 0.594 | 0.018 | 0.256 | 0.001 | 0.629 | 0.664 | 0.001 | 0.770 | 0.001 |
Muthukrishna et al. (2019) [37] | 0.513 | 0.784 | 0.950 | 0.705 | 0.849 | 0.777 | 0.999 | 0.992 | 0.994 | 0.846 | 0.584 |
Linares et al. (2020) [38] | 0.806 | 0.213 | 0.999 | 0.792 | 0.925 | 0.728 | 0.999 | 0.999 | 0.999 | 0.940 | 0.999 |
Szklenár et al. (2020) [10] | 0.001 | 0.940 | 0.512 | 0.001 | 0.001 | 0.001 | 0.999 | 0.001 | 0.001 | 0.001 | 0.001 |
Morales et al. (2021) [9] | 0.999 | 0.859 | 0.995 | 0.895 | 0.882 | 0.828 | 0.999 | 0.999 | 0.999 | 0.923 | 0.997 |
Ours | 0.999 | 0.971 | 0.999 | 0.989 | 0.995 | 0.968 | 0.999 | 0.999 | 0.999 | 0.999 | 0.999 |