Author Contributions
Conceptualization, S.K., S.H.K., and Y.L.; methodology, S.K., S.H.K. and Y.L.; formal analysis, S.K.; data curation, S.K.; writing—original draft preparation, S.K.; writing—review and editing, S.K., Y.Y., M.K., J.K., W.C., S.H.K. and Y.L.; project administration, Y.L. All authors have read and agreed to the published version of the manuscript.
Figure 1.
Location of South Korea and distribution of inland AERONET sites (n = 26, yellow dots).
Figure 1.
Location of South Korea and distribution of inland AERONET sites (n = 26, yellow dots).
Figure 2.
Overview of the Random Forest-based high-resolution AOD retrieval system.
Figure 2.
Overview of the Random Forest-based high-resolution AOD retrieval system.
Figure 3.
Scatter density plots of predicted versus observed AERONET AOD for 2024: (a) Model 1 (trained on 2015–2023), (b) Model 2 (2019–2023), and (c) Model 3 (2021–2023). The black dashed lines represent the 1:1 reference, while red solid lines indicate the linear regression fits. Gray shaded areas denote the MODIS expected error (EE) envelope of ±(0.05 + 0.15 × AOD). The color scale reflects the density of data points.
Figure 3.
Scatter density plots of predicted versus observed AERONET AOD for 2024: (a) Model 1 (trained on 2015–2023), (b) Model 2 (2019–2023), and (c) Model 3 (2021–2023). The black dashed lines represent the 1:1 reference, while red solid lines indicate the linear regression fits. Gray shaded areas denote the MODIS expected error (EE) envelope of ±(0.05 + 0.15 × AOD). The color scale reflects the density of data points.
Figure 4.
(a–e) Bar charts showing sample size (n), bias, MAE, RMSE, and correlation coefficient (R) by AERONET site, color-coded by land cover: Urban (orange), Forest (green), Cropland (yellow), and Coastal (blue). (f) Mean AOD comparison between AERONET (black) and RF model (red) across all sites.
Figure 4.
(a–e) Bar charts showing sample size (n), bias, MAE, RMSE, and correlation coefficient (R) by AERONET site, color-coded by land cover: Urban (orange), Forest (green), Cropland (yellow), and Coastal (blue). (f) Mean AOD comparison between AERONET (black) and RF model (red) across all sites.
Figure 5.
Scatter density plots of RF-predicted versus AERONET-observed AOD for 2024, categorized by land cover type: (a) Urban, (b) Forest, (c) Cropland, and (d) Coastal. The black dashed lines represent the 1:1 reference, the red solid lines indicate the linear regression fits, and the gray shaded areas denote the MODIS expected error (EE) envelope of ±(0.05 + 0.15 × AOD).
Figure 5.
Scatter density plots of RF-predicted versus AERONET-observed AOD for 2024, categorized by land cover type: (a) Urban, (b) Forest, (c) Cropland, and (d) Coastal. The black dashed lines represent the 1:1 reference, the red solid lines indicate the linear regression fits, and the gray shaded areas denote the MODIS expected error (EE) envelope of ±(0.05 + 0.15 × AOD).
Figure 6.
Seasonal scatter density plots of RF-predicted versus AERONET-observed AOD in 2024: (a) Spring, (b) Summer, (c) Autumn, and (d) Winter. The black dashed lines represent the 1:1 reference, the red solid lines indicate the linear regression fits, and the gray shaded areas denote the expected error (EE) envelope of ±(0.05 + 0.15 × AOD).
Figure 6.
Seasonal scatter density plots of RF-predicted versus AERONET-observed AOD in 2024: (a) Spring, (b) Summer, (c) Autumn, and (d) Winter. The black dashed lines represent the 1:1 reference, the red solid lines indicate the linear regression fits, and the gray shaded areas denote the expected error (EE) envelope of ±(0.05 + 0.15 × AOD).
Figure 7.
Time series comparison of RF model AOD (blue dots) and AERONET AOD observations (black crosses with gray line) for representative cases across different AOD ranges: (a) Clean (0.0–0.2, AAQ1_SK_Osan), (b) Thin (0.2–0.4, Hankuk_UFS), (c) Thick (0.4–0.8, Gangneung_WNU), and (d) Very Thick (≥0.8, Gosan_SNU). Shaded gray areas mark nighttime, when AERONET data are unavailable.
Figure 7.
Time series comparison of RF model AOD (blue dots) and AERONET AOD observations (black crosses with gray line) for representative cases across different AOD ranges: (a) Clean (0.0–0.2, AAQ1_SK_Osan), (b) Thin (0.2–0.4, Hankuk_UFS), (c) Thick (0.4–0.8, Gangneung_WNU), and (d) Very Thick (≥0.8, Gosan_SNU). Shaded gray areas mark nighttime, when AERONET data are unavailable.
Figure 8.
Seasonal comparison of AOD products over South Korea in 2024. Each panel displays two consecutive days for each season: (a) Spring (April 20–21), (b) Summer (August 19–20), (c) Autumn (October 1–2), and (d) Winter (December 9–10). The rows represent GOCI-II retrievals (top; 3 km resolution), CAMS forecasts (middle; ~40 km resolution), and the proposed RF model (bottom; 1.5 km resolution). All data are presented at 3-hourly intervals from 00:00 to 09:00 UTC, with GOCI-II matched to the nearest available observation times.
Figure 8.
Seasonal comparison of AOD products over South Korea in 2024. Each panel displays two consecutive days for each season: (a) Spring (April 20–21), (b) Summer (August 19–20), (c) Autumn (October 1–2), and (d) Winter (December 9–10). The rows represent GOCI-II retrievals (top; 3 km resolution), CAMS forecasts (middle; ~40 km resolution), and the proposed RF model (bottom; 1.5 km resolution). All data are presented at 3-hourly intervals from 00:00 to 09:00 UTC, with GOCI-II matched to the nearest available observation times.
Figure 9.
Scatter density plots comparing AOD retrievals with AERONET observations for the full year of 2024: (a) original CAMS forecast and (b) the proposed RF model. The black dashed lines represent the 1:1 reference, the red solid lines indicate the linear regression fits, and the gray shaded areas denote the expected error (EE) envelope. The color bars represent the density of data points.
Figure 9.
Scatter density plots comparing AOD retrievals with AERONET observations for the full year of 2024: (a) original CAMS forecast and (b) the proposed RF model. The black dashed lines represent the 1:1 reference, the red solid lines indicate the linear regression fits, and the gray shaded areas denote the expected error (EE) envelope. The color bars represent the density of data points.
Figure 10.
Comparison of CAMS and our model AOD spatial distributions for (a) high AOD case (January 12) and (b) moderate background case (February 27). Triangles indicate AERONET station locations with observed AOD values. The right panels show magnified views of the metropolitan region with detailed AERONET station values.
Figure 10.
Comparison of CAMS and our model AOD spatial distributions for (a) high AOD case (January 12) and (b) moderate background case (February 27). Triangles indicate AERONET station locations with observed AOD values. The right panels show magnified views of the metropolitan region with detailed AERONET station values.
Figure 11.
Empirical orthogonal function (EOF) decomposition of the Korea-domain AOD field for 2024. The left panels show spatial loadings (unitless), and the right panels show the corresponding standardized principal component (PC) time series. The fraction of explained variance by each mode is indicated in parentheses: (a) Mode 1 (78.3%), (b) Mode 2 (8.1%), and (c) Mode 3 (4.4%). The sign of the patterns is arbitrary; positive PC values indicate an increase in AOD over red areas and a decrease over blue regions.
Figure 11.
Empirical orthogonal function (EOF) decomposition of the Korea-domain AOD field for 2024. The left panels show spatial loadings (unitless), and the right panels show the corresponding standardized principal component (PC) time series. The fraction of explained variance by each mode is indicated in parentheses: (a) Mode 1 (78.3%), (b) Mode 2 (8.1%), and (c) Mode 3 (4.4%). The sign of the patterns is arbitrary; positive PC values indicate an increase in AOD over red areas and a decrease over blue regions.
Table 1.
The data information used in this study.
Table 1.
The data information used in this study.
| Datasets | Parameter | Description | Resolution |
|---|
| AERONET AOD | AOD_500 nm | AOD at 500 nm | |
| 440–675_Angstrom_Exponent | Ångström Exponent (440–675 nm) |
| CAMS AOD | Total aerosol optical depth at 550 nm | AOD at 550 nm | 40 km/1 h |
| MAIAC AOD | AOD_550 nm | AOD at 550 nm | 1 km/1 day |
| LDAPS | TDSWS | Total Downward Surface Shortwave Flux | 1.5 km/3 h |
| NCPCP | Large-Scale Precipitation |
| UGRD | U-Component of Wind |
| VGRD | V-Component of Wind |
| LHTFL | Latent Heat Net Flux |
| TMP | Temperature |
| RH | Relative Humidity |
| DPT | Dew Point Temperature |
| LCDC | Low Cloud Cover |
| HPBL | Planetary Boundary Layer Height |
| PRES | Surface Pressure |
| AirKorea | SO2 | Sulfur Dioxide (SO2) | 1 h |
| CO | Carbon Monoxide |
| NO2 | Nitrogen Dioxide (NO2) |
| O3 | Ozone (O3) |
| PM10 | Particulate Matter ≤ 10 μm |
| PM2.5 | Particulate Matter ≤ 2.5 μm |
| KMA ASOS | VIS | Visibility | 1 h |
Table 2.
Leave-One-Year-Out (LOYO) cross-validation results for Model 1 with a long-term training period (2015–2023).
Table 2.
Leave-One-Year-Out (LOYO) cross-validation results for Model 1 with a long-term training period (2015–2023).
| Test Year | Bias | MAE | RMSE | R | Samples (Total) |
|---|
| 2015 | −0.027 | 0.108 | 0.175 | 0.821 | 2403 |
| 2016 | −0.028 | 0.119 | 0.194 | 0.798 | 5346 |
| 2017 | −0.004 | 0.098 | 0.163 | 0.858 | 3560 |
| 2018 | 0.035 | 0.095 | 0.146 | 0.855 | 3384 |
| 2019 | 0.016 | 0.099 | 0.158 | 0.819 | 2389 |
| 2020 | 0.007 | 0.075 | 0.134 | 0.812 | 2735 |
| 2021 | 0.009 | 0.076 | 0.121 | 0.827 | 4016 |
| 2022 | −0.023 | 0.078 | 0.138 | 0.809 | 4535 |
| 2023 | −0.004 | 0.077 | 0.125 | 0.825 | 5491 |
| Average | −0.002 | 0.092 | 0.150 | 0.825 | 33,857 |
Table 3.
Leave-One-Year-Out (LOYO) cross-validation results for Model 2 with mid-term training period (2019–2023).
Table 3.
Leave-One-Year-Out (LOYO) cross-validation results for Model 2 with mid-term training period (2019–2023).
| Test Year | Bias | MAE | RMSE | R | Samples (Total) |
|---|
| 2019 | 0.018 | 0.103 | 0.166 | 0.799 | 2389 |
| 2020 | 0.006 | 0.075 | 0.133 | 0.814 | 2735 |
| 2021 | 0.013 | 0.077 | 0.122 | 0.823 | 4016 |
| 2022 | −0.026 | 0.078 | 0.139 | 0.812 | 4535 |
| 2023 | −0.003 | 0.078 | 0.126 | 0.821 | 5491 |
| Average | 0.002 | 0.082 | 0.137 | 0.814 | 19,166 |
Table 4.
Leave-One-Year-Out (LOYO) cross-validation results for Model 3 with a short-term training period (2021–2023).
Table 4.
Leave-One-Year-Out (LOYO) cross-validation results for Model 3 with a short-term training period (2021–2023).
| Test Year | Bias | MAE | RMSE | R | Samples (Total) |
|---|
| 2021 | 0.025 | 0.082 | 0.126 | 0.820 | 4016 |
| 2022 | −0.024 | 0.079 | 0.140 | 0.807 | 4535 |
| 2023 | 0.003 | 0.079 | 0.127 | 0.818 | 5491 |
| Average | 0.001 | 0.080 | 0.131 | 0.815 | 14,042 |
Table 5.
Summary of model performance comparison. Best scores are in bold.
Table 5.
Summary of model performance comparison. Best scores are in bold.
| Model | Training Period | LOYO RMSE | LOYO R | 2024 RMSE | 2024 R |
|---|
| (1) Long-term | 2015–2023 | 0.150 | 0.825 | 0.125 | 0.841 |
| (2) Mid-term | 2019–2023 | 0.137 | 0.814 | 0.128 | 0.835 |
| (3) Short-term | 2021–2023 | 0.131 | 0.815 | 0.127 | 0.833 |
Table 6.
Feature importance of the optimal model.
Table 6.
Feature importance of the optimal model.
| Feature | Importance | Feature | Importance |
|---|
| CAMS | 0.63665 | VGRD | 0.01598 |
| PM25 | 0.04931 | TDSWS | 0.01588 |
| VIS | 0.04479 | PRES | 0.01584 |
| DPT | 0.02530 | HPBL | 0.01562 |
| TMP | 0.02491 | NO2 | 0.01479 |
| LCDC | 0.02463 | RH | 0.01450 |
| PM10 | 0.01995 | LHTFL | 0.01344 |
| O3 | 0.01692 | SO2 | 0.00782 |
| UGRD | 0.01675 | CO | 0.00726 |
| Monthly Climatology | 0.01655 | NCPCP | 0.00310 |
Table 7.
Comparison of CAMS and our model AOD regarding multiple AERONET sites.
Table 7.
Comparison of CAMS and our model AOD regarding multiple AERONET sites.
| Date and Time | Station | Latitude (°) | Longitude (°) | CAMS | Ours | AERONET |
|---|
| 2024011200 | Gosan_SNU | 33.29222 | 126.1617 | 0.524 | 0.725 | 1.044 |
| 2024011200 | Anmyon | 36.53854 | 126.3302 | 0.199 | 0.321 | 0.301 |
| 2024011200 | AAQ3_SK_CBNU | 36.62630 | 127.4566 | 0.290 | 0.387 | 0.472 |
| 2024011200 | Hankuk_UFS | 37.33883 | 127.2658 | 0.183 | 0.259 | 0.334 |
| 2024011200 | Seoul_SNU | 37.45806 | 126.9511 | 0.174 | 0.300 | 0.348 |
| 2024011200 | Yonsei_University | 37.56443 | 126.9348 | 0.174 | 0.364 | 0.298 |
| 2024011200 | Gangneung_WNU | 37.77100 | 128.8670 | 0.104 | 0.124 | 0.097 |
| 2024022706 | Gwangju_GIST | 35.22828 | 126.8431 | 0.280 | 0.197 | 0.200 |
| 2024022706 | Hankuk_UFS | 37.33883 | 127.2658 | 0.320 | 0.227 | 0.226 |
| 2024022706 | Seoul_SNU | 37.45806 | 126.9511 | 0.380 | 0.258 | 0.262 |
| 2024022706 | Yonsei_University | 37.56443 | 126.9348 | 0.380 | 0.263 | 0.191 |
| 2024022706 | Incheon | 37.56882 | 126.6372 | 0.529 | 0.328 | 0.330 |