Impact of the Assimilation of Multi-Platform Observations on Heavy Rainfall Forecasts in Kong-Chi Basin, Thailand

: Data assimilation with a Numerical Weather Prediction (NWP) model using an observation system in a regional area is becoming more prevalent for local weather forecasting activities to reduce the risk of disasters. In this study, we evaluated the predictive capabilities of multi-platform observation assimilation based on a WRFDA (Weather Research and Forecasting model data assimilation) system with 9 km grid spacing over the Kong-Chi basin (KCB), where tropical storms and heavy rainfall occur frequently. Data assimilation experiments were carried out with two assimilation schemes: (1) assimilating the combined multi-platform observations of PREPBUFR data from the National Centers for Environmental Prediction (NCEP) and Automatic Weather Stations (AWS) data from the National Hydroinformatics Data Center in Thailand, and (2) assimilating the AWS data only, which are referred to as DAALL and DAAWS, respectively. Assimilation experiments skill scores with lead times of 48 h and 72 h were evaluated by comparing their accumulated rainfall and mean temperatures every three hours in the AWS for heavy rainfall events that occurred on 28 July 2017 and 30 August 2019. The results show that the DAALL improved the statistical skill scores by improving the pattern and intensity of heavy rainfall events, and DAAWS also improved the model results of near-surface location forecasts. The accuracy of the two assimilations for 3 h of accumulated rainfall with a 5 mm threshold, was only above 70%, but the threat score was acceptable. Temperature observations and assimilation experiments ﬁtted a signiﬁcant correlation with a coefﬁcient greater than 0.85, while the mean absolute errors, even at the 48 h lead times remained below 1.75 ◦ C of the mean temperature. The variables of the AWS observations in real-time after combining them with the weather forecasting model were evaluated for unprecedented rain events in the KCB. The scores suggested that the assimilation of the multi-platform observations at the 48 h lead times has an impact on heavy rainfall prediction in terms of the threat score, compared to the assimilation of AWS data only. The reason for this could be that fewer observations of the AWS data affected the WRFDA model.


Introduction
Heavy rains are tending to increase in frequency and intensity with the potential to cause heavy damage in specific locations [1]. In weather forecasting, accurate, spatial and timely forecasts play an important role in supporting early warning and evacuations, as well as in reducing disasters due to rain [2,3]. To provide better accuracy on weather forecasts at local scales, data source information in the area should be applied to the forecast model, which could be achieved by the data assimilation technique [4][5][6][7]. This technique updates the initial conditions and boundary conditions of the model, which could help to provide the best possible atmospheric conditions [8,9]. Therefore, meteorological observations are important components in data assimilation systems, and success is primarily associated with observations in a period of time that is sufficient to provide an initial state of the atmosphere that is balanced and reasonably accurate.
Observations from the global system and the ground station of the local network are available, and merging these observations into a single numerical model is necessary to improve model performance. A Weather Research Forecast (WRF) model data assimilation (WRFDA) system has therefore been developed and made available to researchers and the community at mesoscale [10], with the main goals of minimizing functional costs and finding the optimal merge combination of observations with previous weather forecasts. The WRFDA has a variety of different techniques. Optimal interpolation [11,12] is a statistical data assimilation method based on multi-dimension analysis equations and was, used operationally from 1979 until 1996 when it was replaced by the three-dimensional variational (3DVAR) data assimilation method. In recent years, 3DVAR, and 4DVAR (four-dimensional variational) methods have become popular in weather forecasting [13,14] because of their advantage in physical assimilation; they satisfy the dynamic and thermodynamic constraints of the mesoscale model and can combine conventional and non-conventional data from different sources and interval times. This results in a more expensive computational technique for 4DVAR [15] and Kalman filters [16], which are generally proven to be more efficient than 3DVAR but remain computer-intensive. The major difference between variational and interpolation methods of data assimilation is that while combining methods use observational data in the vicinity of the forecast model, variable methods use the available data observed over the domain and globally minimize the errors. As a consequence of this, initial data provided by variational methods are less noisy and in better balance than those produced by interpolation algorithms. In recent years, high-resolution weather models with 3D/4DVAR have been increasingly applied for studying short-term weather prediction [17][18][19][20][21][22][23]. While these studies make a notable contribution to the short-term forecasting of weather events, they are not representative enough to indicate the performance of 3DVAR for the intensity of rainfall weather forecasting.
Ha et al. [19] designed an assimilation of Doppler radar data and surface data, which had improved the accuracy of rainfall forecasts compared to the assimilation of either radar data or surface data only. The assimilation of surface observations in WRF-3DVAR improved the results of a simulation of two heavy rainfall events in the Indian monsoon region in July 2005 and July 2006; assimilation results can reduce uncertainty [24] and surface observations yield precise data that can be combined with the Numerical Weather Prediction (NWP) model [25]. Kalra et al. [26] evaluated the influence of the 3DVAR data assimilation technique on the simulation of heavy rainfall events along the east coast of India using observation from the Global Telecommunication System (GTS). These studies showed that the 3DVAR technique has a positive effect on the simulation of heavy rainfall. Short-term forecasting is expected to improve in performance in specific areas by increasing the density of observation systems, and a higher frequency and quality of data observations, in combination with an assimilation technique that is appropriate to a particular area, should improve the accuracy of local area forecasting. In Thailand, there are many data sources, but the data assimilation technique is still in the experimental stage.
The region of interest for this research is the Kong-Chi basin (KCB) in the northeast of Thailand, which is a region that, because of its socio-economic relevance, has been the subject of many weather forecasts studies in recent years [27][28][29]. There are two important challenges for weather forecasts over the KCB. The first is the ability to represent the intense monsoon season and heavy rainfall events that develop in this region. There is a general consensus that because of the difficulty of forecasting these rainfall events the model overestimates. Second, an accurate representation of the KCB, along with the associated spatial, time and intensity data of its rainfall, is required to support disaster information.
This study merges observation datasets from the PREPBUFR datasets of the National Centers for Environmental Prediction (NCEP) and Automatic Weather Stations (AWS) of Thailand to present the impact of the assimilation of multi-platform observations of the forecast system based on numerical predictions with the WRFDA model on short term forecasts of heavy rainfall events over the KCB. 3DVAR is adopted for assimilation in this study because, compared to the advanced 4DVAR, it is considered to be the better option to be applied to a short-term weather forecasting model that improves the initial conditions and provides reliable forecasts using reasonable computational resources. Rainfall and temperature forecast skill levels are evaluated during a period of two events. The emphasis on rainfall and temperature follows the approach taken at the World Meteorological Organization (WMO), whose priority is to develop datasets on these two variables. Section 2 presents the model configuration, the assimilation methodology, and the data assimilation experiment. Section 3 presents the assimilation data and data used for the evaluation of results. Section 4 describes the verification methods and discusses the assimilation of rainfall and temperature forecasts. A summary and conclusion are given in Section 5.

Rain Events Identification
The two heavy rainfall events studied in this work occurred on 28 July 2017 and 30 August 2019. These two events were selected because they were the two most heavy rainfall events over the KCB. The SONCA storm was active in several parts of the KCB with rainfall of up to 100-110 mm, on 28 July 2017 (see Figure 1a); described by the Department of Disaster Prevention and Mitigation of Thailand as the worst in 30 years. There were 460,000 farmers and more than 609,425 households reported affected. The PODUL-event had heavy rainfall of up to 80-90 mm, on 30 August 2019 (see Figure 1b); it was less accumulated rainfall, but more than 33 people were reported dead.

Model Setup
Forecasting was conducted with the 3DVAR component of the Advanced Research WRF model [30], version 3.9, which is a state-of-the-art weather prediction system designed to serve both atmospheric research and operational forecasting applications. It features a WRFDA. The model topography, i.e., terrain elevation, soil types, land used, and land cover in the global dataset from the United States Geological Survey (USGS)was used to create surface boundary conditions for the three nested domains. The model's parent domain (Domain01, d01) covered South East Asia (SEA) with a grid spacing of 27 km, another domain (Domain02, d02) covered Thailand with a nested domain with a 9 km grid spacing, and the last one was used for the inner-most domain (Domain03, d03) covering the KCB with a nested domain with a 3 km grid spacing (see Figure 2a). The WRF physics parameterization schemes including Rapid Radiative Transfer Model (RRTM) longwave radiation [31], Dudhia shortwave radiation, the Yonsei University Scheme (YSU) planetary boundary layer [32], Noah land surface model [33], Eta microphysics and Betts-Miller-Janic (BMJ) cumulus Parameterization [34,35] are used in these three domains. These physics schemes were chosen because they were found to produce the best forecasts of extreme weather events over Thailand and the surrounding areas [35][36][37]. A summary of the model configuration and parameterization is presented in Table 1.  The observations obtained from PREPBUFR and AWS are assimilated in the 3DVAR [10,30]. The available 3DVAR represent an optimal estimate of the atmospheric state and allow us to perform an analysis at the initial time. The WRFDA system was first developed by the Mesoscale and Microscale Meteorology Laboratory (MMM) of NCAR (National Center for Atmospheric Research), the incremental analysis system, which utilizes observations of pressure, wind, temperature, and relative humidity. At present, this system can assimilate traditional and non-traditional data. The WRF-3DVAR scheme involves minimizing a cost function measuring the distance between the model simulation and the available observation, as well as the distance of the solution to the background. The cost function is defined as: where J(x) is the cost function, which includes the background J b (x) and observational J o (x) terms. x is the analysis state from the WRF-Var system, and x b is the first guess of the WRF model. y o is the observations, and Hx is the model-derived observation transformed from the analysis x by the observation operator H for comparison against y o . The background error covariance is B, and the observational (instrumental) is E. Additionally, F is the "representivity" error matrices (i.e., the error associated with the observation operator). In this study, B is estimated based on the standard National Meteorological Centre (NMC) method [11] for our forecasting domain. The details of the WRF-3DVAR components and their applications can be found in [12,13].

Assimilation Experiments
The statistic skill scores such as accuracy (ACC), bias score (BIAS), and threat scores (TS) at the different thresholds of rainfall without data assimilation (CNTL) and with data assimilation (DA) using only AWS data from Thailand, the experiments for each event were validated by comparing them with the AWS in the study area. The validation result is given in Supplementary Materials. As shown in Figure S1, the value of the statistic score of DA experiments is higher than the CNTL experiments at the heavy rainfall events over the KCB. Hence, in this work, we only focus on the analysis of forecasts with assimilation.
To examine the effect of assimilation the combination of multi-platform observations from NCEP PREPBUFR and AWS data against the use of AWS from Thailand observations data only on forecasting the rainfall in the KCB over Northeastern with the 48 h and 72 h lead times, two sets of assimilation experiments were designed as follows: (1) DAALL assimilation, observations from the NCEP PREPBUFR datasets and the AWS data. (The PREPBUFR data comprise land surface, marine surface, radiosonde, pilot-balloon, and aircraft reports from the GTS) and (2) DAAWS assimilation, observations from AWS only. The experiments are assimilated with lead times of 48 h and 72 h using 00UTC as the initial lead time for starting heavy rainfall events starting on 28 July 2017 and 30 August 2019, which are referred as the SONCA event and the PODUL event respectively. Global Forecasting System (GFS) forecast data with a 0.5 • × 0.5 • horizontal resolution were employed to generate the initial field and boundary conditions. For all assimilation experiments, the assimilation window is 3 h. The details of these assimilation experiments are listed in Table 2.

Observations Network
The assimilated data include the NCEP-PREPBUFR and AWS data networks of Thailand. The PREPBUFR datasets were obtained from "ds337.0" in the NCAR archives, which is available at http://rda.ucar.edu/datasets/ds337.0 (accessed on 26-28 May 2021). The spatial distribution of all assimilated observations at 00UTC 25 July 2017 is shown in Figure 3. The observations datasets used in the assimilation experiments mainly comprise land surface observations (SNYOP) and upper-air observations (SOUND), which contain measurements of pressure, geopotential height, temperature, dewpoint temperature, wind speed, and wind direction. Other information provided by NCEP-PREPBUFR presented in Figure 3 includes METAR (dew point temperature reported by aircraft), AIREP (other conventional aircraft reports), QSCAT (Windspeed Scattometer data) and SATOB (satellite moisture bogus reports). These data are initially downloaded in PREPBUFR format and are not included in AWS; it has to be converted in advance into LITTER-R format before it is used in the assimilation experiment. The AWS data include parameters, such as sea level pressure, temperature, dewpoint temperature, wind speed, and wind direction, which are retrieved from more than 500 stations spread over Thailand. The AWS data network in Thailand has been used mainly in monitoring weather conditions, but it has not yet been applied to the NWP model. The AWS observations were passed through a quality control check, which was made by the OBSPROC package available in WRF-3DVAR. The locations of AWS used in the assimilation are shown in Figure 3 (blue dot).
The evaluations discussed in this study are based on AWS, which collects accumulated rainfall every three hours, 48 h, and 72 h for rain and temperature observations of the KCB. In addition, data from 35 stations on terrain height, as shown in Figure 2b

Evaluation of Rainfall Forecast
To illustrate the extent of the impact of multi-platform data, the spatial distribution of the differences in accumulated rainfall (mm) between the assimilation experiment and the observations data (forecast-observations) from the Global Data Assimilation System was analyzed in four experiments (namely, DAALL-72h, DAAWS-72h, DAALL-48h, and DAAWS-48h) for the two rainfall events. The results are shown in Figures 4 and 5. For shortrange forecasting, the differences in initial conditions are expected to lead to differences in forecasts. Figure 4 presents the differences between data assimilation and observation in accumulated rainfall of the SONCA event from the four experiments (i.e., DAALL-72h, DAAWS-72h, DAALL-48h, and DAAWS-48h). According to DAALL-72h (Figure 4a), underestimates in the south part and closer in the west part of the KCB. Compared with differences with DAAWS-72h (Figure 4b), overforecasts occur in the north part of the KCB, where the maximum value is 112 mm, and the rainfall difference in the KCB becomes much smaller in the west, as there are poor AWS observations. At the lead time of 48 h, in DAALL-48h (Figure 4c,d) the difference in accumulated rainfall in the KCB are more accurate than the lead time of 72 h. The spatial distribution of the differences in the PODUL event is shown in Figure 5. On the whole, forecasts with the AWS data assimilation successfully captured the rainfall region in the north part of the KCB. They have a reliable response for observation but did not comply with the intensity of the rainfall. The addition of multi-platform observation data assimilation can assess the accumulated rainfall pattern of the rainfall maximum in the west. The combination of all the assimilation experiments accurately produces a more detailed rainfall spatial distribution. The rainfall forecast was evaluated specifically in terms of the multiple category threshold of accumulated rainfall every three hours. To evaluate the model's skill in verifying the forecasted rainfall (usually called skill scores) of the above data assimilation experiments. Categorical statistics are defined and summarized in Table 3. The precipitation accuracy (ACC), the bias score (BIAS), and threat score (TS) for 3 h of accumulated rainfall, with thresholds of 5, 10, 15, and 20 mm from the KCB, were calculated, according to the value of the 3 h intensity of each data, as shown in Figure 6. If the forecast and the observed accumulated rainfall were below 1 mm/3 h, they were not analyzed [38,39], and, they were classified either as hits, misses, or false alarms when forecasts were above 5 mm/3 h. A hit (h) is defined as a forecast that, is in the same threshold as the corresponding observation. A misses (m) is a forecast that, is in the range of the rainfall thresholds, but the corresponding observation is different. A false alarm (fa) means that the forecast is beyond the range of the rainfall thresholds, and the observation is in a 5 mm/3 h threshold. A correct negative (cn) means that the threshold is forecast not to occur and does not occur.
where ACC is the precipitation accuracy, BIAS is the bias score, and TS is the threat score. Table 3. Skill scores used to evaluate the 3 h accumulated rainfall.

Statistics Definition Range
Precipitation Accuracy (ACC) The fraction of correct forecasts 1-100%, where 100% is a perfect score Bias score (BIAS) The number of correct forecasts and the number of each threshold of observed rainfall From −∞ to ∞, where 1 is a perfect score BIAS < 1 underforecast BIAS > 1 overforecast Threat score (TS) The fraction of correct forecasts 0-1, where 1 is a perfect score The accuracy measures the forecast's skill in identifying 3 h of accumulated rainfall as each threshold. The bias score is the ratio between the number of correct forecasts and the number of each threshold of the observed rainfall. A bias score that is less than one indicates that the forecast system has underforecasted, and if it is larger than 1, it overforecasts them. The threat score measures the fraction of correct forecasts in each category threshold that were correctly predicted. If the TS score becomes closer to 0, then the forecast in the category threshold is poor for the actual forecast, which is said to be inaccurate. These scores are convenient for weather station data because of their ease of interpretation [40]. Note that when a forecast storm is shifted in space or time with respect to the observed event, the point-to-point evaluation may create missed and false alarms which lead to a double penalty [41,42]. A contingency table for 3 h of accumulated rainfall of both events (not shown) was computed for all stations in the KCB. The skill scores were computed for each station and each data assimilation experiment. The rainfall threshold used to define "rain" was set to above 1 mm per 3 h of accumulated rainfall to agree with the minimum measurable rainfall of the tipping bucket rain gauge. The results for all stations of validation over the KCB are summarized in graphical form using a map and skill score graphics. Figures 7 and 8 show the accuracy, bias score, and TS score for 3 h of accumulated rainfall with thresholds of 5, 10, 15, and 20 mm from all experiments of the SONCA and PODUL events. For the verifications with assimilation in the lead time 72 h and 48 h experiments of both events, the value of BIAS and the accuracy are equal to unity, which represents a good forecast, whereas a skill score higher with a high rainfall threshold with a short lead time or lower than unity indicates that the assimilated rainfall forecast for each category overestimated or underestimated, respectively.
A comparison of the bias score and TS for 3 h of accumulated rainfall with different thresholds and the gain in accuracy from both assimilation experiments of the SONCA event with lead times of 72 h and 48 h are shown in Figure 7. DAALL and DAAWS with a value bias with a 72 h lead time (Figure 7a) are very similar to those with a 48 h lead time (Figure 7b). The bias score ranges of DAALL from 0.9 to 1.4 with a 72 h lead time, an indication that the addition of AWS assimilation tends to overforecast (BIAS > 1) with a 15 mm threshold for this event. The bias values of DAALL and DAAWS with a 48 h lead time (Figure 7b) are very similar, showing a lower error with all thresholds, the accuracy and the gain. In contrast, the TS of DAAWS with 72 h lead times (Figure 7c); blue bar is too small, all over a perfect value of 1. The further addition of PREPBUFR datasets to the AWS data in the DAALL experiment with 48 h lead time (Figure 7d) suggests good results in terms of TS, as they are above 0.5 for the thresholds of 5, 10, and 25 mm, which should be considered for spatial forecasts. Regarding the verification in the 5 mm/3 h threshold, the DAALL experiment with 48 h lead time is better than the DAAWS experiment for the SONCA event.
For the PODUL event, as shown in Figure 8, the DAALL and DAAWS bias values with 72 and 48 h lead times (see Figure 8a,b) are closer to 1 with all rainfall thresholds, which indicates that the assimilation experiment reduces overforecasts and achieves high performance for this event. However, Figure 8c,d verify that the TS values of both experiments are lower than 0.4 with all rainfall thresholds, which indicates that the forecast has lower performance. In general, the successes of ACC and TS tend to decrease as the forecast time increases and also rainfall thresholds increase, as expected.  The difference between the DAALL and DAAWS at lower rainfall thresholds (5 mm/3 h and 10 mm/3 h) is due to the inefficiency of the multi-platform observations used in this study. To better use the multiple platforms, radiosondes from multi-level pressure surfaces should be assimilated together at an appropriate horizontal resolution. Increasing more observation data may assist in capturing the lower rainfall thresholds of short-term forecasting.
The results presented so far are for all stations of validated data. We selected the assimilation experiments with a 48 h lead time and with 5 mm/3 h of accumulated rainfall to validate the skill scores over the KCB and the results of both events are shown in Figures 9 and 10. Figure 9 shows the skill scores averages between DAALL and DAAWS with a 48 h lead time for the SONCA event. The accuracy of DAALL (Figure 9a)   The results for the PODUL event are shown in Figure 10. The accuracy values of DAALL ( Figure 10a) and DAAWS (Figure 10b) are similar ranging from 0.75 to 0.85, but the accuracy in the southwest of the KCB using DAAWS is higher compared to DAALL values in the range of 0.65 to 0.75. At the same time, the bias scores of DAALL and DAAWS (Figure 10c,d) are approximately 1.2 in the south of the KCB. One issue that contributes to these large values, is that the model tends to overforecast accumulated rain, as shown in

Evaluation of Temperature Forecast
The assimilation forecast of the mean temperature was evaluated for both lead times by comparing the observation at all stations and the model's nearest grid point. The mean 3-h temperature was evaluated using scatterplots and two purpose score measures: the correction coefficient r and the mean absolute error (MAE) ∆T: where N is the total number of 3 h and T f i and T O i are the assimilation forecast and observed temperatures, respectively, for the 3 h. More accurate assimilation forecasts would have a zero mean absolute error, in real cases, the larger the value, the more imperfect the forecast. The scatterplots for the available stations in the KCB at the two assimilation experiments with 72 h and 48 h lead times for both events are presented in Figures 11 and 12 For the SONCA event, as shown in Figure 11, the mean temperature has correlation coefficients that range from 0.93 (48 h) to 0.91 (72 h) in the DAALL experiment and range from 0.91 (48 h) to 0.87 (72 h) in the DAAWS experiment. The slope m values of regression lines of both assimilation experiments of mean temperatures with 48 h and 72 h lead times are notably close to 1 (0.97 and 0.92, respectively); with short lead times, these values decline.
The scatterplots for the PODUL event are shown in Figure 12. Both experiments of mean temperature with a 72 h lead times show slightly lower correlations, yet always exceed 0.86. The slope m values of the regression lines of mean temperature for both experiments vary between 1.2 and 0.99. In general, as the forecast lead time moves from 72 h to 48 h, the dispersion becomes larger, the correlation coefficient becomes smaller, and the slope m of the regression lines moves closer to 1. Since the 3 h mean temperature of some stations is computed from the average of three values per three hours, some errors are canceled out, resulting in higher correlation values. Temperature errors are generally systematic for short-range forecasts, and the bias correction method or the fill missing methods can be applied for more precise forecasts.  For the SONCA event, as shown in Figure 13, the patterns of the correction maps of both assimilations are similar, with the highest values toward the east and the south (values close to 0.9) and with decreasing values toward the west of the KCB. The comparison between the mean absolute error of DAALL and DAAWS reveals a more heterogeneous distribution. The 3 h mean temperature ∆T of the DAALL experiment varies through the domain, with an isolated area larger than 3 • C. The MAE of the 3 h mean temperature of the DAAWS experiment can reach values exceeding 3 • C.  For the PODUL event, as shown in Figure 14, the pattern correction maps of DAALL and DAAWS are similar, with higher values toward the west of the KCB and decreasing values toward the east of the KCB (value below 0.75). The mean absolute error of both experiments is almost similarly distributed, ranging from 1.5 • C to 2.5 • C throughout the area for this event. High error values are found over mountainous regions, both near the PHETCHABUN ranges (located in the west of the KCB, with no sensor data available) and toward the southwest corner of the domain, likely related to the topographical feature of the Thailand Highlands [43,44]. The spatial gradients suggest the superior performance of the assimilation forecast, the observations of which, in that area, have a poor signal. The temperature forecast shows the largest correlations and smallest mean absolute errors. At the south of the KCB, in both assimilation experiments, temperatures that are nearly homogeneously distributed correlate with coefficients greater than 0.85.

Conclusions
This study is part of a general intention to improve forecasting ability based on the WRFDA system to support weather forecasting in the KCB of Thailand, with the main focus of evaluating the impact of using 3DVAR data assimilation with 72 h and 48 h lead times by combining multi-platform observations from the National Centers for Environmental Prediction (NCEP) PREPBUFR and Automatic Weather Station (AWS) data, as compared with the use of AWS observations data only (referred to DAALL and DAAWS, respectively). The two assimilation experiments found that DAALL has the overall positive impact of improving the spatial accuracy of accumulated rainfall and reduced the degree of overforecasts in the KCB. DAAWS is able to resolve underforecasts in terms of intensity and the pattern of accumulated rainfall during periods of heavy rainfall in the KCB. (This refers to the spatial distribution of differences in the KCB with a horizontal spacing of 3 km).
The results follow the conventional understanding that updating the initial conditions of the observations with a lead time of 48 h provides significant improvements. However, at the same time, in 3 h of accumulated rainfall, there are significant improvements in the ACC and bias of the two assimilation experiments; however, in terms of TS, DAALL, compared to DAAWS, has an obvious positive impact with a 5 mm threshold. The evaluation of the 3 h mean forecast temperatures showed that DAALL has a higher correlation than DAAWS. The best performance is for the 3 h mean temperature in the east of the KCB. In general, forecast temperatures tend to correspond better with observations of subtropical temperature regions such as the KCB. The highest temperature biases and lower skill scores are found toward the west, which is characterized by a large diurnal amplitude. The scores suggest that the assimilation of multi-platform observations with a 48 h lead time, compared to the assimilation of AWS data only, has an impact on heavy rainfall prediction in terms of TS. The overall impact of additional surface and multi-platform observations is reflected in the reduction of rainfall overforecasts. The reason for this could be that the fewer observations of AWS data affected the WRFDA system.
It should be mentioned that other assimilation techniques, such as 4DVAR and Kalman filters, need further evaluation with these events. These techniques have great potential, although they currently suffer from high computational costs. Furthermore, assimilation using radiance observations could help to add detailed information to the initial fields but it is difficult to get a positive impact from incorporating radiance data at this time. Future studies are also required to develop the predictability of monsoon directions and their impact on rainfall events using the WRFDA system in this study region.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/ 10.3390/atmos12111497/s1, Figure S1: The statistical skill scores of ACC, BIAS and TS from CNTL (non-data assimilation) and DA experiment at different thresholds of rainfall (mm). Acknowledgments: The first author would like to thank the Chinese Scholarship Council (CSC) and the Asia-Pacific Space Cooperation Organization (APSCO) for making this research a reality. We acknowledge NCEP and NCAR for access to the GFS and PREPBUFR datasets, respectively. All authors would sincerely thank the anonymous reviewers for the improvement of this paper through their constructive and insightful comments.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The