Assessing the Impact of Dropsonde Data on Rain Forecasts in Taiwan with Observing System Simulation Experiments

: This paper presents an observing system simulation experiment (OSSE) study to examine the impact of dropsonde data assimilation (DA) on rainfall forecasts for a heavy rain event in Taiwan. The rain event was associated with strong southwesterly ﬂows over the northern South China Sea (SCS) after a weakening tropical cyclone (TC) made landfall over southeastern China. With DA of synthetic dropsonde data over the northern SCS, the model reproduces more realistic initial ﬁelds and a better simulated TC track that can help in producing improved low-level southwesterly ﬂows and rainfall forecasts in Taiwan. Dropsonde DA can also aid the model in reducing the ensemble spread, thereby producing more converged ensemble forecasts. The sensitivity studies suggest that dropsonde DA with a 12-h cycling interval is the best strategy for deriving skillful rainfall forecasts in Taiwan. Increasing the DA interval to 6 h is not beneﬁcial. However, if the ﬂight time is limited, a 24-h interval of DA cycling is acceptable, because rainfall forecasts in Taiwan appear to be satisfactory. It is also suggested that 12 dropsondes with a 225-km separation distance over the northern SCS set a minimum requirement for enhancing the model regarding rainfall forecasts. Although more dropsonde data can help the model to obtain better initial ﬁelds over the northern SCS, they do not provide more assistance to the forecasts of the TC track and rainfall in Taiwan. These ﬁndings can be applied to the future ﬁeld campaigns and model simulations in the nearby regions.


Introduction
Seasonal rainfall in Taiwan is strongly regulated by East Asian monsoons, including northeasterly in cold seasons and southwesterly in warm seasons [1][2][3]. From June to October, tropical cyclones (TC) join the southwesterly monsoon and play an important role in affecting water resources in Taiwan by bringing large rains to the area [4][5][6][7]. When a TC passes over northern Taiwan, its outer circulation can interact with the summer monsoonal flow, resulting in strong southwesterly flows over the northern South China Sea (SCS) and the ocean southwest of Taiwan [8][9][10]. Chien and Chiu [11] and Chien et al. [12] documented that these southwesterly flows are closely related to the rainfall in Taiwan, because they can transport moisture-laden air from upstream regions to the Taiwan area and establish a confluent environment that is favorable for continuous heavy rainfall [13,14]. A good representation of airflows in these upstream regions is therefore very important for numerical models to produce skillful rain forecasts in Taiwan.
Targeted observations have long been used to supplement the conventional observational network in order to reduce the forecast uncertainty that results from inadequacies in the initial conditions of a numerical model. In the early 1980s, the National Oceanic and Atmospheric Administration Hurricane Research Division started to collect profiles of wind, temperature, and humidity from dropsondes deployed by surveillance aircrafts in an attempt to improve TC track forecasts in the Atlantic [15][16][17]. In the 2000s, a succeeding project named Dropwindsonde Observations for Typhoon Surveillance near the Taiwan Region (DOTSTAR) gathered dropsonde data for the same purpose, albeit over the western

•
Can extra dropsonde observations over the northern SCS help the model in rain forecasts for Taiwan under a southwesterly flow environment? • What weather elements changed by the dropsonde DA are important in determining the rainfall in Taiwan? • What is the best strategy for dropsonde deployment in terms of the frequency and density?

Case Description and Model Design
Using methods developed by Chien and Chiu [11], a heavy rain event in Taiwan was found under a southwesterly flow environment that lasted more than 24 h in late July to early August 2017. The southwesterly flow occurred mainly west and southwest of Taiwan after Typhoon Haitang (2017) passed Taiwan and moved toward southeastern China. At 0000 UTC 31 July 2017, Haitang (2017) had just made landfall and weakened to a tropical storm with a center pressure of 995 hPa over southeastern China (Figure 1a). The northeast-southwest-oriented isobars to the west and southwest of Taiwan favored the formation of southwesterly flows that could potentially cause heavy rainfall in Taiwan. During the four 12-h periods from 0000 UTC 31 July to 0000 UTC 2 August, the rain maxima in southern Taiwan all exceeded 150 mm in 12 h (Figure 1b). Figure 1c shows the time evolution of the longitudinally averaged rain intensity from southern to northern Taiwan. To the south of 23.5 • N, continuous rainfall occurred for about 48 h from 0000 UTC 31 July to 0000 UTC 2 August. Three episodes of stronger rainfall happened during this 2-day period. The first occurred from 0000 to 0600 UTC 31 July, and the second lasted longer, from 1800 UTC 31 July to 0900 UTC 1 August. After a short break of 3 to 4 h, another episode of heavy rain occurred from 1600 to 2300 UTC 1 August. The rainfall weakened after 0000 UTC 2 August. This case study therefore focused on the rainfall of the two days between 0000 UTC 31 July and 0000 UTC 2 August 2017.
To examine whether a better representation of southwesterly flows in the upstream can play a role in rain forecasts of Taiwan, an OSSE study was conducted for this heavy rain event, with the aim of determining the impact of dropsonde observations over the northern SCS. The first stage of the OSSE was to obtain a nature run (NR; see Figure 2a) that was as similar to the observations as possible. In order to produce candidates for selection, an 80-member ensemble was created at 1200 UTC 28 July based on the ensemble transform Kalman filter (ETKF) [47][48][49][50][51], and then, the Weather Research and Forecasting (WRF) model version 3.8.1 [52] was run for 5.5 days (cyan line in Figure 2b). The initial and boundary conditions were obtained from the ERA5 reanalysis data [53] of ECMWF (European Centre for Medium-Range Weather Forecasts) with a 0.25 • × 0.25 • resolution. The model included two nested domains with 15-and 3-km horizontal resolutions (green lines in Figure 3a) and 45 vertical levels. Domain 2 was started 2 days later than domain 1. The WRF singlemoment 6-class microphysics scheme [54], Betts-Miller-Janjić cumulus parameterization scheme (CPS) [55], and Yonsei University planetary boundary layer scheme [56] were utilized during the simulations, except that the CPS was excluded in domain 2. The NR was chosen from the best member run among the 80-member simulations, as judged objectively by the ETS (equitable threat score) and bias [57] of rain simulations in Taiwan in domain 2 and, subjectively, by the synoptic weather pattern in domain 1 during the 3-day period from 0000 UTC 31 July to 0000 UTC 3 August 2017.
Synthetic synoptic-scale surface and sounding observations over land (red dots in Figure 3a) were then created from domain 1 of the NR. For simplicity, we assumed a uniform distribution of stations separated by 300 km in distance. In addition, synthetic observations from 12 dropsondes over the northern SCS (green crosses in Figure 3a), with a horizontal separation of 225 km, were generated by the same method. The sounding and dropsonde data included the geopotential height, temperature, dew point temperature, wind speed, and wind direction at 45 pressure levels. D12 was used to identify the experiments of this kind in terms of the dropsonde number. To examine the impact of different numbers of dropsondes on the model simulation, two other data densities were set up: 24 dropsondes (magenta crosses in Figure 3b) with a 135-km separation and 4 dropsondes (cyan crosses in Figure 3b) with an approximately 450-km separation. These experiments were identified as D24 and D4, respectively.   Seven experimental runs were carried out to study the impact of assimilating synthetic traditional observations over land and dropsonde observations over the northern SCS on rain forecasts in Taiwan (Table 1). WRF version 3.9.1.1 with 3 nested domains of 45-, 15-, and 3-km horizontal resolutions (black line in Figure 3a) and 45 eta levels in the vertical was used for the experimental runs. The initial and boundary conditions were obtained from the National Centers for Environmental Prediction Global Forecasting System (NCEP GFS) [58] with a horizontal resolution of 0.25 • × 0.25 • . Applying the method developed by Bishop et al. [48] and using the inflation factor strategy described by Wang et al. [59], the ETKF DA with a 32-member ensemble was performed 9 times (or 8 cycles) at a 6-h interval from 0000 UTC 29 July to 0000 UTC 31 July (see the green line in Figure 2b), during which only domain 1 was run. All 3 domains were carried out in the 32-ensemble member runs during the 3-day forecasting period from 0000 UTC 31 July to 0000 UTC 3 August. The physics schemes were mostly the same as in NR, except for the Goddard microphysics scheme [60] and Tiedtke CPS [61]. The CPS was not applied in the innermost domain. During the ETKF DA process, no data were assimilated in NODA, while sounding and surface observation data were assimilated in CTRL ( Table 1). Comparisons of these two experiments showed the impact of traditional observations. In T5D12, the dropsonde data over the aforementioned 12 locations, in addition to the sounding and surface observations, were assimilated 5 times at a 12-h interval. T9D12 and T3D12 were the same as T5D12, except that the dropsonde data were assimilated 9 and 3 times, respectively, at 6-h and 24-h intervals. Comparisons of these two experiments with T5D12 helped to understand the impact of dropsonde DA frequency. T5D24 and T5D4 were the same as T5D12, except that the dropsonde data over the aforementioned 24 and 4 locations, respectively, were assimilated. Table 1. The design of the experimental runs, including their names and the data (sounding, surface, and dropsonde observations) that were assimilated during the ETKF DA process. T9, T5, and T3 represent 9, 5, and 3 DA times of synthetic dropsonde data, respectively, with 6-h, 12-h, and 24-h DA intervals. D12, D24, and D4 denote 12, 24, and 4 dropsondes over the northern SCS, respectively.  Figure 4a shows that simulated rain in NR was similar to the observation (Figure 1b) for the first three 12-h periods from 0000 UTC 31 July to 1200 UTC 1 August, during which rain mostly occurred in the plain and mountain areas of southern Taiwan. In the last 12-h period from 1200 UTC 1 August to 0000 UTC 2 August, however, rain was slightly under-forecasted in southern Taiwan. The time evolution of the box-averaged rain intensity over southwestern Taiwan in NR ( Figure 5a) showed three periods of larger rainfall. The first occurred from 0000 to 0700 UTC 31 July, the second from 2000 UTC 31 July to 0900 UTC 1 August, and the last one from 1600 to 2300 UTC 1 August. In the second period, the maximum rainfall happened at approximately 0100-0400 UTC 1 August. These three periods of rainfall agreed well in terms of the occurring times with the aforementioned three rainfall episodes in the observation (see the discussion of Figure 1c). In summary, NR produced good-simulated rainfall in Taiwan that was comparable to the observations.  In NODA (Figure 4b), the mean rain of the 32 ensemble members was smaller than in NR ( Figure 4a) for all the four 12-h periods. This is also seen in Figure 5b, which shows smaller ensemble mean rain intensities than those of NR ( Figure 5a) around 0000 UTC 1 August. With the traditional sounding and surface observation data assimilated, CTRL ( Figure 4c) exhibited much better rain forecasts in the ensemble mean. However, rain was, in general, over-forecasted in the first and second 12-h periods and slightly underforecasted in the third compared with those in NR. In the last 12-h period, large amounts of rain occurred over the mountain area in southern Taiwan. The box-averaged rainfall in CTRL ( Figure 5c) showed that the ensemble mean rain intensity had a maximum at 2300 UTC 31 July, which was earlier than in NR (Figure 5a). In addition, the rain intensities of most members (see the shaded area and the dots) were larger than those of NR. The ensemble mean rain ( Figure 4d) from T5D12 in which dropsonde data over the SCS, besides the traditional observations, were assimilated was in good agreement with that of NR. The box-averaged rainfall (Figure 5d) further showed that the maximum ensemble mean rain intensity occurred at 0100 UTC 1 August, which was close to that of NR ( Figure 5a). Moreover, judging by the smaller spread, the members of T5D12 produced more converged rainfall forecasts than those of CTRL.

Rain Forecasts and the Low-Level Environment
In order to objectively evaluate the aforementioned rain forecasts, the ETS and bias were computed for each experimental run, assuming the rain in NR as the truth. The calculations evaluated the rain forecasts at all grid points over Taiwan using all 32 ensemble members of each experimental run. Nine normalized rain thresholds from 0.1 to 0.9, based on the rain maximum over Taiwan in NR during each 12-h period, were used for the rain verification. For example, if the rain maximum in Taiwan was 100 mm in a 12-h period, a normalized threshold of 0.5 would represent a rain threshold of 50 mm, but in another 12-h period, if the maximum was 200 mm, 0.5 would denote a rain threshold of 100 mm. This was better than using fixed thresholds for all the periods, because the verification could cover the whole range of rain amount in each period. Figure 6 shows the ETS and bias of NODA, CTRL, and T5D12 verified at four 12-h periods from 0000 UTC 31 July to 0000 UTC 2 August 2017. Figure 6a-d presents that CTRL had higher ETSs than NODA in all the four periods, except at small rain thresholds of the 0-12-h period. The bias scores (Figure 6e-h) showed that NODA tended to under-forecast rainfall, especially for small thresholds at 0-36 h. With the assimilation of synthetic traditional observations, CTRL exhibited some improvements, but it turned into an over-forecast problem at larger rainfall thresholds. With the synthetic dropsonde observation data assimilated, T5D12 obtained slightly higher ETSs and better bias scores than CTRL by fixing the over-forecast problem of CTRL.
In order to examine the causes of rain forecasts among different experimental runs, Figure 7 presents low-level averaged fields from the ensemble mean of NODA, CTRL, and T5D12 to compare with those of NR. At 0000 UTC 31 July, Typhoon Haitang in NR had moved to the northwest of Taiwan (Figure 7a), with its center making landfall over southeastern China. The pressure gradient was large over the northern SCS, resulting in the development of strong southwesterly flows. The flows had a maximum wind speed exceeding 18 m s −1 and a wind direction of west-southwesterly to the southwest of Taiwan. A large mixing ratio was found over the eastern portion of Haitang and the southwest of Taiwan. The moisture was transported overland by the southwesterly flow at a later time, resulting in heavy rainfall in southern Taiwan. In NODA (Figure 7b), the TC center of the ensemble mean was located about 200 km north to that of NR at the model's initial time. Both the pressure gradient and the southwesterly flow over the northern SCS were thus weaker than those of NR. The moisture was also not well-reproduced over southeastern China and the northern SCS.  With the traditional observation data assimilated, CTRL (Figure 7c) reproduced a better TC than NODA (Figure 7b), including its location and circulation pattern. The pressure gradient and the southwesterly flow over the northern SCS were more consistently represented. The DA also helped CTRL to obtain a better moisture pattern over southeastern China. However, there appeared to be too much moisture over the northern SCS in CTRL, similar to that in NODA. With extra dropsonde data over the northern SCS assimilated, T5D12 (Figure 7d) reproduced a better moisture pattern over the northern SCS than CTRL. It is thus concluded that, when the data of 12 additional dropsondes over the northern SCS were assimilated five times at a 12-h interval, the ETKF DA process could help to reproduce a better low-level environment at the model's initial time.
Eighteen hours later at 1800 UTC 31 July, Haitang had made landfall and moved northward in NR (Figure 7e). The southwesterly flow over the northern SCS weakened slightly, with a low-level averaged wind speed of about 12 m s −1 . The airflow kept transporting moisture overland, causing continuous rainfall in southwestern Taiwan. At this time, however, Haitang in NODA had moved fast northward after 18 h into the simulation (Figure 7f). The southwesterly flow became very weak, and the mixing ratio was very small over the northern SCS, resulting in insufficient rainfall over southern Taiwan. The simulated Haitang in CTRL was moving slower than in NODA. Its center location was still about 100 km south to that of NR after 18 h into the simulation (Figure 7g). As a result, the pressure gradient was larger, and the southwesterly flow over the northern SCS was stronger than those of NR. This explained why the rainfall was over-forecasted in CTRL. With the dropsonde data assimilated, T5D12 had a better forecast of the TC location at 1800 UTC 31 July (Figure 7h). The southwesterly flow and moisture were also reproduced better over the northern SCS compared with those of CTRL. Although the mixing ratio to the immediate south of the TC was under-forecasted in T5D12, the moisture over the northern SCS was well-reproduced. In general, T5D12 had the best forecasts of the TC location and the low-level environment among the three experimental runs, which explained why it produced the best rain forecasts in Taiwan.

Interpretation of the Forecast Results
As aforementioned, low-level southwesterly flows and moisture were the two important factors that influenced rainfall in Taiwan. In order to examine their time evolutions, the box-averaged low-level wind speed, wind direction, and mixing ratio were calculated in a 2 • × 2 • box over the ocean southwest of Taiwan (see the red box in Figure 3b). These variables were also vertically averaged from the surface to 700 hPa. Figure 8a,e shows that the low-level winds in NR weakened over time from 0000 to 1500 UTC 31 July, with a steadily west-southwesterly wind direction (approximately 250 degrees). The mixing ratios also decreased during this time period (Figure 8i). As a result of the lessening moisture transport in the upstream area, the rain intensities over southwestern Taiwan were weakening during the same time period (Figure 5a). After 1500 UTC 31 July, the wind speeds started to increase until 0000 UTC 1 August. The wind directions switched to a more westerly direction (approximately 260 degrees), and the mixing ratios increased sharply during the same period. These changes created favorable conditions for moisture transport in the upstream region, such that the rain intensity in southwestern Taiwan increased and peaked at about 0300 UTC 1 August (Figure 5a). After that, both the wind speeds and mixing ratios decreased for a short period and started to increase again from 1200 UTC 1 August to 0000 UTC 2 August. These variations were all closely related to the fluctuations of the rain intensity in Figure 5a.  Figure 3b). These variables were also vertically averaged from the surface to 700 hPa.
In NODA, the low-level wind speed of the ensemble mean (Figure 8b) was already weaker than that of NR (Figure 8a) at the initial time and was further decreasing over time. The ensemble mean wind directions (Figure 8f) had a more southerly direction (around 240 degrees) than those of NR (Figure 8e) throughout the 2-day period. In addition, the mixing ratios of the ensemble mean (Figure 8j), except at the early times, were all smaller (around 15 g kg −1 ) than those of NR (Figure 8i). These factors explained why the rain intensity was weak in NODA ( Figure 5b).
CTRL, in view of the ensemble mean, produced nearly the same wind speed (Figure 8c) as that of NR (Figure 8a) at the model's initial time. However, this wind speed remained almost unchanged (i.e., too strong) until 24 h later at 0000 UTC 1 August. The wind directions of the ensemble mean changed from approximately 260 to 247 degrees during this 24-h period (Figure 8g), which were quite different to those of NR (Figure 8e). The mixing ratios of the ensemble mean were high at the early times (Figure 8k). They decreased over time until 0800 UTC 31 July and then started to increase afterwards. The starting time of such increasing was about 6 h earlier than that of NR (Figure 8i). Owing to this effect and the aforementioned stronger winds, CTRL produced rainfall that started earlier and lasted for a longer time (Figure 5c).
The ensemble mean wind speeds of T5D12 (Figure 8d) agreed very well with those of NR in terms of the magnitude and tendency. T5D12 also produced better ensemble mean wind directions (Figure 8h) than both CTRL and NODA. Although the ensemble mean mixing ratios of T5D12 (Figure 8l) were too small between 0600 and 1500 UTC 31 July, they were better-simulated from 1500 UTC 31 July to 0600 UTC 1 August. During this time period, the moisture evolution well resembled that of NR (Figure 8i), such that the rain of T5D12 occurred at times mostly close to that of NR (see Figure 5a,d).
From the above discussions, it is concluded that assimilating traditional observation and dropsonde data over the northern SCS through ETKF DA can help to produce better low-level winds over the ocean southwest of Taiwan. The forecasts of the moisture can also be improved but not as significantly as those of the wind. Owing to these improvements, the model produced better rainfall forecasts in Taiwan. Figure 8 also shows that both the range of one standard deviation and the range between the maxima and minima among the ensemble members became smaller as more observational data were assimilated. This result indicates that adding dropsonde data into the DA process helped the model to reduce the ensemble spread and produced more converged ensemble forecasts.
The southwesterly flow of this case was associated with a TC in the north, as aforementioned. The simulated locations of this TC can greatly influence the southwesterly flow. It is therefore important to compare the locations of the TC center among different experimental runs. At the model initial time (0000 UTC 31 July), the TC of NR was located at the coastline of southeastern China to the west of northern Taiwan (Figure 9a). It moved north-northwestward afterwards until 0000 UTC 1 August and then curved southwestward before dissipating. In NODA, the ensemble mean location of TC (gray dot) was wrongly placed about 150 km to the north at the initial time (also see Figure 9g), with the TC locations of ensemble members widely spread. As the model started, the TCs of the ensemble members moved rapidly northward. Consequently, the ensemble mean TC was located at about 350 km north to that of NR at 0000 UTC 1 August (Figure 9b,h), and about 700 km north-northeast to that of NR at 0000 UTC 2 August (Figure 9c,i). CTRL had a very good ensemble mean TC location (red dot) at 0000 UTC 31 July (Figure 9a), with a smaller spread among its members' TC locations. However, the TCs did not move much in the first 24 h of simulation, resulting in a large track error (180 km) in the ensemble mean at 0000 UTC 1 August (Figure 9b,h). After that, the TCs started to move faster, with a larger spread. The ensemble mean TC location was about 300 km northeast to that of NR at 0000 UTC 2 August (Figure 9c,i). Overall, although the TC locations were not very close to those of NR, CTRL still produced much better TC tracks than NODA. As for T5D12, the initial TC locations were slightly better than those of CTRL (Figure 9a,g). By 0000 UTC 1 August, the TCs had moved northward, with an ensemble mean TC location about 150 km northeast to that of NR (Figure 9b,h). Although they had a larger spread, the mean track errors were still smaller than those of CTRL. However, the TCs kept moving northward after 0000 UTC 1 August, resulting in larger track errors than those of CTRL at 0000 UTC 2 August (Figure 9c,i). These findings suggest that assimilating additional dropsonde data helped the model in forecasting a better TC track at 0-24 h but not at 24−48 h, when the TCs were moving far away from the dropsonde locations. Figure 9. TC locations (crosses) of all 32 ensemble members in NODA (gray), CTRL (red), and T5D12 (green) at (a) 0000 UTC 31 July, (b) 0000 UTC 1 August, and (c) 0000 UTC 2 August. Dots are the ensemble mean locations. The TC track of NR from 0000 UTC 31 July to 0000 UTC 2 August is shown in black curves, with the dates denoted by numbers. (d-f) Same as (a-c) but for T9D12 (blue), T3D12 (yellow), T5D24 (magenta), and T5D4 (cyan). Box plots of the track errors at (g) 0000 UTC 31 July, (h) 0000 UTC 1 August, and (i) 0000 UTC 2 August were computed using the TC locations of NR as the truth. The box extends from the first quartile to the third quartile (interquartile range), with a line denoting the median value. The upper/lower error bars show the data value that is 1.5× the interquartile range above/below the third/first quartile. The dots are outliers.

DA Impact and Sensitivity Runs
In order to examine the impact of ETKF DA on the initial conditions of the model, root mean square errors (RMSE) of the model variables were calculated at the initial time (0000 UTC 31 July) using NR as the truth. The errors were averaged using data at all the grid points in two different evaluation regions (land area of domain 1 and the northern SCS area shown as a yellow box in Figure 3b) for each of the 32 ensemble members of an experimental run. The results of the horizontal wind components and mixing ratio (u, v, and q) at 950, 850, 700, 500, and 200 hPa are presented in the box plots ( Figure 10). To evaluate the DA impact of the traditional observations, Figure 10a-c shows RMSEs averaged over the land area of domain 1 for NODA and CTRL. It is clear that, with traditional observation data assimilated during the ETKF DA process, CTRL overall had smaller RMSEs than NODA for u, v, and q at all the pressure levels. It was also found that CTRL had a smaller ensemble spread than NODA. To study the DA impact of the dropsonde observations, Figure 10d-f shows RMSEs averaged over the northern SCS area for CTRL and T5D12. The results indicated that, with additional dropsonde data assimilated, T5D12 had smaller RMSEs than CTRL over the northern SCS at the model's initial time. Although the interquartile ranges overlapped for some variables/levels, the median value of T5D12 was still smaller than that of CTRL. The ensemble spread of T5D12 was also smaller than that of CTRL. , and (c) the mixing ratio (q, g kg −1 ) averaged over the land area of domain 1 at 950, 850, 700, 500, and 200 hPa for the 32 ensemble members of NODA (gray) and CTRL (red) at 0000 UTC 31 July 2017 using NR as the truth. The box extends from the first quartile to the third quartile (interquartile range), with a line denoting the median value. The upper/lower error bars show the data value that is 1.5× the interquartile range above/below the third/first quartile. The dots are outliers.
(d-f) Same as (a-c) but for CTRL (red) and T5D12 (green). The RMSEs were averaged over the northern SCS area shown as a yellow box in Figure 3b.
In addition to the aforementioned experimental runs, four other runs were designed for the sensitivity studies. T9D12 and T3D12, when compared to T5D12, were meant to understand the impact of dropsonde DA frequency and T5D24 and T5D4, the impact of dropsonde density. Figure 11 presents the ETS and bias of T5D12 and these four sensitivity runs. The results (Figure 11a-d) show that T5D12 had generally higher ETSs than both T9D12 and T3D12. The ETSs of T3D12 were mostly slightly higher than those of T9D12 at the periods of 0-12 h and 24−36 h, and they were almost the same at 12-24 h and 36-48 h. T5D12 also had a better bias than T9D12 and T3D12 (Figure 11e-h). These findings suggest that the DA of dropsondes with a 12-h cycling interval can help the model provide the best rain forecasts in Taiwan. Increasing the interval to 6 h is not beneficial, and if the flight time is limited, a 24-h cycling interval is still acceptable. Figure 12 shows the RMSEs of u, v, and q averaged over the northern SCS area for the four sensitivity runs at the model's initial time. They can be compared with those of T5D12 (green) in Figure 10d-f. It was found that the RMSEs of T9D12 were generally smaller than those of T3D12 (Figure 12a-c) and were very close to those of T5D12 (Figure 10d-f). This result suggests that a 12-h interval is a basic requirement for the DA frequency of the dropsonde data. If the dropsonde data were assimilated at a 24-h interval, the winds and moisture might contain larger errors over the northern SCS at the model's initial time. Figure 9d-i shows that T3D12 had about the same TC track errors as T5D12, except at 0000 UTC 1 August when T3D12 simulated slightly better TC locations from its ensemble members. The TC centers of T9D12 were mostly placed north to those of T5D12, resulting in larger track errors with larger spreads than those of T5D12 at 0000 UTC on both August 1 and 2. To sum up, a combination of better initial fields over the northern SCS and a better TC track helped T5D12 to produce the best rain forecasts in Taiwan among the experimental runs. Although T3D12 had worse initial fields over the northern SCS than T9D12, during the forecast time, it produced better TC locations that helped to yield a better southwesterly flow environment in the upstream and better rain forecasts in Taiwan. Figure 9a-c already showed that the TCs of T5D12 were mostly placed north to those of CTRL, indicating that dropsonde DA tended to cause the model to place the TCs at a more northward location. Dropsonde DA was performed only every 24 h in T3D12, which was by default closer to CTRL than T9D12. The TCs in T3D12 were therefore expected to be more southward and, thus, better simulated than those of T9D12.
T5D24 showed lower ETSs than T5D12 at both 0-12 h and 24-36 h and almost the same ETSs as T5D12 at 36-48 h (Figure 11a,c,d). It had slightly higher ETSs than T5D12 at 12-24 h but only at small rain thresholds (Figure 11b). The ETSs of T5D4 were the lowest among these experimental runs and were close to those of CTRL (red curve in Figure 6a-d). These results suggest that 12 dropsondes with a 225-km separation distance over the northern SCS set a minimum requirement for helping the model in the rain forecasts of Taiwan, and a higher density of dropsondes did not provide more assistance to the model. Figure 12d-f shows that T5D24 generally had smaller RMSEs than T5D4, and it even exhibited slightly smaller RMSEs than T5D12 at the model's initial time. This result suggests that the DA of more dropsonde data can help the model to obtain better initial fields over the northern SCS. However, it did not guarantee a better TC track, because the TC track was primarily controlled by factors outside this region. The TC track errors of T5D24 were, in fact, larger than those of T5D12, and those of T5D4 were even larger, with larger spreads (Figure 9d-i). These results explained why T5D24 and T5D4 produced worse rain forecasts than T5D12.

Summary and Conclusions
This paper presented an observing system simulation experiment (OSSE) study to examine the impact of dropsonde data assimilation (DA) on the rain forecasts for a heavy rain event in Taiwan. The rain event was associated with strong southwesterly flows after a weakening TC passed over Taiwan and made landfall over southeastern China. Previous studies have indicated that the southwesterly flows can transport moisture-laden air from upstream regions, such as the northern SCS, to the Taiwan area and result in continuous heavy rainfall in Taiwan. It is thus reasonable to assume that a good representation of airflows in this upstream region may be critical for a numerical model to produce skillful rain forecasts in Taiwan. Owing to the lack of in situ measurements there, dropsondes that collect 3-D profiles of the winds, temperature, and moisture become potentially good candidates for filling the gap of the conventional observational network.
The OSSE study shows that when synthetic traditional sounding and surface observation data are assimilated in the ETKF DA process, the WRF model can reproduce better initial fields and a better track of the weakening TC over southeastern China. This is, of course, not surprising, but the study also suggests that assimilating additional dropsonde data over the northern SCS can help the model in deriving more realistic moisture and flow patterns over the northern SCS at the model's initial time. Such benefits of DA are also clearly demonstrated by the smaller RMSEs over the northern SCS when dropsonde data are assimilated. During the forecast hours, the simulated pressure gradient is too large, and the southwesterly flows are too strong over the northern SCS in the experiment without dropsonde DA, resulting in over-forecasted rainfall in Taiwan. These problems were primarily caused by the TC centers that were placed wrongly in a southward location. In the experiment with dropsonde DA, the TC track was much improved at the early times when the TCs were still relatively close to the dropsonde locations. The better TC locations boosted the model in producing better low-level southwesterly flows over the northern SCS and superior rain forecasts in Taiwan. The results also showed that low-level winds received more improvement than moisture from dropsonde DA during the model simulation and that dropsonde DA can aid the model in reducing the ensemble spread and producing more converged ensemble forecasts.
The sensitivity studies suggested that dropsonde DA with a 12-h cycling interval could help the model in producing the best rain forecasts in Taiwan. Increasing the DA interval to 6 h was not beneficial, because it did not help much in either the model's initial fields or the TC track forecast, not to mention the rain forecasts in Taiwan. When dropsondes were assimilated with a 24-h cycling interval, the model's initial fields over the upstream region contained slightly larger errors. However, the rain forecasts in Taiwan appeared to be satisfactory in terms of equitable threat scores. This was mainly assisted by the better-simulated TC track during the forecasting period. Therefore, if the flight time is limited, this lower DA frequency may still be acceptable in terms of providing adequate rain forecasts for Taiwan. The study also suggested that 12 dropsondes with a 225-km separation distance over the northern SCS set a minimum requirement for assisting the model in rain forecasts for Taiwan. Although more dropsonde data can help to obtain better initial fields over the northern SCS, they do not provide more assistance to the forecasts of the TC track and rainfall in Taiwan. A dropsonde density of 225 km may therefore be satisfactory if resources are limited.
Compared with many previous studies in which dropsonde observations were mostly collected near the TC circulation [18,28,62,63], this paper dealt with a different situation in which the synthetic dropsondes were located relatively far away from the TC. Even so, the model still gained a positive impact on the simulated TC track from dropsonde DA. Although the ETKF technique was applied in this paper in an attempt to increase the range of possible outcomes by considering the uncertainty and providing probabilistic forecasts, the results presented here were still based on only one case. To obtain more robust conclusions, more case studies are needed in the future.

Data Availability Statement:
The data used in this study are available on request from the corresponding author. Due to the large amount of the data, they are not publicly available.