Performance of Forecasts of Hurricanes with and without Upper-Level Troughs over the Mid-Latitudes

: We investigated the accuracy of operational medium-range ensemble forecasts for 29 Atlantic hurricanes between 2007 and 2019. Upper-level troughs with strong wind promoted northward movement of hurricanes over the mid-latitudes. For hurricanes with upper-level troughs, relatively large errors in the prediction of troughs result in large ensemble spreads, which result in failure to forecast hurricane track. In contrast, for hurricanes without upper-level troughs, mean central position errors are relatively small in all operational forecasts because of the absence of upper-level strong wind around troughs over the mid-latitudes. Hurricane Irma in September 2017 was accompanied by upper-level strong wind around a trough; errors and ensemble spreads for the predicted upper-level trough are small, contributing to smaller errors and small ensemble spreads in the predicted tracks of Irma. Our observing system experiment reveals that inclusion of additional Arctic radiosonde observation data obtained from research vessel Mirai in 2017 improves error and ensemble spread in upper-level trough with strong wind at initial time for forecast, increasing the accuracy of the forecast of the track of Irma in 2017.


Introduction
Accurate prediction of tropical cyclones (TCs) with heavy rain and strong winds is crucial to reducing human casualties and socioeconomic damages. In the early 2000s, various operational forecast centers extended the skillful forecast period of a TC track to 120 h [1,2] as a result of ongoing development of atmospheric numerical models and application of the latest models to TC forecasts [3][4][5]. Linear regression analysis shows that central position error of a 120-h forecast has reduced by 17.5 km yr −1 between 1991 and 2013 [6]. Meanwhile, position error of TCs in 120-h forecasts in recent years is equivalent to that of 72-h forecasts in the early 1990s [2,6]. However, in some cases, prediction error of TC positions remains large, implying that improvements in other factors are necessary to further reduce errors in TC track forecasts [6].
Reducing uncertainty in the analysis data that are used as initial fields in operational weather forecasts is one of the effective ways of improving the accuracy of TC forecasts. This can be achieved by increasing the amount of observation data that is included in the analysis data, for example, through the incorporation of satellite observations with higher resolution and frequency [7] and dropsonde observations conducted around and near TCs over the Pacific (e.g., Dropwindsonde Observations for Typhoon Surveillance near Taiwan [3,8], the Observing System Research and Predictability Experiment-Pacific Asian Regional Campaign [9,10]) and Atlantic Oceans (e.g., Synoptic surveillance

Atlantic Hurricanes That Moved Northward and Approached the US between 2007 and 2019
In this study, we focused on 29 Atlantic hurricanes that moved northward over the North Atlantic Ocean and approached or made landfall over the US between 2007 and 2019 ( Figure 1). Details of these hurricanes, including their duration and name are shown in Table 1. Most of them formed near Africa and moved westward over the Atlantic Ocean. Some turned northward near the east coast of the US and reached the North Atlantic Ocean; others turned northward over the Gulf of Mexico and made landfall over the US. In contrast, several hurricanes were generated over the Caribbean Sea and made landfall over the US.
Between 2007 and 2019, the number of Atlantic hurricanes that moved northward and approached the eastern US was the highest in 2017 (Table 1). To investigate the relationship between upper-level atmospheric circulation and Atlantic hurricane positions, we studied six Atlantic hurricanes (Gert, Harvey, Irma, Jose, Maria, and Nate) that moved northward over the North Atlantic Ocean and approached or made landfall over the US in 2017 (colored tracks in Figure 1). Hurricane Nate was generated over the Caribbean Sea and made landfall over the US. The other five hurricanes formed near Africa and moved westward over the Atlantic Ocean. Two of them (Irma and Harvey) turned northward over the Gulf of Mexico, making landfall over the US. The remaining three (Gert, Jose, and Maria) turned northward near the east coast of the US, and traversed the North Atlantic Ocean without landing. In the cases of Gert, Irma, and Nate, upper-level troughs with strong winds exceeding 25 m·s −1 (averaged between the 500-and 300-hPa levels) extended above the western parts of the hurricanes over the mid-latitudes, influencing hurricane location and movement (Figure 2a,k and Figure S2k). In this study, Atlantic hurricanes that are influenced by upper-level strong winds are referred to as trough cases. Mid-latitude troughs over the western part of these hurricanes potentially favor extratropical transition [26]. Unlike Gert, Irma, and Nate, no upper-level troughs appeared in Harvey, Maria, and Jose over the mid-latitudes (Figure 2f and Figure S2a,f), suggesting that the movement and location of these hurricanes were unaffected by strong wind around troughs. Atlantic hurricanes without upper-level strong wind around troughs are referred to as no trough cases in this study.  Figure S1). Over the Chukchi, Beaufort, and Bering Seas, observations were recorded every 6 h (0000, 0600, 1200, and 1800 UTC), yielding a total of 119 radiosonde observations. In addition, the 53rd Weather Reconnaissance Squadron of the U.S. Air Force Reserve Command and the NOAA Aircraft Operations Center conducted 721 dropsonde observations over the Atlantic Ocean. Data were transmitted via the Global Telecommunication System, and can be used to reduce uncertainty in the initial fields of numerical weather predictions (analysis data), improving atmospheric circulation forecasts.  Between 2007 and 2019, the number of Atlantic hurricanes that moved northward and approached the eastern US was the highest in 2017 (Table 1). To investigate the relationship between upper-level atmospheric circulation and Atlantic hurricane positions, we studied six Atlantic To investigate the impact of assimilated observation data on atmospheric circulation forecasts, OSEs were conducted using our data assimilation system. We used an ensemble data assimilation (DA) system called ALEDAS2 [27], which comprises the atmospheric general circulation model for the Earth Simulator (AFES; [28,29]) and the local ensemble transform Kalman filter (LETKF; [30,31]). This DA system generates the AFES-LETKF experimental ensemble reanalysis version 2 (ALERA2) dataset; ALERA2 comprises 63-ensemble members, has a horizontal resolution of T119 (triangular truncation with truncation wave number 119, yielding a resolution of 1 • × 1 • ) and L48 vertical levels (σ-level, up to ≈3 hPa). Like most reanalysis products, structures of synoptic and large-scale circulations in the troposphere and lower stratosphere are reproduced in ALERA2 [15,[17][18][19]32,33]. Assimilated observations were adapted from the PrepBUFR Global Observation datasets of the NCEP, which are archived by the University Corporation for Atmospheric Research (UCAR). The NOAA daily Optimal Interpolation Sea Surface Temperature (OISST) version 2 dataset was used for ocean and sea ice boundary conditions [34]. In this study, we constructed three 63-member ensemble reanalysis datasets for forecasting experiments. The control dataset (CTL) is ALERA2 including the PrepBUFR global observation datasets; the OSE M dataset is the CTL dataset with additional radiosonde observations from RV Mirai removed; the OSE A dataset is the CTL dataset with dropsonde observations from aircrafts removed. The DA cycles, the so-called DA stream, are composed of repeated DA forecast-analysis cycles with different observations every 6 h between August and September 2017. Each analysis dataset has different DA streams. Therefore, these differences are accumulated in each stream, resulting in different analyzed fields in each DA stream. For the forecast experiments, AFES with horizontal resolution T239 (0.5 • × 0.5 • ) and L48 vertical levels provides 63-member ensemble forecasts. Data from ALERA2 were regridded from T119 to T239 and used in the initial fields.
Atmosphere 2020, 11, 702 5 of 18 of these hurricanes potentially favor extratropical transition [26]. Unlike Gert, Irma, and Nate, no upper-level troughs appeared in Harvey, Maria, and Jose over the mid-latitudes (Figures 2f and S2a,f), suggesting that the movement and location of these hurricanes were unaffected by strong wind around troughs. Atlantic hurricanes without upper-level strong wind around troughs are referred to as no trough cases in this study.   Figure 2 shows predicted ensemble mean upper-level wind speed (averaged between 300-and 500-hPa levels) and geopotential heights at 300 hPa (Z300) over a period of 4.5 days, tracks of Gert (a trough case), Harvey (a no trough case), and Irma (a trough case), and Z300 ensemble spreads in all four operational medium-range ensemble forecasts (ECMWF, JMA, NCEP, and UKMO).

Influence of Upper-Level Trough Prediction on Hurricane Track Forecast
In the case of Gert (trough case), Z300 ensemble spreads are large in all models near the upper-level trough (orange lines in Figure 2b-e). In three models (ECMWF, JMA, and UKMO), and especially in JMA and UKMO, the predicted upper-level trough is located to the east of the center of the trough in ERA5 (black lines in Figure 2a

Atlantic Hurricane Track Forecasting Skill between 2007 and 2019
Analyses of hurricanes in 2017 in Section 3.1 show that errors in track forecasts are related to the existence of upper-level troughs. To compare the skill of the models from the four operational centers to forecast Atlantic hurricanes with and without troughs, we plotted the average central position errors of hurricanes between 2007 and 2019 in Figure 3. The difference between forecast and best track central position (central position error) increases with lead time in all operational models. However, the difference between error in trough cases and that in no trough cases is only obvious in the latter half of forecast periods. In the ECMWF and JMA models, this difference remains small (<100 km) between forecast days 0 and 3.5 (Figure 3a,b), increases rapidly after forecast day 4.0, and reaches about 300 km at forecast day 4.5. In contrast, in the NCEP and UKMO models, the difference between central position error in trough cases and that in no trough cases exceeds 100 km after forecast day 2.5 ( Figure 3c,d), increases rapidly after forecast day 4.0, and exceeds 300 km at forecast day 4.5 in ECMWF and JMA. Central position errors in all cases are relatively small in all models up to forecast day 2, which indicates the limit for atmospheric stochastic variability ( Figure 2). These results indicate that all the operational forecast models have relatively large error and ensemble spread in upper-level troughs, resulting in large error and ensemble spread of hurricane track forecasts after 4.0 forecast days. Atmosphere 2020, 11, x FOR PEER REVIEW 7 of 18

Observing System Experiments Using Observation Data Collected over the Arctic and Atlantic Oceans
In the case of Irma in 2017 (trough case), models have relatively large ensemble spreads in Z300, and their predicted upper-level troughs extend to the east of the trough found in ERA5 (Figure 2k [18,19], large errors and ensemble spreads in the prediction of upper-level troughs over the mid-latitudes are transported from the Arctic Ocean because of jet stream meandering. To further investigate the influence of Arctic observations on forecasts of hurricane tracks, we conducted observing system (data denial) experiments with these Arctic observation data. In addition, we investigated the impact of the inclusion of dropsonde observation data near Hurricane Irma on the skill of operational models to forecast atmospheric circulations.

Impact of Inclusion of Additional Arctic Radiosonde Observation Data on Track Forecast of Hurricane Irma (A Trough Case)
To investigate the impact of the inclusion of Arctic radiosonde observation data collected by RV Mirai on hurricane track forecasts, we conducted AFES forecast experiments initialized with CTL and OSEM for Hurricane Irma (Figure 4a). The CTL captured the observed central position of Irma (orange line in Figure 4a) even though, compared with operational analyses, horizontal resolution is lower and fewer observations were used in CTL (ALERA2). Predicted ensemble mean track for Hurricane Irma, mean upper-level wind speed (averaged between 300-and 500-hPa levels), mean Z300 and Z300 ensemble spread for a 4.5-day forecast initialized using ensemble analyses for 1200

Observing System Experiments Using Observation Data Collected over the Arctic and Atlantic Oceans
In the case of Irma in 2017 (trough case), models have relatively large ensemble spreads in Z300, and their predicted upper-level troughs extend to the east of the trough found in ERA5 (Figure 2k-o). Therefore, forecast track errors are larger in trough than in no trough cases. However, of the three trough cases in 2017, errors and ensemble spreads for the upper-level trough are the smallest in the case of Irma (Figure 2a-e,k-o and Figure S2k-o). Radiosonde observations conducted over the Chukchi and Beaufort Seas between August and September 2017 could be used to reduce the error and ensemble spread in the forecasts of atmospheric circulation over the Arctic Ocean between the end of August and end of September 2017. Following Sato et al. [18,19], large errors and ensemble spreads in the prediction of upper-level troughs over the mid-latitudes are transported from the Arctic Ocean because of jet stream meandering. To further investigate the influence of Arctic observations on forecasts of hurricane tracks, we conducted observing system (data denial) experiments with these Arctic observation data. In addition, we investigated the impact of the inclusion of dropsonde observation data near Hurricane Irma on the skill of operational models to forecast atmospheric circulations.

Impact of Inclusion of Additional Arctic Radiosonde Observation Data on Track Forecast of Hurricane Irma (A Trough Case)
To investigate the impact of the inclusion of Arctic radiosonde observation data collected by RV Mirai on hurricane track forecasts, we conducted AFES forecast experiments initialized with CTL and OSE M for Hurricane Irma (Figure 4a). The CTL captured the observed central position of Irma (orange line in Figure 4a) even though, compared with operational analyses, horizontal resolution is lower and fewer observations were used in CTL (ALERA2). Predicted ensemble mean track for Hurricane Irma, mean upper-level wind speed (averaged between 300-and 500-hPa levels), mean Z300 and Z300 ensemble spread for a 4.5-day forecast initialized using ensemble analyses for 1200 UTC 7 Atmosphere 2020, 11, 702 8 of 18 September are shown in Figure 4b,c. In the forecast using CTL (CTLf), most ensemble members move westwards over northern Cuba and make landfall in Florida (Figure 4b). Similar to the observed track, the forecasted track moves northwards from 10 September, but locations of Irma forecasted in CTLf lie to the east of those from observations. Atmosphere 2020, 11, x FOR PEER REVIEW 8 of 18 UTC 7 September are shown in Figure 4b,c. In the forecast using CTL (CTLf), most ensemble members move westwards over northern Cuba and make landfall in Florida (Figure 4b). Similar to the observed track, the forecasted track moves northwards from 10 September, but locations of Irma forecasted in CTLf lie to the east of those from observations.  Central position errors in CTLf increases with lead time, growing to about 450 km at forecast day 4.5 (Figure 5a,b). Northwestward movement of Irma is reduced because predicted wind speed around the upper-level trough is lower than observed wind speed in CTL (Figure 4a,b). Forecast error of the track of Irma in CTLf is larger than that in operational numerical weather predictions because of differences in model performance (e.g., resolution and physical parameterizations) and assimilation methods (e.g., assimilation techniques and quantity of assimilated data) in CTLf (Figures  3k and 4a). The forecast using OSE M (OSE M f) predicts northward movement of Irma on 10 September (Figure 4c). However, all members of the OSE M f move northeastwards on 11 September, and move further eastward than those in CTLf. Between forecast days 0 and 3.5, there is no difference between the central position error in CTLf and that in OSE M f; after forecast day 4.0, error and ensemble spread of predicted central position in OSE M f are larger than those in CTLf (Figure 5a-c).
Atmosphere 2020, 11, x FOR PEER REVIEW 9 of 18 Central position errors in CTLf increases with lead time, growing to about 450 km at forecast day 4.5 (Figure 5a,b). Northwestward movement of Irma is reduced because predicted wind speed around the upper-level trough is lower than observed wind speed in CTL (Figure 4a,b). Forecast error of the track of Irma in CTLf is larger than that in operational numerical weather predictions because of differences in model performance (e.g., resolution and physical parameterizations) and assimilation methods (e.g., assimilation techniques and quantity of assimilated data) in CTLf (Figures 3k and 4a). The forecast using OSEM (OSEMf) predicts northward movement of Irma on 10 September (Figure 4c). However, all members of the OSEMf move northeastwards on 11 September, and move further eastward than those in CTLf. Between forecast days 0 and 3.5, there is no difference between the central position error in CTLf and that in OSEMf; after forecast day 4.0, error and ensemble spread of predicted central position in OSEMf are larger than those in CTLf (Figure 5ac). Large errors in predicted upper-level wind speed over the east coast of US cause the predicted tracks of Irma to displace further eastward (Figure 4b,c). The difference between CTLf and OSEMf in upper-level wind speed is negative and that in Z300 is positive over North America and the east coast of the US (Figure 4e). The difference between CTLf and OSEMf is negative in Z300 ensemble spread (orange contour in Figure 4e) because ensemble spread of the predicted central position is larger in OSEMf than in CTLf.
As air moves from the Arctic Ocean to the mid-latitudes, large errors in upper tropospheric circulation predictions have been shown to influence surface circulation forecasts over the mid-latitudes [18,19]. The difference between CTL and OSEM analysis data is positive in Z300 in September 2017 over Chukchi, Beaufort, and Bering Seas (Figure 6a), and corresponds to the effect of the assimilation of Arctic radiosonde data. The difference between CTLf and OSEMf in Z300 is large and positive over the western part of Irma (black contour in Figure 4e). During the forecast period of Large errors in predicted upper-level wind speed over the east coast of US cause the predicted tracks of Irma to displace further eastward (Figure 4b,c). The difference between CTLf and OSE M f in upper-level wind speed is negative and that in Z300 is positive over North America and the east coast of the US (Figure 4e). The difference between CTLf and OSE M f is negative in Z300 ensemble spread (orange contour in Figure 4e) because ensemble spread of the predicted central position is larger in OSE M f than in CTLf.
As air moves from the Arctic Ocean to the mid-latitudes, large errors in upper tropospheric circulation predictions have been shown to influence surface circulation forecasts over the mid-latitudes [18,19]. The difference between CTL and OSE M analysis data is positive in Z300 in September 2017 over Chukchi, Beaufort, and Bering Seas (Figure 6a), and corresponds to the effect of the assimilation of Arctic radiosonde data. The difference between CTLf and OSE M f in Z300 is large and positive over the western part of Irma (black contour in Figure 4e). During the forecast period of hurricane Irma, large meandering of the jet stream occurred over the North Pacific and North Atlantic Oceans ( Figure S3), transporting large positive errors in Z300 with a relatively large ensemble spread from the Arctic Ocean to the mid-latitudes, which would result in large errors in hurricane track forecasts.
Atmosphere 2020, 11, x FOR PEER REVIEW 10 of 18 hurricane Irma, large meandering of the jet stream occurred over the North Pacific and North Atlantic Oceans ( Figure S3), transporting large positive errors in Z300 with a relatively large ensemble spread from the Arctic Ocean to the mid-latitudes, which would result in large errors in hurricane track forecasts. To trace the origin of the large Z300 errors over mid-latitudes, we examined the temporal evolution of the difference between CTLf and OSEMf in Z300 (∆Z300; Figure S3). The parameter ∆Z300 proved useful for assessing the error resulting from the incorporation of additional radiosonde data [18,19]. In addition, this error can be tracked along a route of error propagation, which was computed as follows: (1) ∆Z300 fields were calculated at each forecast time step ( Figure  S3); (2) a parcel was put at the location of the maximum value point of ∆Z300 (MVP∆Z300) over the western part of hurricane Irma at 0000 UTC 12 September 2017 (forecast day 4.5; Figure 4e; square in Figure S3f); (3) going back in time with a time step of 6 h, the location of the MVPZ300 that was closest to the location of the MVPZ300 of the previous time step was identified (squares in Figure  S3a-f) and a backward trajectory of MVP∆Z300 was compiled ( Figure S3). The trajectory shows a large ∆Z300 over the Beaufort Sea at the beginning of the forecast period, which moves along the trough to northern Canada and amplifies with lead time, reaching western parts of the hurricane at 0000 UTC on 12 September 2017 (black dots ; Figures 6a and S3). To trace the origin of the large Z300 errors over mid-latitudes, we examined the temporal evolution of the difference between CTLf and OSE M f in Z300 (∆Z300; Figure S3). The parameter ∆Z300 proved useful for assessing the error resulting from the incorporation of additional radiosonde data [18,19]. In addition, this error can be tracked along a route of error propagation, which was computed as follows: (1) ∆Z300 fields were calculated at each forecast time step ( Figure S3); (2) a parcel was put at the location of the maximum value point of ∆Z300 (MVP∆Z300) over the western part of hurricane Irma at 0000 UTC 12 September 2017 (forecast day 4.5; Figure 4e; square in Figure S3f); (3) going back in time with a time step of 6 h, the location of the MVPZ300 that was closest to the location of the MVPZ300 of the previous time step was identified (squares in Figure S3a-f) and a backward trajectory of MVP∆Z300 was compiled ( Figure S3). The trajectory shows a large ∆Z300 over the Beaufort Sea at the beginning of the forecast period, which moves along the trough to northern Canada and amplifies with lead time, reaching western parts of the hurricane at 0000 UTC on 12 September 2017 (black dots; Figure 6a and Figure S3).
We also examined error propagation by assessing the group velocity fields of quasi-stationary Rossby waves because they can transfer errors in the upper troposphere [17]. With a Rossby-wave activity flux of 300 hPa [35,36], Figure 6b shows a Rossby-wave train accompanying a strong wave packet from the Chukchi and Beaufort Seas to North America via northern Canada. This quasi-stationary Rossby-wave packet also propagates errors from the Arctic to the mid-latitudes. The forecast error initiated from the removal of Arctic radiosonde observation data is located over the Beaufort Sea at the beginning of the forecast period. It travels to northern Canada via the Bering Sea and Pacific Ocean and amplifies with lead time, indicating that additional radiosonde observations over the Arctic Ocean improves the reproduction of atmospheric circulation in the analysis data, and enhances the skill to forecast the track of Irma.

Impact of Inclusion of Additional Aircraft Dropsonde Observation Data on Track Forecast of Hurricane Irma (A Trough Case)
Over the Atlantic Ocean, the sparse observational network results in large uncertainties in the initial field of weather forecasts, causing failures in atmospheric circulation predictions over the Northern Hemisphere. To investigate the impact of inclusion of observational data collected near hurricanes on the skill of hurricane track forecast, forecast experiments initialized with OSE A (OSE A f) were conducted to examine the impact of dropsonde observations near the center of hurricane Irma (Figure 4d). There are no large differences between predicted Z300 at forecast day 4.5 in CTLf and that in OSE A f (Figure 4f), suggesting that additional dropsonde observations near the hurricane have little impact on upper-level trough forecasts in the case of Irma. However, central position error in OSE A f at forecast day 4.5 is larger than that in CTLf with central position in OSE A f lying further to the east of observed central position. The difference between CTLf and OSE A f in central position error begins after forecast day 1 (Figure 5a) as reported in previous studies [9,10]. Absence of dropsonde observation data in OSE A f results in errors and/or relatively large ensemble spread of other factors, and influences hurricane track forecast skill.
To investigate the cause of error in the track of Irma in OSE A f, we assessed ensemble spreads of the sea level pressure (SLP) field at initial time ( Figure S4a-e). Over the Atlantic sector, the sparse observational network over the ocean covering the center of Irma results in large SLP ensemble spreads in all forecasts ( Figure S4a-c). Central position of Irma at initial time in OSE A f is the same as that in CTLf. However, the difference between CTLf and OSE A f in SLP ensemble spread is negative near the center of the hurricane ( Figure S4e), indicating that ensemble spread in hurricane intensity at initial time is larger in OSE A f. In all forecasts, SLP ensemble spread increases with time, in particular near the hurricane center ( Figure S4f-h). At forecast day 1.25 (1800 UTC 8 September in Figure 5a), mean predicted hurricane position error of Irma in OSE A f is larger than mean predicted hurricane position errors in CTLf and OSE M f, and SLP ensemble spread near the hurricane center is larger in OSE A f than in CTLf ( Figure S4j). In OSE A f, this relatively large SLP ensemble spread amplifies the development and translation speed of hurricane, resulting in errors and large ensemble spread in hurricane track forecast (Figure 5a,b,d). In contrast, in OSE M f, SLP ensemble spread near the hurricane center at initial time is small ( Figure S4d). Therefore, there is no clear difference between SLP ensemble spread in CTLf and that in OSE M f near the hurricane center at forecast day 1.25 ( Figure S4i), contributing to a small difference between hurricane track in CTLf and that in OSE M f (at 1800 UTC 8 September in Figure 5a-c). The difference in OSE M f is related to accumulated impacts of Arctic observations, which are discussed in Section 3.3.3. Dropsonde observations near the hurricane reduce SLP ensemble spread near the hurricane center at initial time, increasing accuracy of hurricane track forecast.

Impact of Inclusion of Additional Arctic Radiosonde Observation Data on Track Forecast of Hurricane Jose (No Trough Case)
We conducted similar forecast experiments initialized with CTL and OSE M for hurricane Jose (no trough case, Figure 7b-d). Although Jose reached the mid-latitudes during the observation campaign of RV Mirai, its movement was unaffected by upper-level troughs (Figure 7a). In contrast to Irma, there are no clear differences between CTLf and OSE M f in mean and ensemble spread of Z300 over the Atlantic sector (Figure 7d). Evolution of ∆Z300 between 1200 UTC 15 September and 0000 UTC 20 September 2017 ( Figure S5) shows a relatively large ∆Z300 over the Chukchi Sea and Alaska at the initial time of the forecast (Figure S5a), which amplifies with lead time and moves westward, reaching the Canadian Archipelago at forecast day 4.5 ( Figure S5b-f). There is no wave activity flux from the Arctic Ocean to North America ( Figure S6), indicating the absence of a relatively large error and wave packet from the Arctic. However, central positions of Jose at forecast day 4.5 differ between the forecasts (Figure 7a-c). A difference between CTLf and OSE M f in central position error appears after forecast day 1.25 (Figure 8a), suggesting that, similarly to the case of dropsonde observations, large SLP ensemble spread near the hurricane center influences hurricane track forecast skill.
As in the case of dropsonde observations, SLP ensemble spreads over the Atlantic Ocean in the initial fields for both CTLf and OSE M f are relatively large ( Figure S7a,b). The difference between CTLf and OSE M f in SLP spread is negative near the hurricane center, indicating that SLP ensemble spread in OSE M f is larger than that in CTLf ( Figure S7c). In contrast to the case of Irma, there are differences in central positions of Jose in OSE M f, even at initial time for the forecast ( Figure S7b). Although SLP ensemble spread increases with forecast time ( Figure S7d,e), it is larger in OSE M f than in CTLf at forecast day 1.25 when OSE M f has relatively large central position error compared with CTLf ( Figure 8a and Figure S7f).
The OSE M analysis data differ from the CTL analysis data in that they lack the Arctic radiosonde observation data collected from RV Mirai. Differences arising from differences in assimilated observations (e.g., error, relatively large ensemble spread) accumulate and become visible in the OSE M analysis from the end of August, possibly affecting initial fields of forecasts and resulting in Arctic observation data having indirect and remote impacts instead of direct impacts as advection or wave propagation of errors on mid-latitude forecasts. Previous studies found that accumulated differences in analysis data originating from the Arctic Ocean can reach the mid-latitudes [15,19,32,33]. During the first 5 days of Arctic radiosonde observations in the Arctic from RV Mirai, there is no difference between CTL and OSE M analysis data in SLP ensemble spread, even over the Arctic Ocean ( Figure S8a). During the 5 days prior to the start of the forecast for Hurricane Irma, the difference between SLP ensemble spread is large over the Arctic and Pacific Oceans ( Figure S8b). In contrast, during the 5 days prior to the start of the forecast for Hurricane Jose, the difference between SLP ensemble spread is large over the Pacific and Atlantic Oceans ( Figure S8c). Fewer Arctic radiosonde observations are included in the initial fields of the forecast for Jose than for Irma, possibly resulting in the relatively large difference in SLP ensemble spread over the Atlantic Ocean. Although neither CTLf nor OSE M f captures the track of Jose, inclusion of additional Arctic radiosonde observation data reduces SLP ensemble spread over the mid-latitudes at initial time and error of hurricane track forecast in CTLf.
activity flux from the Arctic Ocean to North America ( Figure S6), indicating the absence of a relatively large error and wave packet from the Arctic. However, central positions of Jose at forecast day 4.5 differ between the forecasts (Figure 7a-c). A difference between CTLf and OSEMf in central position error appears after forecast day 1.25 (Figure 8a), suggesting that, similarly to the case of dropsonde observations, large SLP ensemble spread near the hurricane center influences hurricane track forecast skill.  spread (orange contour; unit: m), and predicted tracks of Jose between 1200 UTC 15 September 2017 and 0000 UTC 12 September 2017 with ensemble mean (thick line) and ensemble members (thin lines) from (b) CTLf and (c) OSEMf; black line is track of Jose from NHC best track data. (d) Difference between CTLf and OSEMf in ensemble mean upper-level wind speed (averaged between 300-and 500-hPa levels; shaded; unit: m s −1 ), Z300 (black contour; unit: m) and Z300 ensemble spread (orange contour; unit: m); dots indicate statistical significance at 99% confidence level. As in the case of dropsonde observations, SLP ensemble spreads over the Atlantic Ocean in the initial fields for both CTLf and OSEMf are relatively large ( Figure S7a,b). The difference between CTLf and OSEMf in SLP spread is negative near the hurricane center, indicating that SLP ensemble spread in OSEMf is larger than that in CTLf ( Figure S7c). In contrast to the case of Irma, there are differences in central positions of Jose in OSEMf, even at initial time for the forecast ( Figure S7b). Although SLP ensemble spread increases with forecast time (Figure S7d,e), it is larger in OSEMf than in CTLf at forecast day 1.25 when OSEMf has relatively large central position error compared with CTLf (Figures 8a and S7f).

Discussion
To examine the difference in forecast skill for severe tropical storms with and without upper-level troughs, we focused on Pacific typhoons Noru and Lan in 2017. Both typhoons moved northward over the western Pacific Ocean, making landfall over the mainland of Japan ( Figure S9a,f).
Typhoon Noru, which formed over the western Pacific Ocean, was generated south of Japan at 12UTC 03 August 2017. It moved westward over the south of Japan, then turned northward on 5 August and made landfall on the mainland of Japan on 7 August 2017. The upper-level trough and strong winds were absent from the western part of Noru on 8 August 2016 ( Figure S9a), indicating that the impact of upper-level atmospheric circulation on the movement of Noru was small. Error and Z300 ensemble spreads are relatively small in all operational models, resulting in small error and ensemble spread in forecasts of the track of Noru ( Figure S9b-e).
Typhoon Lan was generated over the east of the Philippines at 0000 UTC 19 October 2017, then moved northward and made landfall over the mainland of Japan on 23 October 2017. At 1200 UTC 23 October 2017, an upper-level trough with strong wind was seen over the western part of typhoon Lan, indicating trough influence on the position of Lan. Predicted upper-level troughs in the four models are located to the west of the center of the trough in ERA5 (black lines in Figure S9f-j). In addition, Z300 ensemble spreads are relatively large around the trough in all models (orange lines in Figure  S9f-j), resulting in larger central position error and ensemble spread in Lan than in Noru (no trough case). Compared with ERA5, the eastern intrusion of the upper-level trough with strong wind is weak, reducing the northward movement of Lan in the model ( Figure S9f-j). Therefore, analyses of typhoons Noru and Lan support our results obtained from analyses of Atlantic hurricanes.

Conclusions
Using various operational medium-range ensemble forecast models, we assessed the skill of operational forecast models to forecast Atlantic hurricanes that moved northward over the North Atlantic between 2007 and 2019. When upper-level troughs with strong wind are present over the western part of the hurricanes, there are large errors and ensemble spreads in the predicted upper-level troughs in the models, causing large errors and ensemble spreads in hurricane track forecasts. In contrast, when upper-level troughs are absent over the North Atlantic, there are small errors and ensemble spreads in the predicted upper-level atmospheric circulations in the models and in the hurricane track forecasts. Although operational models differ in their skill to forecast hurricane track because of differences in model performance (e.g., resolutions and physical parameterizations) and assimilation methods (e.g., assimilation techniques and quantity of assimilated data), average central position errors are lower in Atlantic hurricanes without troughs than in those with troughs in all models. Observing system and forecast experiments in which specific Arctic and aircraft observations were removed from the initial field show that hurricane track forecast skill is improved by the inclusion of dropsonde observations near hurricanes and radiosonde observations over the Arctic Ocean in the case where an upper-level trough appears over the western part of the hurricane after 4.0 forecast days. Assessments of dynamical propagation show that the relatively large error and ensemble spread of the initial upper-level field over the Arctic Ocean reaches the mid-latitudes after 4.0 forecast days. Arctic radiosonde observations increase the accuracy of forecasts of upper-level wind speed near the trough, enhancing the accuracy of North Atlantic hurricane track forecasts. In contrast, dropsonde observations near the hurricane reduce SLP ensemble spread near the hurricane center at initial time, improving hurricane track forecast after 1.0 forecast day. However, the errors and large ensemble spread arising from the absence of Arctic radiosonde observation data accumulate over the mid-latitudes in the analysis data. In the case of hurricane Jose, Arctic radiosonde observations also reduce SLP ensemble spread over the Atlantic Ocean at mid-latitudes at initial time, enhancing the accuracy of the hurricane track forecast. During the first 5 days of radiosonde observations in the Arctic, hurricane position forecasting skill is enhanced by the radiosonde observations after 4.0 forecast days. As relatively large errors and ensemble spreads in atmospheric parameters are absent over the Atlantic sector at initial time in the OSE M analysis data, it took about 4.0 days for the errors and relatively large ensemble spread of the upper level fields over the Arctic to reach the Atlantic sector. In contrast, during the second half of the Arctic observation campaign, a relatively large SLP ensemble spread has accumulated because of the absence of Arctic radiosonde observations, and is present over the Atlantic Ocean at initial time in the OSE M analysis data. This relatively large ensemble spread influences hurricane track forecast skill after 1.0 forecast day. Arctic radiosonde observations improve the error and ensemble spread of predicted hurricane track, even in no trough cases.
These experiments suggest that a more efficient observing system over the higher latitudes is required to reduce human casualties and socioeconomic damages over the mid-latitudes. However, ship-based Arctic observation campaigns have mainly been conducted in summer and early autumn. As a result, improvements in the performance of weather forecasts over the Northern Hemisphere by including ship-based Arctic radiosonde observations are limited to these seasons. Previous studies reveal that increases in the number of radiosonde observations at Arctic existing stations enhance the skill to forecast mid-latitude events [18,19]. The number of radiosonde observations at several existing stations in the Arctic and from onboard ships in the Arctic Ocean increased during the Year of Polar Prediction, which took place between mid-2017 and mid-2019. These observations provide a great opportunity to study the effect of the inclusion of additional summer radiosonde observation data over the Northern Hemisphere on the predictability of extreme events (e.g., tropical storm, heatwaves) at the mid-latitudes.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4433/11/7/702/s1, Figure S1: Locations of radiosonde and dropsonde observations during summer 2017; Figure S2: Predicted upper-level atmospheric circulations and hurricane track for Jose, Maria, and Nate in 2017 by The Interactive Grand Global Ensemble; Figure S3: Difference in upper-level atmospheric circulation between CTLf and OSE M f for Irma forecast period in 2017; Figure S4: Difference in ensemble spread of SLP between CTLf and OSE M f for Irma case; Figure S5: Difference in upper-level atmospheric circulation between CTLf and OSE M f for Jose forecast period in 2017; Figure S6: Z300 anomaly with wave activity flux anomaly for Jose case; Figure S7: Difference in ensemble spread of SLP between CTLf and OSE M f for Jose case; Figure S8: Difference in SLP ensemble spread between CTL and OSE M analysis data; Figure S9: Predicted upper-level atmospheric circulations and hurricane track for Pacific Typhoons in 2017, Table S1: Details of models in The Interactive Grand Global Ensemble.