Strategies for Assimilating High-Density Atmospheric Motion Vectors into a Regional Tropical Cyclone Forecast Model (HWRF)

: In recent years, atmospheric numerical modeling frameworks and satellite observing systems have both undergone signiﬁcant advances. While these developments o ﬀ er considerable potential for improving forecasts of high-impact weather events such as tropical cyclones (TC), much work remains to be done regarding the targeted processing and optimal use of observations now becoming available with high spatiotemporal resolution. Using the 2019 version of NCEP’s HWRF model, we explore several di ﬀ erent strategies for the assimilation of TC-scale, high-density atmospheric motion vectors (AMVs) derived from the new-generation GOES-R series of geostationary satellites. Using 2017’s Atlantic Hurricane Irma as a case study, we examine the HWRF forecast impacts of observation pre-processing, including thinning and adjustments to observation errors. It is demonstrated that enhanced vortex-scale GOES-16 AMVs contribute to notable improvements in HWRF track forecast error compared to a baseline control experiment that does not incorporate the high-density AMVs. Impacts on TC intensity and structure (i.e., wind radii) forecast errors are less robust, but results from the optimization experiments suggest that further work (both with regard to data assimilation strategies and advancements in the methods themselves) should lead to improvements in these forecast variables as well.


Introduction
Tropical cyclones (TCs) are among the most spectacular of natural phenomena, and, in terms of their potential to impact society on vast socioeconomic scales, they remain one of mankind's greatest forecast challenges. In the average year, about 80 to 90 TCs will form over the Atlantic, Pacific, and Indian oceans [1], and, of these, a smaller number will develop into intense cyclones that menace shipping lanes and threaten coastal areas with high winds, storm-surge inundation, and freshwater flooding from rainfall accumulations that can exceed 1 m in extreme cases, such as 2017 s Hurricane Harvey [2]. As coastal populations continue to increase in the 21st century, accurate projections of TC track, intensity, and structure will become more critical than ever.
To meet this challenge, continued development of numerical weather prediction (NWP) and data assimilation techniques (DA) are necessary. NWP models such as the National Centers for Environmental Prediction (NCEP) Hurricane Weather Research and Forecasting Model (HWRF) [3] and its accompanying DA system have undergone tremendous development during the past decade. Increases in model resolution, more realistic parameterizations of sub-grid-scale phenomena (cumulus convection, cloud microphysics, boundary layer fluxes of enthalpy and momentum), and increasingly To simulate air-sea interaction and the impact of sea-surface temperature (SST) cooling on the evolution of the TC, HWRF is coupled to the Princeton Ocean Model for use with TCs (POM-TC) [20].
TC analysis and forecasting with HWRF is a multi-step process that begins with the interpolation of GFS analysis fields to d01 at the beginning of each 6-h analysis cycle (i.e., no DA is performed on d01). The storm-following domains, d02 and d03, are initialized from the GDAS 6-h forecast, and observations are assimilated (including AMVs) over a +/−3-h window centered on each analysis time. The first-guess fields on d02 and d03 are further modified in a number of ways depending on whether the analysis cycle is for a cold start (i.e., the first cycle of a cycling DA experiment) or a warm start (i.e., cycles for which a previous forecast serves as the first guess). For the initial cycle, the GDAS 6-h forecast is modified by the addition of a bogus vortex; subsequent cycles employ vortex relocation and structure correction (i.e., modification of the wind and mass fields) to bring them into better agreement with the observed values of maximum wind and critical wind radii as specified in the NCEP TC-Vitals message valid at the analysis time [3]. For this study, analysis cycling occurs every 6 h at the standard synoptic time (i.e., 00, 06, 12, and 18 UTC).
The DA component of the HWRF system uses the Environmental Modeling Center (EMC) version of the Gridpoint Statistical Interpolation (GSI) code checked out from the EMC repository [21] on 11 April 2019. Specifically, when tail-Doppler radar (TDR) (see, e.g., [22]) observations are available, the two-way hybrid EnKF-GSI algorithm is employed, meaning that the forecast error covariance is given by a blend of static NCEP covariance and the sample covariance provided by a 40-member ensemble of 6-h high-resolution HWRF forecasts. When TDR observations are not available, the forecast error covariance is computed using the one-way hybrid method (i.e., a blend of the static NCEP covariance and the sample covariance provided by the NCEP GFS 80-member ensemble). All data assimilated in this study come from the suite of operationally available BUFR files and include observations from both conventional (radiosonde, ship and surface airways reports, etc.), airborne (TDR, flight-level reconnaissance, etc.) and satellite (SSMIS and AMSU-A radiances, etc.) platforms. The one exception to this rule involves the reprocessed GOES-16 AMVs which are the focus of this study.
Further details on HWRF, GSI and all components of the modeling system can be found in the most recent version of the HWRF scientific documentation [3]. All simulations and analysis were carried out on the NOAA Jet supercomputer located in Boulder, CO.

Hurricane Irma (2017)
Hurricane Irma was one of several destructive hurricanes to impact the Atlantic basin during the 2017 season, spreading a swath of destruction from the Leeward and Virgin Islands to the Bahamas, Cuba and the United States. In the United States alone, Irma resulted in 92 fatalities and damage from wind, storm surge and freshwater flooding is estimated at 50 billion US dollars [23]. The disturbance that would become Irma moved off the west coast of Africa on 27 August and developed into a tropical depression on 30 August while situated west of the Cape Verde Islands. From there, it tracked between west-southwestward and west-northwestward before finally turning north over the Florida Straits toward its landfall in the United States ( Figure 1).
Irma rapidly intensified into a major hurricane on 1 September while over the eastern tropical Atlantic (Figure 2). An eyewall-replacement cycle temporarily prevented any further increase in strength, but, beginning on 4 September, another period of rapid intensification began and lasted until midday on 5 September, by which time Irma had strengthened into a formidable category-5 hurricane with maximum sustained winds of 155 kt. Incredibly, Irma maintained category-5 intensity for 60 h until a second eyewall replacement cycle and associated dry-air intrusion [23] resulted in a weakening phase that lasted until 8 September. Irma then briefly regained category-5 intensity before making landfall on the north coast of Cuba. A day and a half later, Irma made landfall near Marco Island, Florida as a category-3 hurricane and later dissipated over southern Georgia, bringing its long and destructive history to an end. Irma rapidly intensified into a major hurricane on 1 September while over the eastern tropical Atlantic ( Figure 2). An eyewall-replacement cycle temporarily prevented any further increase in strength, but, beginning on 4 September, another period of rapid intensification began and lasted until midday on 5 September, by which time Irma had strengthened into a formidable category-5 hurricane with maximum sustained winds of 155 kt. Incredibly, Irma maintained category-5 intensity for 60 h until a second eyewall replacement cycle and associated dry-air intrusion [23] resulted in a weakening phase that lasted until 8 September. Irma then briefly regained category-5 intensity before making landfall on the north coast of Cuba. A day and a half later, Irma made landfall near Marco Island, Florida as a category-3 hurricane and later dissipated over southern Georgia, bringing its long and destructive history to an end.   Irma rapidly intensified into a major hurricane on 1 September while over the eastern tropical Atlantic ( Figure 2). An eyewall-replacement cycle temporarily prevented any further increase in strength, but, beginning on 4 September, another period of rapid intensification began and lasted until midday on 5 September, by which time Irma had strengthened into a formidable category-5 hurricane with maximum sustained winds of 155 kt. Incredibly, Irma maintained category-5 intensity for 60 h until a second eyewall replacement cycle and associated dry-air intrusion [23] resulted in a weakening phase that lasted until 8 September. Irma then briefly regained category-5 intensity before making landfall on the north coast of Cuba. A day and a half later, Irma made landfall near Marco Island, Florida as a category-3 hurricane and later dissipated over southern Georgia, bringing its long and destructive history to an end.  Aside from its historic destructiveness, Irma's sequence of multiple rapid intensification events and eyewall replacement cycles make it an ideal case for study. TCs which undergo rapid intensity change and/or eyewall replacement cycles often present exquisite challenges for numerical guidance. Irma was no exception [23]. In addition, the newly launched GOES-16 was placed in pre-operational check-out mode by NOAA/NESDIS during the 2017 Atlantic TC season. The flexible meso-scan imaging mode was activated and began targeting Irma on 3 September and continued following it for the duration of the storm.

GOES-16 AMVs
For this study, AMV datasets were reprocessed by the University of Wisconsin Cooperative Institute for Meteorological Satellite Studies (UW/CIMSS) from GOES-16 ABI rapid scan imagery Atmosphere 2020, 11, 673 5 of 13 beginning on 3 September for Hurricane Irma (2017). Special processing strategies were imposed to enhance AMV coverage and maximize the information content to resolve the scales of the flow fields associated with the storm vortex and its near environment. Crucial to obtaining these observations is the continuous 1-min image sampling over targeted storms as mentioned above. In addition to using 1-min image triplets in the cloud tracking process, modifications for enhanced AMV coverage (vs. routine full-disk processing) included increasing the target density, reducing the minimum gradient required for target identification, disabling stringent coherency requirements, and relaxing QC constraints [12].

Experiment Design and Evaluation Metrics
To evaluate the impact of the reprocessed GOES-16 AMVs on HWRF forecasts of Hurricane Irma, we first conducted a baseline cycling experiment (hereafter referred to as BASE) that assimilated the full complement of conventional, airborne (including TDR observations collected by NOAA research aircraft when available), and satellite observations, which included the operationally produced AMVs from GOES-13 but excluded any GOES-16 AMVs. Next, a cycling experiment (hereafter referred to as G16REP) was performed, adding the reprocessed GOES-16 AMV dataset. The differences between G16REP and BASE thus reflect the full impact of the new information contained in the reprocessed GOES-16 dataset and provide an indication of the value added by the addition of increased wind vector information owing to the targeted processing strategy.
Since the reprocessed GOES-16 AMV dataset represents a huge increase in observation density, The enhanced AMV datasets using the modified reprocessing strategies were created at 15-min intervals from four GOES-16 ABI bands: visible (VIS, Channel 2, local daytime only), IR shortwave (SWIR, Channel 7, local nighttime only), IR water vapor cloud top (WVCT, Channel 8) and IR longwave (IR, Channel 14). The AMV derivation algorithm employs a nested tracking technique and is basically the same as used for operational processing at NOAA/NESDIS [24]. As mentioned above, the derived AMVs benefit from the improved temporal sampling of the ABI meso sector, and, importantly, the dataset processing strategy is optimized for enhanced coverage over and around a targeted TC.
Atmosphere 2020, 11, 673 6 of 13 Given the increased spatiotemporal resolution (including good image navigation/co-registration), the reprocessed AMVs produced using the GOES-16 meso-scans are considered to have accuracies superior to that of their conventional counterparts. Figure 3 shows an example of the typical daylight AMV coverage over an area roughly the size of the HWRF d03 assimilation domain for one cycle (+/−3 h from assimilation time) provided by the tailored processing that includes AMVs derived from the limited-area meso-scan sectors targeting Irma. Additionally, shown for comparison are the routine operational (NESDIS) AMVs derived from GOES-13 that were available for assimilation. The enhanced observation density is especially evident in the upper levels over and around the storm core area. Enhanced coverage was less attainable in the core region at low levels due to the extensive storm cirrus canopy but is still an improvement over the operational AMV product. It is expected that GOES-16/17 1-min sector AMV datasets will become operational at NOAA/NESDIS and made publicly available in real time in the near future. Therefore, it is of particular importance that techniques and strategies for their optimal use in operational hurricane models be developed and refined.

Experiment Design and Evaluation Metrics
To evaluate the impact of the reprocessed GOES-16 AMVs on HWRF forecasts of Hurricane Irma, we first conducted a baseline cycling experiment (hereafter referred to as BASE) that assimilated the full complement of conventional, airborne (including TDR observations collected by NOAA research aircraft when available), and satellite observations, which included the operationally produced AMVs from GOES-13 but excluded any GOES-16 AMVs. Next, a cycling experiment (hereafter referred to as G16REP) was performed, adding the reprocessed GOES-16 AMV dataset. The differences between G16REP and BASE thus reflect the full impact of the new information contained in the reprocessed GOES-16 dataset and provide an indication of the value added by the addition of increased wind vector information owing to the targeted processing strategy.
Since the reprocessed GOES-16 AMV dataset represents a huge increase in observation density, the potential exists that the observation errors themselves are spatially correlated. Failing to account for potential observation error correlation in the HWRF DA system might tend to degrade the analyses and subsequently limit the positive impact on forecasts as well. Given that the nature of the enhanced AMV observation error correlations is at present poorly understood, we conducted a third experiment (G16REP THIN) which thinned the reprocessed GOES-16 AMVs to a 25 km × 50 hPa mesh. We did so with the expectation that observation error correlations would decrease with increased spatial separation of the observations themselves. The differences between G16REP THIN and G16REP therefore represent the impact of reducing, at least partially, the adverse impact of observation error correlation.
An additional factor to be considered is the quality control process used by GSI to eliminate outlying observations. Currently, at the beginning of each outer loop, GSI applies a gross-error check that excludes observations whose observation-minus-background (OMB) values exceed a specified threshold: |y − Hx| > e gross × {max[e min ,min(e max ,e obs )]}, where |y -Hx| is the OMB value, e gross is the gross-error parameter, e min and e max are maximum and minimum bound parameters, and e obs is the value of the observation error itself. The HWRF default value of e gross = 2.5 was used for the above-mentioned experiments. To test the hypothesis that the default gross-error parameter is rejecting too many reprocessed GOES-16 observations, we conducted a fourth assimilation experiment (GROSS2X) in which we set e gross = 5.0. To complete the examination of both the thinning and gross-error impacts, we conducted a fifth experiment (GROSS2X THIN) which used e gross = 5.0 and thinned the observations to the same mesh used in G16REP THIN. A summary of the experiments and their respective details and differences is given in Table 1. The impacts of the enhanced AMVs from each of the experiments mentioned above were assessed in terms of selected forecast metrics: (1) mean absolute error (MAE) of the TC forecast positions (expressed in nautical miles, nm), (2) MAE and bias of the TC forecast maximum sustained wind (expressed in knots), and (3) MAE and bias of the TC's forecast area of associated 34-kt winds (expressed in terms of 10 4 nm 2 ). We defined the area of 34-kt winds as the ellipse whose area is given as where r 1 is defined as half the sum of the 34-kt wind radii in the northeastern and southwestern quadrant, r 1 = (R34 ne + R34 sw )/2, and r 2 is defined as half the sum of the 34-kt wind radii in the northwestern and southeastern quadrants, r 2 = (R34 nw + R34 se )/2. Observed values of tropical cyclone position, intensity, and 34-knot wind radii were obtained from the HURDAT2 Best Track database maintained by the National Hurricane Center [25] and were used in the computation of the metrics defined above.

Results
The impact of the reprocessed GOES-16 dataset (G16REP) was first examined relative to the cycling DA experiment that assimilated no GOES-16 AMVs at all (BASE). Relative to BASE, the G16REP reduced the MAE for HWRF track forecasts of Irma at all lead times (Figure 4a), improving upon the BASE track forecasts by 10-15% for lead times out to 48 h (Figure 4b).
Atmosphere 2020, 11, 673 8 of 14 The impact on Irma intensity forecasts was somewhat more mixed (Figure 5a). The G16REP experiment yielded small improvements to the MAE for HWRF forecast maximum-sustained wind speed out to 24 h. However, for most lead times after 24 h, the forecast quality was degraded by up to 10% at 72 h ( Figure 5b).  The impact on Irma intensity forecasts was somewhat more mixed (Figure 5a). The G16REP experiment yielded small improvements to the MAE for HWRF forecast maximum-sustained wind speed out to 24 h. However, for most lead times after 24 h, the forecast quality was degraded by up to 10% at 72 h (Figure 5b). The impact on Irma intensity forecasts was somewhat more mixed (Figure 5a). The G16REP experiment yielded small improvements to the MAE for HWRF forecast maximum-sustained wind speed out to 24 h. However, for most lead times after 24 h, the forecast quality was degraded by up to 10% at 72 h (Figure 5b). Though track and intensity metrics have traditionally been the gold standard of hurricane forecast evaluation, it is also important to consider storm size, since the areal extent of the TC's strong wind field defines the envelope of potential impacts (including storm surge). For this metric, we chose the forecast areal extent of 34-kt winds (A34). Figure 6 shows the MAE and % improvement of HWRF forecasts of A34 for the BASE and G16REP experiments. While G16REP slightly improved HWRF forecasts of Irma's size out to 24 h, as was the case with forecasts of intensity, the impacts beyond 24 h were mixed but more generally less positive. Though track and intensity metrics have traditionally been the gold standard of hurricane forecast evaluation, it is also important to consider storm size, since the areal extent of the TC's strong wind field defines the envelope of potential impacts (including storm surge). For this metric, we chose the forecast areal extent of 34-kt winds (A34). Figure 6 shows the MAE and % improvement of HWRF forecasts of A34 for the BASE and G16REP experiments. While G16REP slightly improved HWRF forecasts of Irma's size out to 24 h, as was the case with forecasts of intensity, the impacts beyond 24 h were mixed but more generally less positive. The DA experiments described above capture the impact of the reprocessed GOES-16 AMV dataset using "out-of-the-box" settings for observation density and quality control in the GSI system. In the next set of experiments, we attempted to better understand how thinning the available data (to alleviate the impact of potential spatial correlation of observation errors) and, conversely, increasing the density of available data (in the case that the pre-tuned quality control in GSI is too strict for the new GOES-16 AMV observations) impact the HWRF forecast results relative to G16REP. Two experiments were conducted, G16REP THIN and GROSS2X, which imposed observation thinning and relaxed quality control on the dataset, respectively, as described in Section 2.4. Additionally, we conducted another experiment, GROSS2X THIN, which examined the impact of both relaxed quality control and observation thinning. Figure 7a presents the HWRF track forecast results (MAE) for this expanded set of experiments. Each showed a robust improvement over BASE with the exception of GROSS2X, which was slightly worse at 6 h (Figure 7b). With respect to G16REP, observation thinning (G16REP THIN) showed a small positive impact over the first 18 h, but this was negative to neutral beyond that lead time. Relaxing the QC (GROSS2X) resulted in an overall degradation early in the forecast, but notably The DA experiments described above capture the impact of the reprocessed GOES-16 AMV dataset using "out-of-the-box" settings for observation density and quality control in the GSI system. In the next set of experiments, we attempted to better understand how thinning the available data (to alleviate the impact of potential spatial correlation of observation errors) and, conversely, increasing the density of available data (in the case that the pre-tuned quality control in GSI is too strict for the new GOES-16 AMV observations) impact the HWRF forecast results relative to G16REP. Two experiments were conducted, G16REP THIN and GROSS2X, which imposed observation thinning and relaxed quality control on the dataset, respectively, as described in Section 2.4. Additionally, we conducted another experiment, GROSS2X THIN, which examined the impact of both relaxed quality control and observation thinning. Each showed a robust improvement over BASE with the exception of GROSS2X, which was slightly worse at 6 h (Figure 7b). With respect to G16REP, observation thinning (G16REP THIN) showed a small positive impact over the first 18 h, but this was negative to neutral beyond that lead time. Relaxing the QC (GROSS2X) resulted in an overall degradation early in the forecast, but notably improved in G16REP beyond 60 h. The combination of the two experiments yielded generally mixed results. In summary, these experiments mainly degraded the G16REP performance out to about 60 h, after which time there was a notable improvement from the relaxation of QC constraints during the assimilation.
new GOES-16 AMV observations) impact the HWRF forecast results relative to G16REP. Two experiments were conducted, G16REP THIN and GROSS2X, which imposed observation thinning and relaxed quality control on the dataset, respectively, as described in Section 2.4. Additionally, we conducted another experiment, GROSS2X THIN, which examined the impact of both relaxed quality control and observation thinning. Figure 7a presents the HWRF track forecast results (MAE) for this expanded set of experiments. Each showed a robust improvement over BASE with the exception of GROSS2X, which was slightly worse at 6 h (Figure 7b). With respect to G16REP, observation thinning (G16REP THIN) showed a small positive impact over the first 18 h, but this was negative to neutral beyond that lead time. Relaxing the QC (GROSS2X) resulted in an overall degradation early in the forecast, but notably improved in G16REP beyond 60 h. The combination of the two experiments yielded generally mixed results. In summary, these experiments mainly degraded the G16REP performance out to about 60 h, after which time there was a notable improvement from the relaxation of QC constraints during the assimilation. Turning next to HWRF intensity forecast results, both G16REP THIN and GROSS2X showed markedly lower MAE and bias with respect to G16REP for lead times out to 36-48 h ( Figure 8a). As with the track forecast impacts (Figure 7b), observation thinning notably helps during the first 12 h of the intensity forecasts relative to BASE (Figure 8b) and continues to have a positive impact on intensity forecasts out to 36 h. The impact beyond about 48 h was mixed with the thinning experiment, but GROSS2X showed small improvements compared to G16REP.
Atmosphere 2020, 11, 673 10 of 14 Turning next to HWRF intensity forecast results, both G16REP THIN and GROSS2X showed markedly lower MAE and bias with respect to G16REP for lead times out to 36-48 h ( Figure 8a). As with the track forecast impacts (Figure 7b), observation thinning notably helps during the first 12 h of the intensity forecasts relative to BASE (Figure 8b) and continues to have a positive impact on intensity forecasts out to 36h. The impact beyond about 48 h was mixed with the thinning experiment, but GROSS2X showed small improvements compared to G16REP. We next considered the effects of observation thinning and relaxed quality control on forecasts of TC size. Figure 9a,b show that for the first 18 h the respective impacts of the two experiments were reversed, with only GROSS2X producing any forecast benefit beyond that offered by the original G16REP configuration, while thinning degraded the forecasts. Beyond that time, the impacts were widely varying and mixed, but the improvements outweighed the degradations with respect to G16REP. Curiously, the GROSS2X THIN experiment MAEs were systematically worse than BASE at all lead times except at 72 h. We next considered the effects of observation thinning and relaxed quality control on forecasts of TC size. Figure 9a,b show that for the first 18 h the respective impacts of the two experiments were reversed, with only GROSS2X producing any forecast benefit beyond that offered by the original G16REP configuration, while thinning degraded the forecasts. Beyond that time, the impacts were widely varying and mixed, but the improvements outweighed the degradations with respect to G16REP. Curiously, the GROSS2X THIN experiment MAEs were systematically worse than BASE at all lead times except at 72 h.
in (b). The number of verifying forecasts for each lead time is shown in black across the top of each plot.
We next considered the effects of observation thinning and relaxed quality control on forecasts of TC size. Figure 9a,b show that for the first 18 h the respective impacts of the two experiments were reversed, with only GROSS2X producing any forecast benefit beyond that offered by the original G16REP configuration, while thinning degraded the forecasts. Beyond that time, the impacts were widely varying and mixed, but the improvements outweighed the degradations with respect to G16REP. Curiously, the GROSS2X THIN experiment MAEs were systematically worse than BASE at all lead times except at 72 h. Two additional DA experiments were conducted. The first examined the impact of turning off the HWRF vortex initialization (VI) size/intensity adjustment, and the second tested the impact of turning off analysis increment blending in the TC core region (the idea being that this may allow for better cycling of the flow-dependent covariances, which the enhanced AMVs could help control). The forecast impact results of these experiments (not shown) were mixed for Irma's track, intensity and structure, with no systematic indication of forecast improvement. In this regard, we speculate that 4DEnVAR and hourly cycling (not available yet in the HWRF DA options employed in this study, but coming soon) may be necessary to take full advantage of the vortex-scale information contained in the enhanced AMV observations.

Discussion
The intent of this study was to examine the reprocessed GOES-16-enhanced, TC-focused AMV demonstration dataset for Hurricane Irma in terms of HWRF assimilation strategies, in anticipation of these datasets becoming routinely available in the near future. The results presented in Section 3 indicate that the impact of assimilating the reprocessed GOES-16 AMV observations is generally positive with respect to HWRF track forecasts of Hurricane Irma while somewhat mixed for intensity and structure. While the overall sample sizes are too small for results to pass significance tests, the findings can be used to inform and steer future studies that consider much larger samples.
Regarding impacts of the enhanced AMVs on HWRF track forecasts, G16REP produces a robust improvement at all lead times relative to BASE, which assimilates no GOES-16 AMVs. For this case study, DA optimization efforts directed at adjusting the density and quality allowance of the assimilated observations indicate that both thinning of the observations (in this case to a 25 km × 50 hPa mesh in the experiment G16REP THIN) and relaxing of the quality control in GSI via upward adjustment of the gross-error parameter (from 2.5 to 5.0 in experiment GROSS2X) have some promise depending on the forecast lead times.
Impacts on the HWRF intensity forecasts of Irma are mixed. There is evidence that the increased observation density offered by enhanced GOES-16 AMVs improves the initial analyses, which then results in small but positive results early in the forecast period. This was seen in G16REP as well as in the optimization experiments G16REP THIN and GROSS2X. Moreover, G16REP THIN and GROSS2X demonstrated an increased duration of positive impact (from 24 h in the case of G16REP to 48 h in the cases of G16REP THIN and GROSS2X). The experiment which combined thinning and relaxed quality control (GROSS2X THIN) did not distinguish itself from G16REP in terms of intensity impact.
In terms of HWRF forecasts of Irma's size, the impacts are again mixed; however, the relaxation of GSI quality-control constraints (GROSS2X) does show promise when coupled with adding the enhanced AMVs. In fact, more generally speaking, this case study shows that reducing this constraint produces varying but overall positive impacts resulting from the assimilation of the enhanced AMVs. This suggests the observations deviating from the model background are resolving important dynamical information over the TC core and near environment, as also concluded by [11].
Optimizing the tuning of settings that control observation density and correct for spatial correlation of observation errors with these enhanced GOES-16/17 AMV datasets will be important in achieving optimal forecast performance. As currently configured, the HWRF DA system specifies AMV observation errors as a simple function of pressure altitude. However, the AMV observations themselves come with quality indicators that can be directly used in the pre-assimilation or DA steps as another means towards more effective data selection. Treating the AMV observations errors non-homogeneously in future DA implementations will do much to ensure that the information content of the AMVs is properly projected onto the analysis increments.
It is also important to note that the current HWRF cycling period (i.e., 6 h) is insufficient to fully leverage the increased temporal resolution of the enhanced AMV datasets, which for this study were reprocessed at 15-min intervals. It is conceivable that even more frequently available AMV datasets will become available in the near future from the GOES-16/17 meso scans (i.e., at 5-min. intervals) when the product is operationalized at NOAA/NESDIS. This highlights the importance of model development in the optimal utilization of novel observation sets. Increasing the cycling frequency to hourly (or less) will be an important part of such developmental activities, but continued improvements in model physics, dynamics, and numerics will also be necessary. Each of these aspects of the FV3 and HAFS, currently under development at NWS, NCEP, and EMC, will be crucial in realizing the full potential of the high-density observations of the present and future.

Conclusions
In this study, we sought to develop a framework for working towards the optimal use of high-density GOES-16 AMVs in sophisticated, high-resolution NWP and DA systems (here represented by the 2019 version of NCEP's HWRF model). We demonstrated that the new GOES-16 AMVs have a broadly positive impact on track forecasts for a single high-profile TC case (Irma) from the 2017 Atlantic hurricane season. Although positive forecast impact on Irma's size and intensity was limited to the first 24 h, we established that, via the imposition of observation thinning and/or relaxation of quality control (thereby allowing more observations to impact the analysis), the period of beneficial intensity forecast impact can be increased. We anticipate that further optimization, with a much larger and more representative set of cases, will lead to more robust results.
However, it must be stressed that optimal utilization of high-density AMVs (as well as other current and future observations types with high density in both space and time) will require upgrades to the model and DA systems as well. More rapid cycling, improved specification of observation errors to include fully non-homogeneous treatment, and even improved model physics will be necessary to extract the maximum benefit from the observations and provide the most accurate and reliable forecasts possible.