2018 Atmospheric Motion Vector (AMV) Intercomparison Study

Atmospheric Motion Vectors (AMVs) calculated by six different institutions (Brazil Center for Weather Prediction and Climate Studies/CPTEC/INPE, European Organization for the Exploitation of Meteorological Satellites/EUMETSAT, Japan Meteorological Agency/JMA, Korea Meteorological Administration/KMA, Unites States National Oceanic and Atmospheric Administration/NOAA, and the Satellite Application Facility on Support to Nowcasting and Very short range forecasting/NWCSAF) with JMA’s Himawari-8 satellite data and other common input data are here compared. The comparison is based on two different AMV input datasets, calculated with two different image triplets for 21 July 2016, and the use of a prescribed and a specific configuration. The main results of the study are summarized as follows: (1) the differences in the AMV datasets depend very much on the ‘AMV height assignment’ used and much less on the use of a prescribed or specific configuration; (2) the use of the ‘Common Quality Indicator (CQI)’ has a quantified skill in filtering collocated AMVs for an improved statistical agreement between centers; (3) Among the six AMV operational algorithms verified by this AMV Intercomparison, JMA AMV algorithm has the best overall performance considering all validation metrics, mainly due to its new height assignment method: ‘Optimal estimation method considering the observed infrared radiances, the vertical profile of the Numerical Weather Prediction wind, and the estimated brightness temperature using a radiative transfer model’. Remote Sens. 2019, 11, 2240; doi:10.3390/rs11192240 www.mdpi.com/journal/remotesensing Remote Sens. 2019, 11, 2240 2 of 28


Introduction
The use of satellite-derived cloud displacements to infer atmospheric motion (AMVs or Atmospheric Motion Vectors) has been investigated since the first weather satellites were launched.In the early 1960s, Tetsuya Fujita developed analysis techniques to use cloud pictures from the first Television Infrared Observation Satellite (TIROS), a polar orbiter, for estimating the velocity of tropospheric winds [1].Throughout the 1970s and early 1980s, AMVs were produced from geostationary satellite data using a combination of automated and manual techniques.
The derivation of Atmospheric Motion Vectors (AMVs) has traditionally been based on the following steps:

•
The reading and preprocessing of the satellite data, including, for example, the normalization of satellite visible images.

•
The location of suitable 'tracers' in an initial image: cloud features, and humidity features in cloudless areas in water vapor images.

•
The location of those tracers in a later image: tracking process.The features can change their shape or even disappear, but enough of them survive to produce a significant number of AMVs; with short time intervals, up to 15 min, the evolution is smaller and more vectors can be calculated.

•
The 'height assignment' of the tracers: the pressure level of the feature must be determined to locate the AMVs in a tridimensional position in the atmosphere.

•
The calculation of the AMV vectors, considering the geographical displacement of the tracers between the initial image and the later image.

•
A quality control process, so that only the AMVs with better quality are accepted.
The following is a brief chronology on the generation of geostationary AMVs from the primary producers:

•
In 1969, the NOAA (Unites States National Oceanic and Atmospheric Administration) National Environmental Satellite Service (NESS, now NESDIS) began routine production of a completely manual technique for producing wind vectors from viewing time loops of visible satellite images, resulting in approximately 300 AMVs per day [2].The technique was partially computerized in the 1970s, automated in the 1990s, and was recently updated for the GOES-R series satellites [3].

•
The University of Wisconsin-Madison (UW) Space Science and Engineering Center (SSEC) developed a computerized algorithm to manually track clouds from the imagery of the first generation of geostationary weather satellites [4].This technique was fully automated in the 1990s [5], and variations of this technique are still in use today.

•
The extraction of AMVs from Meteosat infrared imagery has been operational at the European Space Operations Center (ESOC) and European Organization for the Exploitation of Meteorological Satellites (EUMETSAT) since the late 1970s.After several major improvements, the extraction technique reached a stage in 1991, where the majority of the vectors represented the local wind field [6].The algorithm was updated in 2012, including the 'Cross-correlation contribution (CCC) method' to set the AMV altitude [7].
GMS satellite was launched in July 1977 as the first Japanese geostationary meteorological satellite and operated from 1978 to 1984.The Japan Meteorological Agency/Meteorological Satellite Center (JMA/MSC) routinely began AMV derivation provided twice a day from the GMS satellite imagery in 1978, by using the Cloud Wind Estimation System (CWES) [8].The derivation system was fully automated in 2003, and AMVs are today calculated every hour by the latest algorithm adapted to Himawari-8.
• Atmospheric Motion Vectors have been produced operationally at Brazil Center for Weather Prediction and Climate Studies (CPTEC/INPE) since 1998.The original algorithm was based on the routines developed by ESOC and adapted to use GOES-8 10.8 µm infrared images [9].AMVs from the water vapor channel were added to the operational suite, followed by the 3.9 µm and visible channels in 2006 to estimate winds from the lower troposphere only [10].More recently, the algorithm has been adapted to data from GOES-16.

•
Korea Meteorological Administration (KMA) developed an AMV algorithm for COMS satellite in 2003.Now it is in charge of the AMV algorithm for GEO-KOMPSAT-2 satellites [11].

•
The NWCSAF (Satellite Application Facility on Support to Nowcasting and Very short range forecasting) delivered the first version of its AMV product '"HRW/High Resolution Winds" for MSG satellites, inside the 2004 release of its software package "NWC/MSG".A major change occurred in 2011, with the inclusion of 'CCC method' for the AMV height assignment.
Progressive extensions have permitted to calculate AMVs with additional geostationary satellite series throughout the world (GOES-N in 2016, Himawari-8/9 in 2018, and GOES-R in 2019) [12].
However, it was not until 2008 that a coordinated effort was organized to compare the AMVs from the primary operational AMV producing centers.Overall, this resulted in a total of three 'Atmospheric Motion Vector (AMV) Intercomparison studies', which took place over the last decade: Genkova et al. 2008 and 2010 [13,14], and Santek et al. 2014 [15].
The studies assessed how satellite-derived cloudy AMVs compared in terms of coverage, speed, direction, and cloud height.
The algorithms used for the AMV calculation, and the geostationary satellites with which this process is done, have evolved very much since then, with a new generation of satellites with additional spectral channels bearing new information on cloud microphysics, and with a higher temporal resolution which allows a better understanding of the characteristics of the tracked features.
All this has defined the need for a new 'AMV Intercomparison study' in 2018, and the need to formally publish for the first time the results of these Intercomparisons.The study would act as an input for the development of all AMV algorithms, for their optimal use with this new generation of geostationary satellites.
Three main goals have been considered for this study: (1) To verify the advantages of the calculation of AMVs with the new generation of geostationary satellites that began with Himawari-8, with better spatial and temporal resolution and new spectral channels, with respect to those calculated with previous satellite series like MSG. (2) To compute a 'Common Quality Indicator (CQI)' for all AMV algorithms, using the self-contained Fortran module defined by EUMETSAT and NOAA/NESDIS and distributed to all AMV producers by the International Winds Working Group (IWWG) in May 2017, to verify if there is a better agreement between the different AMV datasets using this CQI.(3) To extract conclusions about the best options for the calculation of AMVs with this new generation of geostationary satellites, taking into account the options defined by the different centers for their AMV calculation.

Materials and Methods
The geostationary AMV algorithms defined by the following six AMV producers have been analyzed here.The three-letter abbreviations have been used as identifiers of the AMV datasets throughout the remainder of this article: Full information on the AMV algorithms used here can be obtained from the "Operational AMV Product Survey", available at the "International Winds Working Group (IWWG)" webpage [16].Enumerating a summary of these characteristics:

•
All algorithms are operational at the corresponding centers, although some of them not with Himawari satellites (BRZ, EUM, and NOA).

•
Most products are distributed through the meteorological Global Telecommunications System (GTS), while some are produced locally (NWC).

•
Many output formats are available, although all AMV algorithms include the heritage Binary Universal Form for the Representation of meteorological data (BUFR) sequence, and most of them have plans of implementing the 2018 new BUFR sequence (3.10.077) for AMVs.

•
There is an important variability in the way Numerical Weather Prediction (NWP) data and other input products are used by the different AMV algorithms, and also in the way the AMV features are defined.

•
Tracers based on square features of different sizes ('target sizes') between 7 × 7 pixels and 31 × 31 pixels are used for the processing by the different AMV algorithms.The square features are located in a regular grid with nominal spacings between 12 km and 72 km at subsatellite point for the different AMV algorithms, although some methods infer corrections to optimize the location of each feature.

•
'Cross-correlation method' is used for the AMV tracking by all centers except NOA, which uses the 'Sum of Squared Differences'.The 'Cross-correlation threshold' and the way the tracking is implemented is slightly different for each algorithm.

•
All algorithms except BRZ calculate, for the study, the AMV displacement as the average of two intermediate components; EUM and NOA additionally use the central image of each triplet as a reference and tracks backward for the first intermediate component and then forward for the second intermediate component.

•
There is an important variability in the height assignment method used by the different centers.
For example, 'Cross-correlation contribution (CCC)' is used by EUM and NWC; 'Effective black body temperature (EBBT)' with optional 'Water vapor/infrared ratioing' and 'CO 2 slicing' corrections are used by BRZ and KMA; an 'Optimal estimation method considering the observed infrared radiances, the vertical profile of the NWP wind and the estimated brightness temperature using a radiative transfer model' is used by JMA.Some thresholds and corrections are additionally implemented by each center for its specific implementation.

•
The 'Quality Indicator without Forecast (QINF)' (based on the 'EUMETSAT Quality Indicator' [17], with a specific implementation at each center), and the 'Common Quality Indicator without Forecast (CQI)' (using a common implementation and code, such as defined previously in the Introduction) are implemented in all AMV algorithms.The 'Quality Indicator with Forecast (QIF)' (based on the 'EUMETSAT Quality Indicator' [17], with a specific implementation at each center) is implemented by all AMV algorithms except BRZ.

•
The 'quality indicator threshold' and several other quality checks are different for each algorithm.Additionally, other quality methods are included by some centers: the 'Expected error' (KMA and NOA) and the 'Orographic flag' (NWC).
The AMV outputs were originated at the different participating centers using the following input data:

•
Two triplets of Himawari-8/AHI infrared 10.4 µm full-disk images for 21 July 2016 at 0530-0550 UTC and 1200-1220 UTC, one of which is shown in Figure 1.Information was provided at each Himawari-8 pixel for the following variables: latitude, longitude, scanning time, satellite and solar zenith angles, radiance, and brightness temperature.

•
Auxiliary files, containing land/sea mask, land cover, and elevation for each Himawari-8 pixel.

•
European Centre for Medium-Range Weather Forecasts (ECMWF) ERA-Interim NWP analysis [18] for the given day, for 37 vertical levels every 6 h, on a regular latitude/longitude grid with a resolution of 0.5 degrees.Information was provided for the following variables: geopotential, surface pressure, temperature, dew point temperature, 2-m temperature, skin temperature, relative humidity, wind components, 10-m wind components, ozone mixing ratio, total column ozone, land/sea mask, sea/ice cover, snow depth, and mean sea level pressure.Considering this, it is known that differences can exist considering different NWP analysis sources, based especially on the number of observations used in the assimilation process of each NWP model.For this study, it was assumed that the differences in the winds from the different AMV datasets are more significant than the differences caused by the use of one or other NWP model analysis in the process.This assumption is confirmed by the results, in which the differences in the mean winds of the different AMV datasets are larger than the differences in the mean winds of different NWP analysis.The AMV datasets produced by the different producers for the AMV Intercomparison, all software tools used in the AMV Intercomparison for the verification of the AMV data, and the radiosonde data used for the AMV data validation, with a text file describing how to process all these elements, have been made available inside a zipped file (File S1), such as described in the "Supplementary Materials" chapter.The AMV output from each center for the considered experiments in this AMV Intercomparison study, included data for the following variables shown in Table 1.The Himawari-8/AHI satellite images used in the Intercomparison were equivalent to those used by the "International Cloud Working Group (ICWG)" for the "Cloud Intercomparison study", to improve synergies between both studies.The day was selected due to the variability of meteorological features found in the different regions.
The AMV outputs provided by each AMV algorithm were analyzed in three independent experiments, designed to measure differences related to specific aspects of the algorithms.Software tools used in the three previous Intercomparison studies (Genkova et al. 2008 and2010 [13,14], and Santek et al. 2014 [15]) were used again, allowing for the comparison of the results in the different studies.
The AMV datasets produced by the different producers for the AMV Intercomparison, all software tools used in the AMV Intercomparison for the verification of the AMV data, and the radiosonde data used for the AMV data validation, with a text file describing how to process all these elements, have been made available inside a zipped file (File S1), such as described in the "Supplementary Materials" chapter.
The AMV output from each center for the considered experiments in this AMV Intercomparison study, included data for the following variables shown in Table 1.

Experiment 1
In this case, AMV producers extracted cloudy AMVs with the Himawari-8/AHI infrared 10.4 µm image triplet 1200-1220 UTC, using their best options for the AMV calculation, but with prescribed target box sizes (16 × 16 pixels), search scene sizes (54 × 54 pixels), and target locations (with a line and column separation of 16 pixels).
This way, differences in the AMV densities and all AMV extraction processes (tracer selection, tracer tracking, height assignment, and quality control) could be compared, verifying exactly equivalent AMV datasets.
Table 2 shows the parameter distribution for the AMV datasets, with a 'CQI threshold' of 50% (CQI >= 50%).This threshold is used in all AMV datasets to avoid the processing of 'low-quality AMVs'; in some elements of the Intercomparison, an additional 'CQI threshold' of 80% (CQI >= 80%) is used to verify its impact on the improvement of the validation parameters in the different AMV datasets.• The total number of AMVs ranges from 15,000 to 76,000, with the lower values for NWC and the larger values for JMA, and a factor of five between the numbers of AMVs of both centers.
The number of AMVs is much larger with Himawari satellites than in the previous 'AMV Intercomparison' with MSG satellite [15], which found up to approximately 10,000 AMVs.This is due, in part, to the higher resolution of Himawari/AHI (2 km) versus MSG/SEVIRI (3 km), more oceanic regions, and fewer deserts.

•
The minimum pressure ranges between 57 hPa for EUM and 125 hPa for JMA.The maximum pressure ranges between 965 hPa for NWC and 1004 hPa for EUM.

•
The percentage of AMVs by layer (with low layer between 700 and 1000 hPa, medium layer between 400 and 700 hPa, and high layer between 100 and 400 hPa) are respectively, 23% to 37%, 9% to 20%, and 43% to 67%.The main outlier of these is NOA at medium levels with only 3%.
Figure 2 shows the spatial localization of AMVs inside the Himawari-8 Full-Disk, and the distribution of parameters (speed, direction, pressure, and CQI), for the different AMV datasets with CQI >= 50%.
Remote Sens. 2019, 11, x FOR PEER REVIEW 8 of 28 Intercomparison' with MSG satellite [15], which found up to approximately 10,000 AMVs.This is due, in part, to the higher resolution of Himawari/AHI (2 km) versus MSG/SEVIRI (3 km), more oceanic regions, and fewer deserts.
• The minimum pressure ranges between 57 hPa for EUM and 125 hPa for JMA.The maximum pressure ranges between 965 hPa for NWC and 1004 hPa for EUM.
• The percentage of AMVs by layer (with low layer between 700 and 1000 hPa, medium layer between 400 and 700 hPa, and high layer between 100 and 400 hPa) are respectively, 23% to 37%, 9% to 20%, and 43% to 67%.The main outlier of these is NOA at medium levels with only 3%.
Figure 2 shows the spatial localization of AMVs inside the Himawari-8 Full-Disk, and the distribution of parameters (speed, direction, pressure, and CQI), for the different AMV datasets with CQI >= 50%.The distributions are generally similar for all centers, with much better general agreement than in the previous 'AMV Intercomparison' [15], but with following items to be taken into account: (1) The distribution of direction values for BRZ shows some directions much more frequent than others, with two anomalous peaks at 90 • and 270 • .This issue also partially seems to occur with NWC, with a possible anomalous peak at 90 • .(2) The distribution of the CQI values is rather similar for all centers.This circumstance does not happen when the other quality indicators are considered: the QIF or QINF, for which the calculation results are quite different for the different centers (not shown).
(3) The distribution of AMV pressure values is the most divergent for the different centers, due to the different AMV height assignment methods used; only EUM and NWC being similar because of both using 'CCC method'.
Figure 3 shows the 'parameter plot' for different variables (speed, direction, pressure, CQI), for the collocated AMVs from the different centers, with CQI >= 50% and a collocation distance up to 55 km.
Remote Sens. 2019, 11, x FOR PEER REVIEW 9 of 28 The distributions are generally similar for all centers, with much better general agreement than in the previous 'AMV Intercomparison' [15], but with following items to be taken into account: 1) The distribution of direction values for BRZ shows some directions much more frequent than others, with two anomalous peaks at 90° and 270°.This issue also partially seems to occur with NWC, with a possible anomalous peak at 90°.
2) The distribution of the CQI values is rather similar for all centers.This circumstance does not happen when the other quality indicators are considered: the QIF or QINF, for which the calculation results are quite different for the different centers (not shown).
3) The distribution of AMV pressure values is the most divergent for the different centers, due to the different AMV height assignment methods used; only EUM and NWC being similar because of both using 'CCC method'.
Figure 3 shows the 'parameter plot' for different variables (speed, direction, pressure, CQI), for the collocated AMVs from the different centers, with CQI >= 50% and a collocation distance up to 55 km.Collocated AMVs in Figure 3 are sorted by increasing speed for NWC.Some specific elements are here significant: • EUM has many high-speed outliers (blue dots in Figure 3a).BRZ also has a cluster of low-speed outliers (green dots in the far right of the plot).Differences in speed are not significant considering the rest of AMV producers.Collocated AMVs in Figure 3 are sorted by increasing speed for NWC.Some specific elements are here significant: • EUM has many high-speed outliers (blue dots in Figure 3a).BRZ also has a cluster of low-speed outliers (green dots in the far right of the plot).Differences in speed are not significant considering the rest of AMV producers.

•
Comparing with the previous 'AMV Intercomparison' [15], the direction (Figure 3b) shows more easterlies even at high speeds, which is likely due to higher speed jets in the southern hemisphere during mid-winter (July), compared to September for that study.

•
The pressure (Figure 3c) shows, in general, very few low-level AMVs with speeds greater than 20 m s −1 .Comparing with the previous 'AMV Intercomparison' [15], there is a large number of high-level AMVs with slow speeds at high levels, for which we cannot offer an explanation.

•
Comparing the CQI (Figure 3d) with that for the QINF (not shown), there is more homogeneity between centers.This provides some evidence that the CQI gives better agreement in the AMVs, with a tendency to retain higher CQI values.The generally lower CQI values for BRZ (green dots) might be related to the larger differences in the AMV pressure with the other centers.Some lower CQI values also occur for EUM (blue dots), which might be related to the differences in AMV speeds.
Figure 4 is a scatterplot of AMV pressures for collocated AMVs from the different centers versus EUM AMV pressures, showing substantial variability depending on the height assignment method used.
• Comparing with the previous 'AMV Intercomparison' [15], the direction (Figure 3b) shows more easterlies even at high speeds, which is likely due to higher speed jets in the southern hemisphere during mid-winter (July), compared to September for that study.
• The pressure (Figure 3c) shows, in general, very few low-level AMVs with speeds greater than 20 m s -1 .Comparing with the previous 'AMV Intercomparison' [15], there is a large number of high-level AMVs with slow speeds at high levels, for which we cannot offer an explanation.
• Comparing the CQI (Figure 3d) with that for the QINF (not shown), there is more homogeneity between centers.This provides some evidence that the CQI gives better agreement in the AMVs, with a tendency to retain higher CQI values.The generally lower CQI values for BRZ (green dots) might be related to the larger differences in the AMV pressure with the other centers.Some lower CQI values also occur for EUM (blue dots), which might be related to the differences in AMV speeds.Three of the wind producers (NOA, EUM, NWC) use methods based on the Cloud products, with both EUM and NWC using the 'CCC method' and, therefore, exhibiting the most similarity.This similarity between EUM and NWC is evident in Figure 4, in which the corresponding magenta clusters concentrate in the diagonal.In spite of using the Cloud products, the black dots related to NOA AMVs behave very differently, with no AMVs between 450 and 700 hPa and many AMVs located in a very different layer than the related EUM AMVs.Three of the wind producers (NOA, EUM, NWC) use methods based on the Cloud products, with both EUM and NWC using the 'CCC method' and, therefore, exhibiting the most similarity.This similarity between EUM and NWC is evident in Figure 4, in which the corresponding magenta clusters concentrate in the diagonal.In spite of using the Cloud products, the black dots related to NOA AMVs behave very differently, with no AMVs between 450 and 700 hPa and many AMVs located in a very different layer than the related EUM AMVs.
The remainder of the centers use methods not based on the provided cloud product.Here, the red dots related to KMA AMVs are in relative agreement with EUM AMVs.But BRZ and JMA AMVs show many points plotted away from the diagonal: BRZ AMVs show large clusters of green dots above the diagonal (indicating generally AMV lower in altitude as compared to EUM), and JMA AMVs show many yellow dots located at medium levels (compared to equivalent EUM low-level AMVs) and some located at very low levels (compared to equivalent EUM higher AMVs).
The updated AMV height assignment methods with respect to those in the previous 'AMV Intercomparison' [15], exhibit more variability in the AMV pressure values than the one shown there.
When the AMVs are compared to radiosonde winds (in Table 3 using the CQI threshold of 50%, and in Table 4 using the CQI threshold of 80%), the best results are for JMA (with a vector RMS of 6 m s −1 ), and then for NWC and NOA (with a vector RMS of 7-8 m s −1 ).BRZ and EUM show poor results for the low-quality threshold, while much better for the high-quality threshold, for which there is more homogeneity between centers.
In addition, there are important differences in the number of AMVs for the different centers despite using, all of them, the prescribed configuration.Validating the collocated AMVs against the ECMWF ERA-Interim NWP analysis winds, in Table 5 using the QINF threshold of 80% and in Table 6 using the CQI threshold of 80%, the differences between centers are smaller, and even smaller using the CQI for the filtering.JMA has again the best results, and only BRZ statistics are visibly worse than the rest.
Although not shown, the situation is similar for all AMVs altogether and for the AMVs in the high and the medium layer.At the low layer, BRZ statistics are better, leaving it in an intermediate position and EUM in the last position.Table 7 shows an additional comparison of collocated AMVs against the NWP analysis winds, in which a different CQI threshold is defined for each center so that a similar number of AMVs is kept for all of them (at least the 10,000 best AMVs).The CQI threshold ranges from 98% to 100% for all the centers except for NWC, for which it is 90%, as the total number of AMVs for NWC is much smaller than for the other centers in this Experiment 1.Despite the different criteria considered for this Table 7, the distribution of errors is not very different to that in Tables 3 and 4: JMA has again the best statistics, with NOA and NWC in later positions.The largest deviations are now being split between BRZ and EUM.The difference of the AMV pressure level with the AMV best-fit pressure level is considered in Figure 5.The best-fit pressure level is computed for each AMV according to the method described by Salonen et al. (2012) [19].This method defines the pressure for which the background NWP model wind has a minimum vector difference with the AMV wind.It does this by first determining the NWP model pressure level with a minimum vector difference between the AMV wind and the NWP wind.
Then, the actual minimum is calculated by using a parabolic fit with the vector difference for this model level and the levels directly above and below, which must be both less than or equal to 4 m s −1 , and at least 2 m s −1 smaller than the vector differences more than 100 hPa away from the best-fit pressure level.Therefore, this method is dependent on the model vertical resolution.
Similar to previous studies, the number of best-fit pressure level matches is for about 30% of the AMVs: depending on the AMV dataset, 21% to 36% of the AMVs are adjusted to a best-fit pressure.
Results show an approximate Gaussian distribution of the variable 'AMV best-fit pressure level-AMV pressure level' for all centers, which is more consistent than the one found in the previous 'AMV Intercomparison' [15].
Then, the actual minimum is calculated by using a parabolic fit with the vector difference for this model level and the levels directly above and below, which must be both less than or equal to 4 m s -1 , and at least 2 m s -1 smaller than the vector differences more than 100 hPa away from the best-fit pressure level.Therefore, this method is dependent on the model vertical resolution.
Similar to previous studies, the number of best-fit pressure level matches is for about 30% of the AMVs: depending on the AMV dataset, 21% to 36% of the AMVs are adjusted to a best-fit pressure.
Results show an approximate Gaussian distribution of the variable 'AMV best-fit pressure level -AMV pressure level' for all centers, which is more consistent than the one found in the previous 'AMV Intercomparison' [15].
The pressure difference is centered near zero (upper panels in Figure 5) and extends up to ± 200 hPa.The exception to this is JMA, for which the deviation is up to ±100 hPa only.This way, it is clear that JMA AMVs are much nearer to the AMV best-fit pressure level than all other datasets.Although this might also indicate that the JMA AMV height assignment method has a stronger dependency on the NWP model background.
The maps in the lower panels of Figure 5 depict the best-fit displacements above (red) and below (blue) the AMV level for each AMV, which tend to be in similar locations for all centers for collocated AMVs.
In the northern hemisphere, generally, only high-level AMVs are adjusted, while, in the southern hemisphere, both high-and low-level clouds are adjusted to the best-fit level.An additional calculation is defined to evaluate the difference between the AMV wind (before and after the best-fit pressure level correction) and the NWP model wind.Results are shown in Tables 8 and 9 for all AMVs for which the best-fit pressure level correction could be applied, using, respectively, a QINF threshold of 80% and a CQI threshold of 80%.
The speed bias is reduced in the case of BRZ, due to the large value of this parameter before the best-fit pressure level correction.For the rest of the centers, the speed bias is already small before the correction, and so the impact of this correction on this parameter is not significant.
Finally, the speed standard deviation is reduced about 70%-75% after the correction for all centers except for JMA, for which the reduction is approximately 38%, as the JMA standard deviation before the best-fit level correction is nearly as good as for the other centers after the best-fit level correction.The pressure difference is centered near zero (upper panels in Figure 5) and extends up to ±200 hPa.The exception to this is JMA, for which the deviation is up to ±100 hPa only.This way, it is clear that JMA AMVs are much nearer to the AMV best-fit pressure level than all other datasets.Although this might also indicate that the JMA AMV height assignment method has a stronger dependency on the NWP model background.
The maps in the lower panels of Figure 5 depict the best-fit displacements above (red) and below (blue) the AMV level for each AMV, which tend to be in similar locations for all centers for collocated AMVs.
In the northern hemisphere, generally, only high-level AMVs are adjusted, while, in the southern hemisphere, both high-and low-level clouds are adjusted to the best-fit level.
An additional calculation is defined to evaluate the difference between the AMV wind (before and after the best-fit pressure level correction) and the NWP model wind.Results are shown in Tables 8  and 9 for all AMVs for which the best-fit pressure level correction could be applied, using, respectively, a QINF threshold of 80% and a CQI threshold of 80%.
The speed bias is reduced in the case of BRZ, due to the large value of this parameter before the best-fit pressure level correction.For the rest of the centers, the speed bias is already small before the correction, and so the impact of this correction on this parameter is not significant.
Finally, the speed standard deviation is reduced about 70%-75% after the correction for all centers except for JMA, for which the reduction is approximately 38%, as the JMA standard deviation before the best-fit level correction is nearly as good as for the other centers after the best-fit level correction.

Experiment 2
In this case, AMV producers extracted cloudy AMVs with the Himawari-8/AHI IR10.4 µm triplet 1200-1220 UTC, using their best options for the AMV calculation, and their specific configuration for target box sizes, search scene sizes, and target locations.This way, differences of each AMV extraction process with respect to the previous prescribed configuration can be compared.
Table 10 shows the parameter distribution for the AMV datasets, with a CQI threshold of 50%.As in Experiment 1, this threshold is used in all AMV datasets to avoid the processing of 'low-quality AMVs'; in some elements of the Intercomparison, an additional CQI threshold of 80% is used to verify its impact on the improvement of the validation parameters in the different AMV datasets.The differences with Experiment 1 are caused by the specific configuration of each center for the target box sizes, search scene sizes, and target locations:

•
The total number of AMVs ranges from 31,000 to 147,000, with the lower values for EUM and the larger values for NWC, and a factor of five between the numbers of AMVs of both centers.

•
Comparing with Experiment 1, the difference in the total number of AMVs is small for all centers (only up to 25%) except for NWC, for which the total number of AMVs is one order of magnitude larger.The main reason for this big increment in the number of NWC AMVs is its high density of tracers in its operational specific conditions (with the smallest nominal separation between tracers of all methods).

•
The number of AMVs is much larger with Himawari satellites than in the previous 'AMV Intercomparison' with MSG satellite [15], which found up to approximately 10,000 AMVs (although KMA was then able to calculate 50,000 AMVs and NWC was able to calculate 90,000 AMVs).As already said, this is due, in part, to the higher resolution of Himawari/AHI (2 km) versus MSG/SEVIRI (3 km), more oceanic regions, and fewer deserts.

•
The minimum pressure ranges between 57 hPa for EUM and 125 hPa for JMA.The maximum pressure ranges between 972 hPa for NWC and 1010 hPa for EUM.

•
The percentage of AMVs by layer (with the low layer between 700 and 1000 hPa, the medium layer between 400 and 700 hPa, and the high layer between 100 and 400 hPa) are, respectively, 25% to 34%, 12% to 22%, and 49% to 64%.The main outlier of these is NOA again at medium levels with only 3%.
Figure 6 shows the spatial localization of AMVs inside the Himawari-8 Full-Disk, and the distribution of parameters (speed, direction, pressure, and CQI), for the different AMV datasets with CQI >= 50%.
high density of tracers in its operational specific conditions (with the smallest nominal separation between tracers of all methods).
• The number of AMVs is much larger with Himawari satellites than in the previous 'AMV Intercomparison' with MSG satellite [15], which found up to approximately 10,000 AMVs (although KMA was then able to calculate 50,000 AMVs and NWC was able to calculate 90,000 AMVs).As already said, this is due, in part, to the higher resolution of Himawari/AHI (2 km) versus MSG/SEVIRI (3 km), more oceanic regions, and fewer deserts.
• The minimum pressure ranges between 57 hPa for EUM and 125 hPa for JMA.The maximum pressure ranges between 972 hPa for NWC and 1010 hPa for EUM.
• The percentage of AMVs by layer (with the low layer between 700 and 1000 hPa, the medium layer between 400 and 700 hPa, and the high layer between 100 and 400 hPa) are, respectively, 25% to 34%, 12% to 22%, and 49% to 64%.The main outlier of these is NOA again at medium levels with only 3%.
Figure 6 shows the spatial localization of AMVs inside the Himawari-8 Full-Disk, and the distribution of parameters (speed, direction, pressure, and CQI), for the different AMV datasets with CQI >= 50%.The parameter distributions are very similar to those for Experiment 1 in Figure 2, and so the same conclusions apply.Specifically, the distribution of the CQI values appear similar for all centers again.The differences in the height assignment process drive again the majority of differences observed in the AMV datasets.
Figure 7 shows the 'parameter plot' for different variables (speed, direction, pressure, CQI), for the collocated AMVs from the different centers, with CQI >= 50% and a collocation distance up to 55 km.

Remote Sens. 2019, 11, x FOR PEER REVIEW 17 of 28
The parameter distributions are very similar to those for Experiment 1 in Figure 2, and so the same conclusions apply.Specifically, the distribution of the CQI values appear similar for all centers again.The differences in the height assignment process drive again the majority of differences observed in the AMV datasets.
Figure 7 shows the 'parameter plot' for different variables (speed, direction, pressure, CQI), for the collocated AMVs from the different centers, with CQI >= 50% and a collocation distance up to 55 km.Collocated AMVs in Figure 7 are sorted by increasing speed for NWC.In general, Figure 7 is in agreement with Figure 3 for Experiment 1 of this study: • The speed plot in Figure 7a shows that the EUM high-speed outliers (blue dots) and the BRZ cluster of low-speed outliers (green dots in the far right of the plot), detected in Experiment 1, are seen again.
• The CQI plot in Figure 7d shows that there is again a tendency to retain high CQI values for all centers except BRZ, and in some cases also EUM.Their lower CQI values (green and blue dots in the plot) might again be related to the differences in AMV pressure and AMV speeds, respectively, with the other centers.The other graphs in Figure 7 are, however, dominated by the nearly 150,000 AMVs from NWC, because of which additional information cannot be extracted.Collocated AMVs in Figure 7 are sorted by increasing speed for NWC.In general, Figure 7 is in agreement with Figure 3 for Experiment 1 of this study:

•
The speed plot in Figure 7a shows that the EUM high-speed outliers (blue dots) and the BRZ cluster of low-speed outliers (green dots in the far right of the plot), detected in Experiment 1, are seen again.

•
The CQI plot in Figure 7d shows that there is again a tendency to retain high CQI values for all centers except BRZ, and in some cases also EUM.Their lower CQI values (green and blue dots in the plot) might again be related to the differences in AMV pressure and AMV speeds, respectively, with the other centers.
The other graphs in Figure 7 are, however, dominated by the nearly 150,000 AMVs from NWC, because of which additional information cannot be extracted.
Figure 8 shows the scatterplot of AMV pressures for collocated AMVs from the different centers versus EUM AMV pressures.There is again an excellent agreement between EUM and NWC (magenta clusters), and many points plotted away from the diagonal (from the other centers).NOA has again no AMVs between 450 and 700 hPa, and many of its AMVs are away from the diagonal (at both higher and lower altitudes).
There are again two clusters of green dots related to BRZ above the diagonal (with AMVs lower in altitude as compared to EUM), although now there are also some green dots related to BRZ below the diagonal (at higher altitudes).KMA AMVs also show now a cluster of red dots above the diagonal at high levels (lower in altitude).
Finally, JMA AMVs show the same behavior than in Experiment 1, with many yellow dots located at medium levels (compared to equivalent EUM low-level AMVs) and some located at very low levels (compared to equivalent EUM higher AMVs).
When the AMVs are compared to radiosonde winds (in Table 11 using the threshold of 50%, and in Table 12 using the threshold of 80% for the CQI), the best results are again for JMA (with a vector RMS of 6 m s -1 ), and then for NOA and NWC (with a vector RMS of 7-8 m s -1 ).
EUM results are much better in Experiment 2, using their specific configuration.The higher quality threshold in Table 12 contributes again to more homogeneity in the statistics of the different centers.There is again an excellent agreement between EUM and NWC (magenta clusters), and many points plotted away from the diagonal (from the other centers).NOA has again no AMVs between 450 and 700 hPa, and many of its AMVs are away from the diagonal (at both higher and lower altitudes).
There are again two clusters of green dots related to BRZ above the diagonal (with AMVs lower in altitude as compared to EUM), although now there are also some green dots related to BRZ below the diagonal (at higher altitudes).KMA AMVs also show now a cluster of red dots above the diagonal at high levels (lower in altitude).
Finally, JMA AMVs show the same behavior than in Experiment 1, with many yellow dots located at medium levels (compared to equivalent EUM low-level AMVs) and some located at very low levels (compared to equivalent EUM higher AMVs).
When the AMVs are compared to radiosonde winds (in Table 11 using the threshold of 50%, and in Table 12 using the threshold of 80% for the CQI), the best results are again for JMA (with a vector RMS of 6 m s −1 ), and then for NOA and NWC (with a vector RMS of 7-8 m s −1 ).EUM results are much better in Experiment 2, using their specific configuration.The higher quality threshold in Table 12 contributes again to more homogeneity in the statistics of the different centers.
Validating the collocated AMVs against the ECMWF ERA-Interim NWP analysis winds, in Table 13 with QINF >= 80%, and in Table 14 with CQI >= 80%, differences between centers are smaller.In Tables 13 and 14, JMA has again the best results, and only BRZ statistics are visibly worse.Although not shown, the situation is similar for all AMVs altogether and for the AMVs in the high and the medium layer.At the low layer, BRZ statistics are better, leaving it in an intermediate position and leaving KMA in the last position.
Table 15 shows an additional comparison of collocated AMVs against the NWP analysis winds, in which a different CQI threshold is defined for each center so that a similar number of AMVs is kept for all of them (at least the 10,000 best AMVs).The CQI threshold ranges from 98% to 100% for all the centers.Despite the different criteria considered for Table 15, the distribution of errors is similar to those in Tables 11-14: JMA has again the best statistics, with NOA and NWC in second positions.The largest deviations are related again to BRZ.The difference of the AMV pressure level with the AMV best-fit pressure level, computed as described in Figure 5, is considered now in Figure 9. Similarly, the number of best-fit pressure level matches is for 22% to 38% of the AMVs.Results show again the approximate Gaussian distribution of the variable 'AMV best-fit pressure level-AMV pressure level' for all centers.
In Tables 13 and 14, JMA has again the best results, and only BRZ statistics are visibly worse.Although not shown, the situation is similar for all AMVs altogether and for the AMVs in the high and the medium layer.At the low layer, BRZ statistics are better, leaving it in an intermediate position and leaving KMA in the last position.
Table 15 shows an additional comparison of collocated AMVs against the NWP analysis winds, in which a different CQI threshold is defined for each center so that a similar number of AMVs is kept for all of them (at least the 10,000 best AMVs).The CQI threshold ranges from 98% to 100% for all the centers.Despite the different criteria considered for Table 15, the distribution of errors is similar to those in Tables 11,12, 13, and 14: JMA has again the best statistics, with NOA and NWC in second positions.The largest deviations are related again to BRZ.The difference of the AMV pressure level with the AMV best-fit pressure level, computed as described in Figure 5, is considered now in Figure 9. Similarly, the number of best-fit pressure level matches is for 22% to 38% of the AMVs.Results show again the approximate Gaussian distribution of the variable 'AMV best-fit pressure level -AMV pressure level' for all centers.
The pressure difference is centered again near zero (upper panels in Figure 9), extending up to ±200 hPa, except for JMA for which the deviation is up to ±100 hPa only.This way, it is again clearer that JMA AMVs are much closer to the AMV best-fit level than all other datasets, although with a possibly stronger dependency on the NWP model background.The maps in the lower panels of Figure 9 depict anew the best-fit displacements above (red) and below (blue) the AMV level for each AMV.Again, they tend to be in similar locations for all centers for collocated AMVs.The pressure difference is centered again near zero (upper panels in Figure 9), extending up to ±200 hPa, except for JMA for which the deviation is up to ±100 hPa only.This way, it is again clearer that JMA AMVs are much closer to the AMV best-fit level than all other datasets, although with a possibly stronger dependency on the NWP model background.The maps in the lower panels of Figure 9 depict anew the best-fit displacements above (red) and below (blue) the AMV level for each AMV.Again, they tend to be in similar locations for all centers for collocated AMVs.
An additional calculation to evaluate the difference between the AMV wind (before and after the best-fit pressure level correction) and the NWP model wind is defined again.Results are shown in Tables 16 and 17 for all AMVs, for which the 'best-fit pressure level correction' could be applied, using, respectively, the threshold of 80% for the QINF and the CQI.The speed bias is reduced in the cases of BRZ, KMA, and NWC.For the rest of the centers, the speed bias is already small before the correction, and so the impact of this correction on this parameter is not significant.
Considering the speed standard deviation, again it is reduced around 70%-75% after the correction for all centers except for JMA, for which the reduction is around 38%, as the JMA standard deviation before the best-fit level correction is nearly as good as for the other centers after the best-fit level correction.

Experiment 3
In this case, producers extracted AMVs with the Himawari-8/AHI infrared 10.4 µm triplet for 21 July 2016 0530-0550 UTC, using their best options for the AMV calculation and their specific configurations for target box size, search scene size, and target location (as in Experiment 2).
This dataset is used for validation against NASA's CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation), which provides an independent measurement of cloud top height.
CALIPSO is a line-of-site measurement, so there are few collocations with AMVs (tens of matches only).Therefore, this evaluation is qualitative, as illustrated in the following Figures 10 and 11.The thresholds for determining collocations are a distance of 0.1 degrees for latitude and longitude and a time of one hour.
In this case, producers extracted AMVs with the Himawari-8/AHI infrared 10.4 µ m triplet for 21 July 2016 0530-0550 UTC, using their best options for the AMV calculation and their specific configurations for target box size, search scene size, and target location (as in Experiment 2).
This dataset is used for validation against NASA's CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation), which provides an independent measurement of cloud top height.
CALIPSO is a line-of-site measurement, so there are few collocations with AMVs (tens of matches only).Therefore, this evaluation is qualitative, as illustrated in the following Figures 10 and  11.The thresholds for determining collocations are a distance of 0.1 degrees for latitude and longitude and a time of one hour.
All previous analysis of variables, validation against radiosonde wind data, and NWP analysis wind data, and best-fit analysis done for Experiment 2 are not repeated here, as both experiments are, as said, equivalent.Overall, the results are in general agreement with the best-fit analysis in Experiment 2; in general, the high-level AMVs need to be adjusted higher in the atmosphere.
AMV heights for the different centers are in good agreement in this specific example, and in apparent disagreement with the previous AMV-pressure scatter plots.However, this is a very small sample, and no attempt has been made to ensure that the AMVs are collocated with each other: data from each center are plotted independently from the other ones.All previous analysis of variables, validation against radiosonde wind data, and NWP analysis wind data, and best-fit analysis done for Experiment 2 are not repeated here, as both experiments are, as said, equivalent.
Figure 10 depicts the portion of the orbit from Western Australia to Mongolia (red line) for which the CALIPSO data for 21 July 2016 0530-0550 UTC are used in Experiment 3. As seen in Figure 10b, clouds are primarily in the tropics and the Northern part of the pass.On the other hand, over Australia and adjacent ocean to the North, there are few or no clouds, resulting in a limited number of AMVs for comparison with CALIPSO.
Figure 11 depicts, for the different centers, the clouds as detected by CALIPSO (color-coded from gray to red by emissivity).
The tropopause height is designated by the black line across the top of the graph (near the top of the cirrus clouds).The topography of the Earth's surface is the light blue line in the lower part of the graph.In each one of the panels, the collocated AMVs for each center are shown as black asterisk symbols, primarily located near the base of the cirrus cloud feature in the middle of the graph, and at the top of the low-level and medium-level clouds located in the central and right side of the graph.
Overall, the results are in general agreement with the best-fit analysis in Experiment 2; in general, the high-level AMVs need to be adjusted higher in the atmosphere.
AMV heights for the different centers are in good agreement in this specific example, and in apparent disagreement with the previous AMV-pressure scatter plots.However, this is a very small sample, and no attempt has been made to ensure that the AMVs are collocated with each other: data from each center are plotted independently from the other ones.

Similarities in the AMV Datasets
To quantify the differences observed in the AMV datasets in Experiments 1 and 2, a 'paired t-test' is used with all combinations of AMV datasets and four considered parameters (speed, direction, pressure, and quality indicator), to determine if the differences in the parameters are statistically significant for collocated pairs of AMVs.This is done because there is no 'ground truth' for AMVs, and one of the goals of this study is to determine the similarity in the AMVs from the different centers.The statistics are computed using the 'MATLAB paired t-test' function, which performs a 't-test' of the hypothesis that the data come from a distribution with mean zero.The data used here are the differences in each parameter from each pair of AMV datasets; therefore, a mean of zero is expected.
AMVs used with this t-test are quality controlled, retaining only those with QINF >= 50% and QINF >= 80%, and also those with CQI >= 50% and CQI >= 80%.For collocation, the distance used here between AMVs is up to 35 km.
Tables 18 and 19 show, respectively, for Experiment 1 (with prescribed configuration) and Experiment 2 (with specific configuration), the pairs of combinations with parameter differences not statistically significant.A total of 15 combinations is defined per parameter (considering all combinations of AMV datasets), and so a total of 60 combinations is defined in each whole test.The most consistent agreement is for the AMV direction with the prescribed configuration and the high-quality indicator thresholds.The similarities then reduce progressively for the AMV speed, the AMV quality indicator and the AMV pressure.The lack of agreement in the AMV pressure is likely due to the lack of commonality in the AMV height assignment algorithms; here, though both EUM and NWC centers use 'CCC method' for the height assignment, there are still significant differences for this variable even for these centers.
In general, more homogeneity between variables in different datasets is seen using the same prescribed configuration, using a higher threshold for the 'quality indicator' in the AMV filterings, and using the specific QINF calculated by each center.
This last result seems to be in contradiction with elements shown in the previous sections, in which the CQI provided more homogeneity in the AMV statistics of different centers (comparing, for example, Tables 5 and 6).
The result here could be, however, derived from the fact that with the CQI, the AMV samples of data for all centers are larger, and with this, the 'paired t-test' has more difficulties to show a mean of zero.

Discussion
The following are general observations and recommendations from this 'AMV Intercomparison Study': (1) Experiments 1 and 2 only differ in terms of a prescribed versus specific configuration.Overall, the results between the two experiments are nearly the same in terms of bulk statistics, collocated statistics, comparisons to radiosonde winds and NWP analysis winds, and best-fit analysis, with the main variations related to the number of calculated AMVs.This implies that the impact of using a prescribed versus a specific configuration is small.(2) For the experiments, the centers have used the 'AMV height assignment' method of their choice.
This height assignment method has the biggest impact on the lack of agreement in the AMVs.(3) The 'Common Quality Indicator (CQI)' has been introduced in this Intercomparison, and it shows some skill in filtering collocated AMVs, resulting in an improved statistical agreement.(4) Another first for this study has been the addition of CALIPSO data to provide an independent measurement of cloud height, to be used for comparison and validation of the AMV heights.Due to the small sample of collocated observations for the comparison, only a qualitative assessment could be made: AMV heights are generally near the cloud base for the high-level thin cirrus clouds, and near the cloud top for the low-level and the medium-level clouds detected by CALIPSO.For the next round of AMV Intercomparison studies, a comparison of AMVs with MISR winds is planned for additional checkings of the AMVs and the AMV heights.(5) Even though there is a large variation in the AMV heights for collocated winds, the best-fit analysis and CALIPSO comparison both show that there are equivalent, high-quality winds produced by all centers.This might indicate that the quality control post-processing by each center is an important step in filtering out poor quality AMVs, resulting in a higher quality product.For future Intercomparison studies, an experiment to substantiate this conjecture should be considered.

Conclusions
The following sections detail the findings from the experiments, in terms of each AMV producer independently.This includes the strengths and weaknesses, as determined by the results of the experiments.

(a) BRZ-Brazil Center for Weather Prediction and Climate Studies (CPTEC/INPE)
The performance of the BRZ algorithm has improved with respect to the results in the previous AMV Intercomparison.Results are more in agreement with the rest of AMV centers, and differences in many of the variables have reduced.In any case, it is clear that there still exists room for improvement.On the one hand, there are still large differences in the AMV height assignment, as compared to all other centers.This results in BRZ often having the largest deviations from radiosonde winds and NWP analysis winds.On the other hand, BRZ has still to verify the differences found in the direction histograms, in which some directions are found to be much more frequent than other ones.

(b) KMA-Korea Meteorological Administration
The KMA algorithm performs similarly to the results from the previous AMV Intercomparison.Overall, the comparisons to radiosonde winds and NWP analysis winds are in the middle of the distributions.

•
Cloud products derived by NOAA/NESDIS for the given time slots.Information was provided for the following variables: surface type, surface elevation, land class, cloud probability, cloud mask, cloud type, cloud phase, and cloud top pressure.However, due to the different cloud assignment methods and the way cloud data are used in them by the different AMV algorithms, only EUM, NOA, and NWC made actual use of these NOAA/NESDIS cloud products.Remote Sens. 2019, 11, x FOR PEER REVIEW 6 of 28

Figure 3 .
Figure 3. 'Parameter plot' for collocated AMVs from the different AMV datasets in Experiment 1, with CQI >= 50% and a collocation distance up to 55 km.Considered variables are speed (a), direction (b), pressure (c), and CQI (d).Color codes correspond to BRZ in green, EUM in blue, JMA in yellow, KMA in red, NOA in black, NWC in magenta.

Figure 3 .
Figure 3. 'Parameter plot' for collocated AMVs from the different AMV datasets in Experiment 1, with CQI >= 50% and a collocation distance up to 55 km.Considered variables are speed (a), direction (b), pressure (c), and CQI (d).Color codes correspond to BRZ in green, EUM in blue, JMA in yellow, KMA in red, NOA in black, NWC in magenta.

Figure 4
Figure 4 is a scatterplot of AMV pressures for collocated AMVs from the different centers versus EUM AMV pressures, showing substantial variability depending on the height assignment method used.

Figure 4 .
Figure 4. Scatterplot of collocated AMV pressures for Experiment 1, using CQI >= 50% and a collocation distance up to 55 km, for each center versus EUM AMV pressures (BRZ in green; JMA in yellow; KMA in red; NOA in black; NWC in magenta).

Figure 4 .
Figure 4. Scatterplot of collocated AMV pressures for Experiment 1, using CQI >= 50% and a collocation distance up to 55 km, for each center versus EUM AMV pressures (BRZ in green; JMA in yellow; KMA in red; NOA in black; NWC in magenta).

Figure 5 .
Figure 5. Histogram (upper) and maps (lower) of the difference 'AMV best-fit pressure level -AMV pressure level' for (a) BRZ, (b) EUM, (c) JMA, (d) KMA, (e) NOA, (f) NWC.In the maps, red shows the best-fit pressure at a higher level; blue shows the best-fit pressure at a lower level.

Figure 5 .
Figure 5. Histogram (upper) and maps (lower) of the difference 'AMV best-fit pressure level-AMV pressure level' for (a) BRZ, (b) EUM, (c) JMA, (d) KMA, (e) NOA, (f) NWC.In the maps, red shows the best-fit pressure at a higher level; blue shows the best-fit pressure at a lower level.

Figure 7 .
Figure 7. 'Parameter plot' for collocated AMVs from the different AMV datasets in Experiment 2, with CQI >= 50% and a collocation distance up to 55 km.Considered variables are speed (a), direction (b), pressure (c), and CQI (d).Color codes correspond to BRZ in green, EUM in blue, JMA in yellow, KMA in red, NOA in black, NWC in magenta.

Figure 7 .
Figure 7. 'Parameter plot' for collocated AMVs from the different AMV datasets in Experiment 2, with CQI >= 50% and a collocation distance up to 55 km.Considered variables are speed (a), direction (b), pressure (c), and CQI (d).Color codes correspond to BRZ in green, EUM in blue, JMA in yellow, KMA in red, NOA in black, NWC in magenta.

Figure 8
Figure8shows the scatterplot of AMV pressures for collocated AMVs from the different centers versus EUM AMV pressures.

Figure 8 .
Figure 8. Scatterplot of collocated AMV pressures for Experiment 2, with CQI >= 50% and a collocation distance up to 55 km, for each center versus EUM AMV pressures (BRZ in green; JMA in yellow; KMA in red; NOA in black; NWC in magenta).

Figure 8
Figure8is in agreement with Figure4for Experiment 1 of this study, but with a much larger number of AMV collocations between centers.There is again an excellent agreement between EUM and NWC (magenta clusters), and many points plotted away from the diagonal (from the other centers).NOA has again no AMVs between 450 and 700 hPa, and many of its AMVs are away from the diagonal (at both higher and lower altitudes).There are again two clusters of green dots related to BRZ above the diagonal (with AMVs lower in altitude as compared to EUM), although now there are also some green dots related to BRZ below the diagonal (at higher altitudes).KMA AMVs also show now a cluster of red dots above the diagonal at high levels (lower in altitude).Finally, JMA AMVs show the same behavior than in Experiment 1, with many yellow dots located at medium levels (compared to equivalent EUM low-level AMVs) and some located at very low levels (compared to equivalent EUM higher AMVs).

Figure 8 .
Figure 8. Scatterplot of collocated AMV pressures for Experiment 2, with CQI >= 50% and a collocation distance up to 55 km, for each center versus EUM AMV pressures (BRZ in green; JMA in yellow; KMA in red; NOA in black; NWC in magenta).

Figure 8
Figure8is in agreement with Figure4for Experiment 1 of this study, but with a much larger number of AMV collocations between centers.There is again an excellent agreement between EUM and NWC (magenta clusters), and many points plotted away from the diagonal (from the other centers).NOA has again no AMVs between 450 and 700 hPa, and many of its AMVs are away from the diagonal (at both higher and lower altitudes).There are again two clusters of green dots related to BRZ above the diagonal (with AMVs lower in altitude as compared to EUM), although now there are also some green dots related to BRZ below the diagonal (at higher altitudes).KMA AMVs also show now a cluster of red dots above the diagonal at high levels (lower in altitude).Finally, JMA AMVs show the same behavior than in Experiment 1, with many yellow dots located at medium levels (compared to equivalent EUM low-level AMVs) and some located at very low levels (compared to equivalent EUM higher AMVs).

Figure 9 .
Figure 9. Histogram (upper) and maps (lower) of the difference 'AMV best-fit pressure level -AMV pressure level' for (a) BRZ, (b) EUM, (c) JMA, (d) KMA, (e) NOA, (f) NWC.In the maps, red shows the best-fit pressure at a higher level; blue shows the best-fit pressure at a lower level.

Figure 9 .
Figure 9. Histogram (upper) and maps (lower) of the difference 'AMV best-fit pressure level-AMV pressure level' for (a) BRZ, (b) EUM, (c) JMA, (d) KMA, (e) NOA, (f) NWC.In the maps, red shows the best-fit pressure at a higher level; blue shows the best-fit pressure at a lower level.

Figure 10
depicts the portion of the orbit from Western Australia to Mongolia (red line) for which the CALIPSO data for 21 July 2016 0530-0550 UTC are used in Experiment 3. As seen in Figure10b, clouds are primarily in the tropics and the Northern part of the pass.On the other hand, over Australia and adjacent ocean to the North, there are few or no clouds, resulting in a limited number of AMVs for comparison with CALIPSO.

Figure 10 .
Figure 10.CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation) data for 21 July 2016 at 0530-0550 UTC, used in Experiment 3: (a) Ground track over a map of the West Pacific Ocean, (b) Ground track over the corresponding Himawari-8 RGB image, (c) Ground track over the corresponding Himawari-8 derived Cloud top height, (d) Statistics for the CALIPSO data.

Figure 11
Figure 11 depicts, for the different centers, the clouds as detected by CALIPSO (color-coded from gray to red by emissivity).The tropopause height is designated by the black line across the top of the graph (near the top of the cirrus clouds).The topography of the Earth's surface is the light blue line in the lower part of the graph.In each one of the panels, the collocated AMVs for each center are shown as black asterisk symbols, primarily located near the base of the cirrus cloud feature in the middle of the graph, and at the top of the low-level and medium-level clouds located in the central and right side of the graph.Overall, the results are in general agreement with the best-fit analysis in Experiment 2; in general, the high-level AMVs need to be adjusted higher in the atmosphere.AMV heights for the different centers are in good agreement in this specific example, and in apparent disagreement with the previous AMV-pressure scatter plots.However, this is a very small sample, and no attempt has been made to ensure that the AMVs are collocated with each other: data from each center are plotted independently from the other ones.

Figure 10 .
Figure 10.CALIPSO (Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation) data for 21 July 2016 at 0530-0550 UTC, used in Experiment 3: (a) Ground track over a map of the West Pacific Ocean, (b) Ground track over the corresponding Himawari-8 RGB image, (c) Ground track over the corresponding Himawari-8 derived Cloud top height, (d) Statistics for the CALIPSO data.

Table 1 .
Reported variables for the AMVs (Atmospheric Motion Vectors) in all datasets.

Table 1 .
Reported variables for the AMVs (Atmospheric Motion Vectors) in all datasets.

Table 2 .
Parameter distribution for the different AMV datasets in Experiment 1 with CQI >= 50%.

Table 3 .
Experiment 1: Comparison of AMVs, with CQI >= 50%, to radiosonde winds within 150 km.(N = Number of Matches; Pre Bias = Pressure Bias; Pre RMS = Pressure Root Mean Square Error; Spd Bias = Wind Speed Bias; Spd RMS = Wind Speed Root Mean Square Error; Dir Bias = Wind Direction Bias; Vec RMS = Wind Vector Root Mean Square Error).The extremes for each category are highlighted; in yellow, the worst value, and in green, the best value.

Table 5 .
Experiment 1: Comparison of collocated AMVs with QINF >= 80%, with NWP analysis winds.(N= Number of AMVs; NBF = Number of AMVs with Best-Fit Pressure Value; VD = Vector Difference for all AMVs; VDBF = Vector Difference for AMVs with Best-Fit Pressure Value; RMS = Root Mean Square Error for all AMVs; RMSBF = Root Mean Square Error for AMVs with Best-Fit Pressure Value).The extremes for each category are highlighted; in yellow, the worst value, and in green, the best value.

Table 7 .
Experiment 1: Comparison of all AMVs to NWP analysis winds, with a CQI threshold so that at least 10,000 AMVs are kept.(CQI= Common Quality Indicator Threshold used for each center; N = Number of AMVs; NBF = Number of AMVs with Best-Fit Pressure Value; VD = Vector Difference for all AMVs; VDBF = Vector Difference for AMVs with Best-Fit Pressure Value; RMS = Root Mean Square Error for all AMVs; RMSBF = Root Mean Square Error for AMVs with Best-Fit Pressure Value).The extremes for each category are highlighted; in yellow, the worst value, and in green, the best value.

Table 8 .
Experiment 1: Speed Bias (BIAS) and Speed Standard Deviation (STD) with respect to the NWP analysis winds, before and after the best-fit pressure level correction, for all AMVs for which the best-fit pressure level correction could be applied, with QINF >= 80%.The extremes for each category are highlighted; in yellow, the worst value, and in green, the best value.

Table 10 .
Parameter distribution for the different AMV datasets in experiment 2 with CQI >= 50%.

Table 11 .
Experiment 2: Comparison of AMVs, with CQI >= 50%, to radiosonde winds within 150 km.(N = Number of Matches; Pre Bias = Pressure Bias; Pre RMS = Pressure Root Mean Square Error; Spd Bias = Wind Speed Bias; Spd RMS = Wind Speed Root Mean Square Error; Dir Bias = Wind Direction Bias; Vec RMS = Wind Vector Root Mean Square Error).The extremes for each category are highlighted; in yellow, the worst value, and in green, the best value.

Table 13 .
Experiment 2: Comparison of collocated AMVs, with QINF >= 80%, to NWP analysis winds.(N= Number of AMVs; NBF = Number of AMVs with Best-Fit Pressure Value; VD = Vector Difference for all AMVs; VDBF = Vector Difference for AMVs with Best-Fit Pressure Value; RMS = Root Mean Square Error for all AMVs; RMSBF = Root Mean Square Error for AMVs with Best-Fit Pressure Value.The extremes for each category are highlighted; in yellow, the worst value, and in green, the best value.

Table 15 .
Experiment 2: Comparison of all AMVs to NWP analysis winds, with a CQI threshold so that at least 10,000 AMVs are kept.(CQI= Common Quality Indicator Threshold used for each center; N = Number of AMVs; NBF = Number of AMVs with Best-Fit Pressure Value; VD = Vector Difference for all AMVs; VDBF = Vector Difference for AMVs with Best-Fit Pressure Value; RMS = Root Mean Square Error for all AMVs; RMSBF = Root Mean Square Error for AMVs with Best-Fit Pressure Value).The extremes for each category are highlighted; in yellow, the worst value, and in green, the best value.

Table 15 .
Experiment 2: Comparison of all AMVs to NWP analysis winds, with a CQI threshold so that at least 10,000 AMVs are kept.(CQI = Common Quality Indicator Threshold used for each center; N = Number of AMVs; NBF = Number of AMVs with Best-Fit Pressure Value; VD = Vector Difference for all AMVs; VDBF = Vector Difference for AMVs with Best-Fit Pressure Value; RMS = Root Mean Square Error for all AMVs; RMSBF = Root Mean Square Error for AMVs with Best-Fit Pressure Value).The extremes for each category are highlighted; in yellow, the worst value, and in green, the best value.

Table 16 .
Experiment 2: Speed Bias (BIAS) and Speed Standard Deviation (STD) with respect to the NWP analysis winds, before and after the best-fit pressure level correction, for all AMVs for which the best-fit pressure level correction could be applied with QINF >= 80%.The extremes for each category are highlighted; in yellow, the worst value, and in green, the best value.

Table 18 .
Similarities found in the AMV datasets by the 'paired t-test' for the different combinations of variables and AMV datasets, in Experiment 1 (with prescribed configuration).

Table 19 .
Similarities found in the AMV datasets by the 'paired t-test' for the different combinations of variables and AMV datasets, in Experiment 2 (with specific configuration).