NASA Global Satellite and Model Data Products and Services for Tropical Meteorology and Climatology

: Satellite remote sensing and model data play an important role in research and applications of tropical meteorology and climatology over vast, data-sparse oceans and remote continents. Since the ﬁrst weather satellite was launched by NASA in 1960, a large collection of NASA’s Earth science data is freely available to the research and application communities around the world, signiﬁcantly improving our overall understanding of the Earth system and environment. Established in the mid-1980s, the NASA Goddard Earth Sciences Data and Information Services Center (GES DISC), located in Maryland, USA, is a data archive center for multidisciplinary, satellite and model assimilation data products. As one of the 12 NASA data centers in Earth sciences, GES DISC hosts several important NASA satellite missions for tropical meteorology and climatology such as the Tropical Rainfall Measuring Mission (TRMM), the Global Precipitation Measurement (GPM) Mission and the Modern-Era Retrospective analysis for Research and Applications (MERRA). Over the years, GES DISC has developed data services to facilitate data discovery, access, distribution, analysis and visualization, including Giovanni, an online analysis and visualization tool without the need to download data and software. Despite many e ﬀ orts for improving data access, a signiﬁcant number of challenges remain, such as ﬁnding datasets and services for a speciﬁc research topic or project, especially for inexperienced users or users outside the remote sensing community. In this article, we list and describe major NASA satellite remote sensing and model datasets and services for tropical meteorology and climatology along with examples of using the data and services, so this may help users better utilize the information in their research and applications.


Introduction
Satellite remote sensing and model data play an important role in research and applications of tropical meteorology and climatology, ranging from case studies to model development, over vast, data-sparse oceans and remote continents. Since the first weather satellite was launched by NASA in 1960 [1], a large amount of Earth science data have been collected by NASA and distributed to the research, application and operation communities around the world, free of charge, significantly improving our overall understanding of the Earth system and environment, not only in the tropics but also in the rest of the Earth [2,3].
The atmosphere in the tropical regions which are commonly defined as a latitude belt between 23.5 • N-S is known to be "barotropic" since temperature is nearly horizontally uniformly distributed.

Satellite Missions, Projects and Data
As one of the 12 distributed and discipline-specific DAACs [22], managed by the NASA's Earth Observing System Data and Information System (EOSDIS) [23], GES DISC [9] archives and supports datasets applicable to several NASA Earth Science focus areas including: atmospheric composition, water and energy cycles and climate variability. Datasets are also applicable to carbon cycle and ecosystem. As of this writing, GES DISC has archived~140 million data files with the data volume of 3157 TB. Since 2010, over 3 billion data files and over 31,159 TB of data have been distributed. Due to space limitation, only brief summaries of satellite missions and projects that are related to tropical meteorology and climatology are presented. More detailed information can be found in their mission or project websites.

TRMM
Launched in November 1997, TRMM is a NASA-JAXA (the Japan Aerospace Exploration Agency) research satellite mission [10][11][12][13][14] to better understand processes in tropical meteorology and climatology. TRMM provides observational data with footprints between 40 • N-S. The first space-borne Ku-band precipitation radar on TRMM provides 4-dimensional datasets of rainfall and latent heating over vast tropical oceans and remote continents, which are extremely valuable (over 17 years of observations) not only for individual weather event case studies but also for tropical climatology, especially in conjunction with other ancillary datasets or observations [24]. TRMM has helped researchers better understand tropical rainfall properties and variation [24] including (1) rainfall intensity and spatial distribution; (2) partition of rainfall categories (from stratiform and convective clouds); (3) vertical distribution or profiles of hydrometeors; (4) diurnal and seasonal variation of rainfall amount and intensity; (5) lightning distribution and variation; and (6) 4-dimensional latent heating profiles. Table 1 lists precipitation and lightning instruments onboard TRMM, including the Visible and Infrared Scanner (VIRS), the TRMM Microwave Imager (TMI), the Precipitation Radar (PR) and the Lightning Imaging Sensor (LIS). Datasets from VIRS, TMI and PR are archived at GES DISC. The lightning datasets from LIS are hosted by the Global Hydrology Resource Center (GHRC) [25]. With more datasets included from additional infrared and passive microwave sensors on other domestic and international satellites to increase both spatial and temporal samplings, algorithms have thus been developed and integrated for multi-sensor, multi-algorithm, and multi-satellite precipitation products such as, the TRMM Multi-Satellite Precipitation Analysis (TMPA) [26,27]. Over the years, the TMPA datasets, consisting of 3-hourly, daily and monthly in both near-real-time and research-grade, have been widely used in tropical meteorology and climatology as well as in other disciplines such as diseases and food security [24].
The instruments on TRMM were turned off on 8 April 2015 and the spacecraft deorbited on 15 June. Over 17 years of valuable TRMM data have been collected by TRMM and available for weather and climate research.

GPM
As the TRMM follow-on mission, the GPM mission [15][16][17][18] was launched in February 2014, extending the spatial coverage of TRMM measurements from 40 • N-S to 70 • N-S. In addition to the aforementioned scientific goals in TRMM, the GPM mission is designed to improve measurements of both light rain and falling snow which have been the existing challenges in satellite-based precipitation estimates. A new frequency (Ka-band) band is added to the Dual-frequency Precipitation Radar (DPR) to improve light rain measurement in addition to the Ku-band radar ( Table 2) that existed on TRMM. The frequency channels in the GPM Microwave Imager (GMI) have been expanded to include several high frequencies for improving falling snow estimates in high latitudes. GPM also serves as a core observatory satellite in the GPM satellite constellation for inter-satellite sensor calibration (e.g., the inter-calibrated passive microwave (PMW) brightness temperatures), which is crucial for developing integrated multi-satellite retrieval products such as the Integrated Multi-satellitE Retrievals for GPM (IMERG) [28,29]. The GPM launch schedule was made to ensure there would be an overlapping period, especially with TRMM for inter-satellite calibration and consistency in datasets, which is imperative for validating and deriving climatological datasets. The IMERG precipitation product suite consists of datasets in three categories, the Early Run (latency:~4 h), the Late Run (latency:~12-14 h), and the Final Run (latency:~3.5 months). Compared to TMPA, both spatial and temporal resolutions of IMERG have been significantly improved from 0.25 deg. × 0.25 deg. to 0.1 deg. × 0.1 deg. and from 3-hourly to half-hourly, respectively. The Early Run can be used in supporting near-real-time operations or research such as flood forecast. The Late Run, with more observations available for improving precipitation estimates, can be used in applications such as monitoring crop conditions. The Final Run, which is research-grade with improvements in quality control such as incorporating rain gauge data at the Global Precipitation Climatology Centre (GPCC) [30] for bias correction, is suitable for research. IMERG also provides over two decades of global precipitation data for various studies including the development of baseline precipitation products, seasonal to inter-annual studies, and more. The production of TMPA ended in December 2019, but data are still available at GES DISC for various purposes, but users are encouraged to use the much-improved TMPA successor product suite, IMERG.

NOAA National Center for Environmental Prediction (NCEP))/Climate Prediction Center (CPC) Merged IR Project
The 4 km, pixel-resolution, half-hourly global (60 • N-S) IR brightness temperature dataset is generated by NOAA NCEP CPC [31]. Data from several domestic and international geostationary satellites (NOAA GOES-8/10, EUMETSAT Meteosat-8/5 and JMA-JAXA GMS) are merged into this dataset. The dataset is used to derive global IR-based precipitation estimates which are used in NOAA operations [32]. Since the TRMM era, the merged IR dataset has been a part of the multi-satellite precipitation estimates. In the GPM era, the IR-based precipitation estimates continue to be merged into the IMERG product suite [28,29].
The IR dataset can be used in weather studies and model development including constraint and verification. The dataset is available from 1 January 1998, to the present. Single image or animation can be made with this dataset to track weather events in the data domain [33].

MERRA-2 and Global and Regional Land Data Assimilation Projects
The Modern-Era Retrospective analysis for Research and Applications, version 2 (MERRA-2) is a reanalysis product suite developed by the NASA's Global Modeling and Assimilation Office (GMAO) [19]. MERRA-2 includes new observation types in the assimilation that are not available in the previous version (MERRA) along with updates in the Goddard Earth Observing System (GEOS) model and analysis scheme [19]. Meteorological variables (e.g., winds, temperature, pressure, humidity), which are crucial for weather and climate analysis in the tropics, are available in MERRA-2. The spatial coverage of MERRA-2 is global (90 • N-S). The temporal resolution varies from hourly, 3-hourly to monthly. The spatial resolution is 0.625 deg. × 0.5 deg. in longitude and latitude. The MERRA-2 product suite begins from 1 January 1980, to the present. Over the years, various MERRA-2 assessment activities have been conducted (e.g., precipitation [34]).
The NASA land data assimilation projects produce global and regional land datasets that consist of quality-controlled, spatially and temporally consistent, land-surface model (LSM) datasets from the best available observations and model output to support modeling activities [35], by ingesting satelliteand ground-based observations and using advanced land surface modeling and data assimilation techniques in the regional and global land data assimilation systems (LDAS) [35]. The temporal and spatial resolutions vary and so do the spatial coverage and the data availability (Table 3).

TROPICS
Datasets from several future satellite missions that are related to tropical meteorology and climatology will be archived and distributed at GES DISC, including the Time-Resolved Observations of Precipitation Structure and Storm Intensity with a Constellation of Smallsats (TROPICS) Mission [36] and the Aerosol and Cloud, Convection and Precipitation (ACCP) Mission [37].
As mentioned earlier, satellite-based global precipitation estimates depend on a constellation of both domestic and international satellites to increase spatial and temporal samplings. Multi-satellite precipitation datasets like IMERG mainly rely on high-quality PMW datasets collected from the satellite constellation. IR datasets are used to fill in data gaps when and where PMW data are not available. The TROPICS mission [36], consisting of six cube-shaped satellites and led by the MIT Lincoln Laboratory, will add more PMW observations in the tropics; however, their new high-frequency channels need to be further studied before they can be integrated into multi-satellite precipitation algorithms such as IMERG. The TROPICS mission will be the first NASA science mission to be implemented with CubeSats and the first proliferated CubeSat constellation mission funded by the U.S. government.

ACCP
Aerosols, clouds, convection and precipitation are closely related and their linkages are still not well understood. The ACCP mission [37] is designed to respond to the Earth System Science themes, science and application questions, and several high priority objectives, highlighted in the National Academies of Sciences, Engineering and Medicine (NASEM) 2017 Decadal Survey [38], "Thriving on Our Changing Planet: A Decadal Strategy for Earth Observations from Space". The major themes in the decadal survey include: (a) climate variability and change and (b) weather and air quality. The observables in the ACCP mission may potentially address additional themes such as marine and terrestrial ecosystems, global hydrological cycle, Earth surface and interior. ACCP datasets that are under development will be archived and distributed by GES DISC. The mission is still in the planning stage.

Datasets for Tropical Meteorology and Climatology
NASA Earth science datasets are categorized into four levels [39]: (1) Level-0 datasets are reconstructed, unprocessed instrument and payload data at full resolution, with the removal of any and all communications artifacts, e.g., synchronization frames, communications headers, and duplicate data; Level-1 datasets are reconstructed, unprocessed instrument data at full resolution, time-referenced, and annotated with ancillary information; (2) Level-2 datasets are derived geophysical variables at the same resolution and location as Level-1 source data; (3) Level-3 datasets consist of variables mapped on uniform space-time grid scales, usually with some completeness and consistency; and (4) Level-4 are the model output or results from the analyses of lower-level data (e.g., variables derived from multiple measurements). Datasets addressed in this article mainly consist of Levels 2-4. Table 3 lists the datasets from major satellite missions and projects described above for tropical meteorology. First, Level-2 orbital precipitation datasets from TRMM and GPM missions are listed, including instantaneous surface and profile precipitation from the radars (TRMM PR and GPM DPR) and passive microwave imagers (TMI and GMI) as well as their combined datasets (PR and TMI, DPR and GMI). These datasets offer the sensor's spatial resolutions and are suitable for case studies, ground validation activity, and more. The merged precipitation suite, IMERG, that is the successor of TMPA, is popular in many areas of research and applications. The NCEP CPC merged IR dataset (Table 3) is normally used for tracking system movement and development due to its high spatial and temporal resolutions. Others important datasets include 2D and 3D winds, sea surface temperature, sea-level pressure, air temperature and humidity from MERRA-2 and AIRS. The North American Land Data Assimilation System (NLDAS) and MERRA-2 provide soil moisture and surface runoff datasets over land. Aerosols, ozone and other trace gases are from OMI, SeaWiFS, and MERRA-2 (Table 3).
Climatological datasets can be derived from these datasets as well and examples are given in the next section, but cautions are needed in data processing for the use of climatology for several reasons. First, for some satellite missions, their life spans are short, in terms of the World Meteorological Organization (WMO) 30 year definition for climatology [40]. A dataset collection may contain several missions such as TMPA, IMERG. Evaluation and corrections are typically needed for ensuring satellite sensor consistency in the same mission and/or the follow-up missions. For multi-satellite datasets, cross-satellite and cross-sensor calibrations are needed to minimize the systematic differences from different satellites and sensors. Furthermore, bias correction with ground truth or observations is a common practice in precipitation algorithms which produce products such as TMPA and IMERG.
Beginning with TRMM Version 8, the GPM algorithms are used in reprocessing. As a result, TRMM and its constellation data have become part of the GPM dataset family. Major changes include the TRMM data format and the file naming conventions, both of which are now consistent with those of GPM. The datasets format is HDF5.
The datasets described above are primarily archived and distributed at GES DISC with emphasis on water and energy cycles. Satellite, airborne and in situ datasets are also available at other NASA Earth Science data centers or universities for tropical meteorology and climatology. The NASA A-Train program consists of a set of eight coordinated satellites (Aqua, Aura, PARASOL, CLIPSO, CloudSat, GCOM-W1, OCO-2 and Glory) [41]. The NASA Physical Oceanography Distributed Active Archive Center (PODAAC) archives ocean winds, salinity, ocean temperature and more [42]. The Jet Propulsion Laboratory (JPL) has developed a tropical cyclone information system that provides near-real-time data and visualization for multi-parameter satellite and airborne observations [43]. In addition to the TRMM lightning data mentioned above, a large collection of NASA field campaign datasets is available at NASA GHRC [25]. More information about these projects and datasets can be found on their websites. A single website, to be described next, has been developed to allow searching and accessing data from NASA DAACs.

Dataset and Information Search
Given so many datasets available for users (see Table 3), a question then arises, how can users find these datasets easily and quickly? A state-of-the-art Web-based interface has been developed by GES DISC to facilitate dataset and information search and discovery [9]. Figure 1 shows the Web search interface. In addition to a dataset search, users can search other information including data collections (default), data documents, alerts, data in action, data releases, FAQs, glossary, how-to's, image gallery, news, publications, service release, and tools [44]. Data collection search allows searching datasets using the exact name or keyword (e.g., precipitation, GPM, TRMM). Search suggestions are provided in the search box to facilitate the search process. Users can select temporal and spatial ranges or browse datasets by categories such as subject, measurement, source, processing level, etc. Search results consist of a list of datasets with links to their landing pages and sample browse images. A filtering capability is provided so that search results can be refined by subject, measurement, source, processing level, etc. Users can also sort the results by source, version, temporal and spatial resolutions, processing level, begin and end dates. A dataset landing page contains a brief summary, data citation, documentation, and data access where users can find available data services for this dataset such as HTTPS (Hypertext Transfer Protocol Data collection search allows searching datasets using the exact name or keyword (e.g., precipitation, GPM, TRMM). Search suggestions are provided in the search box to facilitate the search process. Users can select temporal and spatial ranges or browse datasets by categories such as subject, measurement, source, processing level, etc. Search results consist of a list of datasets with links to their landing pages and sample browse images. A filtering capability is provided so that search results can be refined by subject, measurement, source, processing level, etc. Users can also sort the results by source, version, temporal and spatial resolutions, processing level, begin and end dates. A dataset landing page contains a brief summary, data citation, documentation, and data access where users can find available data services for this dataset such as HTTPS (Hypertext Transfer Protocol Secure), OPeNDAP (Open-source Project for a Network Data Access Protocol), and Giovanni. Data downloading services are available for all datasets, while subsetting services are also provided for most datasets.
For a research project (e.g., investigating an MJO event), users often need multiple datasets (e.g., precipitation, winds, temperature). However, datasets are not grouped together based on a research area for distribution and one has to find each individual dataset from the respective archival collection, which can be a time consuming and sometimes difficult task if the user is not familiar with the dataset. To help the users effectively finding the datasets they want or need, a prototype called Datalist that contains datasets or variables recommended by the supporting staff at GES DISC has been originally developed for hurricanes [45,46] and expanded for tropical meteorology and climatology. Similar prototypes for events such as floods, droughts, and wind energy are also being considered. With the hurricane Datalist, users can discover related datasets and information in one place as well as data subsetting and downloading. Nonetheless, the Datalist concept provides convenience to users who want to search and access multiple datasets.
The newly released "My Dashboard" (seen at the top right corner in Figure 1) allows users keeping a search history and my favorites in their registered accounts [9]. Users can bookmark their favorite datasets for convenience. With the saved search history, users do not need to remember how datasets they want are searched and filtered. Such a dataset download history service is a feature similar to the purchase history service found in shopping websites. In addition, these personal links can also be tagged and shared with other users when this service is implemented in the future.
Data documents are searchable, as are the datasets and service-related information such as alerts, data in action, news, FAQs, how-to's and publications. Short articles (e.g., data in action, news) use events to describe the use of a single or multiple datasets from one or more NASA data centers. FAQs contain answers for frequently asked questions, while how-to's provide step-by-step data recipes. Some how-to's can be downloaded in the Jupyter Notebook format. Key dataset-related publications are available. More dataset-related referral research papers are being developed.
For datasets archived and distributed at other NASA DAACs, users can search them through the NASA Earthdata homepage [47]. Search results can also be filtered and the retrieved information will guide users on where and how to download data and the associated documents.
Finally, all users are required to register with Earthdata [47], which is a simple, easy and one-time process. Once registered, users can access data and services at all NASA DAACs. Details about how to download data with different software packages (e.g., wget, curl) are available [48].

Data Subsetting, Format Conversion and Other Data Services
Spatial data subsetting services (e.g., Table 4) are necessary for many satellite-or model-based datasets to reduce the data download volume, speed up the data download process and increase the server performance, especially because most datasets are global and spatial resolution for some datasets can be high. For example, each IMERG half-hourly dataset contains over 340,000 files, as of this writing, with the 0.1 deg. × 0.1 deg. spatial resolution available from June 2000 onwards. It can be time consuming to download all the original archived files due to the large volume (over 3 TB). Spatial subsetting services are especially helpful to considerably reduce the data download volume, thus time. Likewise, many GPM Level-2 datasets are quite large in file size such as the GPM dual-frequency radar dataset (over 8 TB for a single dataset collection).
The data subsetting services (Table 4) at GES DISC contain different subsetting options, depending on dataset level. Three download methods are currently available: original files, OPeNDAP and the GES DISC subsetter. The option for downloading original files allows downloading files straight from the dataset archive and no subsetting or any change to the original files is available. The second option is to use OPeNDAP for parameter and spatial subsetting options and the output format is NetCDF (Network Common Data Form) or ASCII. The last subsetting option has three choices for spatial subsetting: a single point, a circular area and a rectangular latitude-longitude box. The output format is HDF5 for Level-2 and NetCDF for Level-3. For Level-3 datasets, the last available option provides conversion to over 30 different grid structures (e.g., GPCP, TMPA) with four interpolation methods (bilinear, bicubic, distance-weighted averaging, and nearest neighbor). Table 4. Data services at GES DISC (note: data services may not be available for some datasets).

Service Description
Online Archive Original archived data files accessible only by the Hypertext Transfer Protocol Secure (HTTPS).

Giovanni
Analyze and visualize gridded data interactively online without having to download any data and software.

OPeNDAP
The Open-source Project for a Network Data Access Protocol (OPeNDAP) provides remote access to individual variables within datasets in a form usable by many tools, such as the Integrated Data Viewer (IDV), the Man computer Interactive Data Access System-5th generation (McIDAS-V), Panoply, Ferret, and the Grid Analysis and Display System (GrADS).

GrADS Data Server (GDS)
Stable, secure data server that provides subsetting and analysis services across the Internet. The core of GDS is OPeNDAP (also known as the Distributed Oceanographic Data Systems (DODS)), a software framework used for data networking that makes local data accessible to remote locations.

OGC Web Map Service
The Open Geospatial Consortium (OGC) Web Map Service (WMS) provides map depictions over the network via a standard protocol, enabling clients to build customized maps based on data coming from a variety of distributed sources.

Data Rods for Hydrology
Provide time series data in American Standard Code for Information Interchange (ASCII) for hydrology datasets.
Other data services listed in Table 4 include online archive, Giovanni, OPeNDAP, GrADS Data Server (GDS), OGC Web map services, and data rods. Original archived data can be directly downloaded from HTTPS. OPeNDAP is quite popular in research and application communities and many tools can access data through this protocol such as Panoply. GDS is popular in the meteorological community and the data can be imported in Panoply as well. OGC Web Map Services provide convenience (e.g., WMS, WCS) for users in the GIS community. For users who want long time series data, data rod services [49] can be very helpful to avoid downloading a large amount of data. For example, there are over 340,000 half-hourly files for IMERG Final Run and downloading them for creating time series can be a difficult job. Data rods for IMERG are currently under development.

Giovanni
To be able to evaluate and visualize data, preferably without downloading data and software, has become quite popular these days. For some, handling and processing satellite remote data can be a daunting task due to a number of issues such as data format or data structure. Giovanni is to facilitate users to analyze and visualize satellite data without downloading data and software. Giovanni [20,21] is a Web-based tool where frequently-used Level-3 datasets at GES DISC are included [50]. As of this writing, over 1500 variables are available in Giovanni [50]. Figure 2 shows the Web interface or landing page of Giovanni [50]. A dataset normally contains structured variables with metadata. Users usually use one or more variables in their studies. Unlike the data collection search mentioned earlier, only selected variables from their datasets are presented and analyzed in Giovanni, which helps the data analysis and visualization because users in general are more familiar with variable names. However, their associated dataset names and landing pages are linked in case extra information is needed. Searching, filtering and sorting make finding a variable easier. For example, a search for "precipitation" in Giovanni returns 104 precipitation-related variables. Users can filter the results based on different criteria (e.g., spatial and temporal resolutions). Twenty-two functions are available for analysis and visualization and they are categorized into five groups including maps, comparisons, vertical, time series, and miscellaneous. Users can select a time span and spatial domain for analysis. Shapefiles including countries, U.S. states, land/sea masks, and major watersheds are included. Unit conversion is available for some popular variables such as precipitation. states, land/sea masks, and major watersheds are included. Unit conversion is available for some popular variables such as precipitation. Figure 2. The landing page of Giovanni, developed by GES DISC to facilitate data analysis and visualization without downloading data and software. A keyword search box is provided for searching over 1500 variables for the time being and more variables to be added. As the cloud version of Giovanni is being developed, better performance and more data from other Distributed Active Archive Centers (DAAC) will be expected.
Although few climatology datasets are generated from their archived datasets, a function called monthly and seasonal averages in Giovanni [50] provides a fast and easy way to generate monthly and seasonal averages from over 1500 parameters for a quick look or evaluation. Users can compute monthly or seasonal averages using different time ranges to see their differences.
In the output window, options can be used to fine tune the graphic such as adjusting the min/max values of the color bar. There is a list of different color bars for selection. Maps can be zoomed in or out. Data can be downloaded in different popular formats (e.g., NetCDF, GeoTIFF, and CSV (Comma-Separated Values)). Data lineage information is available and the datasets used in the processing steps are listed for download.

Panoply
Only Level-3 datasets are included in Giovanni. Another powerful and free tool, Panoply [51], developed by NASA GISS, is available for viewing both Level-2 and 3 datasets. Data files can be imported in Panoply or via remote URL access such as OPeNDAP or GDS. There are many features built in Panoply for data analysis and visualization. For analysis, simple map combinations (e.g., subtraction) are included. Vector plots (e.g., wind) are supported. For visualization, features include scale, maps, overlays, shading, contours, vectors and labels. Output can be saved in different popular formats (e.g., PNG, KMZ (Keyhole Markup language Zipped)).

Figure 2.
The landing page of Giovanni, developed by GES DISC to facilitate data analysis and visualization without downloading data and software. A keyword search box is provided for searching over 1500 variables for the time being and more variables to be added. As the cloud version of Giovanni is being developed, better performance and more data from other Distributed Active Archive Centers (DAAC) will be expected.
Although few climatology datasets are generated from their archived datasets, a function called monthly and seasonal averages in Giovanni [50] provides a fast and easy way to generate monthly and seasonal averages from over 1500 parameters for a quick look or evaluation. Users can compute monthly or seasonal averages using different time ranges to see their differences.
In the output window, options can be used to fine tune the graphic such as adjusting the min/max values of the color bar. There is a list of different color bars for selection. Maps can be zoomed in or out. Data can be downloaded in different popular formats (e.g., NetCDF, GeoTIFF, and CSV (Comma-Separated Values)). Data lineage information is available and the datasets used in the processing steps are listed for download.

Panoply
Only Level-3 datasets are included in Giovanni. Another powerful and free tool, Panoply [51], developed by NASA GISS, is available for viewing both Level-2 and 3 datasets. Data files can be imported in Panoply or via remote URL access such as OPeNDAP or GDS. There are many features built in Panoply for data analysis and visualization. For analysis, simple map combinations (e.g., subtraction) are included. Vector plots (e.g., wind) are supported. For visualization, features include scale, maps, overlays, shading, contours, vectors and labels. Output can be saved in different popular formats (e.g., PNG, KMZ (Keyhole Markup language Zipped)).

Examples
Two examples of using datasets and services at GES DISC are presented here. By no means all available datasets are included in these examples, other than to show how datasets and services can facilitate research activities.

An MJO Event during October and November 2011
The evolution, structure, and spatial variability of an MJO event has been studied by Xu and Rutledge [52] with data from the observations of the DYNAMO (the Dynamics of the Madden-Julian Oscillation) ship-borne radar and TRMM. In this example, several datasets during Phase 2 (20)(21)(22)(23)(24)(25)(26)(27)(28)(29) in the area (60 • -100 • E and 15 • N-S) were extracted from the data archive and plotted in Figures 3 and 4. Most variables have been averaged during 20-29 October to be consistent with the analysis in the study [52] and the others consist of accumulated rainfall and a snapshot of the merged IR brightness temperature. More dataset plots can be generated with Giovanni. Figure 4 shows the conditions at 850 hPa for the same Phase 2 MJO event, including wind, air temperature, geopotential height, and specific humidity. Strong westerly winds between 60° and 70° E near the Equator are surrounded by the easterly winds (Figure 4a) and the heavy accumulated rainfall area is found in the convergent area

Monthly and Seasonal Averages
Most datasets do not come with their monthly averages or climatology. To avoid the confusion with the WMO standard definition for climatology, which is an average of a variable for over 30 years [40], only monthly or seasonal averages are used here. As mentioned, for satellite-based datasets, it is quite a complicated process to develop a consistent dataset or climatology for climate studies. For example, corrections are often needed to remove systematic differences across multiple satellite missions or satellites in a constellation. With Giovanni, users can generate monthly or seasonal averages from a list of variables for a quick evaluation. Users can also define the length of the averaging period. Figure 5 contains two maps of the monthly and seasonal averages of IMERG Final Run. Figure 5a is the seasonal average map for June, July and August (JJA) during the period 2000-2019 and Figure 5b shows the similar average map for the month of July only. Users can visit the NASA GPM mission website for more seasonal information including animation [54]. Figure 5c shows their difference (July minus JJA) map, generated with the data downloaded from Giovanni and imported in Panoply for analysis and visualization. It is seen that large differences are found in some regions such as the Indian subcontinent, especially along the mountain range of the Western Ghats (Figure 5c). Accumulated rainfall can be obtained with Giovanni. In Figure 3a, it was seen that over 1000 mm accumulated rainfall during the 10 day period was found in the study area. Compared to the 3B42 maximum rainfall (~300 mm) in Figure 2 [52], the rainfall from the IMERG Final Run much exceeds that of 3B42 in this particular case. Over land, biases in the IMERG Final Run are largely corrected by gauge data from GPCC. By contrast, over oceans, due to the lack of gauge observations, biases are not corrected. Figure 3b shows the MERRA-2 surface skin temperature map averaged during the same period. It is seen that the surface skin temperatures in the heavy rainfall area and its north are high compared to the area in the south. A snapshot (generated with Panoply) of the brightness temperatures from the merged IR dataset is shown in Figure 3c, consisting of clusters of convective cells in different sizes. Animation files can be made from this high-temporal (30 min) and 4 km dataset to show the evolution of these cloud clusters (not shown).
More dataset plots can be generated with Giovanni. Figure 4 shows the conditions at 850 hPa for the same Phase 2 MJO event, including wind, air temperature, geopotential height, and specific humidity. Strong westerly winds between 60 • and 70 • E near the Equator are surrounded by the easterly winds (Figure 4a) and the heavy accumulated rainfall area is found in the convergent area where both winds meet. Air temperatures are cooler in the rainy area than those in the surrounding areas (Figure 4b). A low geopotential height area is centered in the north of the strong westerly wind zone in Figure 4c and the anticlockwise flow pattern is visible in the wind vector map (Figure 4a) as well. Specific humidity is low in the strong westerly wind zone (Figure 4d), compared to the rest of the areas. Specific humidity is generally found high along the convergent area. All these are similar to what have been described in a typical MJO schematic diagram (i.e., [53]).

Monthly and Seasonal Averages
Most datasets do not come with their monthly averages or climatology. To avoid the confusion with the WMO standard definition for climatology, which is an average of a variable for over 30 years [40], only monthly or seasonal averages are used here. As mentioned, for satellite-based datasets, it is quite a complicated process to develop a consistent dataset or climatology for climate studies. For example, corrections are often needed to remove systematic differences across multiple satellite missions or satellites in a constellation. With Giovanni, users can generate monthly or seasonal averages from a list of variables for a quick evaluation. Users can also define the length of the averaging period. Figure 5 contains two maps of the monthly and seasonal averages of IMERG Final Run. Figure 5a is the seasonal average map for June, July and August (JJA) during the period 2000-2019 and Figure 5b shows the similar average map for the month of July only. Users can visit the NASA GPM mission website for more seasonal information including animation [54]. Figure 5c shows their difference (July minus JJA) map, generated with the data downloaded from Giovanni and imported in Panoply for analysis and visualization. It is seen that large differences are found in some regions such as the Indian subcontinent, especially along the mountain range of the Western Ghats (Figure 5c). Likewise, similar average maps can be readily produced for other variables in Giovanni (e.g., in Figure 6). Figure 6a shows the surface skin temperatures in JJA (2000-2019) over the region between 30° N-S with smaller temperature gradients found over oceans than over land. Cloud top pressure in Figure 6b consistently shows the most convective activities occur north of the Equator during the Likewise, similar average maps can be readily produced for other variables in Giovanni (e.g., in Figure 6). Figure 6a shows the surface skin temperatures in JJA (2000-2019) over the region between 30 • N-S with smaller temperature gradients found over oceans than over land. Cloud top pressure in Figure 6b consistently shows the most convective activities occur north of the Equator during the boreal summer season (JJA). Surface wind speeds in JJA are shown in Figure 6c, showing the well-known Somali Jet [4]. Figure 6d is the dust column mass density map for JJA, showing several regions (e.g., the Sahara) with high dust density.

Summary
In this article, major NASA satellite missions and projects and their datasets for tropical meteorology and climatology were introduced. TRMM provides measurements from the first space-

Summary
In this article, major NASA satellite missions and projects and their datasets for tropical meteorology and climatology were introduced. TRMM provides measurements from the first space-borne Ku-band precipitation radar, the PMW instrument, VIRS and LIS. The follow-up GPM Mission adds a new Ka-band in DPR in addition to the Ku-band for the improvement in light rain estimates. Several high frequency channels have been added in GMI for improving falling snow measurement. MERRA-2 provides a suite of meteorological variables that are important for studying weather events in the tropics, while climatological conditions can be derived from the~40 year collection of data. Future satellite missions have also been introduced. The to-be-launched TROPICS mission will improve sampling in the tropics with several high-frequency channels, as well as likely add more observations for research and monitoring activity. However, further research needs to be done to incorporate these channels that are currently not available in merged precipitation products such as IMERG. The ACCP mission that is in the planning stage will aim to improve our understanding of interactions among aerosols, clouds, convective activity and precipitation.
Several important datasets have been introduced. Level-2 TRMM and GPM datasets, including precipitation, radar reflectivity, hydrometeor profiles and inter-calibrated PMW brightness temperatures, are suitable for case studies, algorithm development, and merged products from the constellation of domestic and international satellites. Level-3 TRMM and GPM merged precipitation datasets are more popular, such as TMPA and its successor, IMERG (Early, Late and Final Run) with improved spatial and temporal resolutions beginning from the TRMM era. Climatology and diurnal variation datasets can be derived from this~20 year data collection. The much improved IMERG provides new insights on the spatial and temporal distributions as well as the diurnal variation of precipitation not only in the tropics but also in the rest of the globe. The NOAA NCEP CPC merged IR brightness temperature is another important dataset for tropical meteorology. The merged IR can be used to study meteorological processes due to its high spatial and temporal resolutions. NASA MERRA-2 and LDAS datasets provide a model reanalysis of many meteorological variables (e.g., wind, temperature, pressure, humidity) that are essential for tropical meteorology. Other variables such as aerosols are also available for better understanding the interactions among aerosols, clouds and precipitation.
Data and services at GES DISC have been presented. A newly designed Web landing page at GES DISC provides a one-stop shop for dataset search, subsetting, documents, news, alerts, data-in-action, data recipes, FAQs and more. All Web contents are searchable and filtering is provided to narrow down search results. Search history is available along with favorite dataset links.
A user-friendly and powerful online tool, Giovanni, is introduced. There are over 1500 Level-3 variables that are currently available in Giovanni and more are being or to be added. Giovanni is a convenient tool for evaluating Level-3 remote sensing and model variables because no software and data download are required. With filtering and sorting capabilities, users can quickly locate variables of interest. There are 22 functions that can be used to analyze and visualize data. Graphic results can be fine-tuned and downloaded. Information and data in the data lineage are provided. Giovanni can also be used to verify data downloaded from the archive to make sure that the read software performs as expected.
Another user-friendly, freely available tool, NASA GISS Panoply, was introduced. Panoply can be used to visualize many Level-2 and 3 datasets. Built-in features allow modifying the graphic result including projection, color bar, etc. Panoply provides remote access to data repositories such as HTTPS, OPeNDAP, GDS, etc. Animation is also available. Panoply is available for the most commonly used computing platforms (e.g., Windows, Mac).
Several examples in the tropics are presented, using the data and tools introduced in this article. Most maps were obtained from Giovanni including time-averaged and time-accumulated maps. Users can obtain monthly or seasonal averages of variables for a quick look at their climatology.

Future Plans
Future plans include two parts, new datasets and data services. New datasets, as mentioned earlier, comprise those from to-be-launched TROPICS and the in-planning ACCP. Details about the datasets from these missions are not available as of the time of this writing. A new development idea is to integrate datasets from missions or projects archived and distributed at other DAACs. For example, SST and ocean surface wind datasets at PODAAC [42] could be integrated with datasets at GES DISC such as precipitation, air temperature, humidity, etc. On-the-fly datasets are also useful for users such as a 10 day precipitation dataset derived from IMERG that could be helpful to support monitoring activities in agriculture.
There are a number of areas to be improved in data services. First, a dataset search at GES DISC requires additional improvements. For example, results from a search for "precipitation" for a specific area and time range return datasets that are outside the search criteria. For Level-2 dataset search, steps need to be reduced for searching how many granules are available for the search criteria (e.g., spatial and temporal ranges). These features are very important for case studies. For instance, to investigate the weather conditions associated with the Air France 447 accident [55], one needs to first identify the area of the accident and time span. With this spatial and temporal information in the dataset search interface, the search results should contain all the measurements or model datasets available for the event investigation, other than search each dataset one at a time, which can be time consuming and difficult because one needs to know what to search at the start, compared to what are available in the search results. Spatial subsetting needs to be expanded to include popular shaped-files such as river basins and land-sea masks and allow users to upload their own shape-files.
As mentioned earlier, research in tropical meteorology and climatology often involves multiple disciplines. However, at present, satellite-and model-based datasets are archived and distributed by disciplines at discipline-oriented DAACs. Efforts are underway to better understand use cases and requirements for interdisciplinary data and services and design systems to meet such requirements including a dedicated session in Earth Informatics in the AGU Fall Meeting [56]. In addition, the prototype of Datalist can be further expanded to other research areas such as floods, droughts, MJOs, not limited to datasets from GES DISC.
Nowadays, the data volume can grow rapidly. The whole collection of the half-hourly IMERG dataset (e.g., Final Run) can have files over 350,000 over a 20 year time span as of this writing and continues growing. Event databases such as the NOAA Hurricane Center's hurricane data (HURDAT2) [57] need to be added to data search and ordering services. Track-based (e.g., hurricane tracks) data search is a desired feature that can reduce data download volume and facilitate research on a moving system such as cyclones. Such data services should also allow users to input their system or event tracks. Adding a collection of existing training data for AI/ML can facilitate data access for artificial intelligence (AI) or machine learning (ML). However, creating AI/ML training datasets for events can be costly and time consuming (e.g., identifying and labeling dust storms from MODIS (Moderate Resolution Imaging Spectroradiometer) true color (RGB) imagery). Efforts are being considered, including using citizen scientists to create labeled event image databases.
Conditional dataset search can be a useful feature to have. Users will be able to obtain the only data they want (e.g., meeting the conditions they set in the search box). For example, a user plans to study rainfall events with daily accumulation over 100 mm in the area of interest and would like to obtain the data for such events from the archive. In addition, conditions can be set with another variable or variables. Nonetheless, there are a lot of things that data services can deliver. These so-called analysis ready data can greatly facilitate research and applications by reducing time and efforts in data download and processing.
Data interoperability needs to be enhanced or developed. For example, GIS users prefer data in GeoTIFF and model users prefer data in a projection and grid that match those of their models. NetCDF is a format that is used in most datasets at GES DISC and many tools or software can take this format without major issues. However, minor issues still exist such as missing metadata, lat-lon order, and time dimension, which can be major issues for developing engineering solutions for Earth science data and additional preprocessing components need to be added (which may drive up the development and operational costs).
These days, data centers increasingly consider moving datasets and services to the cloud environment, particularly for those that have service scale-up issues on premises. Downloading the voluminous IMERG suite can easily overload on-premises systems that can impact data service performance. Moving the data services to the cloud environment appears to be a solution for this scale-up challenge. However, the egress issue (cost to data providers for downloading data outside the cloud) can be challenging because it adds additional cost to operations and can be out of control. GES DISC, similar to other DAACs, is experimenting with new cloud-based services for IMERG and MERRA-2 datasets.
Data analysis and visualization tools need to be enhanced as well. Over the years, over 1000 referral papers have been published with help from Giovanni [20]. Giovanni not only provides quick evaluation for many Level-3 datasets but also helps answering what-if questions [21]. However, additional improvements are still needed. First, Level-2 data need to be added to Giovanni for case studies and more. Then, data analysis and visualization need to be linked with event databases as a user-friendly feature. For example, the NOAA hurricane track database (HURDAT2) [57] can be made available so users can analyze and visualize data for hurricane studies in a convenient way. There are some challenges for Giovanni to add such a feature. New components need to be developed to handle hurricane tracks and data subsetting. An event picker in the graphic user interface (GUI) needs to be added as well. Initial data analysis functions can be a time-averaged map, area-averaged time series, and animation. All these function names already exist in Giovanni, but for a moving system, new functions need to be developed.
Giovanni is also facing a service scale-up issue. Over the years, more and more users are using Giovanni for their research work, pushing the on-premises servers to their limit. Moving Giovanni into the cloud environment is being implemented at GES DISC. Initial test results have shown significant improvement in performance for data analysis. As data are migrating from all DAACs to the cloud environment, more data will be available for Giovanni. Users will see more datasets from different DAACs available in Giovanni, which will definitely enhance the capabilities for interdisciplinary or cross-DAAC data analysis.