Assessing the Impact of Land Use and Land Cover Data Representation on Weather Forecast Quality: A Case Study in Central Mexico

: In atmospheric modeling, an accurate representation of land cover is required because such information impacts water and energy budgets and, consequently, the performance of models in simulating regional climate. This study analyzes the impact of the land cover data on an operational weather forecasting system using the Weather Research and Forecasting (WRF) model for central Mexico, with the aim of improving the quality of the operative forecast. Two experiments were conducted using di ﬀ erent land cover datasets: a United States Geological Survey (USGS) map and an updated North American Land Change Monitoring System (NALCMS) map. The experiments were conducted as a daily 120 h forecast for each day of January, April, July, and September of 2012, and the near-surface temperature, wind speed, and hourly precipitation were analyzed. Both experiments were compared with observations from meteorological stations. The statistical analysis of this study showed that wind speed and near-surface temperature prediction may be further improved with the updated and more accurate NALCMS dataset, particularly in the forecast covering 48 to 72 h. The Root Mean Square Error (RMSE) of the average wind speed reached a maximum reduction of up to 1.2 m s − 1 , whereas for the near-surface temperature there was a reduction of up to 0.6 ◦ C. The RMSE of the average hourly precipitation was very similar between both experiments, however the location of precipitation was modiﬁed.


Introduction
The rapid expansion of urban areas or urbanization represents one of the most notable human-caused transformations of our planet [1,2]. Urbanization leads to land use and land cover changes (LULCC) and modifies the biogeophysical properties of the land surface, including the albedo, emissivity, soil moisture, and surface roughness length. Modifications in biophysical properties lead to changes in the surface flux, atmospheric circulation, and surface energy budget [3][4][5][6][7][8][9][10]. For example, when the surface energy budget is altered, fluxes in heat, moisture, and momentum within the planetary boundary layer (PBL) are directly affected [8]. Local and regional wind and other climate variables are subsequently affected due to horizontal variations in the turbulent sensible heat flux and PBL depth [8,[11][12][13]. The changes in the physical properties can also impact the thermal inertia/heat capacity of the land surface [14]. The nighttime temperatures are more sensitive to heat capacity.

Model Configuration
The WRF model was used to study LULC data quality and its impacts on simulated meteorological variables in an operational weather forecast for central Mexico. Two numerical experiments were completed, one using USGS data [33] (hereafter named the USGS experiment) and another using an updated dataset named the 2005 North American Land Change Monitoring System (NALCMS) [34] (hereafter named the NALCMS experiment).
For the USGS experiment, historical meteorological outputs were used from a weather forecast at the Atmospheric Sciences Center of the National Autonomous University of Mexico (http://grupoioa.atmosfera.unam.mx/historicos/). This study applied version 3.6 of the WRF model and used the ARW (Advanced Research WRF) core to solve atmospheric dynamics. The experiment adopted a nesting that covers the megalopolis of central Mexico and its surroundings (Figure 2), with a grid of 205 points from west to east and 124 points from south to north. The nested and parent domains have a horizontal resolution of 6.67 and 20 km, respectively. The domains have 30 vertical levels; the nested domain interacts with the parent domain (one-way nested) and the latter covers the entirety of Mexico. The weather forecast-produced outputs for 5 days and every hour were saved (a 120 h forecast). The initial and boundary conditions were taken from the Global Forecast System (GFS) model with a one-degree spatial resolution. The 0000 UTC data were used for the initial conditions and the boundary conditions every six hours. We used a Mercator projection, a time step of 120 s, and LULC data generated by the USGS in 1992-1993 with 24 classes. The horizontal resolution of the static geographic dataset used was 10 arc minutes for the 20 km domain and 2 arc minutes for the 6.67 km domain.
The operational forecast adopted the following physical parameterizations: the Kain-Fritsch scheme for cumulus [35], the single-moment 3-class scheme for microphysics, the Rapid Radiative Transfer Model (RRTM) scheme for longwave radiation, the Dudhia scheme for shortwave radiation, and the Yonsei University (YSU) scheme for the boundary layer [36]. In addition, a 5-layer thermal diffusion scheme [37] was used for the Land Surface Model (LSM). This scheme, although simple, is adequate for mesoscale studies. It is important to consider the fact that this LSM does not predict soil moisture and currently is used more in studies that analyze the performance of LSMs for different regions [38][39][40]. These studies have indicated that the selection of an LSM can notably affect the The average annual rainfall is between 560 to 1270 mm, with most of it May and October (https://worldweather.wmo.int/179/m179.htm). The average annual temperature is between 14 and 22 • C, with 10 and 12 • C in the mountain ranges and volcanoes. In the beginning of the 20th century, the predominant types of land cover were grasslands, shrubland, cropland, and uninhabited arid land. Urban area growth and, in general, the LULC transformations in the seven entities have modified the landscape, and it presents not only a social and political but also an environmental challenge.

Model Configuration
The WRF model was used to study LULC data quality and its impacts on simulated meteorological variables in an operational weather forecast for central Mexico. Two numerical experiments were completed, one using USGS data [33] (hereafter named the USGS experiment) and another using an updated dataset named the 2005 North American Land Change Monitoring System (NALCMS) [34] (hereafter named the NALCMS experiment).
For the USGS experiment, historical meteorological outputs were used from a weather forecast at the Atmospheric Sciences Center of the National Autonomous University of Mexico (http://grupoioa.atmosfera.unam.mx/historicos/). This study applied version 3.6 of the WRF model and used the ARW (Advanced Research WRF) core to solve atmospheric dynamics. The experiment adopted a nesting that covers the megalopolis of central Mexico and its surroundings (Figure 2), with a grid of 205 points from west to east and 124 points from south to north. The nested and parent domains have a horizontal resolution of 6.67 and 20 km, respectively. The domains have 30 vertical levels; the nested domain interacts with the parent domain (one-way nested) and the latter covers the entirety of Mexico. The weather forecast-produced outputs for 5 days and every hour were saved (a 120 h forecast). The initial and boundary conditions were taken from the Global Forecast System (GFS) model with a one-degree spatial resolution. The 0000 UTC data were used for the initial conditions and the boundary conditions every six hours. We used a Mercator projection, a time step of 120 s, and LULC data generated by the USGS in 1992-1993 with 24 classes. The horizontal resolution of the simulation results [39,40]; however, their performances (LSMs such as Noah, Pleim-Xiu, Rapid Update Cycle (RUC), Thermal Diffusion) have been reasonably good for the prediction of the nearsurface temperature at 2 m and the wind speed at 10 m [38][39][40], but not for the relative humidity, where the errors have been found to be significant [40]. In particular, the Thermal Diffusion scheme has been found to overestimate (underestimated) the sensible heat (latent heat) flux during the daytime [40]. Taking this into consideration, the scope of this study was limited to analyzing the impact of LULC data quality for predicting meteorological variables for Mexico, rather than considering evaluating aspects of LSM performance.

Description and Assessment of the Experiments
The NALCMS experiment used the same configuration and physical parameters described above but with a different LULC dataset. Each experiment, USGS and NALCMS, considered a period of two dry months (January and April of 2012) and two rainy months (July and September of 2012). These months are representative of climatic conditions in the study area, which is characterized by two well-defined seasons: the dry season, which spans from November to April, and the rainy season, which is from May to October. A 120 h forecast was run for each day of the 4 months and, in total, 122 numerical simulations were carried out in each experiment (244 in total). The analysis was carried out every 24 h: 1-24, 25-48, 49-72, 73-96, and 97-120 h. In other words, the first 24 h simulated of each day of the month were analyzed in order to obtain the behavior of 1-24 h, then the following 48 simulated hours of each day were analyzed to obtain the behavior of the next 48 h (25-48 h), and so on up to the 120 h forecast (97-120 h).
The accuracy of the weather forecast was analyzed for the near-surface temperature at 2 m, the wind speed at 10 m, and the hourly precipitation. An assessment of the reliability of both experiments was completed by comparing them with observations from meteorological stations located in the study area ( Figure 1). The assessment was performed using the Root Mean Square Error (RMSE) and the Mean Absolute Error (MAE) [41]. The RMSE is a quadratic scoring rule which measures the average magnitude of the error (Equation (1)). The MAE measures the average magnitude of the errors in a set of predictions, without considering their sign (Equation (2)). The difference between them is that MAE is a linear score, which means that all the individual differences are weighted The operational forecast adopted the following physical parameterizations: the Kain-Fritsch scheme for cumulus [35], the single-moment 3-class scheme for microphysics, the Rapid Radiative Transfer Model (RRTM) scheme for longwave radiation, the Dudhia scheme for shortwave radiation, and the Yonsei University (YSU) scheme for the boundary layer [36]. In addition, a 5-layer thermal diffusion scheme [37] was used for the Land Surface Model (LSM). This scheme, although simple, is adequate for mesoscale studies. It is important to consider the fact that this LSM does not predict soil moisture and currently is used more in studies that analyze the performance of LSMs for different regions [38][39][40]. These studies have indicated that the selection of an LSM can notably affect the simulation results [39,40]; however, their performances (LSMs such as Noah, Pleim-Xiu, Rapid Update Cycle (RUC), Thermal Diffusion) have been reasonably good for the prediction of the near-surface temperature at 2 m and the wind speed at 10 m [38][39][40], but not for the relative humidity, where the errors have been found to be significant [40]. In particular, the Thermal Diffusion scheme has been found to overestimate (underestimated) the sensible heat (latent heat) flux during the daytime [40]. Taking this into consideration, the scope of this study was limited to analyzing the impact of LULC data quality for predicting meteorological variables for Mexico, rather than considering evaluating aspects of LSM performance.

Description and Assessment of the Experiments
The NALCMS experiment used the same configuration and physical parameters described above but with a different LULC dataset. Each experiment, USGS and NALCMS, considered a period of two dry months (January and April of 2012) and two rainy months (July and September of 2012). These months are representative of climatic conditions in the study area, which is characterized by two well-defined seasons: the dry season, which spans from November to April, and the rainy season, which is from May to October. A 120 h forecast was run for each day of the 4 months and, in total, 122 numerical simulations were carried out in each experiment (244 in total). The analysis was carried Atmosphere 2020, 11, 1242 5 of 20 out every 24 h: 1-24, 25-48, 49-72, 73-96, and 97-120 h. In other words, the first 24 h simulated of each day of the month were analyzed in order to obtain the behavior of 1-24 h, then the following 48 simulated hours of each day were analyzed to obtain the behavior of the next 48 h (25-48 h), and so on up to the 120 h forecast (97-120 h).
The accuracy of the weather forecast was analyzed for the near-surface temperature at 2 m, the wind speed at 10 m, and the hourly precipitation. An assessment of the reliability of both experiments was completed by comparing them with observations from meteorological stations located in the study area ( Figure 1). The assessment was performed using the Root Mean Square Error (RMSE) and the Mean Absolute Error (MAE) [41]. The RMSE is a quadratic scoring rule which measures the average magnitude of the error (Equation (1)). The MAE measures the average magnitude of the errors in a set of predictions, without considering their sign (Equation (2)). The difference between them is that MAE is a linear score, which means that all the individual differences are weighted equally in the average, while RMSE gives a relatively high weight to large errors. For both the MAE and RMSE, lower values represent a better agreement.
where S i is the predicted variable, O i is the observation, and n is the total number of data.

Meteorological Data
Meteorological stations (Estaciones Meteorológicas Automáticas-EMA for its Spanish acronym) from the Mexico's National Weather Service (Servicio Meteorológico Nacional-SMN for its Spanish acronym) were used to assess the experiments. Although SMN is the Mexican agency responsible for providing meteorological information, a visual analysis was performed to verify that there were no inaccurate values. Furthermore, the annual percentage of existing data for each EMA station was estimated, and to consider a station a percentage of data equal to or greater than 96% was used as a limit, for which ten stations were selected ( Figure 1). EMA stations are available with a temporal frequency of 10 min. A comparison between the predicted and observed values was made, taking into account the nearest grid cell to the meteorological station. The values of each hour were compared for the near-surface temperature ( • C) and wind speed (m s −1 ). The hourly (mm h −1 ) accumulated observed precipitation was calculated to enable comparison with the simulation results.

LULC Dataset
The NALCMS dataset was selected to carry out the second experiment ( Table 1). The selection was based on the analysis conducted by López-Espinoza et al. [42]. This dataset is a product of a the collaborative effort of governmental agencies from Canada, Mexico, and the United States, coordinated by the North American Environmental Atlas of the Commission for Environmental Cooperation. For this dataset, 2005 Moderate Resolution Imaging Spectroradiometer (MODIS) data at a spatial resolution of 250 m were used. Nineteen land cover classes were defined using the Land Cover Classification System (LCCS) [43]. To generate the final dataset, each country used their own training data and classification methods, and the final dataset was a combination of the datasets generated by the three countries. For Mexico, the inputs were the monthly composite of the MODIS radiance data, the Normalized Difference Vegetation Index (NDVI) data, and information from a Digital Elevation Model. NALCMS has a successor developed in 2010; however, previous studies have demonstrated that this dataset does not present significant land cover changes in the simulated nested domain ( Figure 2, Domain 2), and even less in our study area [44,45]. On the other hand, studies have shown that the USGS dataset has a low classification accuracy [3,6,14,29], and in atmospheric modeling an accurate representation of LULC is desirable because such information impacts the water and energy budgets, and, consequently, the performance of models in simulating regional climate.
A pre-processing of the data was required in order to convert the NALCMS into the same coordinate, spatial resolution, and classification scheme of the USGS [42]. In the pre-processing step, the dataset was converted into a latitude-longitude coordinate system using a 1 km spatial resolution. To standardize the NALCMS classes, each legend was related to the most suitable USGS class. The USGS classification scheme is defined with 24 classes, and it is based on Anderson's classification scheme [46] (first column in Table 2). However, only the 11 classes shown in Table 2 are present in the study area. On the other hand, NALCMS is defined with 19 land cover classes, of which 13 are present in the study area. The standardization was achieved based on the best available information reported in regional maps and the literature about the continental and global scale datasets [47][48][49]. Table 2 shows the reclassification of the NALCMS dataset to the USGS class scheme. The dryland cropland/pasture and irrigated cropland/pasture classes of USGS are grouped into a class called cropland in the NALCMS dataset. The moisture and energy budgets of irrigated cropland are different from those of dry cropland. However, due to uncertainty in the irrigation/pasture cropland data and the potentially smaller total area of irrigated croplands, the cropland class is used as a compromise ( Figure 3).  Table 2 shows the reclassification of the NALCMS dataset to the USGS class scheme. The dryland cropland/pasture and irrigated cropland/pasture classes of USGS are grouped into a class called cropland in the NALCMS dataset. The moisture and energy budgets of irrigated cropland are different from those of dry cropland. However, due to uncertainty in the irrigation/pasture cropland data and the potentially smaller total area of irrigated croplands, the cropland class is used as a compromise ( Figure 3).   Changes of LULC  Figure 3c shows in white the areas that changed (the elements off the diagonal of the matrix), while yellow represents the areas that did not show LULC changes (diagonal of the matrix). Table 3 shows the LULC changes at each meteorological station with 1 km spatial resolution (Domain 2), which are consistent with the observed changes in the original resolution of the NALCMS dataset (250 m). The LULCCs around the selected meteorological stations were characterized mainly by the loss of forest, shrubland, and grassland, and these were replaced by cropland, grassland, and urban and built-up land. The loss of mixed forest is related to the loss of the deciduous forest, leaving only the evergreen forest (stations 5 and 9). The evergreen forest (stations 3 and 4) and shrubland (stations 1, 8, and 10) were converted to cropland for food production. The vegetation cover at station 6 was modified into urban and built-up land. Given these changes in LULC around the different meteorological stations, and in general over the entire study area (Figure 3c in white color and Table S1), changes in the physical properties of the terrain (Table 4) have also occurred and led to changes in the surface flux, atmospheric circulation, and surface energy budget [5]. Table 3. Land cover change at each station. ID represents the location in Figure 1.

Near Surface Temperature
Maps of the monthly average temperature are shown in Figure S1 (Supplementary Material). The absolute difference was calculated between the NALCMS and USGS experiments. The highest absolute difference was about 0.5-1 • C for the regions with conversion from vegetation to urban land. It was also found that the absolute difference in the average daily maximum temperature (Supplementary Material, Figure S2) can reach up to 2.5 • C, mainly in the regions around the newly urbanized centers and in the regions where there was a conversion from shrubland to cropland and urban land. Maps of the monthly average daily minimum temperature (Supplementary Material, Figure S3) showed absolute differences between 0.5 to 2 • C for the dry months, and values between 0.5 to 1.5 • C for the rainy months. These results are consistent with those of previous studies of the thermal behavior of urban land compared to vegetated land [17,18], where the largest near-surface temperature differences are observed during noon. Figure 4 shows the time series for the monthly average surface temperature for station 6 in January. It can be seen that there was an increase in the daily maximum temperature [7,50] and a decrease in the daily minimum temperature using updated data [17]. This was due to the new urban areas represented in the NALCMS data, where values close to observations were predicted. Replacing shrubland with Atmosphere 2020, 11, 1242 9 of 20 urban and built-up land reduced the albedo from 25 to 15, and this improved the forecast of the near-surface temperature. Additionally, the shrubland has a higher albedo and emissivity than urban and built-up land, and hence its temperature during the day was lower than that of urban land. In addition, during the night the highest minimum temperature was estimated. These results are consistent with the measured surface temperature profiles by Shamsipour et al. [17]. thermal behavior of urban land compared to vegetated land [17,18], where the largest near-surface temperature differences are observed during noon. Figure 4 shows the time series for the monthly average surface temperature for station 6 in January. It can be seen that there was an increase in the daily maximum temperature [7,50] and a decrease in the daily minimum temperature using updated data [17]. This was due to the new urban areas represented in the NALCMS data, where values close to observations were predicted. Replacing shrubland with urban and built-up land reduced the albedo from 25 to 15, and this improved the forecast of the near-surface temperature. Additionally, the shrubland has a higher albedo and emissivity than urban and built-up land, and hence its temperature during the day was lower than that of urban land. In addition, during the night the highest minimum temperature was estimated. These results are consistent with the measured surface temperature profiles by Shamsipour et al. [17].  The RMSE was calculated by comparing the observed and predicted near-surface temperature values. Figure 5 shows the RMSE for the monthly average of the surface temperature for the 10 stations during the 120 h of forecast. For January (dry month) and in the 72-96-120 h forecast ( Figure  5a), it is observed that the minimum value of RMSE (lower whisker), together with 25% of the data, or the first quartile Q1, was reduced when the NALCMS data was used. For the same month, a RMSE reduction was not observed if more than 50% of the data, or the second quartile Q2, is considered. For April (dry month), an increase in the RMSE values was observed when using the NALCMS data. In the rainy months (Figure 5c,d), the minimum values of RMSE (lower whisker) were reduced for the NALCMS data. For this month (July), after 24 h of forecast 75% of the data (Q3) resulted in a smaller The RMSE was calculated by comparing the observed and predicted near-surface temperature values. Figure 5 shows the RMSE for the monthly average of the surface temperature for the 10 stations during the 120 h of forecast. For January (dry month) and in the 72-96-120 h forecast (Figure 5a), it is observed that the minimum value of RMSE (lower whisker), together with 25% of the data, or the first quartile Q 1 , was reduced when the NALCMS data was used. For the same month, a RMSE reduction was not observed if more than 50% of the data, or the second quartile Q 2 , is considered. For April (dry month), an increase in the RMSE values was observed when using the NALCMS data. In the rainy months (Figure 5c,d), the minimum values of RMSE (lower whisker) were reduced for the NALCMS data. For this month (July), after 24 h of forecast 75% of the data (Q 3 ) resulted in a smaller RMSE when using the NALCMS. Like April, the month of September did not produce lower errors when using the NALCMS. Finally, in both rainy months, outliers with lower RMSE values were observed when using the updated LULC.  Using the updated NALCMS data, the RMSE of the predicted near-surface temperature was reduced up to 0.6 • C for July (rainy month) compared to the USGS data. This is observed for a 48 h forecast ( Figure 5). The maximum reduction in RMSE was reached at station 3, where the LULC changed from evergreen needleleaf forest to cropland. The changes in land cover caused modifications in the vegetation and in the physical properties of the terrain such as albedo (from 12% to 18% for station 3), which modified the near-surface temperature as well as the daily maximum and minimum temperature extremes. Albedo together with emissivity (from 95% to 88% for station 3) impacted the upwelling of longwave radiation (not shown) [5,8]. Similar results were found at station 10, where the conversion from shrubland to cropland had a reduction in RMSE of up to 0.5 • C; station 7, where the conversion from shrubland to evergreen needleleaf forest had a reduction of up to 0.3 • C; and at station 8, with conversion from shrubland to cropland, where a reduction of 0.2 • C was obtained. In the other stations, when NALCMS was used, the reduction in RMSE was less. For September, a similar behavior was observed from the 96 h forecast, and for January (dry month) the RMSE reduction in temperature was observed only in three meteorological stations (1, 6, and 7). However, these smallest RMSE values were observed for the entire five-day forecast. Figure 6 shows the RMSE and MAE of the daily maximum and minimum temperature from the 10-station average for the dry and rainy seasons. The RMSE values were reduced for the rainy season (July and September) (Figure 6b), principally after the 72 h forecast. For the dry season, a decrease in the MAE was observed for January during the entire five-day forecast ( Figure 6a); however, for April there was no improvement in the predicted maximum temperature when the NALCMS data were used. Finally, only in January (dry month) was there a reduction in the MAE values for the predicted minimum temperature, particularly for the 96 h forecast (Figure 6c).

Wind Speed
Difference maps for the monthly average maximum and minimum wind speed between the NALCMS and USGS experiments are shown in Figures S4 and S5 (Supplementary Material). It was found that the daily maximum and minimum wind speed using the USGS data were higher than using the NALCMS data, and that higher wind speeds occurred in surfaces with lower friction (USGS) and were lower where there was urban and forest land (NALCMS). Changes in the surface

Wind Speed
Difference maps for the monthly average maximum and minimum wind speed between the NALCMS and USGS experiments are shown in Figures S4 and S5 (Supplementary Material). It was found that the daily maximum and minimum wind speed using the USGS data were higher than using the NALCMS data, and that higher wind speeds occurred in surfaces with lower friction (USGS) and were lower where there was urban and forest land (NALCMS). Changes in the surface roughness (Table 4) due to land cover changes impacted the wind speed during the entire five-day forecast.
In Figure 7, the time series of the predicted and observed wind speed are shown for station 6 in April. At this location, the land cover changed from shrub to urban land, which led to changes in the surface roughness length from 0.10 to 0.80 (Table 4); as a consequence, the overestimation of the wind speed was reduced when NALCMS was used. The friction and drag of the updated land cover with greater roughness around stations 1, 8, 10, 4, 6, and 7, decreased the wind speed, with an average RMSE of up to 0.6 m s −1 in January (Figure 8a (c) (d) Figure 6. RMSE and MAE for the stations during the 120 h of forecast. The horizontal axis corresponds to the forecast hours and the vertical axis corresponds to the RMSE or MAE in degrees Celsius. Solid black and gray lines represent the NALCMS experiment, and black and gray dotted lines represent the USGS results. Average daily maximum temperature: (a) dry season (January and April), (b) rainy season (July and September). Average daily minimum temperature: (c) dry season (January and April), (d) rainy season (July and September).

Wind Speed
Difference maps for the monthly average maximum and minimum wind speed between the NALCMS and USGS experiments are shown in Figures S4 and S5 (Supplementary Material). It was found that the daily maximum and minimum wind speed using the USGS data were higher than using the NALCMS data, and that higher wind speeds occurred in surfaces with lower friction (USGS) and were lower where there was urban and forest land (NALCMS). Changes in the surface roughness (Table 4) due to land cover changes impacted the wind speed during the entire five-day forecast.
In Figure 7, the time series of the predicted and observed wind speed are shown for station 6 in April. At this location, the land cover changed from shrub to urban land, which led to changes in the surface roughness length from 0.10 to 0.80 (Table 4); as a consequence, the overestimation of the wind speed was reduced when NALCMS was used. The friction and drag of the updated land cover with greater roughness around stations 1, 8, 10, 4, 6, and 7, decreased the wind speed, with an average RMSE of up to 0.6 m s −1 in January (Figure 8a (Figure 9a,b), while for the minimum wind speed they were observed from 24 to 96 h of the forecast (Figure 9c,d).
72 and 96 h of forecast when the NALCMS was used. Additionally, the MAE values of the average maximum wind speed were reduced more than the MAE of the average minimum wind speed. For the dry season, the MAE values for the maximum (minimum) wind speed were reduced up to 0.53 m s −1 (0.36 m s −1 ), whereas for the rainy season they were reduced up to 0.46 m s −1 (0.25 m s −1 ). The errors for the maximum wind speed occurred between 48 and 96 h of the forecast (Figure 9a,b), while for the minimum wind speed they were observed from 24 to 96 h of the forecast (Figure 9c,d).

Precipitation
Maps of the average daily accumulated precipitation are shown in Figure S6 (Supplementary Material). Analysis suggests that, for the dry months, both experiments produced a similar spatial distribution of the precipitation. On the other hand, for the rainy months during the 48 h forecast, the maximum increase was up to 19 mm d −1 when the NALCMS was used. This was observed mainly in the center (Mexico City) and west (State of Mexico) of the study area ( Figure 1). The average hourly precipitation rate (Supplementary Material, Figure S6) did not change significantly for the dry months, and the WRF model correctly predicted the absence of precipitation with either of the two LULCs. For this reason, the analysis was focused on rainy months, which it was observed that the location of precipitation was modified [20][21][22][23][24].
In Figure 10, the predicted and observed average hourly precipitation is shown for station 10 for September. At this location, the conversion from shrubland (USGS) to dryland cropland (NALCMS) decreased the amount of latent heat flux (not shown), which led to a decrease in precipitation. With these results, predicted values closer to the observations were obtained, however this did not happen for all stations. precipitation; however, the smallest RMSE values were obtained for the USGS data. This is observed in the 120 h forecast in July for both datasets (USGS and NALCMS) and in September in the 48 h forecast for USGS, and during the 96 h forecast for NALCMS ( Table 5).
The precipitation results obtained with the NALCMS experiment were very similar to those with USGS. In addition, these results are consistent with those of previous studies, specifically with those of [3,9].

Physical Processes
Land cover changes have been considered as one of the most significant modifications to natural ecosystems [10]. The improvement in the forecast performance that is observed when using NALCMS The RMSE was calculated for the hourly precipitation during the 120 h of forecast. Table 5 shows the maximum RMSE values for the 10 stations. Using the NALCMS (USGS) data, the maximum RMSE values are between 0.73 (0.38) and 1.24 (1.15) mm h −1 for July, and between 0.77 (0.41) and 1.56 (0.84) mm h −1 for September. It was found that both experiments overestimated the average hourly precipitation; however, the smallest RMSE values were obtained for the USGS data. This is observed in the 120 h forecast in July for both datasets (USGS and NALCMS) and in September in the 48 h forecast for USGS, and during the 96 h forecast for NALCMS (Table 5). The precipitation results obtained with the NALCMS experiment were very similar to those with USGS. In addition, these results are consistent with those of previous studies, specifically with those of [3,9].

Physical Processes
Land cover changes have been considered as one of the most significant modifications to natural ecosystems [10]. The improvement in the forecast performance that is observed when using NALCMS is mainly due to the following physical processes.
For temperature, the changes in albedo and sensible and latent heat are the main contributors. Built-up areas alter thermal characteristics and moisture pathways, adding an extra energy supply to the region [1]. The reduction in vegetation and soil in urban areas decreases evapotranspiration and its associated latent heat. This keeps heat in the ground and the daily minimum temperature increases because of the reduction in one of the mechanisms whereby the ground loses heat. The albedo modification produces changes in the amount of energy absorbed (or reflected), affecting the amount of heat exchange between the surface and the atmosphere.
The wind speed predicted by the WRF model is at a height of 10 m and is sensitive to the surface roughness. The wind speed in built-up areas decreases because the height of the urban buildings is usually higher than that of other natural vegetation, which increased the roughness of the surface, hindering the air circulation and lowering the wind speed [11]. In areas where the previous LULC was cropland, urbanization increased friction. On the other hand, in regions where the previous land cover was forest, the friction may decrease, depending on the type of forest and the average building characteristics (e.g., the height of the buildings).
The impacts of LULC changes on precipitation are complex [2]. Modifications in precipitation are more difficult to attribute to a cause, since several spatial scales are involved in the processes that lead to precipitation changes. Urbanization induces a series of complex changes in land surface roughness, soil moisture, and the exchange of water and energy between the land and the atmosphere [15]. These changes can either offset or enhance each other and complicate the spatial patterns of precipitation [19]. In our study, the microclimates may be modified because under urbanization local thermal convection increases and modifies the small-scale circulation patterns. Since the air temperature increases, this may be a contributor to the delay in the peak hour of precipitation. In addition, changes in local circulation also contributed to modifying the spatial distribution of precipitation.

Conclusions
This study evaluated the impacts of two LULC datasets on the operational weather forecast using the WRF model in central Mexico. Two experiments were completed as an operational forecast for 120 h, one using the USGS dataset and another using the NALCMS dataset. The updated NALCMS data agree better with the current land use conditions than the USGS data. Land cover such as grasslands, shrubland, and mixed and evergreen forests in USGS were replaced by cropland and urban and built-up land.
The RMSE and MAE of the average wind speed at 10 m decreased in the NALCMS experiment in comparison to the USGS experiment. Due to the surface roughness length, the NALCMS data improved the wind speed forecast quality. The friction and drag of the updated surfaces reduced the RMSE up to 0.6 m s −1 . The maximum reduction in RMSE reached up to 1.2 m s −1 . The RMSE of the maximum wind speed was lower than the minimum wind speed, and larger in the rainy than in the dry season. The lowest RMSE values were observed from a 48 h forecast.
The application of updated and accurate LULC data (NALCMS) improved the forecast of the near-surface maximum and minimum temperatures. The RMSE was reduced up to 0.6 • C during the rainy month (July) from a 48 h forecast. However, the differences between the USGS and NALCMS experiments reached up to 2.5 • C (average daily maximum temperature). During the dry and rainy season, the daily maximum temperature was better predicted mainly during the 72 h forecast and when NALCMS was used, and the daily minimum temperature was better predicted for the dry month (January) during the entire forecast.
The predicted precipitation showed that the WRF model was able to capture the main features of the observed precipitation, and the precipitation results obtained with the new NALCMS data were very similar to those obtained with USGS.
The reduction in the error magnitudes of the variables analyzed is consistent with previous studies, specifically with those of [12,13] for the wind speed at 10 m, [12,13] for the near-surface temperature at 2 m, and [3,9] for the precipitation.
From our results, it was observed that the most reliable forecast is the forecast from 48 to 72 h. This window provides the lowest error on which decision-makers could base their conclusions. This study follows those from [29,31] in the search for better weather forecasts in central México, a region influenced by tropical and middle latitude meteorological systems. The results may apply to other large urban areas.
Finally, even if the weather forecast was improved, research on physical parameterization schemes for the region should be conducted, with a focus on LSM schemes to obtain a better representation of land surface processes. LSMs with a higher level of complexity have been shown to improve predictions [51]. For example, studies such as Teklay et al. [9] have shown that Noah/USGS can decrease the RMSE of precipitation up to 1.36 mm d −1 compared to Thermal Diffusion/USGS. However, a combination of Noah and new LULC data can further lower bias. Particularly for Mexico, this is an issue that needs to be investigated in the future.