Machine Learning in Weather Prediction and Climate Analyses—Applications and Perspectives

: In this paper, we performed an analysis of the 500 most relevant scientiﬁc articles published since 2018, concerning machine learning methods in the ﬁeld of climate and numerical weather prediction using the Google Scholar search engine. The most common topics of interest in the abstracts were identiﬁed, and some of them examined in detail: in numerical weather prediction research—photovoltaic and wind energy, atmospheric physics and processes; in climate research— parametrizations, extreme events, and climate change. With the created database, it was also possible to extract the most commonly examined meteorological ﬁelds (wind, precipitation, temperature, pressure, and radiation), methods (Deep Learning, Random Forest, Artiﬁcial Neural Networks, Support Vector Machine, and XGBoost), and countries (China, USA, Australia, India, and Germany) in these topics. Performing critical reviews of the literature, authors are trying to predict the future research direction of these ﬁelds, with the main conclusion being that machine learning methods will be a key feature in future weather forecasting.


Introduction
The beginning of the 21st century, with the advent of big data, efficient supercomputers with Graphics Processing Units (GPU), and scientific interest in emerging new methods, turned out to be crucial in the history of machine learning [1]. Although many methods are known from the 1960s and have been examined in detail in many studies since then, recent years, with unprecedented increases in data volume and computer power, are seen as the golden era for artificial intelligence and machine learning (https://www.forbes.com/sites/joemckendrick/2019/10/23/artificial-intelligenceenters-its-golden-age/?sh=75d495f0734e, accessed on 17 December 2021).
Detailed reviews of machine learning algorithms, as the most important subgroup of artificial intelligence methods ( Figure 1) in atmospheric science, can be found in many thematic articles [2][3][4]. In these publications, one can find details about many methods and their classifications. For atmospheric scientists, the most interesting group of techniques was found to be supervised learning (Figure 1), the most dominant group in the recent publications in the field. In the case that some labelled data are available, one can use it as a training dataset from which to build a function that maps given inputs to outputs. That function can be used in a different dataset, named testing one, to evaluate the model, and if the results are satisfactory, it can be used in the classification or regression of any kind of application needed. In that group we find methods, such as Decision Trees, e.g., Random Forest (RF) [5] or XGBoost (XGB) [6], Artificial Neural Networks (ANN) [7], Deep Learning (DL) [8], and Support Vector Machine (SVM) [9]. The second group in machine learning is unsupervised learning (Figure 1), in which algorithms do not have labelled data to train from, and must decide upon other ways to divide a given dataset, or reduce the dimensions

Materials and Methods
This study was performed using a database with information for 500 scientific articles (published since 2018), obtained from the Google Scholar search engine (https://scholar.google.com/, accessed on 10 November 2021), which were related to the phrases "numerical weather prediction" and "machine learning"-250 papers, and "climate" and "machine learning"-250 papers. All search results were organized by relevancy; every item in the search results was checked in order to choose only research papers and to exclude unrelated articles. Thus, a database with 500 papers was created. Subsequently, every manuscript was saved onto Zotero software (https://www.zotero.org, accessed on 10 November 2021), which helps to organize data and extract text databases using important information, such as title, abstract, keywords, authors, journals, etc. All the prepared data are available in the supplementary comma separated (csv) files (Table

Materials and Methods
This study was performed using a database with information for 500 scientific articles (published since 2018), obtained from the Google Scholar search engine (https://scholar. google.com/, accessed on 10 November 2021), which were related to the phrases "numerical weather prediction" and "machine learning"-250 papers, and "climate" and "machine learning"-250 papers. All search results were organized by relevancy; every item in the search results was checked in order to choose only research papers and to exclude unrelated articles. Thus, a database with 500 papers was created. Subsequently, every manuscript was saved onto Zotero software (https://www.zotero.org, accessed on 10 November 2021), which helps to organize data and extract text databases using important information, such as title, abstract, keywords, authors, journals, etc. All the prepared data are available in the supplementary comma separated (csv) files (Tables S1 and S2). Text mining, using the 'tidytext' [12] R package [13], was performed on our database to search for the most common phrases included in the abstracts, and mostly used meteorological fields, methods, and countries of analysis. Additional to text mining, we analyzed well known papers on the relevant weather forecasting and climate change issues.

Results
For the first group of research papers related to machine learning methods and NWP models, we first built a list of search items based on American Geophysical Union (AGU) index terms (https://www.agu.org/Publish-with-AGU/Publish/Author-Resources/Indexterms, accessed on 22 December 2021), then measured how frequently these search items occurred in the abstracts. The 10 most common phrases are presented in Figure 2. Since the topic of post-processing NWP results to improve forecasts concerning renewable energy is very common among scientists, the phrase "Wind Forecasting" had the highest count ( Figure 2). The phrase with the second highest count turned out to be "Ensemble Forecasting", due to the growing interest in improving probabilistic forecasts and in the methods required to interpret them correctly. Slightly fewer counts were recorded for phrases such as "Data Assimilation", "Extreme Events", "Remote Sensing", and "Land Cover". Less than 10 counts were found for the phrases "Tropical Cyclones", "Coupled Models", "Cloud Physics", and "Boundary Layer".
Atmosphere 2022, 13, x FOR PEER REVIEW 3 of 17 S1, Table S2). Text mining, using the 'tidytext' [12] R package [13], was performed on our database to search for the most common phrases included in the abstracts, and mostly used meteorological fields, methods, and countries of analysis. Additional to text mining, we analyzed well known papers on the relevant weather forecasting and climate change issues.

Results
For the first group of research papers related to machine learning methods and NWP models, we first built a list of search items based on American Geophysical Union (AGU) index terms (https://www.agu.org/Publish-with-AGU/Publish/Author-Resources/Indexterms, accessed on 22 December 2021), then measured how frequently these search items occurred in the abstracts. The 10 most common phrases are presented in Figure 2. Since the topic of post-processing NWP results to improve forecasts concerning renewable energy is very common among scientists, the phrase "Wind Forecasting" had the highest count ( Figure 2). The phrase with the second highest count turned out to be "Ensemble Forecasting", due to the growing interest in improving probabilistic forecasts and in the methods required to interpret them correctly. Slightly fewer counts were recorded for phrases such as "Data Assimilation", "Extreme Events", "Remote Sensing", and "Land Cover". Less than 10 counts were found for the phrases "Tropical Cyclones", "Coupled Models", "Cloud Physics", and "Boundary Layer". For the second group of research papers related to machine learning methods in climate, a similar histogram is presented in Figure 3. Unsurprisingly, the most common phrase in this group was "Climate Change", with more than 140 counts. Almost three times more counts were recorded for the phrase "Global Climate Models" than for "Regional Climate Models". Slightly less common phrases were "Climate Impact", "Remote Sensing", "Land Cover", and "Extreme Events", while phrases such as "Coupled Models", "Convection", and "Calibration" had less than 10 counts. For the second group of research papers related to machine learning methods in climate, a similar histogram is presented in Figure 3. Unsurprisingly, the most common phrase in this group was "Climate Change", with more than 140 counts. Almost three times more counts were recorded for the phrase "Global Climate Models" than for "Regional Climate Models". Slightly less common phrases were "Climate Impact", "Remote Sensing", "Land Cover", and "Extreme Events", while phrases such as "Coupled Models", "Convection", and "Calibration" had less than 10 counts.  In addition to text mining of research topics and phrases, similar word counts in the abstracts of selected publications can also give an interesting insight into the most common topics of interest. Some of the most interesting results derived using this method are presented in the following section. Figure 4 presents the most commonly used meteorological fields in NWP studies. Scientists mentioned the term 'wind' more than 200 times, and this term is related to an important group of renewable energy and wind forecasting studies, as presented in Figure 2. The term 'precipitation' was used almost 150 times, usually with regards to applications for short-range prediction, and downscaling or post-processing. Several papers on bias correction of temperature and air pressure were present, as well as studies on radiation, both using photovoltaic application and emulating this scheme in NWP models.
To better understand the practices used by scientists for exploring machine learning techniques in NWP, the most commonly applied methods are presented in Figure 5. The most dominant algorithms were ANN and DL. Decision trees methods, such as RF, XGB, and SVM, are often used. Based on our experience, it seems that all of these methods can be successfully applied to NWP and climate analyses. In addition to text mining of research topics and phrases, similar word counts in the abstracts of selected publications can also give an interesting insight into the most common topics of interest. Some of the most interesting results derived using this method are presented in the following section. Figure 4 presents the most commonly used meteorological fields in NWP studies. Scientists mentioned the term 'wind' more than 200 times, and this term is related to an important group of renewable energy and wind forecasting studies, as presented in Figure 2. The term 'precipitation' was used almost 150 times, usually with regards to applications for short-range prediction, and downscaling or post-processing. Several papers on bias correction of temperature and air pressure were present, as well as studies on radiation, both using photovoltaic application and emulating this scheme in NWP models.
To better understand the practices used by scientists for exploring machine learning techniques in NWP, the most commonly applied methods are presented in Figure 5. The most dominant algorithms were ANN and DL. Decision trees methods, such as RF, XGB, and SVM, are often used. Based on our experience, it seems that all of these methods can be successfully applied to NWP and climate analyses.
In the case of research related to climate studies with machine learning methods, the most common countries taken into consideration are presented in Figure 6. It must be noted that only 25% (62 articles out of 250) of the papers under consideration in Figure 6 had a specific geographical region included in the abstract. Very often, the abstracts were more focused on the methods and data used in the study. An example of how this effected our results is that 25 papers on climate-related aspects in China represented almost 40% of all the papers with specified regions included in the abstract ( Figure 6). The most dominant group of papers related to studies about climate in China. Slightly fewer occurrences of the following countries, USA, Australia, India, and Germany, were found in selected abstracts.  In the case of research related to climate studies with machine learning methods, the most common countries taken into consideration are presented in Figure 6. It must be noted that only 25% (62 articles out of 250) of the papers under consideration in Figure 6 had a specific geographical region included in the abstract. Very often, the abstracts were more focused on the methods and data used in the study. An example of how this effected  In the case of research related to climate studies with machine learning methods, the most common countries taken into consideration are presented in Figure 6. It must be noted that only 25% (62 articles out of 250) of the papers under consideration in Figure 6 had a specific geographical region included in the abstract. Very often, the abstracts were more focused on the methods and data used in the study. An example of how this effected of all the papers with specified regions included in the abstract ( Figure 6). The most dominant group of papers related to studies about climate in China. Slightly fewer occurrences of the following countries, USA, Australia, India, and Germany, were found in selected abstracts. Figures 4-6 show the results from our analysis designed to capture all possible occurrences of a given phrase (e.g., the phrase 'USA' was a sum of the occurrences of the words 'U.S.', 'USA', 'United States', etc.). A more detailed insight into the selected fields of interest to scientists, generated using the text mining method in the form of co-occurrence networks from Figures 2 and 3, is presented below. Subsections 3.1 and 3.2 consider NWP and climate research, respectively.

Photovoltaic and Wind Energy
Many countries all over the world are in the stages of moving away from fossil-fuel power plants and towards the implementation of cleaner technologies, such as harnessing energy from wind or solar radiation; however, this transition leads to new challenges, one of which being the stability of power grids. Energy from conventional power plants is more stable and it is relatively easy to alter the production power with changing demand from customers, whereas renewable energy production is highly dependent on weather conditions. Therefore, accurate predictions are required, not only of meteorological fields, but also for energy production.
The standard procedure for predicting energy production from wind farms is to use NWP models and power curves of installed wind turbines, although many applications also use machine learning techniques. The research in this field has, in recent years, focused on using new machine learning methods [14][15][16][17], different NWP models and configurations, such as ensemble forecasting [18,19], or different approaches, from forecasting wind power for every wind turbine with high-resolution NWP models [20] to wind

Photovoltaic and Wind Energy
Many countries all over the world are in the stages of moving away from fossil-fuel power plants and towards the implementation of cleaner technologies, such as harnessing energy from wind or solar radiation; however, this transition leads to new challenges, one of which being the stability of power grids. Energy from conventional power plants is more stable and it is relatively easy to alter the production power with changing demand from customers, whereas renewable energy production is highly dependent on weather conditions. Therefore, accurate predictions are required, not only of meteorological fields, but also for energy production.
The standard procedure for predicting energy production from wind farms is to use NWP models and power curves of installed wind turbines, although many applications also use machine learning techniques. The research in this field has, in recent years, focused on using new machine learning methods [14][15][16][17], different NWP models and configurations, such as ensemble forecasting [18,19], or different approaches, from forecasting wind power for every wind turbine with high-resolution NWP models [20] to wind power production over whole countries [21]. With the obvious limitations of accuracy of NWP models, authors are trying to build models using methods such as RF, XGB, ANN, and DL to increase the accuracy of very short-range forecasts up to few hours, most commonly examined dayahead forecasts, and predictions up to several days in advance.
In terms of photovoltaic (PV) energy, similar to wind power, research has focused on examining the different architectures of machine learning models for improved postprocessing of NWP forecasts, with the use of similar methods [22,23]. One example is the PVNet model, designed to predict spatially aggregated PV production in Germany [24]. This model, based on LRCN (Long-Term Recurrent Convolutional Network) architecture, was not only proven to predict PV energy with high accuracy, but also to provide valuable insight into the dependence on energy production from different meteorological fields with respect to geographical location. Machine learning can be also an important tool in planning future installations of power plants [25]. With the use of existing PV systems, NWP forecasts, and observational data, it is possible to build an accurate model that can be used to determine favorable locations for new PV installations, even if weather measurements are not available. Machine learning techniques can be also combined with more basic statistical methods to provide location independent, day-ahead PV production forecasts [26].

Atmospheric Physics and Processes
In recent years there has been a growing interest in machine learning methods in many aspects, from the post-processing and bias correction of model forecasts [27,28] to emulating full model physics [29]. There are three important aspects to be taken into account when planning work using machine learning methods in NWP models. The first is to speed up computations of very computationally expensive parts of the model, the second is to improve the performance of current algorithms, and the third is to emulate the existing code with machine learning models in order to easily allow a model to run on a computer cluster with GPU accelerators. It is worth mentioning here some of the events held by leading NWP centers that focused on sharing knowledge regarding the use of machine learning in their applications, with recordings of meetings and presentations One of the most challenging problems with very high-resolution NWP models is related to land-cover classifications. Currently used databases are usually available with very coarse resolution and consist of numerous errors. Convolutional Neural Networks (CNN) can be used to improve them with the use of Sentinel-2 satellite data, the CORINE land-cover, and the BigEarthNet database [30]. This method was not only able to produce a model land-cover database that outperformed the currently used model, but also allowed the updating of maps to any time of the year, which is important for regions with large seasonal variations.
Several papers were also recently published on the topic of emulating different parts of NWP models by machine learning [31][32][33][34]. Authors are using either benchmark solutions to provide reliable estimates of examined algorithms, or using observational data from special campaigns to train models on real and accurate data. Very promising results were also obtained with NWP modules prepared especially for GPU accelerators, with speedup reaching 120 times those of their versions on standard Central Processing Unit-based (CPU) computers, in the cases of both the Radiation Transfer Model and Aerosol Microphysics.
Another interesting aspect relates to the tuning of NWP model parameters. Currently, in every existing NWP model, there are several parameters that have to be tuned manually. Scientists, while running some long or short-range experiments, usually do it, and compare verification results over different configurations. A machine learning-based approach for this process has been proposed in the literature [35,36]. Various microphysics schemes, cumulus parameterizations, and shortwave and longwave radiation schemes were examined, and based on the relationship between the choice of physical processes and the resulting forecast errors, a machine learning model was built to assess WRF model uncertainty.

Parametrizations
One of the challenges in improving General Circulation Models (GCM) is related to the proper parametrization of several atmospheric processes, e.g., moist convection. One example of how to tackle this problem comes with the use of machine learning methods [37]. It was proposed that RF models be trained from the output of high-resolution atmospheric NWP models and incorporated into the GCM model. It was shown that, using this technique, GCMs can run stably and accurately capture even extremes in precipitation. The RF method was used to ensure, for example, energy conservation, but authors commented that it can also be achieved with other machine learning techniques with an adjustment of the field's tendencies in the training process.
Interesting insight into parametrization performance was presented by Juval and Gorman [38]. Consistent with O'Gorman at al. [37], the RF method was used to learn from high-resolution, idealized atmospheric models, and it also led to stable forecasts in the coarse-grid model. Different approaches to the problem of using machine learning with parametrizations can be divided into three groups [39]. The first relates to the use of machine learning with observed data to develop improved individual parameterizations of features not explicitly resolved by the dynamics of the models [40][41][42][43]. The second is similar to the first group, although the parametrization scheme is not improved here, but replaced completely by machine learning [31,37,38,[44][45][46][47][48][49][50][51][52][53][54][55][56][57][58][59]. The third group relates to when observed data are used to produce forecasts of key weather features at specific locations [60][61][62][63].

Extreme Events
Extreme meteorological events are often related to the occurrence of weather fronts. Several studies were compared in order to examine their climatology with the use of machine learning methods [64][65][66][67], which can help provide more objective tools, in contrast with manually drawn maps with fronts. Authors are using several databases with labelled weather fronts, meteorological reanalysis, and several other methods to provide accurate models that can be used for the climatological analysis of positions of weather fronts.
Precipitation is also often considered in studies using machine learning methods. Since there is a big difference in the level of accuracy of prediction of synoptic-scale climate features and precipitation field, a 2D Convolution Neural Network has been proposed to develop approximators of regional precipitation and discharge extremes based on synopticscale predictions from general circulation models [68]. With such a method, not only is it possible to find the most reliable fields in estimation of precipitation extremes, but also to identify important regional and seasonal differences. Machine learning methods can be also used to better predict future intensity-duration-frequency curves that are important in terms of extreme precipitation and flooding events [69], or to estimate the trends and seasonal components of rainfall and streamflow [70][71][72] with the use of Wavelet analysis.

Climate Change
The previous sections show that the role of machine learning in many areas of meteorology, especially operational meteorology working for weather forecasts, is significant. This role has grown in recent years. The question that now arises is how machine learning will contribute to the field of climate change, which is probably the most significant issue in Earth sciences in recent years. The answer is not unequivocal here, because due to hundreds of articles on this topic, covering both global aspects as well as regional and local, there is a dominance of works without reference to machine learning. This situation is slowly changing, and the number of works using machine learning in climate change analyses has recently grown. We present here the most important works, in our opinion, which are important from the methodological and cognitive point of view. From the outset, it is worth citing a fundamental publication by 22 authors entitled 'Tackling Climate Change with Machine Learning' [73], which includes a very wide spectrum of machine learning applications in various climate change issues. It is written by many researchers from renowned research centers, specializing in particular climatic issues. This publication contains over 800 references to different aspects of climate change. In three main parts, titled 'mitigation', 'adaptation' and 'meta tools', the authors provide a detailed review of the literature on specific issues of climate change and its interactions with the environment and human activities. Moreover, in the work one can find many recommendations for various recipients and decision makers. The more than 800 works cited in total provide an excellent source of numerous analyses and introduce the possibilities of machine learning applications in research and activities related to climate change.
An example of a somewhat similar work, but concerning the modeling of the climatic conditions and climate projections, is the study by Schneider et al. 2017 [74]. Although this work is not as recent, it gives a good look at climate modeling and the application of new data and tools with the special use of machine learning. A broad view of the problems of climate modeling and the application of machine learning is presented in the work of Rechstein et al. [75] and Huntingford et al. [76]. A narrower meaning, concerning only selected elements of the climate system, is represented by the works of O'Gorman and Dwyer [37] and Dijkstra et al. [77]. The latter presents the advantages of using ML in the prediction of the El Niño phenomenon.
Several scientific publications considered the use of machine learning techniques in tackling climate change, where specific applications were considered, having the potential to be successfully examined similar to teleconnection identification and climate connection to extreme weather. Climate change research is present in many aspects of everyday life, e.g., in predicting building energy use [78] or pavement condition [79], in the future climate. Studies such as these can be very important and beneficial to long-term policymaking.
Another important aspect in this field is related to agriculture. Crop [80] and wheat yields [81] were modelled with successful results that outperformed classical statistical methods. Increasing heat events were identified, with machine learning techniques in the study conducted in Australia, to be a major factor causing yield losses in the future.
Based on the findings in this section, we summarize that machine learning helps to improve analysis and find links between different predictors and climate conditions in different issues. Simultaneously, it can also be used to generate high-resolution data and to explore the drivers of climate change [82][83][84][85][86][87].

Discussion and Conclusions
In this article, we present a review of the studies that aimed to use machine learning and artificial intelligence methods in meteorology and climatology. First, we extracted relevant information about the current studies in the field using text mining methods. With the use of Google Scholar search engine, we collected 500 articles, published since 2018, related to the use of machine learning techniques in numerical weather prediction and climate analysis. Based on the created dataset, we identified the most relevant topics of currently published studies, as well as other characteristics, such as analyzed meteorological fields, used methods, and the most common countries mentioned in the abstracts of the papers. This method has several limitations that should be mentioned here. The search engine will favor publications with the "machine learning" phrase in the title or abstract, and can omit important papers that use this phrase only in the main text. In addition, since the articles were collected manually, their overall number was only 500, far less than the requirement of several thousand, as suggested in other text mining publications related to searching for patterns in scientific articles (in those publications, databases were already prepared for specific topics, such as COVID-19). On the other hand, every publication was checked by a specialist in the field, so unrelated papers were immediately excluded from further analysis.
In terms of the presented results, it is clear that there are wide possibilities for using the methods mentioned previously, which have recently become a very important part of atmospheric science due to their research and applicational potential. Applicability in terms of prognostic models is indisputable, therefore machine learning methods can be successfully used to analyze and determine important problems in meteorology and synoptic climatology, such as current circulation types (patterns), types of weather, weather fronts, and air masses.
In our opinion, machine learning may have a particularly significant application in synoptic meteorology and climatology. This is because in many circulation-related issues there are no unambiguous, quantitative definitions or criteria, which makes it difficult and sometimes impossible to conduct objective analyses. Only for weather types can those criteria be found, but for others there are usually no strict and precise definitions without quantitative criteria and indices, and even if they exist, they are only available to selected regions on a local scale [88][89][90][91]. Therefore, machine learning can be used to objectively determine those elements, both in a supervised way when labelled data are available, and in an unsupervised way when we need to divide different features based on common characteristics. For example, the k-means clustering method can be found in many publications in which the authors intended to determine specific types of circulation, weather types, or types of dependence between different characteristics of meteorological and environment variables [92][93][94][95][96][97][98].
It is worth mentioning here that in previous review papers from the 20th century and the beginning of the 21st century, machine learning was not often mentioned in the perspectives for future emerging developments [99][100][101][102]. However, looking at the progress in the field of numerical meteorological analysis in recent years, this is not surprising. At the beginning of the 21st century, access to computer clusters, specialized software, and professional databases was very limited. It is clearly visible also in terms of meteorological reanalysis, that is now freely available to the research community, in very high spatial-and temporal resolution [103]. Although the interest in using machine learning in atmospheric science is visible from the beginning of 1990s and earlier [104,105], they were much more limited than more recent versions [106][107][108].
Even throughout the history of development in meteorology and synoptic climatology in the 21st century, it is hard to find a perspective for machine learning and artificial intelligence [90,[100][101][102][103][104][105], where greater importance is placed on downscaling and GIS methods. With that in mind, authors are trying to answer the question about the future of machine learning in atmospheric science, and it seems that, at least in the coming years, interest will grow. The increase in available computer power and emerging new technologies, the development and access to specialized software, and improved reanalysis will be key factors determining the use of machine learning in many studies. There are several limitations and problems that scientists can face when using machine learning techniques. One of the most obvious is related to knowledge of tools and methods. Fortunately, many institutions are now trying to organize workshops and seminars that are freely available online to help to tackle this problem. Proper use of machine learning methods also requires some level of interdisciplinary cooperation between scientists [109].
With fast growing interest in the use of machine learning methods in NWP and climate research, it is difficult to judge what the near future looks like. Some scientists are predicting that these methods will not play a significant role, while others see machine learning as a solution to almost every problem, and believe that in a few years it will suppress the standard way of working with models. We decided to look at the written plans belonging to world-leading NWP and climate consortia, such as the European Centre for Medium-Range Weather Forecasts (ECMWF) (https://www.ecmwf.int/node/19877, accessed on 10 November 2021), and agencies such as the National Oceanic and Atmospheric Administration (NOAA) (https://sciencecouncil.noaa.gov/Portals/0/Artificial% 20Intelligence%20Strategic%20Plan_Final%20Signed.pdf?ver=2021-01-19-114254-380, accessed on 10 November 2021). Both institutions are highly involved in research related to the use of modern machine learning techniques, and have extensive plans for the near future that can be used as a proxy for what we can expect in the field.
Based on previously mentioned plans and the current progress in atmospheric science, there is a clear tendency to tackle many aspects of research and operational areas of work with machine learning techniques. Both NOAA and ECMWF have assembled groups of scientists that will be responsible for accelerating artificial intelligence across whole institutions, and working together with other researchers and computer companies. Several goals and milestones have been established. There is a plan to organize several workshops and conferences related to the progress in the use of machine learning, research applications with the use of machine learning are supposed to be transferred faster to operational mode, and artificial intelligence should be widely promoted.
Another sign that machine learning methods will be present in many components of modern NWP and climate models is related to trends in new computer clusters. What can be seen in recent years is a rapid growth of top speed clusters in the world with GPU accelerators (https://blogs.nvidia.com/blog/2020/06/22/top500-isc-supercomputing/, accessed on 10 November 2021, and https://www.lumi-supercomputer.eu/lumi-providesnew-opportunities-for-artificial-intelligence-research/, accessed on 10 November 2021). Computer codes and the overall design of NWP models, written in programming languages such as Fortran, were prepared for standard CPU machines, therefore research into the use of machine learning to emulate some parts of the model, or even the whole model, can be very beneficial for agencies and consortia in the future [110]. It should be mentioned here that there are also some other initiatives that promote the use of machine learning in NWP and climate models. The Destination Earth project from European Commission's Green Deal and the Digital Strategy (https://digital-strategy.ec.europa.eu/en/policies/ destination-earth, accessed on 10 November 2021) with it's very challenging aims to provide a digital twin of the Earth with very high-resolution, will require speedup of current NWP models and fast post-processing of hundreds of terabytes of data every day. Most probably, in order to achieve this goal, state of the art machine learning methods will have to be implemented in future operational suites.
According to our knowledge and experience in this field, it is important first to properly understand the processes and relationships between meteorological and environment variables in analyzed problems, and to correctly implement any machine learning method and not to use it as a black box. With proper investigation taken into account, the aforementioned use of new technologies, and the cooperation between different fields, we believe that machine learning methods will be a key feature in future weather forecasting. Bias correction, ensemble forecasting interpretation, better data assimilation, and the emulation of computationally costly parametrizations can help us achieve accurate, high-resolution NWP model forecasts. It is worth mentioning that, although all the methods referenced in this paper can be used with success in many applications, some of them, for example RF, require less knowledge in the field of machine learning and are more suitable for beginners, while DL or CNN needs more experience to be used properly. We agree with [111] that artificial intelligence will be a very important technique that will help in the monitoring and forecasting of weather conditions. Independent of operational use, those methods can be also highly valuable in climate change research at spatial-and temporal scales [112], although it will depend strongly on data availability, which over recent years has been constantly improving.