Calibration of Ensemble Forecasts for Extreme Rainfall Using Bayesian Model Averaging: A Comparative Review of Gaussian and Gamma Distributions

Faidah, Defi Yusti; Darmawan, Gumgum; Tantular, Bertho; Immanuel, Febrianggi Caesar; Mohamed, Norizan

doi:10.3390/su18126121

Open AccessReview

Calibration of Ensemble Forecasts for Extreme Rainfall Using Bayesian Model Averaging: A Comparative Review of Gaussian and Gamma Distributions

by

Defi Yusti Faidah

^1,*

,

Gumgum Darmawan

¹,

Bertho Tantular

¹

,

Febrianggi Caesar Immanuel

² and

Norizan Mohamed

³

¹

Department of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Padjadjaran, Bandung 45363, Indonesia

²

Bachelor Programme of Statistics, Faculty of Mathematics and Natural Sciences, Universitas Padjadjaran, Sumedang 45363, Indonesia

³

Faculty of Computer Science and Mathematics, Universiti Malaysia Terengganu, Kuala Nerus 21030, Terengganu, Malaysia

^*

Author to whom correspondence should be addressed.

Sustainability 2026, 18(12), 6121; https://doi.org/10.3390/su18126121 (registering DOI)

Submission received: 27 April 2026 / Revised: 4 June 2026 / Accepted: 10 June 2026 / Published: 15 June 2026

(This article belongs to the Section Hazards and Sustainability)

Download

Browse Figures

Versions Notes

Abstract

Global climate change is causing an increase in extreme rainfall events, which impacts the risk of hydrometeorological disasters. To support disaster mitigation and early warning systems, accurate and reliable rainfall predictions are required. Although ensemble forecasting is widely used to model atmospheric uncertainty, raw ensemble results often exhibit insufficient bias and dispersion. Therefore, post-processing techniques are needed to improve the quality of probabilistic predictions. The most commonly used calibration method is Bayesian Model Averaging (BMA). This study conducted a scoping review of peer-reviewed papers on ensemble forecast calibration using BMA, based on the PRISMA-ScR framework. Furthermore, this study presents a comprehensive bibliometric analysis involving co-authorship networks of productive authors and bibliometric maps with clustered terms. A total of 35 relevant articles were identified from 49 screened publications. The bibliometric analysis revealed that “ensemble forecasting” and “Gaussian distribution” are the most dominant terms in the research network, indicating that Gaussian-based approaches remain more widely used in ensemble forecast calibration studies. In contrast, studies explicitly applying Gamma-based approaches are still relatively limited despite their relevance for modeling asymmetric rainfall data. The results obtained in this study highlight the importance of developing and integrating more appropriate probability distributions, such as those within the Extreme Value Theory framework, into BMA models. These findings suggest that the selection of appropriate probabilistic distributions in BMA-based calibration frameworks plays an important role in improving forecast reliability and the representation of uncertainty in rainfall prediction. Furthermore, the development of more suitable probability distributions, including Extreme Value Theory (EVT)-based distributions, has strong potential to enhance probabilistic calibration performance for asymmetric rainfall data. This approach is expected to improve the accuracy and reliability of extreme rainfall predictions. The findings of this study provide an important contribution to the development of early warning systems for hydrometeorological disasters and support the achievement of Sustainable Development Goals (SDGs).

Keywords:

Bayesian Model Averaging; BMA–Gamma; BMA–Gaussian; ensemble forecasting; extreme rainfall

1. Introduction

Increasingly intense global climate change has led to a marked rise in the frequency and severity of extreme rainfall events in many regions worldwide. Over the past two decades, the occurrence of extreme precipitation has escalated significantly, resulting in heightened risks of hydrometeorological disasters such as flooding, landslides, and flash floods that threaten human life, infrastructure, economic stability, and environmental sustainability [1,2,3]. Climate change poses a critical challenge to sustainable development, as highlighted in the United Nations Sustainable Development Goals (SDGs) framework.

Among the 17 SDGs, Goal 13 (Climate Action) explicitly emphasizes the urgency of enhancing adaptive capacity and implementing effective mitigation strategies to reduce climate-related hazards [4,5]. However, the impacts of climate change extend beyond SDG 13, influencing several interconnected goals. For instance, extreme rainfall directly affects the achievement of SDG 11 (Sustainable Cities and Communities) through the development of resilient infrastructure and improved disaster early warning systems. Similarly, it has important implications for SDG 6 (Clean Water and Sanitation), particularly in terms of sustainable water resource management and the mitigation of hydrological risks [5,6,7].

Therefore, accurate rainfall forecasting is crucial because it directly impacts various sectors, particularly the economy and transportation safety. Furthermore, accurate rainfall forecasting is beneficial for hydrological disaster mitigation and the development of efficient long-term development plans for risk management [8,9]. One effort to adapt to climate change is through the development of modeling and forecasting methods to improve the accuracy and reliability of rainfall predictions, particularly for supporting disaster mitigation, early warning systems, and climate resilience planning [10]. Rainfall forecasting remains a challenging problem because atmospheric systems are highly dynamic, nonlinear, and uncertain. In particular, predicting extreme rainfall is considerably more difficult because extreme rainfall events generally exhibit asymmetric, skewed, and heavy-tailed characteristics with relatively low occurrence probabilities [11,12,13].

In the climate modeling process, one important component that needs to be considered is uncertainty, including modeling for rainfall forecasting [14]. Uncertainty in dynamic weather and climate forecasting models stems from a lack of information regarding the initial conditions of the climate system and the limitations of the forecasting model in simulating the climate system [15]. A frequently encountered problem is how to quantify this uncertainty within a model by considering the sources of uncertainty. This can be overcome by combining forecasting models, also known as ensemble forecasting [16]. The fundamental consideration in combining forecasting models is that each model has a different ability to capture data patterns [17].

In relation to dynamic weather and climate modeling, the concept of combining forecasting models is known as the Ensemble Prediction System (EPS). The EPS is generated by combining several single models with random perturbations. Ensemble forecasting is able to quantify the uncertainty of future atmospheric conditions [18,19,20] and provide probabilistic information regarding forecast reliability, particularly for extreme rainfall prediction. The forecast results from each model are weighted according to the model’s contribution. Models with good performance are given higher weights. Ensemble forecast results are probabilistic, thus capturing elements of uncertainty [21,22,23]. In other words, forecast results are based on the Probability Density Function (PDF) rather than on a single forecast value and deterministic forecasting. Single-model forecasting produces results in the form of a single forecast value and focuses on the use of only one model, ignoring other models that may be significant and provide accurate forecast results [24].

Ensemble forecasting techniques have been widely applied by meteorological and geophysical research institutions worldwide. This aims to predict weather or climate conditions, both globally and regionally, particularly rainfall. One EPS currently being developed is the North-American Multi-Model Ensemble (NMME) in the United States. NMME output tends to be under-dispersive or over-dispersive. Under-dispersive and over-dispersiveness lead to less reliable forecasts [16,25,26,27,28,29]. Therefore, further processing (post-processing) of NMME output is required. Post-processing aims to improve the quality of dynamic weather and climate forecasts, particularly rainfall. Post-processing calibrates the ensemble forecast output from multiple models into a single forecast value that represents all models [30,31].

The Bayesian Model Averaging (BMA) method is quite popular and frequently used for ensemble forecast calibration [32]. BMA combines ensemble forecasting models to produce a predictive PDF, thereby improving model forecasting accuracy compared to a single model [32,33]. The predictive PDF is obtained by averaging the conditional predictive densities of individual models with the posterior density of the model as its weight. Ensemble forecast calibration is performed to obtain accurate forecast values [34,35,36].

BMA has a conditional PDF component of the ensemble members that follows a specific distribution pattern. Raftery et al. [32] developed a BMA method with a conditional PDF component following a Gaussian distribution, known as BMA–Gaussian. The results of ensemble forecast calibration using BMA–Gaussian were superior to the initial forecasts before calibration. BMA–Gaussian produced point forecasts with lower RMSE values than ensemble member forecasts. Furthermore, early studies on BMA–Gaussian demonstrated the effectiveness of probabilistic ensemble forecast calibration in improving forecast reliability and reducing prediction errors in weather and climate applications [37,38,39,40,41,42,43,44]. These studies played a crucial role in establishing BMA as a widely used probabilistic post-processing approach in hydrometeorology. However, most of the early BMA–Gaussian approaches were primarily developed for general weather forecasting and often showed limitations in representing the asymmetric and thick-tailed characteristics of extreme rainfall data.

Although the BMA–Gaussian approach is widely used, the Gaussian distribution is often inadequate to represent the statistical characteristics of rainfall data, especially extreme rainfall events. Rainfall observations are generally non-negative, asymmetric, highly skewed, and often characterized by thick tails. Consequently, Gaussian-based approaches are less able to represent extreme rainfall events, thus reducing the reliability of forecasts in hydrometeorological applications [45].

Subsequently, Sloughter et al. [45] developed the BMA model by replacing the conditional PDF component that originally followed the Gaussian distribution with a Gamma distribution, called BMA–Gamma. This method was applied to calibrate rainfall in North America. The results obtained showed that BMA–Gamma provided better calibration results. Sloughter et al. [45] also applied the BMA–Gamma model to calibrate wind speed data in the northwest Pacific. The predictive PDF results of the BMA–Gamma calibration were better than the ensemble forecast. In addition, Liu and Xie [46] used BMA–Gamma to calibrate rainfall at 43 stations in China’s Huaihe Basin. Liu and Xie [46] stated that further analysis of extreme rainfall is still needed because such events are characterized by asymmetric, highly variable, and heavy-tailed distributions with relatively low occurrence probabilities, making them difficult to model with standard probabilistic approaches. In addition, extreme rainfall is strongly influenced by climate variability and complex atmospheric dynamics, which further increase uncertainty in rainfall prediction. This study presents a literature review based on the PRISMA-ScR framework of the application of Bayesian Model Averaging in ensemble calibration for extreme rainfall forecasting. This study aims to provide a scientific basis and research directions for the development of more reliable probabilistic approaches in extreme rainfall forecasting, which may support hydrometeorological disaster early warning systems and climate change adaptation strategies.

Recent studies have shown that Bayesian Model Averaging (BMA) is increasingly being used in hydrometeorological forecasting to improve prediction reliability and uncertainty quantification. Yadav and Yadav [47] applied BMA to TIGGE ensemble rainfall forecasts and demonstrated improved forecast reliability and uncertainty representation. Yang et al. [48] used BMA to combine multiple rainfall products and reported improved rainfall estimation performance under various climate conditions. Amjad et al. [49] emphasized the importance of probabilistic post-processing approaches to improve reliability and uncertainty quantification in extreme rainfall forecasting. Meanwhile, Getu et al. (2025) [50] demonstrated that BMA can improve the reliability of rainfall predictions compared to individual climate models under future climate change scenarios.

Despite the growing number of studies on ensemble forecasting and Bayesian Model Averaging (BMA), several important research gaps remain. Most previous studies focused primarily on methodological applications and improvements in forecast accuracy. Furthermore, the integration of Extreme Value Theory (EVT)-based distributions into ensemble forecast calibration remains relatively underexplored, despite their theoretical relevance for modeling extreme hydrometeorological events. This study presents a literature review based on the PRISMA-ScR framework of the application of Bayesian Model Averaging in ensemble calibration for extreme rainfall forecasting, focusing on the use of Gaussian and Gamma distribution components. This study aims to identify research developments, methodological characteristics, and research opportunities related to probabilistic ensemble forecast calibration for extreme rainfall.

In recent years, researchers have applied ensemble post-processing, particularly BMA, in climatology, particularly for forecasting hydrometeorological variables. A thorough understanding of the methodology of previous literature reviews is crucial for identifying the calibration of probabilistic forecasts for extreme climate events. Table 1 summarizes selected studies related to Bayesian Model Averaging (BMA) for ensemble forecast calibration, particularly studies relevant to rainfall forecasting and the development of Gaussian- and Gamma-based probabilistic calibration approaches. Several studies involving other hydrometeorological variables were also included because of their methodological contributions to the development of BMA-based calibration frameworks.

2. Materials and Methods

This study combines a systematic literature review (SLR) and bibliometric analysis to examine the development of research on the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). Specifically, the focus is on Gaussian and Gamma distributions.

2.1. Systematic Literature Review Process

This study employed a systematic literature review approach based on the PRISMA-ScR framework to provide a transparent, systematic, and reproducible synthesis of the literature. A systematic literature review comprises four sequential stages. The initial stage is the identification of the research topic, which aims to determine the research focus and formulate relevant keywords. The subsequent stage is the search phase, which consists of an initial search and a final search. The preliminary search was conducted using keywords A, B, and C in the Scopus database. Subsequently, the search was consolidated by combining all keywords to obtain 64 documents. The search technique involved three things: quotation marks (“…”) for specific phrases, the logical operator OR for keyword synonyms, and AND to combine multiple concepts. Parentheses were used to group keywords. The keywords used in this study are grouped as shown in Table 2.

The search results were saved in CSV format for further analysis. The third stage is the screening phase, in which documents that are not scientific articles are removed. Additional screening criteria were applied to ensure the relevance and quality of the selected literature. Only peer-reviewed journal articles published in English and directly related to ensemble forecast calibration, Bayesian Model Averaging (BMA), probabilistic forecasting, and extreme rainfall were included in this review. Conference abstracts, book chapters, duplicated records, inaccessible full-text articles, and studies not directly related to hydrometeorological forecasting or ensemble calibration were excluded.

Following the screening process, the number of documents is reduced to 49 articles. The final stage is the selection phase, which is carried out in two stages: title and abstract selection, and full-text review. During the full-text review stage, articles were further evaluated based on methodological relevance, calibration approaches, probability distributions used, and their contribution to probabilistic ensemble forecasting for extreme rainfall. This process is conducted manually to ensure that only publications relevant to the research topic are retained. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) flow chart, as depicted in Figure 1, provides a structured framework for the review process, delineating the number of documents at each stage.

2.2. Bibliometric Analysis

A bibliometric analysis was conducted using a filtered database with 49 potential papers. The data were analyzed using the Biblioshiny interface in RStudio (version 2026.01.2, Posit Software, PBC, Boston, MA, USA), which serves as a web-based platform for the Bibliometrix package integrated with Shiny package. Biblioshiny is a web-based application that facilitates bibliometric data analysis from various sources, such as Scopus, Google Scholar, PubMed, and Web of Science. In this study, the Scopus database was selected as the primary data source because it provides comprehensive bibliographic information, including article titles, authors, publication years, journal names, publishers, volumes, and citation data. The search results were exported in CSV format for further analysis. The bibliometric analysis includes the following: publication count analysis, citation analysis, identification of top journals and publishers, authorship analysis (including co-authorship), and keyword analysis based on titles and abstracts. Additionally, bibliometric maps were visualized using VOSviewer to identify relationships between topics, authors, and research trends. The results of the analysis are presented in tables, graphs, and visual maps and are explained in detail in the Results and Discussion sections.

2.3. Methods of Calibration of Ensemble Forecasts

Ensemble forecasts tend to be under-dispersive or over-dispersive. This can lead to less reliable forecasts. This problem can be addressed by ensemble forecast calibration [53,54]. The principle of ensemble forecast calibration is to obtain predictive PDFs from observed weather variables. One method used for ensemble forecast calibration is the BMA. The BMA was first introduced by Rafteri [32]. The BMA is one of the statistical methods used to combine several forecasting models, including numerical weather prediction models, climate models, ARIMA models, and other probabilistic forecasting approaches. In addition, the BMA is able to calibrate the combined model so that reliable forecasting results are obtained. The BMA forecasting result is the sum of the results of the forecast of individual models that have been weighted. Suppose that

f_{k} = (f_{1}, f_{2}, \dots, f_{K})

is a forecasting result obtained from

K

different models and

y

is the result of BMA forecasting; the BMA predictive model is as follows:

p (y | f_{1}, f_{2}, \dots, f_{K}) = \sum_{k = 1}^{K} w_{k} g_{k} (y | f_{k})

(1)

Here,

p (y | f_{1}, f_{2}, \dots, f_{K})

represents the conditional predictive probability density function of the observed variable

(y)

given the ensemble forecasts

f_{k} = (f_{1}, f_{2}, \dots, f_{K})

, with

w_{k}

the weight of forecast

k

being the best one. Values of

w_{k}

are non-negative and if added together will be equal to one.

w_{k}

is a weight that reflects the relative contribution of individual models to predictions during a given period. While

g_{k} (y| f_{k}),

it is an opportunity from the results of BMA forecasting with the forecast results in the k-model. The PDF from

g_{k} (y | f_{k})

follows a particular distribution [32], one of which approaches the Gaussian distribution with a mean of

β_{0, k} + β_{1, k} f_{k}

and standard deviation of

σ,

so

g (y | f_{k}) ~ N (β_{0, k} + β_{1, k} f_{k}, σ^{2})

(2)

Besides using Gaussian distribution, the distribution that is often used is also the Gamma distribution. Gamma distribution has two parameters, where

α

is shape parameters and

β

is scale parameters. The mean value of the Gamma distribution is

μ = α β

and the variance is

σ^{2} = α β^{2}

[38]. If the result of BMA is a conditional PDF

y

on

f_{k},

this results in the best forecasting of individual models so that the PDF prediction of BMA is obtained in accordance with Equation (3).

g_{k} (y | f_{k}) = \frac{1}{β_{k}^{α k} Γ (a_{k})} y^{a_{k} - 1} \exp (- \frac{y}{β_{k}})

(3)

3. Results

3.1. Summary of Publications

This summary outlines research on the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). The findings indicate a total of 49 papers published by 29 journals, with an average of three publications per year and 221 citations per paper. Furthermore, a total of 239 authors contributed to these studies. From an initial set of 49 potential papers, 35 were identified as relevant after the screening process.

Figure 2 illustrates the number of papers from 2008 to 2026 (2026 until April). There have been several periods with no publications (2009–2011), followed by stable increases in subsequent years. A sharp increase occurred in 2018, with publication output reaching its peak in 2021. However, a decline occurred from 2024 to 2026. There are 29 journals in this bibliometric analysis that published articles related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) and Table 3 shows the top 10 journals with the most papers related to these topics.

The database contains information such as the number of citations per paper related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). Therefore, Figure 3 illustrates the most cited papers whose titles and abstracts contain the keywords of the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) within this database.

Figure 4 illustrates the average total citations per year for research in the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). The trend exhibits significant fluctuations over the analyzed period, with a prominent peak observed in 2013. Following this peak, the average citation rate experienced a decline, with subsequent minor increases noted in 2016 and 2019.

Several journals have published multiple documents related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). Consequently, Figure 5 illustrates the journals alongside their impact metrics (H-Index) specifically for publications within these topics.

3.2. Authorship Analysis

This section describes authorship in the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). Here, the bibliometrix package in R was utilized to analyze authorship. There are 239 authors within this dataset, two of whom are single authors. Figure 6 illustrates the overall number of authors per paper.

Figure 6 shows that the authorship pattern in research on the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) is dominated by small to medium-sized collaborative groups. Papers with three authors represent the largest proportion, with 14 publications, followed by papers with four authors (nine publications), two authors (seven publications), and six authors (six publications). This indicates that most studies in this field are conducted by compact, collaborative teams with limited numbers of researchers. The figure also shows that single-author publications are relatively limited, with only two papers identified in the dataset. This finding suggests that research on ensemble forecast calibration often requires interdisciplinary collaboration, particularly among meteorology, hydrology, statistics, and climate modeling. In addition, two papers appear as notable outliers, consisting of 26 and 51 authors, respectively. These large-authorship papers may represent large-scale multi-institutional or international collaborations involving operational forecasting systems, large datasets, and diverse scientific expertise.

Several authors have contributed more than one document to this field. Table 4 shows the most productive authors who have published documents related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA), along with their impact metrics (H-Index) for publications within this topic.

These top 15 authors produced over 65% of papers from all papers related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) in this database.

Figure 7 illustrates the publication output of the top 15 authors who have contributed to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). The blue circles within the figure represent an author’s document count for a given year. Specifically, the size of the circle indicates the volume of papers, while the intensity of the color (darkness) signifies the total citations per year.

Figure 8 illustrates the institutional affiliations of authors who have published documents concerning the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). It is observed that this topic is predominantly discussed by authors affiliated with departments of statistics, mathematics, and civil and environmental engineering.

Figure 9 illustrates the most cited countries related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). The figure clearly illustrates that the United States holds a predominant position, significantly in terms of total citation impact. Furthermore, Germany follows as the second most cited country, while the United Kingdom and the Netherlands exhibit a comparable level of citation frequency.

The authorship network on the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) is illustrated by the main authors related to their co-authors as shown in Figure 10.

The figure illustrates ten different clusters, where connected lines represent collaboration between authors, and different colors are used to group those who collaborate more frequently with each other than with those outside their group. The size of each node is proportional to the author’s total publications within this network. Furthermore, the distance between nodes reflects the strength of their relationship; authors positioned closely together or connected by thicker lines demonstrate a more intensive collaborative bond.

Figure 11 illustrates four clusters that represent different research focuses in calibration ensemble forecasting. Raftery and Gneiting are the primary links between these clusters. This is because Raftery developed Bayesian Model Averaging (BMA). This underscores their crucial role in developing and networking BMA as a method for ensemble prediction calibration. It demonstrates that these authors have made substantial contributions to building a network of diverse research groups, particularly in developing and applying BMA. This suggests that BMA is the dominant approach, connecting prediction methodologies with the calibration process.

Figure 12 illustrates the distribution of countries of origin of corresponding authors based on the number of publications and collaboration patterns. This distribution is further divided into single-country publications (SCPs) and multi-country publications (MCPs). Research contributions are generally dominated by a few key countries, primarily the United States and China. The United States tops the list with the most publications and a significant proportion of international collaborations (MCPs). This indicates that research in the United States is productive and has a high level of global collaboration. China also shows high productivity but is dominated by national publications (SCPs), indicating that most research is conducted domestically without significant international participation. European countries such as Germany and the Netherlands show a mix of national and international collaborations, albeit on a smaller scale. In contrast, countries such as Australia, Canada, and the United Kingdom are generally dominated by single-country publications, indicating relatively low international participation. It is noteworthy that some countries, such as Austria, only appear in the context of international collaborations.

3.3. Research Theme Mapping

This section discusses research topics related to ensemble forecast calibration for extreme rainfall using Bayesian Model Averaging (BMA), with a focus on the use of Gaussian and Gamma distributions. A co-occurrence network was created using VOSviewer version 1.6.20 (Centre for Science and Technology Studies (CWTS), Leiden University, Leiden, The Netherlands) based on terms from article titles and abstracts. Then, terms with similar meanings were clustered to improve the consistency of the analysis. The relationship between the terms in this network is represented by the distance between nodes. A smaller distance indicates a stronger association or higher co-occurrence frequency. VOSviewer groups these terms into several clusters, each represented by a different color. Each cluster represents a specific research subfield. This visualization allows one to identify thematic structures and relationships within the literature, providing a comprehensive overview of research developments.

Figure 13 presents a keyword co-occurrence network depicting the thematic structure of research related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA). Each node represents a term extracted from the title and abstract, while the node size indicates the frequency of occurrence. Relationships between terms are indicated by connecting lines, and proximity between nodes reflects a stronger degree of relatedness. Furthermore, different colors indicate the grouping of terms into research clusters. The visualization results show that the term “ensemble forecasting” is the most dominant node and is located at the center of the network. This indicates that this topic is a core topic in the literature and acts as a bridge between various research subtopics. Based on visualization, the network is divided into five main clusters. The term “ensemble forecasting” is seen as the most dominant node and is located at the center of the network, indicating its role as a core topic in the literature and a bridge between various research subtopics.

Several closely related terms, including singular and plural forms such as “numerical model” and “numerical models”, were retained in the bibliometric network visualization because they appear in different co-occurrence contexts in the analyzed publications. In this context, “numerical model” generally refers to a specific numerical approach or model, whereas “numerical models” is more commonly used in discussions involving multiple models, model comparisons, or ensemble modeling. These contextual differences result in the terms appearing as distinct nodes in the bibliometric network. Furthermore, the larger node size of “numerical model” compared to “numerical models” indicates that the singular form appears more frequently in the analyzed dataset. The presence of connecting lines between these nodes indicates a strong conceptual relationship and frequent co-occurrence within similar research themes. Therefore, the visualization not only highlights the frequency of term occurrence but also illustrates the relational structure among keywords in the bibliometric dataset.

Table 5 describes the keywords from the five clusters. According to Table 5, the opportunities for conducting research relevant to this topic are wide open based on the keywords in each cluster. The association of selected keywords with other keywords is obtained by hovering over the selected word. For example, the keyword “ensemble forecasting” was selected from Cluster 1. Furthermore, this keyword is directly related to ensemble forecasting, as shown in Figure 14, illustrating the results of this network term.

Furthermore, keywords related to precipitation forecasting can be seen in Figure 15. Figure 14 and Figure 15 illustrate the keyword co-occurrence network for ensemble forecast calibration research, generated using VOSviewer. The visualization shows that “ensemble forecasting” serves as the central and most dominant node, closely connected with terms such as “Gaussian distribution” and “statistical post-processing.” The larger node size of “Gaussian distribution” indicates that Gaussian-based approaches are more frequently discussed in the analyzed publications. The term “gamma distribution” does not appear as an independent node because it did not satisfy the minimum occurrence threshold applied in the visualization process. This finding suggests that studies explicitly referring to the Gamma distribution or BMA–Gamma remain relatively limited in the analyzed dataset, despite their conceptual relevance for probabilistic rainfall calibration. Nevertheless, the Gamma distribution remains important in rainfall forecasting because it is suitable for modeling positively skewed precipitation data and for improving probabilistic calibration performance.

Figure 15 further indicates that statistical approaches based on the Gaussian distribution continue to dominate precipitation forecasting studies. This dominance highlights opportunities for future research to explore alternative probabilistic distributions, particularly Gamma-based approaches, which may provide better representation of asymmetric rainfall characteristics. These findings reinforce the importance of comparative studies between Gaussian and Gamma distributions in improving precipitation forecast performance.

An equally important keyword is calibration. Figure 16 displays a keyword co-occurrence network with “calibration” as one of the central nodes connecting various themes in ensemble forecasting research. Although the ensemble forecasting node is more dominant in size, calibration’s central position in the network demonstrates its strategic role as a link between predictive models, statistical approaches, and practical applications. Calibration is a key component in ensemble forecasting systems, connecting numerical model output with statistical and Bayesian approaches to produce more accurate and reliable predictions.

The R software (version 2026.01.2, Posit Software, PBC, Boston, MA, USA) package bibliometrix can analyze the conceptual structure of themes, for example, through thematic maps and evolution. Next, related keywords are generated using thematic maps, as shown in Figure 16, divided into four quadrants. The cluster of colored circles represents the circle size, which indicates the number of papers. Each circle displays a maximum of three related keywords, although more can be included.

Figure 17 presents a thematic map depicting the development and relevance of research topics in the fields of ensemble forecasting and precipitation forecasting. The horizontal axis indicates the level of relevance (centrality), while the vertical axis indicates the level of development (density). Based on this division, research topics are divided into four main quadrants: motor themes, basic themes, niche themes, and emerging or declining themes. The motor themes quadrant includes topics such as weather forecasting, prediction, and precipitation forecasting, as well as hydrological modeling, machine learning, and streamflow. This position indicates that these topics have a high level of development and relevance, thus serving as key drivers of current research. This indicates that research related to rainfall prediction and its integration with modern approaches such as machine learning is a primary focus in the literature.

The basic themes quadrant is filled with topics such as ensemble forecasting, probability, and numerical models. These topics have high relevance but a relatively lower level of development, thus serving as the conceptual foundation for research. In other words, ensemble forecasting and numerical models are the primary foundation that supports the development of other topics, including rainfall calibration and prediction.

The niche themes quadrant includes topics such as forecasting, extreme events, and continuous ranked probability scores (CRPS). These topics have a high level of development but limited relevance, tending to be specific and in-depth within certain subfields, such as evaluating the performance of probabilistic models.

Meanwhile, the emerging or declining themes quadrant includes topics such as uncertainty analysis, uncertainty, and Gaussian distribution. This position indicates that these topics have a relatively low level of development and relevance and can therefore be interpreted as emerging or declining themes. In the context of this research, the presence of Gaussian distribution in this quadrant indicates a shift in attention toward other, more appropriate distribution approaches, such as the Gamma distribution for asymmetric rainfall data.

Furthermore, topics such as Bayesian analysis, precipitation (climatology), and probability distributions are in the middle area, indicating that these topics play a transitional role in connecting basic themes with more developed ones. This reinforces the importance of the Bayesian approach in the calibration and modeling of probability distributions.

Overall, this thematic map demonstrates that research in this area is moving beyond the foundations of numerical models and ensemble forecasting toward more advanced approaches such as integrating machine learning and improving precipitation forecast accuracy. Furthermore, there are significant research opportunities in the development of alternative distribution-based calibration methods, particularly to address the limitations of Gaussian distributions in modeling extreme rainfall.

History and continuity are essential for analyzing thematic development in ensemble and precipitation forecasting research. Figure 18 illustrates the thematic evolution over three periods: 2008–2019, 2020–2022, and 2023–2026. It demonstrates the dynamics and interactions between themes based on the co-occurrence of keywords. In the early period, weather forecasting dominated, with early approaches such as Gaussian methods, uncertainty analysis, and hydrological modeling. In the second period, the focus shifted toward probabilistic approaches, characterized by the emergence of probability theory, ensemble forecasting, statistical post-processing, and precipitation forecasting. This indicates an increasing focus on calibration and improving forecast quality. In the most recent period, research has evolved toward more complex integration methods such as machine learning and more specific applications such as flood forecasting. The continuity of the probabilistic theme demonstrates that the probabilistic approach is increasingly becoming a primary framework in modern weather forecasting. Overall, this thematic development illustrates a shift from a deterministic to a calibrated probabilistic approach, as well as the integration of modern methods to improve the accuracy and reliability of precipitation forecasts.

4. Discussion

Bibliometric analysis results indicate that research related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) is still in its infancy, with a relatively limited number of publications, namely 35 selected articles from an initial collection of 49 documents. However, Figure 2 shows a significant increase in publication trends since 2018, peaking in 2021, indicating growing attention to the importance of calibration in extreme rainfall prediction.

In terms of publication sources, the dominance of journals such as Water Resources Research, Monthly Weather Review, and Journal of Hydrology indicates that research in this field lies at the intersection of meteorology, hydrology, and statistics. This aligns with the characteristics of the problem, which requires integration between numerical weather models and statistical approaches to improve prediction accuracy.

Authorship analysis shows no single author to be dominant, with the highest contribution reaching only three publications. This indicates that research remains dispersed across various research groups and is not yet concentrated in a single scientific community. However, analysis of collaboration networks indicates that there are connections between research groups, although they remain relatively fragmented. Interestingly, in the co-authorship network in Figure 11, Raftery and Gneiting serve as key links between clusters, confirming their crucial role in the development of the BMA method as a leading approach to ensemble forecast calibration.

Conceptual structure analysis through the co-occurrence network identified five main clusters. These findings indicate that research in this field is structured around three main components: numerical models, calibration methods, and applications to extreme rainfall. According to Figure 13, the central position of ensemble forecasting and calibration within the network confirms that calibration is a key component in improving prediction reliability.

Thematic evolution analysis reinforces these findings by showing a shift from early deterministic, model-based approaches to probabilistic, calibration-based approaches. In the early period, research was dominated by weather forecasting and methods such as Gaussian modeling and uncertainty analysis. Subsequently, research shifted toward statistical post-processing and precipitation forecasting, indicating an increased focus on prediction quality. In the more recent period, the integration of methods such as machine learning and applications like flood forecasting has become increasingly prominent.

The development of Bayesian Model Averaging (BMA first, keywords relevant to the research topic are determined) methods in ensemble forecast calibration has shown a clear methodological evolution over time. Early studies primarily focused on the BMA–Gaussian approach introduced by Raftery et al. [32], which is widely applied in weather forecasting due to its ability to improve forecast reliability and reduce prediction errors. However, subsequent studies identified limitations of the Gaussian distribution in representing the non-negative, asymmetric, and thick-tailed characteristics of rainfall. These limitations prompted the development of alternative probabilistic approaches, particularly the BMA–Gamma model proposed by Sloughter et al. [52], which demonstrated superior performance for rainfall forecasting applications. In addition to methodological development, validation techniques also play an important role in assessing the effectiveness of probabilistic calibration approaches. Several studies reviewed in this paper commonly employed the Continuous Ranked Probability Score (CRPS) to evaluate calibration quality and forecast reliability. CRPS is widely used in probabilistic forecasting because it simultaneously measures prediction accuracy and the ability of calibration methods to represent forecast uncertainty. Therefore, the use of CRPS provides important insight into the effectiveness of BMA-based calibration methods, particularly for extreme rainfall prediction.

Recent developments indicate that the application of Bayesian Model Averaging (BMA) has expanded beyond rainfall forecasting to include a variety of other climate variables, such as air temperature, wind speed, humidity, and seasonal climate prediction. Furthermore, several recent studies have begun to explore integrating probabilistic calibration approaches with Extreme Value Theory (EVT) and machine learning to improve uncertainty quantification and the reliability of extreme event predictions. Although developments in this direction remain relatively limited, these studies indicate promising opportunities for the future development of Bayesian Model Averaging (BMA) in probabilistic ensemble forecasting. Overall, these findings indicate that research in this field has undergone a transformation from basic model development to more complex and applicable approaches. However, significant gaps remain, particularly in the selection of optimal probability distributions for calibrating extreme rainfall. The declining dominance of the Gaussian distribution opens up opportunities to explore alternative distributions, such as the Gamma distribution, which better suits the characteristics of asymmetric data.

Based on these research findings, several important directions for future research are highlighted. First, a more in-depth study is needed on modeling extreme rainfall using distribution approaches specifically designed for extreme data, such as distributions within the Extreme Value Theory (EVT) framework, including the Generalized Extreme Value (GEV) and Generalized Pareto Distribution (GPD). This is crucial given that most previous research has been dominated by the Gaussian distribution, which is less able to represent the skewed and heavy-tailed characteristics of extreme rainfall data.

Second, the integration of these extreme distribution approaches into the Bayesian Model Averaging (BMA) framework remains relatively limited and represents a significant research opportunity. The development of a BMA model based on extreme distributions is expected to improve prediction accuracy and reliability, particularly in capturing high-intensity, low-probability events. Although this review primarily focuses on Bayesian Model Averaging (BMA), recent studies have also demonstrated the growing potential of Artificial Intelligence (AI) and machine learning approaches in rainfall forecasting. Nevertheless, BMA remains one of the most widely used probabilistic post-processing approaches for ensemble forecast calibration because of its capability to quantify predictive uncertainty and forecast reliability. Third, future research could also combine the BMA approach with machine learning methods to enhance the model’s ability to capture nonlinear patterns and the complexity of hydrometeorological data. This integration has the potential to produce a more adaptive and robust prediction model to varying climate conditions.

This scoping review has several limitations. First, the literature search was limited to the Scopus database and peer-reviewed articles published in English, which may have excluded relevant studies indexed in other databases or published in different languages. Second, the number of studies specifically discussing Bayesian Model Averaging (BMA) for extreme rainfall calibration remains relatively limited. Third, this review focused primarily on Gaussian- and Gamma-based BMA approaches; therefore, other probabilistic calibration approaches and distributions may not have been comprehensively covered. In addition, no formal critical appraisal of the included studies was conducted because the main objective of this scoping review was to map and summarize the existing literature rather than evaluate methodological quality.

5. Conclusions

This study presents a systematic literature review and bibliometric analysis of studies related to the calibration of ensemble forecasts of extreme rainfall using Bayesian Model Averaging (BMA). The analysis shows that this field has experienced significant development since 2018, with contributions from various disciplines, particularly meteorology, hydrology, and statistics. The conceptual structure of the study indicates that the prediction system consists of three main components: ensemble forecasts, statistical and Bayesian-based calibration methods, and applications to extreme rainfall. This review provides a comparative and bibliometric perspective on Gaussian- and Gamma-based BMA approaches for extreme rainfall calibration, which remains relatively limited in the existing literature.

According to the key findings, calibration is essential for improving the reliability of ensemble forecasts, and probabilistic approaches, especially BMA, are the most common methods. Thematic maps and evolution results reveal a shift from a deterministic to a probabilistic approach. However, the continued dominance of Gaussian distributions suggests that they are inadequate for representing the asymmetric and heavy-tailed characteristics of extreme rainfall data. These findings indicate that the selection of appropriate probabilistic distributions plays an important role in improving forecast reliability, uncertainty representation, and probabilistic prediction performance for extreme rainfall events.

Furthermore, this study identified a significant scientific gap in the literature, namely the limited exploration of alternative distributions, particularly the Gamma distribution and distributions within the Extreme Value Theory framework in the BMA-based calibration process. Therefore, further research will focus on emphasizing the importance of developing and comparing more appropriate probability distribution approaches for extreme data within the Bayesian Model Averaging framework. The integration of EVT-based distributions into BMA calibration frameworks has strong potential to improve the ability of ensemble forecasting systems to capture asymmetric and extreme precipitation characteristics. This approach is expected to improve the accuracy, reliability, and capability of models in capturing extreme rainfall events, thus making a significant contribution both methodologically and applicatively to modern hydrometeorological prediction systems.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/su18126121/s1, Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) checklist [65].

Author Contributions

Conceptualization, D.Y.F.; methodology, D.Y.F. and G.D.; software, F.C.I.; validation, B.T.; formal analysis, D.Y.F.; investigation, N.M.; resources, B.T. and N.M.; data curation, B.T. and G.D.; writing—original draft preparation, D.Y.F.; writing—review and editing, B.T.; visualization, D.Y.F. and F.C.I.; supervision, D.Y.F.; project administration, N.M.; funding acquisition, D.Y.F. All authors have read and agreed to the published version of the manuscript.

Funding

This publication is funded by Universitas Padjadjaran through the Indonesian Endowment Fund for Education (LPDP) on behalf of the Indonesian Ministry of Higher Education, Science and Technology, and managed under the EQUITY Program (Contract No. 4303/B3/DT.03.08/2025 and 3927/UN6.RKT/HK.07.00/2025).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data is contained within the article or Supplementary Materials.

Acknowledgments

This publication was supported by Universitas Padjadjaran through the EQUITY Program Article Review Scheme (Contract No. 5670/UN6.3.1/PT.00/2025). The publication fee was funded by Universitas Padjadjaran through the Indonesian Endowment Fund for Education (LPDP) on behalf of the Indonesian Ministry of Higher Education, Science, and Technology and was administered under the EQUITY Program (Contract No. 4303/B3/DT.03.08/2025 and 3927/UN6.RKT/HK.07.00/2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

SDGs	Sustainable Development Goals
BMA	Bayesian Model Averaging
EPS	Ensemble Prediction System
NMME	North-American Multi-Model Ensemble
SLR	Systematic Literature Review
CRPS	Continuous Ranked Probability Score
EVT	Extreme Value Theory
GEV	Generalize

References

Maity, R. Hydrological Alterations Under Climate Change: Global-Scale Challenges and Opportunities for Adaptation and Sustainable Development. In Civil Engineering Innovations for Sustainable Communities with Net Zero Targets; CRC Press: Boca Raton, FL, USA, 2024; pp. 102–128. [Google Scholar]
National Academies of Sciences, Medicine; Division on Earth; Board on Atmospheric Sciences; Committee on Extreme Weather Events; Climate Change Attribution. Attribution of Extreme Weather Events in the Context of Climate Change; National Academies Press: Washington, DC, USA, 2016. [Google Scholar]
Muralidharan, K.; Pathak, P. Navigating the Deluge: Understanding India’s Diverse Flood Landscape. Resonance 2025, 30, 323–340. [Google Scholar] [CrossRef]
Filho, W.L.; Wall, T.; Salvia, A.L.; Dinis, M.A.P.; Mifsud, M. The central role of climate action in achieving the United Nations’ Sustainable Development Goals. Sci. Rep. 2023, 13, 20582. [Google Scholar] [CrossRef] [PubMed]
Hsieh, Y.L.; Yeh, S.C. The trends of major issues connecting climate change and the sustainable development goals. Discov. Sustain. 2024, 5, 31. [Google Scholar] [CrossRef]
Dixit, A.; Madhav, S.; Mishra, R.; Srivastav, A.L.; Garg, P. Impact of climate change on water resources, challenges and mitigation strategies to achieve sustainable development goals. Arab. J. Geosci. 2022, 15, 1296. [Google Scholar] [CrossRef]
Batalini de Macedo, M.; Nóbrega Gomes Júnior, M.; Pereira de Oliveira, T.R.; HGiacomoni, M.; Imani, M.; Zhang, K.; Ferreira do Lagoa, C.A.; Mendiondo, E.M. Low Impact Development practices in the context of United Nations Sustainable Development Goals: A new concept, lessons learned and challenges. Crit. Rev. Environ. Sci. Technol. 2022, 52, 2538–2581. [Google Scholar] [CrossRef]
Rahman, M.R.; Rahman, A.; Saha, S.K. Technology in Hydro-Geological. In Advanced GIScience in Hydro-Geological Hazards: Applications, Modelling and Management; Springer: Berlin/Heidelberg, Germany, 2025; p. 3. [Google Scholar]
Coles, S. An Introduction to Statistical Modeling of Extreme Values; Springer Series in Statistics; Springer: Berlin/Heidelberg, Germany, 2001. [Google Scholar]
Ghahramani, A.; Howden, S.M.; del Prado, A.; Thomas, D.T.; Moore, A.D.; Ji, B.; Ates, S. Climate change impact, adaptation, and mitigation in temperate grazing systems: A review. Sustainability 2019, 11, 7224. [Google Scholar] [CrossRef]
Makridakis, S.; Hibbon, M. The M3-Competition: Results, Conclusions and Implications. Int. J. Forecast. 2000, 16, 451–476. [Google Scholar] [CrossRef]
Liu, Z.; Zhu, Z.; Gao, J.; Xu, C. Forecast methods for time series data: A survey. IEEE Access 2021, 9, 91896–91912. [Google Scholar] [CrossRef]
Zou, H.; Yang, Y. Combining time series models for forecasting. Int. J. Forecast. 2004, 20, 69–84. [Google Scholar] [CrossRef]
Kuswanto, H.; Rahadiyuza, D.; Gunawan, D. Probabilistic precipitation forecast in (Indonesia) using NMME models: Case study on dry climate region. In Advances in Sustainable and Environmental Hydrology, Hydrogeology, Hydrochemistry and Water Resources: Proceedings of the 1st Springer Conference of the Arabian Journal of Geosciences (CAJG-1); Springer International Publishing: Berlin/Heidelberg, Germany, 2019. [Google Scholar]
Wu, Y.; Xue, W. Data-driven weather forecasting and climate modeling from the perspective of development. Atmosphere 2024, 15, 689. [Google Scholar] [CrossRef]
Bülte, C.; Horat, N.; Quinting, J.; Lerch, S. Uncertainty quantification for data-driven weather models. Artif. Intell. Earth Syst. 2026, 5, 240049. [Google Scholar] [CrossRef]
Sheikh, M.R.; Coulibaly, P. Review of recent developments in hydrologic forecast merging techniques. Water 2024, 16, 301. [Google Scholar] [CrossRef]
Zhang, X.; Srinivasan, R.; Bosch, D. Calibration and uncertainty analysis of the SWAT model using Genetic Algorithms and Bayesian Model Averaging. J. Hydrol. 2009, 374, 307–317. [Google Scholar] [CrossRef]
Peng, X.; Zheng, W.; Zhang, D.; Liu, Y.; Lu, D.; Lin, L. A Novel Probabilistic Wind Speed Forecasting Based on Combination of the Adaptive Ensemble of On-Line Sequential ORELM (Outlier Robust Extreme Learning Machine) and TVMCF (Time-Varying Mixture Copula Function). Energy Convers. Manag. 2017, 138, 587–602. [Google Scholar] [CrossRef]
Wang, S.; Zhang, N.; Wu, L.; Wang, Y. Wind Speed Forecasting Based On The Hybrid Ensemble Empirical Mode Decomposition And GA-BP Neural Network Method. Renew. Energy 2016, 94, 629–636. [Google Scholar] [CrossRef]
Blanc, S.M.; Setzer, T. When to Choose the Simple Average in Forecast Combination. J. Bus. Res. 2016, 69, 3951–3962. [Google Scholar] [CrossRef]
AlKandari, M.; Ahmad, I. Solar power generation forecasting using ensemble approach based on deep learning and statistical methods. Appl. Comput. Inform. 2024, 20, 231–250. [Google Scholar] [CrossRef]
Mahesh, A.; Collins, W.; Bonev, B.; Brenowitz, N.; Cohen, Y.; Harrington, P.; Kashinath, K.; Kurth, T.; North, J.; Obrien, T.; et al. Huge ensembles part II: Properties of a huge ensemble of hindcasts generated with spherical Fourier neural operators. arXiv 2024, arXiv:2408.01581. [Google Scholar]
Mallick, T.; Macfarlane, J.; Balaprakash, P. Uncertainty quantification for traffic forecasting using deep-ensemble-based spatiotemporal graph neural networks. IEEE Trans. Intell. Transp. Syst. 2024, 25, 9141–9152. [Google Scholar] [CrossRef]
Price, I.; Sanchez-Gonzalez, A.; Alet, F.; Andersson, T.R.; El-Kadi, A.; Masters, D.; Ewalds, T.; Stott, J.; Mohamed, S.; Battaglia, P.; et al. Probabilistic weather forecasting with machine learning. Nature 2025, 637, 84–90. [Google Scholar] [CrossRef]
Wang, Y.; Xu, H.; Zou, R.; Zhang, F.; Hu, Q. Dynamic non-constraint ensemble model for probabilistic wind power and wind speed forecasting. Renew. Sustain. Energy Rev. 2024, 204, 114781. [Google Scholar] [CrossRef]
Vrugt, J.A.; Diks, C.G.H.; Clark, M.P. Ensemble Bayesian Model Averaging Using Markov Chain Monte Carlo Sampling. Environ. Fluid Mech. 2008, 8, 579–595. [Google Scholar] [CrossRef]
Sun, L.; Lan, Y.; Sun, X.; Liang, X.; Wang, J.; Su, Y.; He, Y.; Xia, D. Deterministic forecasting and probabilistic post-processing of short-term wind speed using statistical methods. J. Geophys. Res. Atmos. 2024, 129, e2023JD040134. [Google Scholar] [CrossRef]
Becker, E.; Kirtman, B.P.; Min, D. Initialized Seasonal Prediction with the NCAR Models in the North American Multimodel Ensemble (NMME). Weather Forecast. 2025, 40, 889–900. [Google Scholar] [CrossRef]
Ross, A.C.; Stock, C.A.; Koul, V.; Delworth, T.L.; Lu, F.; Wittenberg, A.; Alexander, M.A. Dynamically downscaled seasonal ocean forecasts for North American East Coast ecosystems. Ocean Sci. 2024, 20, 1631–1656. [Google Scholar] [CrossRef]
Candille, G. The Multiensemble Approach: The NAEFS Example. Mon. Weather Rev. 2009, 137, 1655–1665. [Google Scholar] [CrossRef]
Raftery, A.E.; Gneiting, T.; Balabdoul, F.; Polakowski, M. Using Bayesian Model Averaging to Calibrate Forecast Ensembles. Mon. Weather Rev. 2005, 133, 1155–1174. [Google Scholar] [CrossRef]
Gneiting, T.; Raftery, A.E. Weather Forecasting with Ensemble Methods. Science 2005, 310, 248–249. [Google Scholar] [CrossRef]
Zhang, T.; Liang, Z.; Bi, C.; Wang, J.; Hu, Y.; Li, B. Statistical post-Processing for precipitation forecast through deep learning coupling large-scale and local-scale spatiotemporal information. Water Resour. Manag. 2025, 39, 145–160. [Google Scholar] [CrossRef]
Meng, H.; Di, Z.; Zhang, W.; Sun, H.; Tian, X.; Wang, X.; Xie, M.; Li, Y. Spatiotemporal Analyses of High-Resolution Precipitation Ensemble Simulations in the Chinese Mainland Based on Quantile Mapping (QM) Bias Correction and Bayesian Model Averaging (BMA) Methods for CMIP6 Models. Atmosphere 2025, 16, 1133. [Google Scholar] [CrossRef]
Zhao, Y.; Luo, S.; Cai, J.; Li, Z.; Zhang, M. Monthly precipitation prediction based on the CEEMDAN-BMA model. Water Resour. Manag. 2024, 38, 5661–5681. [Google Scholar] [CrossRef]
Ji, L.; Zhi, X.; Luo, Q.; Ji, Y. Hierarchical multimodel ensemble probabilistic forecasts for precipitation over East Asia. Meteorol. Appl. 2025, 32, e70035. [Google Scholar] [CrossRef]
Zhang, X.; Song, S.; Guo, T. Nonlinear segmental runoff ensemble prediction model using BMA. Water Resour. Manag. 2024, 38, 3429–3446. [Google Scholar] [CrossRef]
Cui, Z.; Guo, S.; Chen, H.; Liu, D.; Zhou, Y.; Xu, C.Y. Quantifying and reducing flood forecast uncertainty by the CHUP-BMA method. Hydrol. Earth Syst. Sci. 2024, 28, 2809–2829. [Google Scholar] [CrossRef]
Bao, L.; Gneiting, T.; Grimimt, E.P.; Guttorp, P.; Raftery, A.D. Bias Correction and Bayesian Model Averaging for Ensemble Forecasts of Surface Wind Direction. Mon. Weather Rev. 2010, 138, 1811–1821. [Google Scholar] [CrossRef]
Soltanzadeh, I.; Azadi, M.; Vakili, G.A. Using Bayesian Model Averaging (BMA) to Calibrate Probabilistic Surface Temperature Forecasts over Iran. Ann. Geophys. 2011, 29, 1295–1303. [Google Scholar] [CrossRef]
Kim, C.; Suh, M.S. Prospects of Using Bayesian Model Averaging for The Calibration of One-Month Forecasts of Surface Air Temperature Over South Korea. Asia-Pac. J. Atmos. Sci. 2013, 49, 301–311. [Google Scholar] [CrossRef]
Duan, Q.; Ajami, N.K.; Gao, X.; Sorooshian, S. Multi-model ensemble hydrologic prediction using Bayesian model averaging. Adv. Water Resour. 2007, 30, 1371–1386. [Google Scholar] [CrossRef]
Wilson, L.J.; Beauregard, S.; Raftery, A.E.; Verret, R. Calibrated Surface Temperature Forecasts from The Canadian Ensemble Prediction System Using Bayesian Model Averaging. Mon. Weather Rev. 2007, 135, 1364–1385. [Google Scholar] [CrossRef]
Sloughter, J.M.; Gneiting, T.; Raftery, A.E. Probabilistic Wind Speed Forecasting Using Ensembles and Bayesian Model Averaging. J. Am. Stat. Assoc. 2010, 105, 25–35. [Google Scholar] [CrossRef]
Liu, J.; Xie, Z. BMA Probabilistic Quantitative Precipitation Forecasting over the Huaihe Basin Using TIGGE Multimodel Ensemble Forecasts. Mon. Weather Rev. 2014, 142, 1542–1555. [Google Scholar] [CrossRef]
Yadav, R.; Yadav, S.M. Calibration of TIGGE ensemble precipitation forecasts using Bayesian model averaging for a semi-arid river basin. Acta Geophys. 2026, 74, 114. [Google Scholar] [CrossRef]
Yang, Y.; Chen, R.; Lu, X.; Mao, W.; Liu, Z.; Wang, X. Bayesian Model Averaging Method for Merging Multiple Precipitation Products over the Arid Region of Northwest China. Atmosphere 2026, 17, 94. [Google Scholar] [CrossRef]
Amjad, N.; Ismail, M.; Ali, Z. Advancing future drought characterization: A two-phase Bayesian model averaging approach for GCM ensemble calibration. Appl. Geomat. 2026, 18, 7. [Google Scholar] [CrossRef]
Getu, L.A.; Sándor, S.; Zoltán, T.; Addis, H.K. Bayesian Model Averaging approach to Predict Future Rainfall Erosivity in Gumara-Maksegnit Watershed, Ethiopia. Environ. Sustain. Indic. 2025, 28, 101012. [Google Scholar] [CrossRef]
Javanshiri, Z.; Fathi, M.; Mohammadi, S.A. Comparison of the BMA and EMOS statistical methods for probabilistic quantitative precipitation forecasting. Meteorol. Appl. 2021, 28, e1974. [Google Scholar] [CrossRef]
Sloughter, J.M.; Gneiting, T.; Raftery, A.E.; Fraley, C. Probabilistic Quantitative Precipitation Forecasting Using Bayesian Model Averaging. Mon. Weather Rev. 2007, 135, 3209–3220. [Google Scholar] [CrossRef]
Dirkson, A.; Buehner, M. Are we misdiagnosing ensemble forecast reliability? On the insufficiency of spread–error and rank-based reliability metrics. Q. J. R. Meteorol. Soc. 2026, e70186. [Google Scholar] [CrossRef]
Wang, J.; Chen, J.; Chen, F.; Deng, G.; Liang, C. Diagnostic Evaluation of Spread–Skill Relationships for Convection-Permitting Ensemble Prediction System. Meteorol. Appl. 2026, 33, e70166. [Google Scholar] [CrossRef]
Hrachowitz, M.; Savenije, H.H.G.; Blöschl, G.; McDonnell, J.J.; Sivapalan, M.; Pomeroy, J.W.; Arheimer, B.; Blume, T.; Clark, M.P.; Ehret, U.; et al. A decade of Predictions in Ungauged Basins (PUB)-a review. Hydrol. Sci. J. 2013, 58, 1198–1255. [Google Scholar] [CrossRef]
Vereecken, H.; Schnepf, A.; Hopmans, J.W.; Javaux, M.; Or, D.; Roose, T.; Vanderborght, J.; Young, M.H.; Amelung, W.; Aitkenhead, M.; et al. Modeling soil processes: Review, key challenges, and new perspectives. Vadose Zone J. 2016, 15, 1–57. [Google Scholar] [CrossRef]
Babaeian, E.; Sadeghi, M.; Jones, S.B.; Montzka, C.; Vereecken, H.; Tuller, M. Ground, Proximal, and Satellite Remote Sensing of Soil Moisture. Rev. Geophys. 2019, 57, 530–616. [Google Scholar] [CrossRef]
Li, W.; Duan, Q.; Miao, C.; Ye, A.; Gong, W.; Di, Z. A review on statistical postprocessing methods for hydrometeorological ensemble forecasting. Wiley Interdiscip. Rev. Water 2017, 4, e1246. [Google Scholar] [CrossRef]
Madadgar, S.; Moradkhani, H. Improved Bayesian multimodeling: Integration of copulas and Bayesian model averaging. Water Resour. Res. 2014, 50, 9586–9603. [Google Scholar] [CrossRef]
Madadgar, S.; Moradkhani, H.; Garen, D. Towards improved post-processing of hydrologic forecast ensembles. Hydrol. Process. 2014, 28, 104–122. [Google Scholar] [CrossRef]
Sigrist, F.; Künsch, H.R.; Stahel, W.A. Stochastic partial differential equation based modelling of large space-time data sets. J. R. Stat. Soc. Ser. B Stat. Methodol. 2015, 77, 3–33. [Google Scholar] [CrossRef]
Madadgar, S.; AghaKouchak, A.; Shukla, S.; Wood, A.W.; Cheng, L.; Hsu, K.L.; Svoboda, M. A hybrid statistical-dynamical framework for meteorological drought prediction: Application to the southwestern United States. Water Resour. Res. 2016, 52, 5095–5110. [Google Scholar] [CrossRef]
Berrocal, V.J.; Raftery, A.E.; Gneiting, T. Probabilistic quantitative precipitation field forecasting using a two-stage spatial model. Ann. Appl. Stat. 2008, 2, 1170–1193. [Google Scholar] [CrossRef]
Scheuerer, M.; Möller, D. Probabilistic wind speed forecasting on a grid based on ensemble model output statistics. Ann. Appl. Stat. 2015, 9, 1328–1349. [Google Scholar] [CrossRef]
Tricco, A.C.; Lillie, E.; Zarin, W.; O’Brien, K.K.; Colquhoun, H.; Levac, D.; Moher, D.; Peters, M.D.J.; Horsley, T.; Weeks, L.; et al. PRISMA Extension for Scoping Reviews (PRISMAScR): Checklist and Explanation. Ann. Intern. Med. 2018, 169, 467–473. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Systematic literature process.

Figure 2. The number of papers on the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) published between 2008 and 2026. The symbol shown in the figure represents the Bibliometrix logo.

Figure 3. The most cited papers that have titles and abstracts containing the keywords of the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) [55,56,57,58,59,60,61,62,63,64].

Figure 4. Average total citations per year for papers that have titles and abstracts containing the keywords of the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA).

Figure 5. Journals’ impact metrics (H-Index) for publications related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA).

Figure 6. Number of authors per paper on the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA).

Figure 7. Authors’ production over time in published papers related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) between 2008 and 2026.

Figure 8. Authors’ institutions on published paper related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) between 2008 and 2026.

Figure 9. The most cited countries related to the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA).

Figure 10. Collaboration network between authors working on the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA) published between 2008 and 2026.

Figure 11. Co-authorship network with clusters by authors on calibration of ensemble forecasts using Bayesian Model Averaging.

Figure 12. The productivity and collaboration of corresponding authors from different countries.

Figure 13. Co-occurrence network of terms on ensemble forecasting.

Figure 14. Ensemble forecasting title and abstract network terms.

Figure 15. Precipitation forecast title and abstract network terms.

Figure 16. Calibration title and abstract network terms.

Figure 17. Themes quadrant of ensemble forecasting and precipitation forecasting.

Figure 18. Calibration ensemble forecast title and abstract network term.

Table 1. Selected studies related to BMA model research for ensemble forecast calibration.

No	Reseacher	Description	Conclusion
1	Raftery et al. [32]	The BMA–Gaussian model is applied to calibrate the ensemble forecast of rainfall.	This study shows that the BMA–Gaussian distribution can improve the reliability of ensemble forecasts and reduce prediction errors compared to the raw ensemble forecasts. However, the Gaussian distribution is less effective in representing the asymmetric and thick characteristics of extreme rainfall events.
2	Yadav and Yadav [47]	Bayesian Model Averaging (BMA) was applied to calibrate TIGGE ensemble precipitation forecasts in a semi-arid river basin.	The study demonstrated that BMA improved forecast reliability and uncertainty representation compared with raw ensemble forecasts. However, the study primarily focused on calibration performance and offered limited discussion of BMA’s ability to capture extreme rainfall events, heavy-tailed rainfall distributions or skewed rainfall distributions.
3	Yang et al. [48]	Bayesian Model Averaging (BMA) was used to combine multiple rainfall products under various climate conditions.	The study reported improved rainfall estimation performance and enhanced prediction reliability across different climate conditions. However, the study provided limited discussion regarding extreme rainfall characteristics and the representation of heavy-tailed rainfall distributions.
4	Javanshiri et al. [51]	BMA–Gaussian was applied to calibrate ensemble forecasts on rainfall and surface temperature data.	The study showed that BMA–Gaussian produced better calibration performance compared to raw ensemble forecasts.
5	Getu et al. [50]	Applied Bayesian Model Averaging (BMA) to combine multiple climate models for predicting future rainfall erosivity under climate change scenarios.	The study demonstrated that BMA improved rainfall prediction reliability and reduced uncertainty compared to individual climate models. However, the study primarily focused on rainfall erosivity prediction and provided limited discussion regarding the capability of BMA in capturing extreme rainfall characteristics.
6	Zhang et al. [38]	The BMA–Truncated Gaussian model is applied to calibrate the ensemble forecast on wind speed data.	The study demonstrated that the BMA–Truncated Gaussian approach produced better calibration performance than the standard BMA–Gaussian model for non-negative wind speed data.
7	Sloughter et al. [45]	Use of the BMA–Gamma model to calibrate ensemble forecasts on rainfall.	The study demonstrated that BMA–Gamma produced better calibration performance than the BMA–Gaussian approach for non-negative and skewed rainfall data. However, the parameter estimation and bias correction procedures still relied on Gaussian-based assumptions and linear regression approaches, which may limit the model’s ability to fully represent extreme rainfall characteristics.
8	Sloughter et al. [52]	Applying the BMA–Gamma method to calibrate ensemble forecasts on wind speed data.	The study showed that BMA–Gamma produced better calibration performance than the BMA–Gaussian approach. However, BMA–Gamma’s performance decreased when the data contained highly extreme observations.

Table 2. Number of papers in database searching.

Code	Keywords	Scopus
A	Calibration of Ensembles Forecast	19,195
B	Extreme Rainfall OR Precipitation	261,015
C	Bayesian Model Averaging Gaussian Gamma Distribution	821
D	A AND B	5888
E	D AND C	64

Table 3. Top 10 journals publishing the most papers on the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA).

Journal	Papers
Water Resources Research	6
Monthly Weather Review	5
Journal of Hydrology	4
Journal of Hydrometeorology	4
Annals of Applied Statistics	3
Atmosphere	2
Meteorological Applications	2
Weather and Forecasting	2
AIP Conference Proceedings	1
Atmospheric Environment	1

Table 4. The most productive authors who write papers on the calibration of ensemble forecasts for extreme rainfall using Bayesian Model Averaging (BMA).

Authors	Number of Papers	H-Index
Madadgar S	3	3
Zhang Y	3	3
Allen S	2	2
Baran S	2	2
Chapman We	2	2
Duan Q	2	2
Ghazvinian M	2	2
Künch HR	2	2
Kwasniok F	2	2
Li L	2	2
Li W	2	2
Li Y	2	2
Liu Y	2	2
Miao C	2	2
Monache LD	2	2

Table 5. Cluster membership of title and abstract terms related to calibration of ensemble forecasts using Bayesian Model Averaging in extreme rainfall.

Cluster 1	Cluster 2	Cluster 3	Cluster 4	Cluster 5
Statistical Post-processing & Distribution	Bayesian & Calibration Methods	Numerical Weather Prediction (NWP)	Extreme Events & Climate	transition connector
statistical post-processing	Bayesian analysis	numerical weather prediction	extreme event	ensemble
gaussian distribution	Bayesian networks	numerical models	climate change	forecasting
precipitation	calibration	short-range prediction
maximum likelihood estimation	ensemble model output statistics (EMOS)	ensembles
gaussian method	normal distribution	post-processing
ensemble forecast	climate prediction
	ensemble post-processing
	probabilistic forecasting

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Faidah, D.Y.; Darmawan, G.; Tantular, B.; Immanuel, F.C.; Mohamed, N. Calibration of Ensemble Forecasts for Extreme Rainfall Using Bayesian Model Averaging: A Comparative Review of Gaussian and Gamma Distributions. Sustainability 2026, 18, 6121. https://doi.org/10.3390/su18126121

AMA Style

Faidah DY, Darmawan G, Tantular B, Immanuel FC, Mohamed N. Calibration of Ensemble Forecasts for Extreme Rainfall Using Bayesian Model Averaging: A Comparative Review of Gaussian and Gamma Distributions. Sustainability. 2026; 18(12):6121. https://doi.org/10.3390/su18126121

Chicago/Turabian Style

Faidah, Defi Yusti, Gumgum Darmawan, Bertho Tantular, Febrianggi Caesar Immanuel, and Norizan Mohamed. 2026. "Calibration of Ensemble Forecasts for Extreme Rainfall Using Bayesian Model Averaging: A Comparative Review of Gaussian and Gamma Distributions" Sustainability 18, no. 12: 6121. https://doi.org/10.3390/su18126121

APA Style

Faidah, D. Y., Darmawan, G., Tantular, B., Immanuel, F. C., & Mohamed, N. (2026). Calibration of Ensemble Forecasts for Extreme Rainfall Using Bayesian Model Averaging: A Comparative Review of Gaussian and Gamma Distributions. Sustainability, 18(12), 6121. https://doi.org/10.3390/su18126121

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

Calibration of Ensemble Forecasts for Extreme Rainfall Using Bayesian Model Averaging: A Comparative Review of Gaussian and Gamma Distributions

Abstract

1. Introduction

2. Materials and Methods

2.1. Systematic Literature Review Process

2.2. Bibliometric Analysis

2.3. Methods of Calibration of Ensemble Forecasts

3. Results

3.1. Summary of Publications

3.2. Authorship Analysis

3.3. Research Theme Mapping

4. Discussion

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI