Weather Prediction for Singapore—Progress, Challenges, and Opportunities

: Singapore is a tiny city-state located in maritime Southeast Asia. Weather systems such as localized thunderstorms, squalls, and monsoon surges bring extreme rainfall to Singapore, inﬂuencing the day-to-day conduct of stakeholders in many sectors. Numerical weather prediction models can provide forecast guidance, but existing global models struggle to capture the development and evolution of the small-scale and transient weather systems impacting the region. To address this, Singapore has collaborated with international partners and developed regional numerical weather prediction systems. Steady progress has been made, bringing added value to stakeholders. In recent years, complex earth system and ultra high-resolution urban models have also been developed to meet increasingly diverse stakeholder needs. However, further advancement of weather prediction for Singapore is often hindered by existing challenges, such as the lack of data, limited understanding of underlying processes, and geographical complexities. These may be viewed as opportunities, but are not trivial to address. There are also other opportunities that have remained relatively unexplored over Singapore and the region, such as the integration of earth system models, uncertainty estimation and machine learning methods. These are perhaps key research directions that Singapore should embark on to continue ensuring value for stakeholders.


Introduction
Singapore lies at the heart of the Maritime Continent, situated in the deep tropics where the multi-scale interactions of the earth system-atmosphere, ocean, and land-govern the weather systems that evolve and impact the region. These weather systems can be simulated by mathematical models of the earth system components based on physical principles, such as the conservation of mass, momentum, and energy. This approach is known as numerical weather prediction. Numerical weather prediction is largely an initial value problem; the weather forecast accuracy is determined mainly by the construct of the mathematical model, via weather modeling, and the initial state prescribed to it. For short forecasts, boundary conditions have a smaller impact, unless they are poorly specified.
Successful weather prediction requires a concerted effort by both research institutes and national meteorological services. The Meteorological Service Singapore (MSS), Singapore's national weather authority on weather and climate, joined the Global Unified Model Partnership first as an associate partner to develop strategic partnership with the United Kingdom Met Office (UKMO), and subsequently as a core partner in 2022. This consortium brings together global scientific and technical expertise to forge new frontiers in weather prediction through the development of innovative techniques in broad domains such as coupled modeling, model physics parametrization, and ensemble prediction.
Central to supporting Singapore's weather prediction strategy is the Weather Modelling and Development Branch (WMD) in the Centre for Climate Research Singapore (CCRS), which is the research arm of MSS. By leveraging knowledge-sharing and advancements arising from the consortium, WMD undertakes numerical weather prediction research and development focusing on the 2-h to 2-day forecast timescales; encompassing (i) the lifespan of a short-lived localized thunderstorm; (ii) the passage of a mesoscale Sumatra squall; and (iii) the onset period of a synoptic-scale monsoon surge. These systems bring extreme rainfall to Singapore and have wide-ranging societal impacts (e.g., [1][2][3][4][5]), influencing the day-to-day conduct of stakeholders in many sectors including aviation, maritime, defense, and energy.
In this article, we describe the progress in weather prediction at CCRS. We also discuss the challenges and opportunities in weather prediction for Singapore and the region; a clarion call to spur interest in this niche area and encourage further collaboration as a community to pave the way ahead.

Progress
The increase in the accuracy of weather forecasts is underpinned by progress in weather modeling. In the global context, a modern 5-day model forecast is as accurate as a 1-day model forecast was in 1980s [6]. The duration where a forecast is considered useful is increased by approximately one day for every decade of research and development [7], courtesy of continuous investment by global numerical weather prediction centers. This has attendant benefits for smaller research institutions, including CCRS, that implement regional numerical weather prediction models nested within such global (host) models-known as dynamical downscaling.
Prior to 2013, weather prediction at MSS relied solely on dynamical downscaling techniques, employing in tandem a selection of host and regional models throughout the historical development path. The computational resources at that time permitted simulations over a curated Southeast Asia domain, including Singapore, Peninsular Malaysia, and Sumatra, with a horizontal grid-spacing of 6 km.
In 2013, a collaborative project between MSS and the UKMO was conceptualized to develop a convective-scale numerical weather prediction system for Singapore (SINGV). A summary of the progress during the SINGV project from 2013 and 2018 is documented in [8]. The culmination of the efforts by UKMO and CCRS returned three configurations-SINGV-DS (downscaler [9]), SINGV-DA (data assimilation [10]) and SINGV-EPS (ensemble prediction [11]), all driven by European Centre for Medium-Range Weather Forecasts (ECMWF) host model inputs. Notably, SINGV-DS and SINGV-DA included an upgrade to use a convective-scale horizontal grid-spacing of 1.5 km (constant grid-spacing), which immediately yielded considerable improvements in the representation of localized thunderstorms and small-scale forecast features [9,10]. Concomitantly, the data assimilation of observations from the Global Telecommunications System (GTS) of the World Meteorological Organization (WMO) in SINGV-DA demonstrated improved short-range rainfall forecasts, attributed to the superior initial state estimates compared to simple dynamical downscaling [10].
Since 2018, the remit of WMD has grown considerably, involving further development of other earth system model components (e.g., ocean and wave models) which are inextricably linked to the atmosphere, and their coupled counterparts (e.g., atmosphereocean-land model [12]). These efforts are justified by the increasingly diverse requirements of stakeholders, requesting higher forecast accuracies and longer warning lead times from a broad range of products. Additionally, ultra high-resolution (e.g., 300 meters grid-spacing and finer) urban modeling-a branch of numerical weather prediction-was also undertaken [13,14] to understand the impact of anthropogenic heat and urbanization (e.g., urban heat island effect), for both weather and climate applications.

Challenges
We shall focus on challenges faced in three priority research domains. Each domain presents a unique set of practical and scientific challenges for weather prediction, specifically for Singapore and the region.

Atmosphere-Land Modeling
In numerical weather prediction, atmosphere models are often coupled to land surface models (e.g., in SINGV configurations) to represent atmosphere-land interactions and simulate hydrological processes. These are particularly important in the deep tropics, where substantial mass and energy exchanges between the atmosphere and land occur [15]. Within the tropical atmosphere, moist processes also play an enhanced role in the development and evolution of thunderstorms, squall lines and monsoon surges. Atmosphere-land models over Singapore and the region may often struggle to forecast these weather systems, due to the following:

•
Singapore is surrounded by numerous islands and uneven terrain which give rise to land-sea interactions and orographic effects that complicate the development and evolution of weather systems. Forecast errors may stem from a poor representation of terrain-induced processes (e.g., negligible orographic lift over Sumatra leading to lesser-than-observed rainfall [16]) if the grid resolution of the atmosphere-land model is too coarse. Even at a convective-scale resolution, position and timing errors are still common because localized thunderstorms span only ≈15 km (10 grid-spaces in SINGV-DA), and are short-lived (≈1 to 2 h). It is unlikely that a single deterministic weather forecast will capture all the fine scale features of convection, and this could mean the difference between heavy rainfall in Singapore or in Johor (southern tip of Peninsular Malaysia). • Many underlying physical processes in the deep tropics are poorly understood, so these may not be well-represented in the atmosphere-land model. In particular, simplification of cloud microphysics processes may cause undesirable storm splitting in the simulated passage of a Sumatra squall, following results from idealized tests [17]. Incorrect partitioning of the soil water retention and surface runoff especially during heavy rainfall may affect the surface energy and water fluxes, leading to excess or insufficient near-surface moisture availability for the pre-convective environment. Inappropriate ad hoc modifications to the boundary layer scheme (e.g., using stochastic perturbations) may have repercussions on the diurnal cycle, such as the early triggering of convection [9]. These discrepancies ultimately introduce biases and errors into the forecasts. • There is a lack of useful wind observations over the Maritime Continent, even though geostrophic adjustment theory suggests that wind information is vital for the region [18]. First, wind measurements are sparsely distributed in space. Existing wind measurements are usually only available over land (radiosondes), in the upper troposphere (aircraft), or just above the sea surface (satellite-based scatterometers). Over adjacent oceans (covering about 50% of the SINGV-DA domain), the lower troposphere is still insufficiently sampled. Second, wind measurements are sparsely distributed in time. Existing in situ wind measurements are too infrequent (e.g., radiosondes only launched twice a day) to capture the diurnal cycle, the dominant mode of variability in the region. Third, the quality of wind measurements are easily compromised. Sampling noise is prevalent when the tropical winds are light and variable (e.g., during the inter-monsoon season), so low quality wind observations such as surface wind measurements are eventually discarded during data assimilation.
Remotely sensed winds are also often subject to rain contamination in the tropics [19], which may limit their usefulness. These three reasons cause large information gaps in the prescribed atmospheric state for initialising atmosphere-land models, and thus compromise the forecast quality.

Ocean-Wave Modeling
For weather prediction timescales, ocean circulation models are rarely standalone due to the slow-evolving nature of oceans; they are often paired with a wind-driven wave model. These ocean-wave models are focused on marine weather-phenomena occurring at the atmosphere-ocean interface-providing marine forecasts (e.g., significant wave height of sea waves and swells, and mean wave direction) over the surrounding waters of Singapore for stakeholders. Ocean-wave modeling in the region is notoriously challenging due to the following: • The Maritime Continent's oceans are characterized by rugged coastlines and highly varying bathymetry. The region hosts one of the largest currents of water on the planet-the Indonesian Throughflow (ITF), which connects the tropical Pacific and Indian Ocean. The ITF includes many narrow straits weaving through numerous islands, resting largely on shallow continental shelves. Coarse resolution simulations cannot effectively resolve the flow through these straits. A model with a poorly chosen set of vertical coordinates will also struggle with steep gradients in the bathymetry (e.g., from Andaman Sea to Malacca Straits). This can result in spurious numerical mixing and ocean eddy artifacts [20]. • There is a severe lack of observations of the Maritime Continent's oceans, which are used to prescribe the ocean state via data assimilation, resulting in initial value errors. Due to the shallow topography around Peninsular Malaysia, Argo floats-freely drifting robotic devices that profile the ocean subsurface-are near-absent. Ship-based observations, buoys and tide gauges are sparsely distributed, while other satellitebased observations (e.g., altimetry, and infrared imagery) can only sample the ocean surface. These are insufficient to account for thermodynamical properties in the ocean subsurface that are important for marine forecasts. • There is uncertainty in specifying the lateral boundary conditions along the continental break regions (e.g., Peninsular Malaysia) and in the atmospheric wind forcing, resulting in boundary condition errors in a regional ocean-wave model (e.g., through dynamical downscaling, used by WMD). An incorrect representation of inflow and outflow of water along the domain boundaries will lead to errors in momentum and mass exchanges. An incorrect forcing wind field over elongated coastal regions (e.g., offshore Sumatra) will lead to errors in the coastal upwelling. These errors will propagate into the ocean-wave model and contaminate the marine forecasts.

Urban Modeling
Urban modeling introduces urban canopy parametrization (urban canopy models) to numerical weather prediction to represent the dynamic and thermodynamic effects of the city on the atmosphere. These may alter weather patterns (e.g., diurnal cycle intensity [21]) and phenomena (e.g., sea breeze convergence [14]) that affect Singapore and the region. Accounting for these effects is thus an important step towards urban-scale weather prediction for Singapore, but it also faces the following challenges:

•
There is a lack of datasets with a detailed and accurate representation of the land surface types, urban morphology, and anthropogenic heat required for ultra highresolution urban modeling over Singapore and Johor. Substantial effort is required to develop high quality datasets because of existing limitations. First, the quality of satellite-derived land use and land cover data are often compromised by ubiquitous cloud cover in the deep tropics. Second, Singapore's urban morphology is extremely heterogeneous and it is tricky to derive parameters (e.g., building plan area, and mean building height) for the urban morphology suitable at ultra high-resolution. Third, anthropogenic heat emissions cannot be measured and must be indirectly estimated. Prerequisite high-resolution data (e.g., building energy use, human metabolic heat) used for estimation are difficult to acquire, so the emissions dataset is often coarsely derived over Singapore.
• Many assumptions are required when applying urban canopy parametrizations, which may be violated in ultra high-resolution urban modeling over Singapore. The assumption of homogeneity in the representation of buildings within a grid box does not hold when approaching urban-scales in Singapore, known as the "building gray zone" problem [22]. Fundamental assumptions behind conventional turbulence parametrizations are violated with increasing horizontal grid resolution, known as the "turbulence gray zone" problem [23]. Other simplifications (e.g., negligible vegetation effects within the street canyon, exponential wind profile near the surface, and conventional mesoscale microphysics schemes) may also be invalid. These lead to undesirable effects on the model forecast and limits its usefulness. • The domain for ultra high-resolution urban modeling is usually too small to fully capture large-scale weather systems. This arises because urban modeling is computationally expensive, and the domain size is often compromised to facilitate higher resolution simulations. The drawback of shrinking the domain becomes apparent when forecasting organized mesoscale (or larger) systems. For example, a Sumatra squall line which span a few hundred kilometers is larger than the Singapore urban modeling domain size typically used by WMD, so portions of the squall will be truncated at the boundaries. This may introduce unintended lateral boundary effects through discontinuities or numerical oscillations that may contaminate the forecasts within the inner domain. There is therefore a strong need to understand the trade-off between the benefits from using a smaller horizontal grid-spacing and a larger domain size for different applications.

Opportunities
The aforementioned challenges are thematic, relating to a lack of data, limited understanding of underlying processes, or geographical complexities. Each challenge can be viewed as an opportunity to improve weather prediction for Singapore and the region, but may not be trivial to address. Intensive observation campaigns around the region (not only in Singapore), process-based studies, and model sensitivity experiments will be extremely helpful for answering fundamental research questions under respective themes. There are also domain cross-cutting opportunities which have so far remained relatively unexplored. We have classified them into three categories below.

Integrating Earth System Components
The integration of earth system components can address issues related to the representation of earth system processes, which is due to the geographical complexities of weather modeling over Singapore and the region. Individual earth system components (land, ocean, and wave) can be integrated with the atmosphere model to form a single environmental prediction system. The concept of linking components is apparent from the region's nickname: "Maritime Continent", where coupling the atmosphere with ocean-waves (Maritime) and land (Continent) are perhaps necessary to capture their natural interactions and thus simulate the region's weather systems more accurately.
Global numerical weather prediction centers have started adopting a similar approach for consolidating their global modeling systems (e.g., at National Oceanic and Atmospheric Administration; NOAA, and Korea Institute of Atmospheric Prediction Systems; KIAPS). Progress in coupled regional modeling is slightly lagging, but ironically may have potential to unlock greater benefits due to the higher sensitivity of small-scale phenomena (e.g., localized thunderstorms, and wind-driven waves) to transient coupled processes (e.g., downdrafts causing cold pool propagation over oceans).
Apart from improving forecast accuracy, an environmental prediction system also allows forecast products from individual models to be consolidated. A single simulation run from the environmental prediction system can then provide a one-stop shop for all forecast products with better consistency and quality. This simplifies the overall data pipeline and workflow required to meet diverse stakeholder requirements.
In this light, WMD is developing a coupled atmosphere-land-ocean-wave environmental prediction system over the Maritime Continent, with the option to incorporate weakly coupled data assimilation for weather prediction applications over the next few years. This system will also support climate projection applications when the need arises.

Integrating Uncertainty Estimation
The integration of uncertainty estimation can address issues related to random (stochastic) errors in weather prediction, which is due to the uncertainty in prescribing the initial state and in the construct of the mathematical model of an inherently chaotic earth system. One solution is to employ an ensemble of simulations with different initial states, boundary conditions, and/or constructs of the model, accounting for possible sources of uncertainties and quantifying them. The ensemble approach is widely adopted when forecasting for longer timescales (i.e., sub-seasonal through to climate), since more direct information on weather statistics rather than individual weather events are required. However, for weather timescales, ensemble techniques can still provide uncertainty estimates on the position and severity of weather systems.
In regional models over the Maritime Continent, this effort is relatively nascent. Applications are typically focused on tropical cyclones to identify track or intensification uncertainties (see [24] and references therein). In the context of Singapore, this can also be applied to squall lines in estimating propagation and growth uncertainties, and monsoon surges in estimating mainly the onset timing uncertainties (since surges tend to persist longer than the model forecast range). Thus far, in-depth studies have yet to be conducted. Extending this idea to local or site-specific (i.e., town or station level) forecasts, perhaps probabilistic products (e.g., probabilities of heavy rainfall from thunderstorms, or extreme events) rather than a deterministic "yes-no" product may be more useful for stakeholders.
In this light, WMD is developing SINGV-EPS to augment the high-resolution SINGV-DA simulations. The long-term plan involves further consolidation of SINGV-EPS and SINGV-DA into a single configuration -having the best estimates of the initial states (from data assimilation) to initialize ensemble simulations, then representing the appropriate ensemble-derived forecast uncertainty for performing data assimilation. At the same time, effort is underway to develop forecast post-processing probabilistic and/or site-specific products using SINGV-EPS.

Integrating Machine Learning
The integration of machine learning can address issues related to systematic errors in weather modeling, which is due to, e.g., a limited understanding and poor representation of physical processes. Machine learning methods diagnose model biases by generalising patterns between the input layer and output layer (e.g., following [25]; model state as input layer, analysis increment from data assimilation as output layer in a neural network). Diagnosed forecast biases can then be corrected online by introducing tendencies to the model simulation, or offline as a single correction term after the simulation is completed. This concept is applicable throughout the weather prediction workflow-during data assimilation to retrieve the initial state, to forecast post-processing (see [26,27] for an overview).
Machine learning can also be used to train surrogate models to replace portions of the weather model entirely (e.g., turbulence and radiation parametrization). In particular, physics-informed machine learning methods has gained traction in recent years (see [28] and references therein). Physical laws (e.g., conservation) are enforced in a machine learning model architecture such that the solutions satisfy the given mathematical constraints. These methods can bridge the gap between fully data-driven approaches (limited by data sample size) and physics-based models (limited by understanding of governing physics) to generate generalized, physically consistent solutions.
Finally, machine learning may also be more computationally efficient than traditional modeling approaches (e.g., radiation emulation). As the grid resolution of the earth's system models approaches sub-kilometer scales, the availability of high-performance computing resources may become a major limitation. Machine learning methods-requiring only a one-off cost for training-can assist in reducing the overall computational workload, thus balancing the domain size versus grid resolution trade-off in ultra high-resolution modeling. One should note, however, that re-training may be required whenever components of the earth system prediction system are upgraded.
In this light, WMD is keen to explore opportunities with collaborators who are willing to undertake machine learning research, prioritizing physics-informed machine learning for improved efficiency and/or accuracy, or machine learning in nowcasting and forecast postprocessing applications to generate improved products for stakeholders. At the moment, different machine learning methods (e.g., generative adversarial networks) are being tested to support radar-based nowcasting system development, with potential to blend numerical weather prediction outputs to improve the nowcasting products.

Concluding Remarks
Throughout the course of numerical weather prediction research at CCRS, the rationale for embarking on any project has always been based on desired outcomes, which are often stakeholder-driven. It is paramount that we constantly take stock of the progress to serve as a reminder of the relevance of our research for serving stakeholder needs.
The litmus test for regional weather modeling is the value it brings to stakeholders in terms of added accuracy, reliability (or timeliness), and to a smaller extent privacy compared to global models. As weather prediction for Singapore evolves, one must also be cognizant of the remaining challenges to overcome and opportunities that may arise. With finite resources, effort must then be concentrated on research directions that tackles these challenges or exploits these opportunities to ensure the most value for stakeholders. Funding: This research received no external funding. The authors were supported by the National Environment Agency (Singapore).