Evaluation of ECMWF Lightning Flash Forecast over Indian Subcontinent during MAM 2020

: During the pre-monsoon season (March–April–May), the eastern and northeastern parts of India, Himalayan foothills, and southern parts of India experience extensive lightning activity. Mean moisture, surface and upper-level winds, the sheared atmosphere in the lower level, and high positive values of vertically integrated moisture ﬂux convergence (VIMFC) create favorable conditions for deep convective systems to occur, generating lightning. From mid-2018, the European Centre for Medium-Range Weather Forecasts (ECMWF) Integrated Forecasting System (IFS) operationally introduced lightning ﬂash density on a global scale. This study evaluates the ECMWF lightning forecasts over India during the pre-monsoon season of 2020 using the Indian Institute of Tropical Meteorology (IITM) Lightning Location Network (LLN) observation data. Qualitative and quantitative analysis of the ECMWF lightning forecast has shown that the lightning forecast with a 72-h lead time can capture the spatial and temporal variation of lightning with a 90% skill score.


Introduction
Cumulonimbus clouds with vigorous updrafts and downdrafts form thunderstorms. Severe thunderstorms consist of multiple convective cells at different stages, and the storm itself has a longer lifetime than individual cells [1][2][3][4].As the convective clouds grow larger, different microphysical processes inside the cloud trigger the formation of charged hydrometeors.Negatively charged hydrometeors accumulate in the lower region of the cloud and positively charged particles in the upper levels, leading to the formation of an electric dipole structure.When the charge difference between the two poles of the dipole becomes high enough, lightning occurs through electric discharge [5][6][7].
Despite uncertainties in the data, the present global estimation of lightning-related fatalities is about 4000 to 5000 per year [8], with developing countries from tropical and subtropical nations being the most affected.Being part of the tropics, the Indian subcontinent experience a high number of lightning-related death, with the highest deaths recorded throughout March, April, and May-with an average of 103 deaths per year in Nepal [9], an average of 18 deaths per year in Shri Lanka [10], and in Bangladesh an average of 286 deaths per year [11].In India, lightning has caused more fatalities than any other natural calamity from 2004 to 2013, according to a report by National Crime Records Bureau (NCRB), with 38.6% of deaths caused by lightning alone [12,13].A recent study [14] has shown that mortality due to lightning alone has increased by 52.8% in the last 20 years.
In addition to its destructive impacts, lightning is a source of NOx in the atmosphere influencing ozone productivity and oxidization [15].Lightning strikes are the dominant cause of forest fires and a significant contributor to biomass-burning aerosols [16].As lightning is responsive to temperature on many time scales, one study [15] has predicted more lightning in a warmer world based on a general circulation model with enhanced CO 2 .Many studies [16][17][18][19] have called for enhancements in clustered positive ground flash activity in the presence of elevated equivalent potential temperature.Nevertheless, the use of lightning as an indicator of long-term climate issues has also been questioned due to the diminishing effects of lightning-temperature sensitivity on a longer time scale [20].
The India Meteorological Department (IMD), Indian Institute of Tropical Meteorology (IITM), and the National Centre for Medium-Range Weather Forecasting (NCMRWF), under the initiative of the Ministry of Earth Sciences (MoES), have developed a thunderstorm prediction system that can deliver a nowcast up to 3 h ahead for 800 stations, a regional model forecast with a 24-h lead time and thunderstorm-specific products 48 h in advance from high-resolution (12.5 km) global model.These products use combinations of high-resolution satellite data, IITM lightning sensors, RADAR observations, high-resolution mesoscale models, and outputs from the Global Forecast System (GFS, at 12.5 km spatial resolution).All these efforts have significantly improved accuracy for nowcasting and short-range forecasting [33].To address the disaster management aspect of lightning, the Lightning Resilient India Campaign was launched in 2019 to reduce lightning deaths by 80% by 2022.The campaign was formed jointly by Climate Resilient Observing Systems Promotion Council (CROPC), in collaboration with IMD, the Indian Meteorological Society (IMS), World Vision India, and the Red Cross.Within the first two years, casualties associated with lightning have been reduced by 60% [34].
In June 2018, the European Centre for Medium-Range Weather Forecasts (ECMWF) started to predict lightning flash density operationally in their Integrated Forecasting System (IFS) [35].The lightning flash density [36] is calculated using convective outputs from the convective mass-flux scheme [37] of IFS.Verification of the ECMWF lightning forecasts over Europe has shown that the ensemble lightning forecasts have useful skill with a lead time of at least 3 days, while the deterministic forecast is in good agreement with the observations for 6 h temporally and 50 km spatially [35].
Here, using observations from the IITM Lightning Location Network (LLN), operational over more than 80 locations across India, we evaluate the ECMWF lightning forecast with a lead time of 3 days over India and analyze the quality of the ECMWF lightning forecasts with a longer lead time during the pre-monsoon months of March-April-May.Section 2 of this paper details the data and methodology used in this study.Section 3 includes results and discussion, followed by a summary in Section 4.

ECMWF IFS Lightning Forecast Data
ECMWF implemented lightning parameterization in IFS and started operational lightning forecasts in June 2018 [35].The parameterization calculates the total of cloudto-ground (CG) and Intra-Cloud (IC) lightning as a function of convective hydrometeor amounts, convective available potential energy (CAPE), and convective cloud base height diagnosed by the convection scheme of ECMWF [36].The total lightning flash density is given by where α is a tunable coefficient, currently set to 37.5, to match the annual global mean flash rate obtained from Lightning Imaging Sensor (LIS)/Optical Transient Detector (OTD) gridded climatology [38].The variable z base is the convective cloud base height.The mixed-phase region of a convective cloud consists of graupel and super-cooled water and is typically found between 0 • C and −25 • C isotherm.These microphysical processes, such as collision, and sublimation, lead to the charge separation of cloud particles [39][40][41][42].The region between the 0 • C and −25 • C isotherm is assumed to correspond to the charge separation region [36] (Figure 1, p. 3058).In Equation (1), Q R provides a proxy for the charging rate generated by the collisions between different hydrometeors within the charge separation region: where q graup and q snow denote the amount of graupel, and snow, respectively, while q cond is the cloud condensate amount within the convective updraft region.ρ is the ambient air density at altitude z. z −25 represents the altitude of −25 • C isotherm and z 0 is the altitude of 0 • C isotherm.This parameterization provides both instantaneous and time-averaged total lightning flash densities and was made operational from the IFS model cycle 45r1 for both the highest resolution "HRES" configuration and the 51-member ensemble system "ENS".The horizontal resolution was 9 km for HRES and 18 km for ENS, with 137 vertical levels.
For our analysis, we considered pre-monsoon months March, April, and May (MAM) 2020 total (CG and IC) hourly lightning flash density (km −2 day −1 ) from daily ECMWF IFS HRES runs up to the 72 h range.In this paper, IFS Day 1 lead time refers to the first 24 h forecast, Day 2 lead time refers to the forecast from 25 h to 48 h, Day 3 from 49 h to 72 h.

Lightning Location Network
IITM LLN currently consists of 82 lightning-detecting sensors.Each sensor can detect the arrival time of the radio waves (up to 100 MHz) generated by lightning at nanoseconds accuracy from as far as 1000 km away (Figure 1).CG and IC lightning strikes can be detected with 90% and 50% detection efficiency, respectively [43][44][45][46].
where q graup and q snow denote the amount of graupel, and snow, q cond is the cloud condensate amount within the convective updraft r bient air density at altitude z. z -25 represents the altitude of −25 °C isot altitude of 0 °C isotherm.This parameterization provides both instantaneous and time-av ning flash densities and was made operational from the IFS model cyc highest resolution "HRES" configuration and the 51-member ensem The horizontal resolution was 9 km for HRES and 18 km for ENS, with For our analysis, we considered pre-monsoon months March, April 2020 total (CG and IC) hourly lightning flash density (km −2 day −1 ) from HRES runs up to the 72 h range.In this paper, IFS Day 1 lead time re forecast, Day 2 lead time refers to the forecast from 25 h to 48 h, Day 3
An algorithm has been built mainly for this study to calculate the total lightning flash density that occurred in the last hour (in flash km −2 h −1 ) on a 0.1 • × 0.1 • grid for MAM 2020.

Other Datasets
In addition to IFS lightning forecasts and LLN observation data, we have also used the LIS/OTD Gridded Climatology data set, which consists of the gridded climatology of total lightning flash rates measured by the spaceborne Optical Transient Detector (OTD) and Lightning Imaging Sensor (LIS) from 1995 to 2015 with a horizontal resolution of 0.5 • × 0.5 • [38].To analyze the large-scale conditions during the pre-monsoon season in India, we used the fifth-generation ECMWF reanalysis (ERA5) hourly data at a resolution of 0.25

Mean Lightning Climatology and Large-Scale Condition during MAM 2020
The MAM season lies between the winter and summer monsoon seasons and is classified as the 'Hot weather period' by the IMD [48].The subcontinent experiences intense convective events during this period, resulting in high lightning activity.The LIS/OTD lightning climatology for MAM (Figure 2) shows the Himalayan foothills, eastern, northeastern, and southern parts of India experiencing an extended amount of lightning.Western India has a very low lightning occurrence during this season.The five thunderstorm-prone regions [26] reasonably overlap with the intense lightning regions, as shown in Figure 2.
In addition to IFS lightning forecasts and LLN observation data, we have also used the LIS/OTD Gridded Climatology data set, which consists of the gridded climatology of total lightning flash rates measured by the spaceborne Optical Transient Detector (OTD) and Lightning Imaging Sensor (LIS) from 1995 to 2015 with a horizontal resolution of 0.5° × 0.5° [38].To analyze the large-scale conditions during the pre-monsoon season in India, we used the fifth-generation ECMWF reanalysis (ERA5) hourly data at a resolution of 0.25° × 0.25° [47].

Mean Lightning Climatology and Large-Scale Condition during MAM 2020
The MAM season lies between the winter and summer monsoon seasons and is classified as the 'Hot weather period' by the IMD [48].The subcontinent experiences intense convective events during this period, resulting in high lightning activity.The LIS/OTD lightning climatology for MAM (Figure 2) shows the Himalayan foothills, eastern, northeastern, and southern parts of India experiencing an extended amount of lightning.Western India has a very low lightning occurrence during this season.The five thunderstormprone regions [26] [38].Colored boxes highlight five thunderstorm-prone regions as defined by [26].
Investigation of the meteorological surface and dynamical features during MAM 2020 (Figure 3) shows the regions with a mean relative humidity above 50% coincide with Figure 2. The 850 hPa wind shows westerlies to northwesterlies in the northern part of India, while a clear line of wind discontinuity can be seen across the southern peninsula in a north-south direction (Figure 3a).At the 200 hPa level, the mean wind is predominantly westerly (Figure 3b).The vertical wind shear between 850 hPa and 700 hPa is positive over thunderstorm-prone regions (Figure 3b).Large-scale lifting in the lower troposphere helps sustain convective activity and therefore favors thunderstorm intensification.Vertically integrated moisture flux convergence (VIMFC) has been used as a thunderstorm predictor [49].As displayed in Figure 3c, the significant positive values of VIMFC indicate a thunderstorm-supporting environment over eastern, northeastern, and southern India and along the Himalayan foothills.
sphere helps sustain convective activity and therefore favor tion.Vertically integrated moisture flux convergence (VIMF derstorm predictor [49].As displayed in Figure 3c, the sig VIMFC indicate a thunderstorm-supporting environment ove southern India and along the Himalayan foothills.

Evaluation of IFS Lightning Forecast
In Figure 4  Figure 5 shows the mean diurnal variation of lightning density over (a) all Ind over (b)-(f) the thunderstorm-prone regions, as indicated in Figure 1.The diurna tion of lightning usually exhibits a distinct peak during the local afternoon, as can b in Figure 5, where LLN lightning peaks after 15:00 local time for all regions exc Northeast India (Figure 5b), as thunderstorms in this region are mostly orograp initiated.The IFS lightning diurnal cycle follows the observed diurnal cycle, albeit two to three hours lead for all three forecast days, consistent with the 2-3 h lead in f convective precipitation compared to rainfall observations [37].Furthermore, for th and ECI regions, the observations indicate a double peak in lightning which is not p in the forecasts.For different lead time, the diurnal variation of lightning does no noticeable differences over all India and all thunderstorm regions, except CI region Day 3 forecast underperforms compared to Day 1 and Day 2 forecast.Figure 5 shows the mean diurnal variation of lightning density over (a) all India and over (b)-(f) the thunderstorm-prone regions, as indicated in Figure 1.The diurnal variation of lightning usually exhibits a distinct peak during the local afternoon, as can be seen in Figure 5, where LLN lightning peaks after 15:00 local time for all regions except for Northeast India (Figure 5b), as thunderstorms in this region are mostly orographically initiated.The IFS lightning diurnal cycle follows the observed diurnal cycle, albeit with a two to three hours lead for all three forecast days, consistent with the 2-3 h lead in forecast convective precipitation compared to rainfall observations [37].Furthermore, for the NWI and ECI regions, the observations indicate a double peak in lightning which is not present in the forecasts.For different lead time, the diurnal variation of lightning does not have noticeable differences over all India and all thunderstorm regions, except CI region where Day 3 forecast underperforms compared to Day 1 and Day 2 forecast.
Figure 6 shows the time series of the average daily lightning flash density over (a) all India and over (b)-(f) five thunderstorm-prone regions during MAM 2020.The model reasonably reproduces the day-to-day lightning variations over all India for all forecast lead times, while some of the daily fluctuations are missed in the forecasts when the subregions are considered.In Figure 7, we have plotted the total monthly flash density over (a) all India and over (b)-(f) five thunderstorm-prone regions.All India, along with NEI and CI, shows an increasing trend with months, with lightning peaking in May.In April, the mean flash density observed by LLN is maximum over the SP and ECI regions but lowest over the NWI region.IFS performs best in March, with more considerable differences between observation and forecast in April and May.In May, the model positive bias increases with increasing lead time for all India, NEI, SP and ECI regions while over NWI the positive bias is less affected by increasing lead time.On the contrary, over CI region IFS forecast underperformes independent of model lead time.

IFS Lightning Forecast Verification
For further quantitative evaluation, skill score analysis has been performed.Since the IITM LLN sensors and IFS grid points may not coincide, we have binned both the observed and forecasted data into 0.3° × 0.3° boxes.For each day and for both observation and forecasted data, if the total lightning flash density is greater than zero in one bin, we denote it as 'yes' and 'no' if the total lightning in a bin is zero.The corresponding contingency table [33] is illustrated in Table 1.

IFS Lightning Forecast Verification
For further quantitative evaluation, skill score analysis has been performed.Since the IITM LLN sensors and IFS grid points may not coincide, we have binned both the observed and forecasted data into 0.3 • × 0.3 • boxes.For each day and for both observation and forecasted data, if the total lightning flash density is greater than zero in one bin, we denote it as 'yes' and 'no' if the total lightning in a bin is zero.The corresponding contingency table [33] is illustrated in Table 1.We have then calculated four skill scores using the contingency table to evaluate the forecast for Day 1, Day 2, and Day 3 lead times (Table 1).The skill scores are POD, FAR, Bias score or Frequency Bias (FB), and SEDI.POD, or probability of detection, is defined as the ratio of the number of correct forecasts to the total number of events, POD = hits hits + misses , FAR, or false alarm ratio, is defined as the ratio of false alarms to the total number of non-occurrence of the event or the conditional relative frequency of a wrong forecast given that the event does not occur, FAR = false alarms hits + false alarms , FB or frequency bias, also know as Bias score or simply Bias, is defined as, FB = hits + false alarms hits + misses .
FB is the ratio of the number of yes forecasts to the number of yes observations.FB > 1 indicates over-forecasting, where the event has been forecasted more than it is observed, FB < 1 is under-forecasting, where the event is forecasted less often than it has been observed, and FB = 1 is an unbiased forecast, where an equal number of events have been forecasted and also been observed.SEDI, or the Symmetric Extremal Dependence Index, is defined as [50], with H being the hit rates or POD and F being the False alarm rate or probability of false detection, defined as, F = false alarms true negative + false alarms .
SEDI has been defined as assessing rare or extreme events and is resistant to 'hedging' or biases in forecasts.SEDI is independent of the base rate and ranges between [−1, 1].As the forecasted events become more random, SEDI reaches 0.
The skill score analysis summarized in Figure 8 shows that, for all three forecast lead days, the probability of detection is over 0.8 for all regions, including all India, indicating that IFS can accurately forecast 8 out of 10 observed lightning events, which is also supported by the low false alarm ratio, especially over thunderstorm-prone regions.FB > 1 for all regions indicates that IFS forecasts more lightning events than observed ones.SEDI shows values > 0 for all cases, indicating that the forecast is not random and has a positive skill that decreases with increasing lead time over most regions.

Summary
During the pre-monsoon, India experienced extensive thunderstorms and enhanc lightning activity, specifically in the northeastern parts of the country, followed by t northwestern and southern peninsula (Figure 2).Mean moisture, surface and upper-lev winds, and the sheared atmosphere in the lower level create favorable conditions for the thunderstorms, which are supported by high positive values of VIMFC, high CAPE, an low CIN (Figure 3).These meteorological conditions support high lightning occurrenc throughout the season.
IITM LLN, since its inception, has played a crucial role in the development of re time lightning warning systems in India.With over 80 sensors (Figure 1) across the natio IITM LLN is also crucial for forecast evaluation, which will help further development early warning systems for lightning.For the first time, IITM LLN has been used to ver lightning forecasts from a global operational forecasting system over India-ECMW IFS-which includes a lightning parameterization [36] capable of predicting total ligh ning flash density with a lead time of 15 days.This paper has analyzed lightning even

Summary
During the pre-monsoon, India experienced extensive thunderstorms and enhanced lightning activity, specifically in the northeastern parts of the country, followed by the northwestern and southern peninsula (Figure 2).Mean moisture, surface and upperlevel winds, and the sheared atmosphere in the lower level create favorable conditions for these thunderstorms, which are supported by high positive values of VIMFC, high CAPE, and low CIN (Figure 3).These meteorological conditions support high lightning occurrences throughout the season.
IITM LLN, since its inception, has played a crucial role in the development of real-time lightning warning systems in India.With over 80 sensors (Figure 1) across the nation, IITM LLN is also crucial for forecast evaluation, which will help further development of early warning systems for lightning.For the first time, IITM LLN has been used to verify lightning forecasts from a global operational forecasting system over India-ECMWF IFS-which includes a lightning parameterization [36] capable of predicting total lightning flash density with a lead time of 15 days.This paper has analyzed lightning events over the Indian subcontinent during the 2020 pre-monsoon season, focusing on five thunderstormprone regions.Despite positive bias along the Himalayan foothills and negative bias over eastern India, the IFS forecasts can capture the spatial structure of the mean lightning flash density for MAM 2020 (Figure 4).
During the pre-monsoon seasons, thunderstorms over many parts of the country reach their mature stage in the afternoon except in the northeast region, where the thunderstorms occur primarily between midnight and early morning.The diurnal variation of lightning closely follows this.The IFS-forecasted primary lightning peak could be seen 2 to 3 h ahead of observations during the day, corresponding to an early onset of convective precipitation [37].Furthermore, the model has difficulties representing the double daily peak of lightning flash density as observed over the east coast of India (Figure 5).We further investigated the daily variations of observed and forecasted lightning flash densities.The day-to-day variations of forecasted lightning agree well with the observed fluctuations, especially when averaged over larger basins.In the observations, lightning instances gradually increase from March to May, with lightning values peaking in May over all India except for the southern peninsula and northwest India, where the maximum total lightning peaks in April.The IFS forecast captures the monthly trend of total lightning flash density overall well in all regions, except the southern peninsula.
We have also calculated statistical skill scores to quantitatively analyze the forecast using a statistical contingency table.The main intention of this statistical analysis is to assess how well IFS can predict 1 to 3 days in advance whether a location will be hit by lightning or not.The POD values show that IFS can correctly predict 80% of thunderstorm events nationwide.The rate of false alarms is lower in thunderstorm-prone regions than in all India.The IFS is also found to over-predict the frequency of lightning events over all India and the five thunderstorm-prone regions.
Since, the relative frequency or the ratio of number of grid points that has observed lightning to total number of available grid points (or sample size; Table 1) converges to zero, we have plotted the SEDI score [42], which is independent of the relative frequency.SEDI score (Figure 8d) indicates that the forecast model can predict lightning with sufficient high skill over all regions.
Overall, the analysis shows that a 9 km global forecasting model, using a convective parameterization scheme and a coupled lightning parameterization, is able to produce a lightning forecast with a 3-day lead time that has a 90% success rate.We consider this to be a substantial step toward an early lightning warning system for India.

Figure 1 .
Figure 1.The locations of Indian Institute of Tropical Meteorology (IITM) Lig work (LLN) stations (marked by white stars).Overlaid colored boxes are 5 regions (as defined by [26]).

Figure 1 .
Figure 1.The locations of Indian Institute of Tropical Meteorology (IITM) Lightning Location Network (LLN) stations (marked by white stars).Overlaid colored boxes are 5 thunderstorm-prone regions (as defined by [26]).
reasonably overlap with the intense lightning regions, as shown in Figure 2.
, the mean lightning flash density is plotted for (a) the LLN observations, (b) IFS Day 1, (c) Day 2, and (d) Day 3 lead times, respectively, along with the bias between LLN and IFS forecast for (e) Day 1, (f) Day 2, and (g) Day 3 lead times.The figure shows that the IFS lightning forecasts capture the pre-monsoon lightning-prone regions rather well.However, there is positive bias along with the Himalayan foothills and a negative bias over eastern India.With increasing lead time, over-estimation across Himalayan foothills decreases while under-estimation over eastern India increases.sphere 2022, 13, x FOR PEER REVIEW

Figure 6
Figure6shows the time series of the average daily lightning flash density over (a) a India and over (b)-(f) five thunderstorm-prone regions during MAM 2020.The mod reasonably reproduces the day-to-day lightning variations over all India for all foreca lead times, while some of the daily fluctuations are missed in the forecasts when the su regions are considered.