Multiscale Correlation Analysis between Wind Direction and Meteorological Parameters in Guadeloupe Archipelago

: In this paper, the wind direction ( WD ) behaviour with respect to the variability of other meteorological parameters (i


Introduction
Nowadays, waste management is a worldwide societal issue [1][2][3][4]. In insular contexts, waste management is a main problem due to the lack of space. Open landfills are often located in the heart of agglomerations [5,6]. Apart from the nauseating odours emitted by these storage centres, the atmospheric pollutants emitted by this municipal solid waste may have a significant health impact on the neighbouring populations [7][8][9]. In a place where there are many micro-climates, it is crucial to better understand the fate of these pollutants.
In the literature, it is well known that meteorological parameters play an important role in pollutant dilution, diffusion, advection and transformation [10][11][12]. The wind is a preponderant climatic variable in these processes. Studies on wind speed behaviour are highly significant, because strong winds allow the dispersion of air pollutants in the atmospheric boundary layer, while calm winds promote its stagnation [13][14][15][16][17][18]. Nevertheless, wind direction is also a key feature as it determines the path of pollutants. In the air pollution field, the behaviour of wind direction is usually studied using statistical methods [19][20][21][22][23][24]. Although these studies provide information on the statistical behaviour of this parameter, its dynamics are not taken into account. Indeed, the results of these works are based only on a single time scale and might not necessarily reflect the features of wind direction time series over several scales [25,26]. To our knowledge, no study has yet investigated the relationship between wind direction and other meteorological parameters using the Hilbert-Huang transform (HHT) framework. For the first time, the coupled multivariate empirical mode decomposition (MEMD) [27] and time-dependent intrinsic correlation (TDIC) framework [28] are used to analyse the teleconnection between wind direction and rainfall, temperature, relative humidity, solar radiation and wind speed. The aim of this study was to analyse wind direction behaviour according to the aforementioned meteorological parameters over several time scales.

Study Area and Data Description
The Guadeloupe archipelago (16.25 • N−61.58 • W) is a French West Indies island located in the middle of the Lesser Antilles in the Caribbean basin. With an area of ∼1800 km 2 and a population of 390,250 inhabitants [29], this island experiences a tropical rainforest climate ("Af") according to the Köppen-Geiger classification [30]. The study area is in the centre of Guadeloupe (Figure 1) where the topography is nearly flat and concrete buildings do not exceed four floors. The main open landfill of the island (LF in Figure 1) is in the same location [31]. Embedded in a mangrove area that surrounds it, the LF is sandwiched between the mangrove and a densely urbanized area [5,6]. This area is the most populated area of Guadeloupe [15]. The meteorological measurements (i.e., rainfall (R), temperature (T), relative humidity (Rh), solar radiation (SR), wind speed (U) and wind direction (WD)) were carried out by Météo France (https://meteofrance.gp/fr, 30 January 2023, MF in Figure 1) in the study area on the international airport of Les Abymes (16.2630 • N −61.61.5147 • W). In order to study the relationship between WD and the other meteorological parameters, a daily database was available from 2016 to 2021, i.e., 1827 points per time series. To guarantee the validity of the data, the latter were previously pre-processed by MF. In Figure 1, we can observe that LF and MF are close. Classically, the prevailing wind in the Caribbean basin comes from the east (∼90 • ), i.e., the trade wind. As WD exhibits huge fluctuations over time due to many factors such as topography [32], the predominant WD of each day was used in our analysis for a better understanding of its multi-scale behaviour.

MEMD
Within 25 years of its introduction, HHT has gained wide popularity for the spectral analysis of non-linear and non-stationary time series data [33]. It is a purely data adaptive method, which can produce physically meaningful representations of time series data. This method does not require a priori selection of functions, but instead it decomposes the signal into intrinsic oscillation modes derived from the succession of extrema. The HHT involves two major steps (i), the use of the empirical mode decomposition (EMD) method or its variants, such as the ensemble EMD (EEMD), or the complete EEMD with adaptive noise (CEEMDAN) [34,35], to decompose a time series into a collection of orthogonal time series, namely, intrinsic mode functions (IMFs), and a final residue; (ii) the use of Hilbert spectral analysis (HSA) to obtain the instantaneous frequency which may be helpful to identify embedded structures of the time series data.
The EMD method decomposes a time series into a set of zero mean components and a final residue, each with specific periodicity. The decomposition is carried out based on the physical time scales that characterize the oscillations of the phenomena [36]. Different non-stationary oscillation processes controlling the variables are governed by the IMFs of the EMD [37]. In a broad sense, comparing the periodicity of the modes with that of the driving factors governs the basis of teleconnection studies [38,39]. For such studies on geophysical time series, the meteorological factors or large-scale climatic oscillations are often considered as predictors. The residue shows the long-term inherent trend of the time series and a comparison between the residue components of the variables and governing factors are likely to be well correlated [40]. The trend is an intrinsically fitted monotonic function or a function in which there can be at most one extremum within a given data span [41]. In this study, the trend represents the changes or alternations in the most likely magnitude of the wind direction throughout time.
The EMD or its variants work well for single time series data at a time and it may result in a different number of data modes adaptively, based on data complexity. This may impose difficulties in developing hybrid decomposition models [42,43] and a decomposition method, which can result in an equal number of modes being a viable option in multiscale correlation studies, especially when multiple time series are involved. Multivariate EMD, proposed by Rehman and Mandic [27], is an extension of the traditional EMD, which decomposes multiple time series simultaneously after identifying the common scales inherent in different time series of concern. In this method, multiple envelopes are produced by taking projections of multiple inputs along different directions in an m-dimensional space. Assuming . v m (t)} being the m vectors as a function of time t, and X φ k = {x k 1 , x k 2 , . . . , x k m } denoting the direction vector along different directions given by angles where K is the total number of directions). It can be noted that the rotational modes appear as the counterparts of the oscillatory modes in EMD or its variants. The IMFs of m temporal datasets can be obtained by the following steps:

1.
A suitable set of direction vectors are generated by sampling on a (m − 1) unit hyper-sphere; 2.
The projection of the dataset P φ k (t) are calculated along the direction vector X φ k for all k; 3.
Temporal instants t φ k i are identified corresponding to the maxima of projection for all k; 4. [t i )] is interpolated to obtain multivariate envelope curves e φ k (t) for all k; 5.
The "detail" D(t) is extracted using D(t) = V(t) − M(t). If D(t) fulfils the stoppage criterion for a multivariate IMF, the above procedure from step (1) onwards is applied upon the residue series (i.e., V(t) − D(t)). Otherwise, steps from (2) onwards are repeated upon the series D(t).
The Hammersley sampling sequence can be used for the generation of direction vectors [44] and the stoppage criteria proposed for the EMD (i.e., the sum squared difference between the deviations of the mode from the mean signal in two consecutive iterations should be less than a specified tolerance) can be used in the implementation of the MEMD.

HSA
In the HSA, firstly the I MF(t) is convoluted with the function g(t) = 1 πt to obtain HT. As the function is a non-integrable one, the Cauchy principal value (PV) is considered instead of finding HT, in the following form [45]: Hence, any signal (X(t)) can be represented by combining I MF(t) and its HT as follows: is the amplitude, and θ(t) is the phase angle, which are defined as: In HSA, the amplitude and phase angle are both functions of time. Therefore in HSA, the plot between instantaneous frequency (IFs) and time depicting the variation of instantaneous amplitude (IAs) provide the Hilbert spectrum. IF can be computed as: The Hilbert spectrum can be developed for individual IMFs in the form: where i = 1, 2, . . . , N is the index of IMFs.

TDIC
In the MEMD-TDIC framework, multiple variables are decomposed into different time scales in a single step operation. It is noteworthy to mention that in TDIC, the data adaptive selection of optimal window size is followed, keeping the stationarity of the data within the window. To ensure this, the size of the sliding window is fixed based on the instantaneous period (IT) (computed by HT of IMFs). The different steps involved in the MEMD-TDIC analysis are:

1.
All time series data are decomposed using MEMD; 2.
The periodicities of the IMFs of the two time series of concern are compared and the IMFs with nearly same mean periodicity are selected; 3.
The ITs of both IMFs (of similar scale) are identified by HT; 4.
The minimum sliding window size (t d ) is identified as the maximum of ITs between the two signals at the current position t k , i.e., t d = max(T 1,i (t k ), T 2,i (t k ));

The sliding window is then fixed as
where n is any positive number (a multiplication factor for minimum sliding window size). In general, n is selected as 1 [46]; 6.
IMF1 and IMF2 are given as two IMFs of nearly the same mean period pertaining to two different time series. The TDIC of the pair of IMFs can be found out as where Corr is the correlation coefficient of two time series; 7.
Student's t-tests are performed to investigate whether the difference between the correlation coefficient R i (t n k ) and zero is statistically significant or not; 8.
Steps 4 to 7 are repeated iteratively till the boundary of the sliding window exceeds the end points of the time series.
The end result of the TDIC analysis will be in a matrix form, based on which the TDIC plot is developed. A triangular shaped plot depicting the correlations at different time instants and under different time scales is obtained from the analysis. The bottom contour of the triangular plots depicts IFs and hence a shift of the plots to larger time scales can be noticed in higher order IMFs (i.e., of low-frequency modes). The TDIC method has gained popularity in performing multi-scale correlation analysis between teleconnected time series from different fields [32,[46][47][48][49]. In this research, the HHT and TDIC methods are employed to understand the teleconnection between the hydroclimatic time series. Figure 2 shows a flowchart summarizing the complete framework of the multi-scale multidimensional correlation analysis.

MEMD Analysis
Before performing the TDIC analysis, first we applied the multi-scale decomposition using the MEMD frameworl. In Figures 3 and 4, all time series data were decomposed into 11 IMFs and one residue. The first IMF mode represents the fast fluctuations, while the last mode represents the slowest fluctuations, i.e., an increase in the time scale with the mode index [50,51]. At first glance, we can notice that some residues, i.e., the overall trend of the time series, seem to exhibit the same behaviour over time: a decrease for R-Rh, an increase for T-SR-WD and a mixture of both for U. The amplitude of these residues vary on small ranges showing that the changes in climatic regimes from one year to the next are not significant [52].

Instantaneous Frequency of Meteorological Parameters (IMFs)
To quantitatively determine the physical meaning behind the MEMD method and compare the IMFs, the mean periodicity of each IMF and their variability are presented in Table 1 for all parameters. The mean period is computed by the zero crossing method [34]. The ratio of variance of a mode (IMF or residue) to the variance of the data series (expressed in %) is computed as variability (V). Overall, the variability seems to decrease with increasing time scales. At IMF9, we notice a sudden increase in the variability. This may due to the impact of the Earth's annual cycle [53] on different meteorological parameters.
Then, HHT is applied for each IMF of all the variables and IFs and IAs are computed. The instantaneous frequency trajectories associated with the IMFs of different variables are presented in different panels of time-frequency-amplitude (TFA) spectra shown in Figures 5-10. In the TFA spectra, the colour scale indicates the distribution of amplitude (in respective units). In general, the high amplitudes are noticed at very localized time instants in high-frequency IMFs (up to weekly scale) to seasonal time scales of ∼4 months (IMF7). The components of intra-seasonal to inter-annual periodicity are affected significantly by high amplitude. The high amplitudes are present for a longer time in the TF spectra of the IMFs of larger time scales of intra-seasonal periodicity. However, such spells pertaining to different variables do not need to be at the same time instants, indicating that the dominant contributor to the WD variability at different time spells may be different. Strong singularities in the form of abrupt changes in the IFs are noted in the TFA spectra of different variables in the monthly to seasonal time scales (IMF5 to IMF7). However, the period of occurrence of such abrupt shifts is also noted at different time instants, even though one of such a shift around the year of 2018 is evident, in line with the peaks in the time series of variables such as rainfall, relative humidity, wind speed and wind direction.       In the study, three types of correlation analysis were performed (i), Pearson's correlation between the raw time series of meteorological variables with WD (ii), Pearson's correlation between the modes of variables and modes of WD series (iii), running correlation between the modes of variables and modes of WD series at different scales. The overall correlation between WD and the parameters are presented in Table 2. One can notice that the correlations are not significant in any of the cases. Table 2. Pearson's correlation coefficients of the raw time series between wind direction (WD) and rainfall (R), temperature (T), relative humidity (Rh), solar radiation (SR) and wind speed (U).

Parameters
Pearson Correlation To gain better insights into the association between meteorological variables and WD at different time scales, Pearson's correlations between the modes of variables and that of WD are computed and presented in Table 3. In Table 3, the significant correlations between WD and the other meteorological parameters (R-T-Rh-SR-U) are from IMF7 to IMF11, i.e., for large time scales. Unlike the other parameters, there is no significant correlation for the residual between U and WD. This seems to show that WD is less dependent on U for the overall trend. The micro-climate of this area will therefore have a major impact on WD behaviour. Indeed, since the study area is in the continental island regime of the island, the trade winds speed (U) will be highly dependent on thermal convection during the day and ground cooling during the night. Consequently, WD behaviour will be also strongly impacted [54]. Table 3. Correlation between the modes of wind direction and the modes of rainfall, temperature, relative humidity, solar radiation and wind speed. Significant correlations at the 5% level are marked in bold. IMF1  IMF2  IMF3  IMF4  IMF5  IMF6  IMF7  IMF8  IMF9 IMF10 Table 3. Cont.

TDIC Analysis
The Pearson's correlations between the modes of meteorological variables and WD showed that high correlations are noted only at low-frequency modes. However, it is worth mentioning that in the computation of the overall correlation between the modes, the complete data length is considered. The reasons behind low correlation between the modes in the high-frequency scale must be investigated in detail. However, in the process of the complex behaviour of WD controlled by many local meteorological variables, one cannot conclude that the same low-magnitude correlation will be preserved over the complete period of observation. On the other hand, the correlation may be very strong (positive or negative) at some of the localized time spells and very weak in some other spells in some of the time scales. The effect of positive correlation in some time spells might be nullified because of the negative correlation with other spells. Moreover, it is found that the overall correlation of the different variables with WD is very small (Table 2). Furthermore, even if the overall correlation in some scales is strongly positive (or negative), its influence will be cancelled due to strong negative (or positive) correlations at other scales, leading to an overall low correlation between the raw time series. Therefore, it is very important to find the localized correlations, for which a running correlation method should be used. Therefore, to analyse the localized correlations between WD and SR-T-Rh-R-U over time, a TDIC analysis was performed. The TDIC analysis creates a graphic triangle, where the vertical axis is the sliding window size and the horizontal axis represents the centre position of the sliding window. The minimum size of the sliding window is the maximum instantaneous period between WD and the other meteorological parameters, whereas the maximum window size is the entire time range [55]. The colour bar indicates the correlation intensity between the studied variables, i.e., red and blue highlight the positive and negative correlations, respectively. White regions in the triangle indicate statistical insignificance when the correlations do not pass the Student's t-test. Hence, the correlation is not statistically significant [28].
Overall, fom Figures 11-15, one can notice there were localized positive and negative correlations between WD and the meteorological parameters from IMF1 (∼3 days) to IMF8 (∼7 months). The transitions in the correlation are more frequent with T and Rh (Figures 12 and 13), implying that the local meteorology may have a strong interaction with these variables to determine WD behaviour. This can be explained by the impact of the urban heat island (UHI) previously identified in the study area by Plocoste et al. [54] which generates an urban breeze in the evening allowing the transportation of volatile organic compounds (VOCs) emitted by LF to the urban areas opposite to the flow of the trade winds, i.e., to the west [31,54]. This urban breeze may explain the dominance of negative associations between WD and U in Figure 15. During this event, the UHI can reach 6 • C and generate a breeze of 1 m/s [54]. The cold and moist air coming from the mangrove will be loaded with air pollutants while passing over the LF and then cool T of the contiguous urban areas. The authors assume that this will also increase the humidity of these areas.
In Figure 11, there is also a dominance of negative correlation between WD and SR. This may be due to the impact of the diurnal cycle on the atmospheric boundary layer behaviour [56]. During the day, the mixed layer is at its maximum height, allowing the trade wind to establish itself in the study area. In the evening, the radiative cooling of the ground pushes the trade wind above the nocturnal boundary layer allowing the establishment of the urban breeze in the surface layer, thus changing WD behaviour.
From Figures 11-15, the relationships are long range, consistent and no transition in the correlation was noticed at IMF9, i.e., annual scale. These results dynamically show the impact of the Earth's annual cycle [53] which tends to homogenize the interactions between the meteorological parameters. IMF8 (∼7 months) to IMF10 (∼2.5 years) in Figure 14, R display a strong positive correlation with the IMFs of WD. R is the only variable that shows a strong positive correlation with WD on three consecutive time scales. As heavy rains and cyclones can cause micro-bursts that modify WD behaviour [57], we assume that these three large scales may correspond to periods without extreme rainy events.
The multi-scale correlation analysis can be considered as one of the potential prerequisites for developing hybrid decomposition machine learning models for the prediction of complex time series. In such a framework, the dominant predictors in different time scales can be identified using TDIC-based multi-scale correlation analysis. Thereafter predictions can be carried out for individual scales separately and final integration will provide the prediction of wind direction. The framework is well-described and applied for the prediction of geophysical series [43,50,58]. Thus, the study has great potential for improving the predictability of wind direction, hence modelling the transportation of air pollutants. Figure 11. TDIC plots between solar radiation (SR) and wind direction (WD). The white spaces imply that correlation is not significant at the 5% level. Figure 12. TDIC plots between temperature (T) and wind direction (WD). The white spaces imply that correlation is not significant at the 5% level. Figure 13. TDIC plots between relative humidity (Rh) and wind direction (WD). The white spaces imply that correlation is not significant at the 5% level. Figure 14. TDIC plots between rainfall (R) and wind direction (WD). The white spaces imply that correlation is not significant at the 5% level. Figure 15. TDIC plots between wind speed (U) and wind direction (WD). The white spaces imply that correlation is not significant at the 5% level.

Conclusions
Wind direction is a key parameter during pollutant transportation. In insular contexts, there is a wide variety of meteorological contexts due to micro-climates. For the first time in the literature, the coupled MEMD-TDIC has been used to study the dynamical relationship between wind direction (WD) and other meteorological parameters (rainfall (R), temperature (T), relative humidity (Rh), solar radiation (SR) and wind speed (U)) in a multi-scale way.
Overall, localized positive and negative correlations between WD and the meteorological parameters were identified from ∼3 days to ∼7 months. The alternation between these correlations were more significant for T and Rh. For SR and U, there is a dominance of a negative correlation with WD. We believe that the local micro-climate specific to the study area may explain all these behaviours. From ∼7 months to ∼2.5 years, there is a strong positive correlation between WD and R. We assume that these time scales correspond to a period without extreme rainy events. It is important to underline that at the annual scale, the relationships are of long range, consistent and no significant transition in correlation was found between WD and all the meteorological parameters. This time scale shows the influence of the Earth's annual cycle on the behaviour of meteorological parameters.
The results obtained in this study clearly show the impact of R-T-Rh-SR-U on WD over different time scales. Due to the small size of the Caribbean islands, these results are crucial because they provide information on the impact of micro-climates on WD behaviour. In order to develop predictive models for WD, the lagged influence of correlations also needs to be studied. This will be the subject of our next study using the time-dependent intrinsic cross correlation (TDICC) framework.

Funding:
The present study has no external funding.

Data Availability Statement:
The data presented are available on request from the corresponding author. The data are not publicly available due to privacy or ethical reasons.