Multivariable Panel Data Cluster Analysis of Meteorological Stations in Thailand for ENSO Phenomenon

The purpose of this research is to study the spatial and temporal groupings of 124 meteorological stations in Thailand under ENSO. The multivariate climate variables are rainfall, relative humidity, temperature, max temperature, min temperature, solar downwelling, and horizontal wind from the conformal cubic atmospheric model (CCAM) in years of El Niño (1987, 2004, and 2015) and La Niña (1999, 2000, and 2011). Euclidean distance timed and spaced with average linkage for clustering and silhouette width for cluster validation were employed. Five spatial clusters (SCs) and three temporal clusters (TCs) in each SC with different average precipitation were compared by El Niño and La Niña. The pattern of SCs and TCs was similar for both events except in the case when severe El Niño occurred. This method could be applied using variables forecasted in the future to be used for planning and managing crop cultivation with the climate change in each area.


Introduction
In the past, the climate in Thailand was largely influenced by monsoon winds, such as southwest moonsoon and northeast moonsoon, resulting in Thailand having a predominantly rainy season and dry season (summer and winter) taking place at a relatively certain time. Currently, however, there has been an El Niño-La Niña phenomenon known as the ENSO phenomenon (ENSO) that affects the climate. The ENSO phenomenon is caused by variations in the Southern Hemisphere's climate system. It is a phenomenon that has a connection between ocean phenomena and ocean winds. It brings about climatic variations, causing unusually high rainfall and unusual drought [1]. There are three types of weather variability: drought, rain and cold disasters, and tropical cyclones. Thailand's proximity to the Western Pacific makes it directly affected by El Niño during 1997-1998, which resulted in drought, lower than normal rainfall, and higher than normal air temperatures across the country [2]. In 1999-2000, during the La Niña period, Thailand experienced more rainfall than usual and cold weather, breaking records in many provinces [2]. Thailand is in the humid tropics, which is suitable for agriculture. Most of its population is engaged in agriculture, so agricultural products are the main source of the country's income and, therefore, vital to its economy. The 12th Agricultural Development Plan (2017-2021) summarizes the agricultural situation in terms of climate change and seasonal variability, resulting in decreased agricultural productivity. Existing plant species are unable to adapt to changing climate conditions, especially the ongoing drought from 2012 to 2015, damaging important crops. This may be due to insufficient observation or experience by farmers to cope with unprecedented situations in time, posing a risk of loss of productivity and increased pro-unprecedented situations in time, posing a risk of loss of productivity and increased production costs [3]. ENSO-related climate variability exerts strong influences on agricultural production in different regions, including in Thailand [4][5][6][7][8][9].
Cluster analysis, unsupervised learning, have been applied in many studies to define spatial and temporal variability from climate variables. In previous studies, only one variable, mostly focusing on rainfall in a time series format, has been used for spatial and temporal cluster [10][11][12]. However, there are other climate factors that affect agricultural production such as relative humidity and temperature, which statistically significantly affected sugarcane production, which was likely to decrease in the year of El Niño and to increase in the year of La Niña [13]. Although there are some studies which employed longitudinal meteorological factors such as rainfall, air temperature, humidity, pressure, wind, evaporation, etc., they firstly average data over the time into the general cross-sectional data and then the distance between samples is calculated for clustering [14]. Averaging over the time will result in a high amount of data loss because the mean shows the average change in the data, yet it does not show the distribution of the data [15][16][17][18].
It would be beneficial to study variation across different geographic scales using multivariable panel hierarchical clustering from ENSO-effected climate variables in Thailand, obtained from the conformal cubic atmospheric model (CCAM). There are seven weather variables, including rainfall, average temperature, highest temperature, lowest temperature, temperature difference from highest temperature, temperature difference from lowest temperature, relative humidity, and solar radiation according to the locations of the weather stations of the Thailand Meteorological Department. These monthly data have been characterized by a combination of panel data, cross-sectional data, and time-series data representing behavioral units and periods. Therefore, this research will employ the distance measurement that does not need to average the data, which is Euclidean distance timed and spaced, to cluster meteorological weather stations in Thailand and discover the seasonal pattern for each cluster using climate factors associated with precipitation when ENSO phenomena occur, since changes in rainfall are important variables affecting agricultural productivity. The studied method, cluster analysis on multivariable panel data with climate change application, therefore, could be applied to the future data from weather models to group area and season. The clustering framework applied in this study is shown in Figure 1. The results could be used as a guideline to benefit the agricultural sector or the relevant agencies to prepare for the upcoming changes resulting from climate change. In addition, spatial and timely management plans can also be appropriately executed, including drought monitoring, water management of both agricultural areas, as well as crop management.

Study Area
Thailand is located between latitudes 5°37′ N and 20°27′ N and longitudes 97°22′ E and 105°37′ E. A total of 124 stations of the Thai Meteorological Department ( Figure 2) were selected for the cluster analysis.

El Niño-Southern Oscillation (ENSO)
El Niño-southern oscillation (ENSO) is a periodic change in the oceanic atmosphere system in the tropical Pacific Ocean that affects climate around the world. It occurs every three to seven years (average five years) and typically lasts nine months to two years, associated with floods, droughts, and other global disturbances. During normal or non El Niño conditions, trade winds blow west across the Pacific Ocean. The western part of the equatorial Pacific is characterized by warm, wet, and low-pressure weather conditions due to the accumulation of moisture in the form of typhoons and thunderstorms.
During the ENSO event, there was an increase in air pressure across the Indian Ocean, Indonesia, and Australia, and a decrease in air pressure over Tahiti and the rest of the central and eastern Pacific Ocean. The trade winds in the South Pacific weaken or head east, and warm water spreads eastward from the western Pacific and Indian Ocean to the eastern Pacific. This has led to widespread droughts in the western Pacific and dry eastern Pacific rainfall. While El Niño is characterized by unusually warm ocean temperatures in the central to eastern Pacific Ocean, La Niña is characterized by unusually cold ocean temperatures in the region, but warmer waters in the western Pacific Ocean, as shown in Figure 3. However, as El Niño conditions lasted for several months, more global warming occurred in the oceans.
In this study, the Oceanic Niño Index (ONI) from the National Oceanic and Atmospheric Administration (2020) was used to identify the El Niño-southern oscillation. The ONI is the 3-month running mean of the sea surface temperature anomaly in the Niño 3.4 region (5° N-5° S, 120°-170° W). The ONI index exceeding +0.5 °C or −0.5 °C for at least

El Niño-Southern Oscillation (ENSO)
El Niño-southern oscillation (ENSO) is a periodic change in the oceanic atmosphere system in the tropical Pacific Ocean that affects climate around the world. It occurs every three to seven years (average five years) and typically lasts nine months to two years, associated with floods, droughts, and other global disturbances. During normal or non El Niño conditions, trade winds blow west across the Pacific Ocean. The western part of the equatorial Pacific is characterized by warm, wet, and low-pressure weather conditions due to the accumulation of moisture in the form of typhoons and thunderstorms.
During the ENSO event, there was an increase in air pressure across the Indian Ocean, Indonesia, and Australia, and a decrease in air pressure over Tahiti and the rest of the central and eastern Pacific Ocean. The trade winds in the South Pacific weaken or head east, and warm water spreads eastward from the western Pacific and Indian Ocean to the eastern Pacific. This has led to widespread droughts in the western Pacific and dry eastern Pacific rainfall. While El Niño is characterized by unusually warm ocean temperatures in the central to eastern Pacific Ocean, La Niña is characterized by unusually cold ocean temperatures in the region, but warmer waters in the western Pacific Ocean, as shown in Figure 3. However, as El Niño conditions lasted for several months, more global warming occurred in the oceans. five consecutive months was considered as a full-fledged El Niño (E) or La Niña (L). According to Null report, the three latest very strong El Niño events (ONI ≥ 2 °C) in 1982, 1997, and 2015 and three latest strong La Niña events (−1.5 to −1.9 °C) in 1999, 2007, and 2011 were selected to study the climate variations [19].

Conformal Cubic Atmospheric Model (CCAM)
The CCAM is a dynamic global climate model developed by the Commonwealth Scientific and Industrial Research Organization (CSIRO), Division of Atmospheric Research, Australia. It is used to forecast global climate through dynamic scale reduction by generating a grid covering the region's forecast area [20]. The model has also been developed by adding physical parameterization schemes that include longwave radiation, shortwave radiation, aerosol, cumulus convection, cloud distribution, soil temperature, etc., to reduce the climate forecast error. The CCAM dataset was downscaled to 10 km grid resolution, which is sufficient for the analysis of both spatial and temporal forecasts at the regional level [21,22]. Data were changed from grid data to station format, which covers 124 meteorological measurement stations across Thailand ( Figure 1).
Climate variables, focusing mainly on agricultural-related variables for cluster analysis, were used in this study. They consist of a total of 7 variables: rainfall (mm/day), relative humidity (percent), average temperature (degrees Celsius), maximum temperature (degree Celsius), minimum temperature (degrees Celsius), solar radiation (watts/square meter), and wind speed (m/s). Monthly data of those variables were collected for the years 1987, 1999, 2000, 2004, 2011, and 2015, of which the ENSO phenomenon occurred.

Multivariate Panel Data
Panel data is the combination of cross-sectional data and time-series data representing behavioral units over the time ( ) . Data were collected from cross-section data, which collects the value of the variables in each unit at a given point in time. Then, the data were repeatedly collected from the same unit at a subsequent time, either yearly, quarterly, monthly, weekly, daily, or hourly. If each panel unit is observed at the same time point, a data set is called balanced panel data. Consequently, if a balanced panel contains panel units and periods, the number of observations in the dataset is necessarily = × . However, if at least one panel unit is not observed every period, a data set is called unbalanced panel data. Therefore, the number of observations in the unbalanced panel dataset is < × .
Multivariate panel data has a very complex structure and cannot be represented by a simple two-dimensional  (1) and (2), respectively [15][16][17][18]. In this study, the Oceanic Niño Index (ONI) from the National Oceanic and Atmospheric Administration (2020) was used to identify the El Niño-southern oscillation. The ONI is the 3-month running mean of the sea surface temperature anomaly in the Niño 3.4 region (5 • N-5 • S, 120 • -170 • W). The ONI index exceeding +0.5 • C or −0.5 • C for at least five consecutive months was considered as a full-fledged El Niño (E) or La Niña (L). According to Null report, the three latest very strong El Niño events (ONI ≥ 2 • C) in 1982, 1997, and 2015 and three latest strong La Niña events (−1.5 to −1.9 • C) in 1999, 2007, and 2011 were selected to study the climate variations [19].

Conformal Cubic Atmospheric Model (CCAM)
The CCAM is a dynamic global climate model developed by the Commonwealth Scientific and Industrial Research Organization (CSIRO), Division of Atmospheric Research, Australia. It is used to forecast global climate through dynamic scale reduction by generating a grid covering the region's forecast area [20]. The model has also been developed by adding physical parameterization schemes that include longwave radiation, shortwave radiation, aerosol, cumulus convection, cloud distribution, soil temperature, etc., to reduce the climate forecast error. The CCAM dataset was downscaled to 10 km grid resolution, which is sufficient for the analysis of both spatial and temporal forecasts at the regional level [21,22]. Data were changed from grid data to station format, which covers 124 meteorological measurement stations across Thailand ( Figure 1).
Climate variables, focusing mainly on agricultural-related variables for cluster analysis, were used in this study. They consist of a total of 7 variables: rainfall (mm/day), relative humidity (percent), average temperature (degrees Celsius), maximum temperature (degree Celsius), minimum temperature (degrees Celsius), solar radiation (watts/square meter), and wind speed (m/s). Monthly data of those variables were collected for the years 1987, 1999, 2000, 2004, 2011, and 2015, of which the ENSO phenomenon occurred.

Multivariate Panel Data
Panel data is the combination of cross-sectional data and time-series data representing behavioral units over the time x ij (t) . Data were collected from cross-section data, which collects the value of the variables in each unit at a given point in time. Then, the data were repeatedly collected from the same unit at a subsequent time, either yearly, quarterly, monthly, weekly, daily, or hourly. If each panel unit is observed at the same time point, a data set is called balanced panel data. Consequently, if a balanced panel contains n panel units and T periods, the number of observations in the dataset is necessarily N = n × T. However, if at least one panel unit is not observed every period, a data set is called unbalanced panel data. Therefore, the number of observations in the unbalanced panel dataset is N < n × T.
Multivariate panel data has a very complex structure and cannot be represented by a simple two-dimensional table. Table 1 shows the multivariate combination of data in a twodimensional table format, where n represents the number of samples collected, p represents the number of variables (x 1 , x 2 , . . . , x p ), T represents the length of time and represents the data value of the ith sample and jth variable at time t, where i ∈ [1, n]; j ∈ [1, p]; t ∈ [1, T]. Descriptive statistics, such as mean and variance of jth variable, is calculated as Equations (1) and (2), respectively [15][16][17][18].
The values for monthly climate variables were organized in two configuration matrices. Matrix N × p had monthly data (T) for stations (n) in its rows (N = n × T) and the variables (p) in the columns. It was used to identify clusters of similar stations. Furthermore, monthly climate variables within these clusters (N c ) were analyzed to discover seasonality within the spatial cluster. For the second step, monthly climate variables were arranged in T × N c rows, and the variables (p) were set up in columns ( Table 2).

Month (t)
Station Index (i)

Multivariate Cluster Analysis
Cluster analysis is an unsupervised learning technique to identify groups with similar characteristics in the same group [23]. Agglomerative hierarchical clustering was used in this research. The bottom-up hierarchical algorithm treats each sample as a single cluster and then combines pair of clusters that are most similar until every cluster is grouped into one single cluster. In the case of general cross-section data, block distance, Euclidean distance, Minkowski distance, Chebychev distance, or Mahalanobis distance are used to measure the distance between two vectors Cluster analysis of samples collected from multivariate panel data is often averaged over time data into general cross-section data. Typical Euclidean distance is then calculated for further grouping. However, this will result in information loss because the mean shows the average change in the data but does not show the distributing characteristics of the data, such as the standard deviation. Therefore, in this study, a Euclidean distance timed and spaced (d rk ) is used to calculate the distance between sample r and sample k [15][16][17][18], as in Equation (3).
The distance should satisfy some conditions as follows: A distance matrix for spatial grouping analysis contains a distance value between every pair of samples as in Equation (4a), which is the symmetric matrix (n × n) with all diagonal values of zero. At the same time, a distance matrix for temporal grouping analysis within the spatial cluster contains a distance value between every pair of months as in Equation (4b), which is the asymmetric matrix (12 × 12) with all diagonal values of zero.
Average linkage, which is the unweighted pair group method using arithmetic averages (UPGMA), was used to average the distance values between pairs of clusters [24]. It is widely used because it compromises the extreme cases [25].
The multivariate cluster analysis used in this paper was implemented directly using the "philanthropy", "cluster", "factoextra" and "FactoMineR" package in R programming language and RStudio [26].

Cluster Validation
This paper employed silhouette width (S i ) [27] to determine the optimal number of clusters, and it also could be used to validate consistency within clusters of data. The silhouette measures the similarity of i-th observation to its own cluster and the similarity of observation to other clusters as Equation (5).
where a i is the average distance between i and all other observations in the same cluster, and b i is the average distance between i and the observations in the "nearest neighboring cluster" as Equation (6).
where C(i) is the cluster containing observation i, d(i, j) is the Euclidean distance timed and spaced between observations i and j, and n(C) is the cardinality of cluster C. S i ranges from −1 to +1, where a high value indicates that the observation is well matched to its own cluster, while a low or negative value indicates that observation is poorly matched to its own cluster. The average of observation's silhouette in a cluster was obtained to determine whether the clustering configuration is appropriate. The advantage of using silhouette only depends on the actual partition of the observations, not on the clustering algorithm that was used, and no need to access the original data. This paper implemented this function using the silhouette function in package cluster [28].

Results
This section may be divided into subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn. Figure 4 shows boxplots of seven variables; they are varied by month but have the same pattern each year.

Variable Characteristics
Rainfalls were more varied than others in 1997 and 2007 for the El Niño and La Niña phenomenon, respectively. The average rainfall in La Niña phenomenon was higher than that in El Niño phenomenon and the normal average, except for 1999, which was affected by the 1997-1998 very strong El Niño. Furthermore, all factors in each year had a pattern in relation to the season. For example, rainfall was very high and more fluctuated from August to September. It can be concluded that climate factors were different from month to month and year to year. Obviously, the rainfall between El Niño and La Niña differed significantly, while other climate factors were similar. This suggested the rainfall should be more focused to analyse the impact of the ENSO phenomenon on spatial clustering.

Results
This section may be divided into subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn. Figure 4 shows boxplots of seven variables; they are varied by month but have the same pattern each year.

Variable Characteristics
Rainfalls were more varied than others in 1997 and 2007 for the El Niño and La Niña phenomenon, respectively. The average rainfall in La Niña phenomenon was higher than that in El Niño phenomenon and the normal average, except for 1999, which was affected by the 1997-1998 very strong El Niño. Furthermore, all factors in each year had a pattern in relation to the season. For example, rainfall was very high and more fluctuated from August to September. It can be concluded that climate factors were different from month to month and year to year. Obviously, the rainfall between El Niño and La Niña differed significantly, while other climate factors were similar. This suggested the rainfall should be more focused to analyse the impact of the ENSO phenomenon on spatial clustering.

Spatial Clustering
The average silhouette width was used to determine a suitable number of clusters ( ). It suggested the value 4 or 5 for , due to their maximum width ( Figure 5). So, a fair comparison between the ENSO events was achieved for choosing five spatial clusters (SCs) close to height 12.5 (distance between clusters) for all datasets in this study. Five spatial clusters, SC1, SC2, SC3, SC4, and SC5, which were sorted according to the amount of precipitation from ascending to high, were formed and displayed on a spatial map in Figure 6. It was obvious that precipitation was the only meteorological data to noticeably differ between clusters. Spatial clustering in El Niño events was mostly grouped in SC2 (yellow) with 62-66 members except in 1982, which mostly in SC1 (red) with 59 members; however, its average rainfalls were nearly the same to SC2, whereas spatial clustering in La Niña events was mostly grouped in SC1 (red) with 61-83 members. While SC5 (pink) was the least populated member with one member, which was the station in the east for both events (Table 3). These showed most areas in Thailand had low precipitation rate.
In La Niña event, SC1 (red) was found mostly in Northeast and Central areas, which had the least amount of rainfall, and SC2 (yellow) was widely distributed in the North, which had low rainfall. SC3 (green) with moderate rainfall were distributed among all regions, except North and Northeast, while SC4 (blue) are in the south which had quite a lot of rainfall. Lastly, SC5 (pink) with the highest rainfall had one station in the East (Table  3).

Spatial Clustering
The average silhouette width was used to determine a suitable number of clusters (k). It suggested the value 4 or 5 for k, due to their maximum width ( Figure 5). So, a fair comparison between the ENSO events was achieved for choosing five spatial clusters (SCs) close to height 12.5 (distance between clusters) for all datasets in this study. The average silhouette width was used to determine a suitable number of clusters ( ). It suggested the value 4 or 5 for , due to their maximum width ( Figure 5). So, a fair comparison between the ENSO events was achieved for choosing five spatial clusters (SCs) close to height 12.5 (distance between clusters) for all datasets in this study. Five spatial clusters, SC1, SC2, SC3, SC4, and SC5, which were sorted according to the amount of precipitation from ascending to high, were formed and displayed on a spatial map in Figure 6. It was obvious that precipitation was the only meteorological data to noticeably differ between clusters. Spatial clustering in El Niño events was mostly grouped in SC2 (yellow) with 62-66 members except in 1982, which mostly in SC1 (red) with 59 members; however, its average rainfalls were nearly the same to SC2, whereas spatial clustering in La Niña events was mostly grouped in SC1 (red) with 61-83 members. While SC5 (pink) was the least populated member with one member, which was the station in the east for both events (Table 3). These showed most areas in Thailand had low precipitation rate.
In La Niña event, SC1 (red) was found mostly in Northeast and Central areas, which had the least amount of rainfall, and SC2 (yellow) was widely distributed in the North, which had low rainfall. SC3 (green) with moderate rainfall were distributed among all regions, except North and Northeast, while SC4 (blue) are in the south which had quite a lot of rainfall. Lastly, SC5 (pink) with the highest rainfall had one station in the East (Table  3).
While, spatial clustering in El Niño was differently distributed by years. In 1997 and 2015, SC1 (red) was found mostly in the North and SC2 (yellow) was widely distributed in the Northeast, and vice versa in 1982. In 1982 and 1997, SC3 (green) with moderate rainfalls were distributed among all regions, except North and Northeast, and SC4 (blue) Five spatial clusters, SC1, SC2, SC3, SC4, and SC5, which were sorted according to the amount of precipitation from ascending to high, were formed and displayed on a spatial map in Figure 6. It was obvious that precipitation was the only meteorological data to noticeably differ between clusters. Spatial clustering in El Niño events was mostly grouped in SC2 (yellow) with 62-66 members except in 1982, which mostly in SC1 (red) with 59 members; however, its average rainfalls were nearly the same to SC2, whereas spatial clustering in La Niña events was mostly grouped in SC1 (red) with 61-83 members. While SC5 (pink) was the least populated member with one member, which was the station in the east for both events (Table 3). These showed most areas in Thailand had low precipitation rate. SC1 with less rainfall than SC2 and in the South region classified as SC3 with less rainfall than SC4 for the El Niño event. These areas would be at risk to be the most drought-prone areas. This suggested the effect of ENSO on spatial clustering.  n-number of members, C-Central, E-East, N-North, NE-Northeast, S-South, W-West.
In La Niña event, SC1 (red) was found mostly in Northeast and Central areas, which had the least amount of rainfall, and SC2 (yellow) was widely distributed in the North, which had low rainfall. SC3 (green) with moderate rainfall were distributed among all regions, except North and Northeast, while SC4 (blue) are in the south which had quite a lot of rainfall. Lastly, SC5 (pink) with the highest rainfall had one station in the East (Table 3).
While, spatial clustering in El Niño was differently distributed by years. In 1997 and 2015, SC1 (red) was found mostly in the North and SC2 (yellow) was widely distributed in the Northeast, and vice versa in 1982. In 1982 and 1997, SC3 (green) with moderate rainfalls were distributed among all regions, except North and Northeast, and SC4 (blue) were in the South which had quite a lot of rainfall, and vice versa in 2015. In every year, SC5 (pink) with the highest rainfall had one station in the East ( Table 3).
The spatial clistering extracted the drought areas in the North region, classified as SC1 with less rainfall than SC2 and in the South region classified as SC3 with less rainfall than SC4 for the El Niño event. These areas would be at risk to be the most drought-prone areas. This suggested the effect of ENSO on spatial clustering.
The distribution of SGs over six regions, showing a clear trend in the redistribution of SGs observed in this study, is shown in Table 4. More diverse climate was found in the East and West than other regions. All regions had a heterogenous meteorological distribution. Every year for both El Niño and La Niña events had 2-4 SCs. However, less distribution for El Niño (2015) in Central, East, and West regions and for La Niña (1999) in the West, and more distribution for La Niña (1999) in the South were noted. These would be due to changes in TGs and intensity of climate factors. Table 4. The distribution of SGs over six regions for ENSO. La Niña   1982  1997  2015  1999  2007  2011   Central  26  3  3  2  3  3  3  East  15  4  4  3  4  4  4  North  16  2  2  2  2  2  2  Northeast  28  2  2  2  2  2  2  South  27  2  2  2  3  2  2  West  12  4  4  3  3  4  4

Temporal Clustering
After spatial cluster analysis had been obtained, a Euclidean distance timed and spaced with average linkage was next applied to the monthly climate factors for each SC to find temporal clusters (TCs) within each SG. Normally, Thailand has three seasons, summer (February-May), rainy (May-October), and winter (October-February). To compare temporal clusters of the ENSO phenomenon, three TCs within each SC were compared in this study. TC1, TC2, and TC3, which were sorted according to the amount of ascending precipitation, were represented by orange, blue, and green, respectively. TCs corresponding to each SG is shown in the dendrogram to depict the groups of clusters and their combination, indicating dissimilarity in the vertical scale and the samples (months) in clustering order on the horizontal axis. They help to see how long each season lasts and the different period of seasons in each spatial grouping (Figure 7).   For example, in 1982, TC1 and TC2 in SC1 depicted a very dry season with average precipitation intensity of less than 2 mm/day (Table 5). They were composed of three months. Months of TC1 were December and of TC2 were January and February. TC3, on the other hand, was a slightly wet season with an average precipitation of 2 mm/day or more for 9 months, March-November. TC1 and TC2 in SC2, 1982 depicted a very dry season with average precipitation intensity of less than 2 mm/day. TC1 was composed of February-March. TC2 was December and January. TC3, on the other hand, was a wet season with an average precipitation of 4.51 mm/day for 8 months, April-November. TC1 and TC2 in SC3, 1982 was a dry season for three months, December (TC1) and January-February (TC2), with an average precipitation of less than 2 mm/day, while TC3 was a wet season with an average precipitation of 4 mm/day or more for 9 months, March-November.
TCs in SC4 and SC5, 1982 were the same. TC1 was a dry season, January-February, with an average precipitation of less than 2 mm/day, while TC2 and TC3 were a wet season with an average precipitation of 2 mm/day or more for 10 months, December (TC2) and March-November (TC3).
TCs in each SC in La Niña were similar to those in El Niño. Nevertheless, there was higher average precipitation intensity in La Niña phenomenon, than those in El Niño phenomenon. Furthermore, the rainy season was a longer period in SC4 and SC5 for both events of ENSO.
Lower rainfall than usual was found, so there was a widespread drought in almost all regions of Thailand in 1982 and 1997, especially in Northeast [30]. There also was a severe El Niño effect in 2015, causing very low precipitation across the country (x = 2.28 mm/day).
Five spatial clusterings were formed. SC5 with the highest average precipitation was formed by only one station in Khlong Yai District, Trat Province, in every year whether there was an El Niño or La Niña phenomenon (x = 5.64 − 14.74 mm/day). The topography of Khlong Yai District is a coastline fully influenced by the southwest monsoon from the Gulf of Thailand; consequently, it has abundant rainfall for most of the year. This is consistent with the Trat Agricultural Meteorological Document that reports that Khlong Yai District, Trat Province, is the wettest area in Thailand [31].
There were approximately 80 stations in SC1 and SC2 with low average precipitation and especially low in 2015, mostly in the Central, North, and Northeast. It was consistent with a report that rainfall in these three regions when El Niño occurred was less than the average 30 years of rainfall of normal years.
There were three TCs in each SC. When the El Niño phenomenon occured, Thailand rainfall tended to be lower than normal, especially during the summer and early rainy season (mid-February-June). The dry season in El Niño was longer and less than average rainfall than TCs for the La Niña phenomena.
Most stations in the south were clustered into SC3 and SC4 with moderate and high rainfall, respectively, for both El Niño and La Niña phenomena. Usually, rainfall in Thailand, especially in the southeast coast, is high during October-December. In addition, some parts of Thailand were not affected by the ENSO phenomenon (El Niño and La Niña), such as Trat in SC5 with the highest rainfall, and Tak, Chiang Rai, Chiang Mai, Phayao, and Lampang in SC1 with the least rainfall. This may be due to their topography.
There are 35 provinces with more than one meteorological station of TMD. Of these, stations in 34 provinces were grouped into different SCs. This may be due to their topography affecting a different climate.
Spatial clusters were similar for both El Niño and La Niña except in 2015, when severe El Niño occurred. This might be the Euclidian distance matrix tending to cluster the samples with climate variables having similar mean. This suggests that other similarity matrices, such as correlation, may be possible to group samples based on trends and variation over time [11].

Conclusions
This paper employed multivariate cluster analysis with the average linkage to analyze the spatial and temporal grouping, using climate factors which are rainfall, relative humidity, average temperature, maximum temperature, lowest temperature, solar radiation, and wind speed at 124 locations over Thailand from CCAM (10 km), for the years 1982, 1997, and 2015 (El Niño) and 1999, 2007, and 2011 (La Niña).
Five SCs with a distance between a cluster of 12.5 were compared. It was observed that SCs were similar for both El Niño and La Niña except in 2015, when severe El Niño occurred. This indicated the more severe El Niño, the more spatial variation. The main difference between SC1-SC5 was the ascending amount of precipitation, where SC1 had the least amount of rainfall and SC5 had the heaviest rainfall.
In addition, three TC patterns in each SC were similar for both El Niño and La Niña. Nevertheless, the average precipitation intensity in La Niña was higher than that in El Niño.
This paper implements cluster analysis on atmospheric panel data. Even multivariable panel data is more complicated, but it is practical to cluster. Cluster results arealso more realistic than cross-sectional data and avoid information loss.
Future studies may focus on using future climate factors from the weather forecast models for clustering to study the spatial and temporal distributions. Other than the correlation distance suggested, the robust distance, for example the absolute distance or the Canberra distance to deal with outliers, should be further studied. Furthermore, as there might be extreme whether events in the ENSO phenomenon, for example less or abundant precipitation, which may affect the clustering, outliers should be detected and handled prior.

Conflicts of Interest:
The authors declare no conflict of interest.