Study of Temporal Variations in Species–Environment Association through an Innovative Multivariate Method: MixSTATICO

: The study of biotic and abiotic factors and their interrelationships is essential in the preservation of sustainable marine ecosystems and for understanding the impact that climate change can have on different species. For instance, phytoplankton are extremely vulnerable to environmental changes and thus studying the factors involved is important for the species’ conservation. This work examines the relationship between phytoplankton and environmental parameters of the eastern equatorial Pacific, known as one of the most biologically rich regions in the world. For this purpose, a new multivariate method called MixSTATICO has been developed, allowing mixed-type data structured in two different groups (environment and species) to be related and measured on a space–time scale. The results obtained show how seasons have an impact on species–environment relations, with the most significant association occurring in November and the weakest during the month of May (change of season). The species Lauderia borealis , Chaetoceros didymus and Gyrodinium sp. were not observed in the coastal profiles during the dry season at most stations, while during the rainy season, the species Dactyliosolen antarcticus , Proboscia alata and Skeletonema costatum were not detected. Using MixSTATICO, species vulnerable to specific geographical locations and environmental variations were identified, making it possible to establish biological indicators for this region.


Introduction
Water sustainability has often been treated as sustainable freshwater management for human consumption. However, from a more holistic point of view, water, both freshwater and saltwater, is an important resource and essential for sustaining ecosystems that generate or maintain the environmental conditions necessary for sustaining life in general [1]. The demand for water in rural agriculture, industrial operations, human consumption and other purposes is high, but in recent years, the water contamination caused by chemicals, pharmaceuticals and cleaning processes [2] have polluted rivers, creating serious problems in the marine ecosystems once this contamination reaches the sea.
For the United Nations (UN), water is an essential element in the adaptation to climate change [1], and is a significant problem that society and the environment are currently experiencing. Climate change has led to alterations in the natural variability of the environment, which is reflected worryingly in extreme temperatures and precipitation, affecting pollution in the hydrological component due to the high nutrient values generated [3]. Environmental changes are considered one of the main elements that cause alterations in the environment and ecology, leading to vulnerability [4]. To assess ecological vulnerability, it is necessary to have quantitative and qualitative knowledge about ecosystems and to thus understand the factors that cause their vulnerability [5].
In the study of the hydrological component, mathematical models [6] have increasingly been applied through predictive and descriptive methods that are classified into deterministic and stochastic. Stochastics are characterized by using multivariate statistical methods [7]. These methods are considered helpful for studying water quality [8] using evaluation factors such as correlation, covariation, similarities and distances [7]. In addition, in order to analyse temporal hydrological behaviour from a stochastic perspective, it is necessary to study the temporal-spatial correlation of hydrological parameters [7].
In this way, it is of vital importance to study the sustainability of water associated with climate change, not necessarily in its most common environment, but rather from a holistic perspective.
It is known that climate change affects the sustainability of water and its ecosystem, whether inland waters, seawater or the open ocean, as well as the environment and society at large.
Multivariate statistical methods have been developed to study data matrices with multiple variables in order to reduce the dimensionality of a matrix. These methods have a long history in science. The first was Principal Component Analysis (PCA) [9][10][11], a technique that projects the observation matrix onto a bidimensional space built from several inter-correlated orthogonal quantitative variables called principal components. The components are linear combinations of the original variables. The first component represents the largest variance retained by the method, the second component represents the largest variance possible while orthogonal to the first, and so on for the other components.
Subsequent developments include Factorial Analysis (FA) [12,13], which projects the correlations between variables and among variables and factors; Correspondence Analysis (CA) [14,15], which studies the association between the categories of two qualitative variables; Multiple Correspondence Analysis (MCA) [16], which extends CA to several qualitative variables; and Canonical Correspondence Analysis (CCA) [17]. The latter is an extension of CA widely used in biology which relates community composition to known variation in the environment (two matrices). The ordination axes are linear combinations of the environmental variables.
These methods are useful in analysing data arrays with a two-way dimensional structure, as they have the constraint that the number of variables must be less than the number of individuals (PCA). However, this condition cannot be met in some areas of research that analyse several variables to describe communities and environments, and for ecological data sets that are for the most part multivariate [18]. At present, with the advancement of multivariate data analysis techniques, it is also possible to analyse data that consider multiple conditions, which give rise to arrays that have a k-way structure and multi-way data arrays. For the analysis of this new array set, multi-way multivariate methods are used that allow multiple conditions to be studied in combination, identifying underlying patterns that other methods are unable to identify. There are valuable contributions to these multivariate methods cited below.

Contributions of the French School
Multiple Factor Analysis (FMA) [19][20][21] analyses information from different variable sets (matrices) defined on the same set of individuals (observations). This method performs PCAs in two phases: first it normalizes each matrix (dividing its elements by the first eigenvalue obtained from applying a PCA to each matrix) and then merges the matrices into one that represents the overall structure, then a PCA is applied to that global matrix.
STATIS [22] is a technique used to study a set of K-way matrices and is also known as ACT-STATIS [23]. It analyses data matrices by calculating Euclidean distances between configurations of the same individuals measured by different variables and K conditions (times or situations). This method bases its analysis on operator matrices (covariance matrices); it is a generalization of PCA and begins by analysing the similarities between the K-matrices. From these, it obtains weights (first eigenvector) used to find the stable part of the common structure (compromise or consensus) of the set of matrices, which is calculated as the weighted linear combination of the K-matrices and represents it in a low dimension space. It also allows the original individuals of each k-matrix (k = 1, …, K) to be projected over the consensus dimensions obtained.
However, despite all these contributions, none of these methods analyse matrices with mixed data.
The methodological basis of this study is based on the methods of the French school, but it is also necessary to present the methods developed by other schools.
This work proposes a multivariate alternative called MixSTATICO in order to analyse together the information provided by quantitative and qualitative variables, which allows us to identify, through a space-time analysis, the effect of variations of the physical and chemical variables of the ocean on certain phytoplanktonic species from a perspective similar to that of the STATICO method.

Materials
The study area consists of four sampling stations, located spatially from the north to the south of the coastal profile of Continental Ecuador, located 16 km from the coastline. The samples sites of Emeralds and Puerto Bolívar (Pto. Bolivar) are influenced by an estuarine system, while those of Manta and La Libertad maintain a direct influence of the Pacific Ocean. The area has two climatic epochs: rainy (December-May) and dry (June-November).
Data were collected in 2013. At each station, multiple environmental and biological variables were collected monthly at seven depth levels of about 0, 10, 20, 30, 40, 50 and 75 m in the morning (8:00 a.m. to 12:00 p.m.) from February to December in 2013. For a further description of the data, consult González-Narváez [106].

Hydrography and sampling
A vertical profile of conductivity-temperature-depth (CTD), SeaBird 9/11-plus was measured daily at a fixed position, providing temperature (°C) and salinity (psu) profiles. Water samples were collected at the four sampling sites, using a cast of 3 L nontoxic sampling Van Dorn bottles to obtain water samples for dissolved oxygen concentration (mg L-1) and analysis of nutrients including nitrite, nitrate, phosphate and silicate (µgat L-1), as well as phytoplankton examination. To measure dissolved oxygen, the volumetric method was applied (PEE/LAB-DOQ/01) based on Eaton et al. [107].

Determination of inorganic nutrients and ratios
Samples for nutrient analysis were filtered through washed glass-fibre filters with 0.45 µm (Whatman, GF/C). They were frozen for analysis 1-3 weeks later onshore. Ammonium, nitrite, nitrate, phosphate and silicic acid amounts were determined by the method of Strickland and Parsons [108].

Phytoplankton analysis
Phytoplankton analysis was performed with approximately 250 cm 3 water samples in glass bottles fixed with Lugol's solution. A 25 cm 3 composite chamber was subsequently filled with the sample water and its contents allowed to settle for 24 h. At least two transects of the chamber bottom were observed with an inverted microscope [109] at 400x magnification to count the small, frequently occurring phytoplankton forms. Additionally, the whole chamber bottom was examined at 125x magnification to count larger, less frequent cells [110,111]. Data are expressed as cells per liter. Classification was performed at the genus or species level when possible, but many taxa could not be identified and were pooled into categories such as "small flagellates" or "small dinoflagellates". Abbreviations were added for statistical reporting (see Table 1 for species lists); references used can be found in Jiménez [112], Pesantes [113], Balech [114], Tomas [115]), Taylor et al. [116], Tomas [117,118], Young et al. [119] and Jiménez [120]. for quartiles calculated using all data collected throughout the entire study period). As the last variable (nominal type), the location is used to indicate the sector on which the sampling station is situated: north for Esmeraldas, centre north for Manta, centre south for La Libertad and south for Pto. Bolívar. The seasons were indicated as "rainy from December to May" and "dry from June to November".

Methodological Description of the MixSTATICO Method.
Boldface lowercase letters (e.g., ⃑ [ , , ] ) will be used to represent vectors of dimensionality n corresponding to j-th column and k-th condition. Boldface uppercase letters [ , ] will denote ( × ) matrices corresponding to the k-th condition. A set of K such matrices (a "data cube") will be denoted by a boldface uppercase double-stroke letter such as [ ] . The positions of matrix elements will be represented by boldface lowercase letters { | = 1, … , ; = 1, . . , = 1, … , }. The transpose of a matrix is written [ , ] , its rank is ( [ , ] ), its diagonalization matrix is ( [ , ] ) and its vectorised form is ( [ , ] ) . The merger of two matrices is written as [ , ] , where there are n individuals and ′ = + variables. The identity matrix is [ ] and the unit vector is [ ] . The multiplication of a matrix by a scalar [ , ] is represented by [ , ] [ , ] , while the product of two matrices is symbolised by [ , ] × [ , ] .
Consider two sets of three-way matrices (data cubes) [ ] and [ ] , structured by mixed data that may be quantitative (discrete, continuous) and/or qualitative (binary, ordinal, nominal). The data cubes have the characteristic that their matrices contain the same variables and that the individuals are the same between pairs of matrices for the kth condition, that is: Subsequently, it is suggested to pre-treat [ ] and [ ] as follows: a. Quantitative variables must be applied a mathematical transformation as the suggest principally Kroonenberg [121] and Legendre and Legendre [122], such as: normalization by column, centre by column, logarithm; b. Binary variables must be coded with 0 and 1; c. Ordinal variables must be coded for each scale in ascending numerical order; d. Nominal variables must be transformed to a disjunctive matrix (each category is a new dichotomous variable).
With these modifications, the dimensions of the two data cubes are temporarily modified. The dimensions p and q belonging to the variables will increase by the creation of the disjunctive matrix for each nominal variable, resulting in: Among the methods commonly used to analyse the relationships between two sets of variables measured on the same individuals are CCA, Redundancy Analysis (RDA) [123], Canonical Correlation Analysis (CANCOR) [11] and Co-Inertia Analysis (Co-IA) [124]. Co-Inertia differs from the others in that it focuses on finding axes maximizing the covariance between the variables of the two tables.
To implement the methodology of analysing the co-structure between two sets of variables measured on the same individuals, a suggestion made by Abdi et al. [30] for the distance matrix based DISTATIS method is adopted. For mixed data, the distance metric of Gower [125] is used, between the variables of the paired matrices [ , ] and [ , ] , having previously merged these matrices in each K-condition to obtain the following set of K-distance matrices of dimension ( × × ): The Gower distance dij to each K-matrix is calculated independently from: Here, is the Gower similarity coefficient; is the number of continuous quantitative variables; is the number of binary variables; is the number of qualitative non-binary variables; and are the number of matches in binary variables (1,1) and (0,0), respectively; is the number of matches in qualitative non-binary variables; and is the rank of the h-th quantitative variable.
The analysis is based only on the distances between the variables p and q in [ , ] and [ , ] . Therefore, when at least one of these is nominal, the average Gower distance between the distances of the non-nominal variable and each of the categories of the nominal variable that were previously transformed into dichotomous variables can be obtained from: where is the Gower distance; represents the non-nominal variable; represents the nominal variable; and is the category number of the nominal variable.
If both variables are nominal, then the centroid method is applied, using the generalised average of all distances.
Finally, a set of K-distance matrices of dimension ( × × ) is obtained that expresses the common structure of the two initial data cubes: Continuing to parallel the approach taken by Abdi et al. [30] to DISTATIS, a normalisation of the distance matrices, using a similar method to that applied in Factorial Multiple Analysis (FMA) [21,126], is developed to obtain new weighted co-inertia matrices [ , ] comparable in importance to the original distance matrices when comparing the K-matrices in the factorial plane.
To perform the normalisation, the Singular Value Decomposition (SVD) is applied to each K-distance matrix with rank L. This technique allows us to decompose rectangular matrices [ , ]  [ , ] = [ , ] . The matrices are normalised by multiplying their entries by the scalar [ , ] . Formally, and the normalised K-matrices are: The [ , ] form a set of weighted K-distance matrices: The next step is to calculate the cross-product matrix [ ] , known as the varianceand-covariance matrix. It is first necessary to obtain the vectorisation of the matrices [ , ] : [ , ] , [ , ] , … , [ , ] ; where = × . Then, To compute the inter-structure, Equation (14) is used as well as to calculate a scalar product matrix that contains the Rayleigh coefficients RV [127], the vectorial correlation matrix: This matrix shows the proximity between K-matrices; its interpretation is similar to that of the Pearson linear correlation coefficient.
Abdi et al. [35] apply SVD to the [ ] matrix to analyse the similarities between Kmatrices because it meets the symmetry requirements and is positive defined. Nevertheless, Abdi et al. [30] suggest calculating the eigenvectors and eigenvalues of the [ ] matrix (the method adopted in this investigation), such that: After the decomposition of the [ ] matrix and the computation of its eigenvectors and eigenvalues (the PCA method), the K-matrices can be projected onto the factorial plane as points, allowing the similarities between them to be studied. The elements of the first eigenvector have the same sign for being a positive semi-definite matrix.
The first eigenvector of [ ] supplies the weight vector Abdi et al. [35] mention that K-matrix similarities are represented in the first eigenvector ⃑ [ , ] which provides greater importance at matrices that better represent the co-inertia.
To show the inter-structure graphically, one calculates the coordinates of the Kmatrices using the first two vectors of [ ] (the first columns of which must be positive; see Abdi et al. [35]). From these, one obtains: With the ⃑ [ ] values, the consensus matrix is calculated, which represents the stable part or average image of the relations between the paired matrices (co-inertia).
The consensus matrix [ ] is analysed using Principal Coordinates Analysis (PcoA) [128] with a double centre. To represent its configuration graphically in a low dimension space, it is necessary to apply SVD: To generate the graph of the consensus matrix, the scores [ ] for the row variables and the loadings [ ] for the column variables are calculated: In the same way, the intra-structure is studied, which shows the Euclidean image of the consensus and consists of projecting the co-inertia of the K-matrices onto the consensus axes. A double centring is carried out on each K-matrix of ′ [ , ] , and the FK (scores for the row variables) and QK (loadings for the column variables) are obtained: The procedure of the method is summarized in Figure 1 and the algorithm is shown in Appendix A. The algorithm that summarizes the MixSTATICO method steps is in Appendix A.

Case Study
In Ecuador, there is limited information regarding previous studies carried out on this subject. Therefore, it is necessary to do a space-time analysis to identify the effect that typical environmental variations have on this region. Increases in temperature and nutrient concentrations can occur, owing to the effect of the geographical location and the climatic season of this area on the abundance of phytoplanktonic, which is predominately the diatoms group.
The coast of Continental Ecuador is located in the eastern equatorial Pacific ( Figure  2), with the north influenced by the warm current of Panama (heading south and producing an increase in SST and decrease in nutrients; begins in December and intensifies in February and April) and the south by the cold stream of Humboldt (heading north, its waters are coldest, high in salinity and rich in nutrients; it mainly presents in July and November, weakening in December), and there are two climatic seasons (dry and rainy) [129]. The southern part is considered to be the most significant tropical estuary on the west coast of South America [120]. However, much of its coastline is contained in the Niño Region 1 + 2 (0°-10° S, 90°-80° W), an area sensitive to the variations produced by the surface sea temperature from the Central Pacific due to the presence of ENSO events (El Niño-La Niña), which are events that alter the physical and chemical parameters of the sea. On the west coast of South America, the most evident signs of ENSO are manifested in ocean-atmospheric systems and by the impact it has on natural ecosystems [130]. Due to the importance of this area, constant monitoring of the chemical and biological conditions of the sea is carried out in space and in time.

Results
All data were analysed using the software R Studio version 1.2.1335 and the functions daisy (Gower distance) and svd (Singular Value Decomposition).
Below is a summary of the spatial-temporal data describing the environmental and biological characteristics of the coastal region of Ecuador (see Appendices B and C for all the environmental and species data). Tables 1 and 2 show the average value and standard deviation (SD) of the environmental and biological variables, respectively, according to the location of the sampling stations and the climatic period.
The stations in the north (Esmeraldas and Manta) presented the highest average values with regard to temperature and the lowest average values for nutrients. In addition, the environmental variables (except salinity) showed the highest average values during the rainy season and the Esmeraldas estuary had the warmest waters of the entire coast ( Table 1).
The phytoplankton species had different average abundance values that varied according to the geographical location of each station (which had different environmental characteristics) and the influence of the season ( Table 2). The inter-structure graph, the result of the first step of the MixSTATICO analysis, showed that the environment-species association had two types of behaviour, which varied according to the season ( Figure 3A). The members of the first type presented a similar behaviour during the dry season (red in Figure 3A), while those of the second type behaved similarly during the months of the rainy season (blue), except for October and December. The two first components explained 93.07% of the inertia ( Figure 3B).
Moreover, there was a solid typical structure between the environment and species, as it was observed that all K-matrices (months) reached values close to the unitary cycle. They also had high values for weights. Therefore, all matrices substantially contributed toward building the consensus matrix ( Figures 3A,C). November contributed the most to the consensus (weight × 0.094), followed by August and September (dry season), but the consensus of the environment-species association was more notably represented during November (highest value of cos 2 ) ( Figure 3C).
April, February and May (rainy season) made the lowest contributions to the consensus, while May was considered a month of seasonal change and the end of the rainy season. In relation to the consensus analysis, which showed the stable part of the speciesenvironment association, the two first components explained 99.92% of the inertia ( Figure  4C). The consensus for the environment variables showed a strong positive correlation among temperature, dissolved oxygen and salinity, and a strong negative correlation with phosphate, silicate and nitrate ( Figure 4A).
In 2013, the average temperature value was 23.6 °C, with the highest values recorded at the stations located to the north, while the highest average nutrient values were recorded at the stations south of the coastal profile ( Table 1).
The ordination of species in the consensus ( Figure 4B) established a high level of association between S. membranacea (e23-continuous variable) and S. membranacea (e24ordinal variable), where both variables contain the same information, showing the suitability of the method when different variable types are presented. The abundance of centric diatoms Dactyliosolen antarcticus (e4), Proboscia alata (e5), Skeletonema costatum (e6), Lauderia borealis (e7), Chaetoceros didymus (e15) and Ditylum brightwellii (e21) and dinoflagellate Gyrodinium sp. (e19) were associated (same direction) with the location and season variables. This indicated that the vulnerability of the species varied according to the geographic location of the sampling stations and the season, with different levels of average abundance (Table 2). In addition, these species had the lowest average abundance throughout the whole coastline (Total column of Table 2), being absent during some of the months and stations (Table A2). In addition, the species with stable abundance levels (ordered in the opposite direction to the location and season variables), during both periods and in the entire coastal profile, were mainly P. pungens (e10), Gymnodinium sp. (e18) and Dactyliosolen fragilissimus (e1), S. membranacea (e23) and S. membranacea (e24).
The vulnerable species indicated in Figure 4B (located on the factorial environmental plane Figure 4A) show preferential environmental conditions with below-average values in relation to temperature, dissolved oxygen and salinity (opposite direction to these variables).
The following species where those with the highest abundance (longer vector) over the entire coastline: the pennate diatoms Pseudo-nitzschia pungens (e10) and N. longissima (e8) and the dinoflagellate Gymnodinium sp. (e18). These species preferred above-average values in relation to temperature, dissolved oxygen and salinity. Dactyliosolen antarcticus (e4), on the other hand, was detected during the dry season at the stations located in the north (Esmeraldas and Manta) ( Figures 3A,B).
The trajectories of both variables and species for the rainy and the dry seasons are presented separately for the purpose of comparison. The same interpretation is applied to the intra-structure. During the rainy months ( Figures 5A, C, E, G, I), it was observed that the environmental and species variables showed a non-stable management pattern. For February and March, temperature, dissolved oxygen and salinity showed a direct and inverse association with phosphate and nitrate, while in April and December, that association became weaker, mainly being between temperature and dissolved oxygen. The association pattern in May (month of change of season) differs from the rest of the months since it was observed that temperature does not directly associate with any other variable. At this time, the highest average values for temperature and nutrients were recorded ( Table 1).
The species with the highest abundance during the rainy season ( Figures 5B, D, F, H, J) were mainly the species Leptocylindrus danicus (e11), Chaetoceros affinis (e13) and Gymnodinium sp. (e18). In February to April (they showed a longer vector), their high abundance was associated with above-average temperature and salinity values and low abundance was associated with nitrate values, among other nutrients.
Each sampling month showed the species that had a decrease in their abundance or disappeared due to the influence of the sampling station location and the variability of the corresponding climatic feature in that month. At this time of year (except in December), the main vulnerable species were the centric diatoms D. antarcticus (e4), P. alata (e5) and S. costatum (e6), which were found mainly at the southern station, and D. brightwellii (e21) at the stations in the centre of the coastal profile that were influenced directly by the ocean. These species seemed to thrive more in waters with below-average temperature and salinity values ( Table 2). During the dry season, like during the months of the rainy season, a non-stable environmental ratio pattern ( Figures 6A, C, E, G, I, K) was displayed, changing according to the month the samples were taken. In addition, lower values were obtained for nutrients and temperature.
The species that stood out as being more abundant during these months were centric diatoms D. antarcticus (e4), S. costatum (e6), P. pungens (e10) and D. brightwellii (e21), among others, and their preferred environmental conditions varied according to the month. The species that were vulnerable to environmental variations and geographical  Table 2). However, the species that had a steady abundance were Guinardia striata (e2), Rhizosolenia imbricata (e3), D. antarcticus (e4), P. alata (e5), S. membranacea (e24) and S. membranacea (e23) (Figures 6B, D, F, H, J, L). Moreover, there are more apparent patterns of association between species in the dry season (except in June, the start of the dry season) as compared to the rainy season.

Discussion
Water sustainability is an overriding issue for ecosystem support [1]. Human actions and natural events are factors that cause alterations in the environment, affecting ecosystem biotic and abiotic components [132][133][134]. Therefore, planktonic components are of great importance in assessing the water quality of aquatic systems, as they are subject to environmental variations [135] in the area and the geographical locations where they grow [95]. This was established by the analysis of the consensus and the intra-structure, where phytoplankton species vulnerable to climatic variability and environmental conditions of the estuarine and oceanic stations were identified. These stations are seen to have a positive effect on the ecosystem, which we believe gives support to the importance of studying the hydrological variations and the effects they have on biota and biotic processes. The estuarine and oceanic stations are known as dynamic environments that exhibit spatial and temporal variations in their biotic and abiotic parameters [136]. This dynamic behaviour was classified into two groups based on the months of the year (inter-structure), where the rainy season was established as February-May and the dry season as June-December [129], except for October and December. December was the start of the rainy season of 2014, the year in which a weak El Niño event occurred [137] with aforementioned anomalies >1.5 (Oceanic Niño Index-ONI) that decreased as the month passed.
In addition to the intra-structure, differences in species population patterns were observed between both seasons, with the dry season showing a more pronounced pattern of association among the species. The reason for this is that during the rainy season, the species-environment co-structure was weaker and the environmental variables presented unstable behaviour due to the presence of high outlier values for phosphate, nitrate, nitrite and salinity during February, March and April (in the different sample stations). This could be due to an increased amount of runoff originating from sedimentary basins (agriculture fields wash off). However, regarding the temperature, it only increases in August, a month that tends to experience upwellings along the Ecuador coast at the northern station (Esmeraldas) [106]. The highest average temperature and the lowest nutrient values were recorded in the stations located to the north (Esmeraldas and Manta). In contrast, an opposite effect occurred in the southern stations (La Libertad and Pto. Bolivar) (see Table 2). This behaviour coincides with the presence of warm waters coming from the north caused by the influence of the Panama current. However, in the south, oceanic waters are influenced by the cold Humboldt current, as well as upwellings that favour planktonic productivity [111].
During most of the months of the dry season (intra-structure), the species D. antarcticus (e4) showed a high abundance, which was associated with above-average nitrate values and low temperatures [138], which occurred mainly in June and July (Table  A2). This is due to the presence of a cold Humboldt current, rich in nutrients, which is consistent with optimal environmental conditions to support the high productivity of diatoms. In the intra-structure chart for February, March, May and December, the dinoflagellate Gyrodinium sp. (e19) was found to be the species that associated its aboveaverage abundance with geographical location (north, Esmeraldas and Manta- Table 2) and climate (rainy- Table 2). In 2013, other authors [139] stated that nutrient concentrations are the drivers of dinoflagellate productivity, drawing attention to species such as Gymnodinium cf catenatum, Oxytoxum turbo and Prorocentrum micans, which we found were associated with warm waters (typical of the Panama current) with low salinity and nutrients during the rainy season in the northern station. This may have occurred because dinoflagellates trophic behaviour is not dependent on nutrient levels, and as such is considered an indicator of species of warm ocean waters with low nutrient levels [140,141].
In the intra-structure, it was possible to identify the high abundance species, such as P. alata (e5) in November, which was associated with above-average nitrate and temperature values in the north station, Esmeraldas (Table 2). In August, the species D. fragilissimus (e1) was abundant with above-average values in temperature and oxygen association values in the north centre station, Manta (Table 2), but whose presence was found to be associated with nutrient availability [137]. In February, March, April and May, the abundance of P. pungens (e10) was found to be associated with above-average temperature and oxygen values in the south centre station, La Libertad (Table 2), which is also consistent with the warmer temperatures associated with conditions during the rainy season. Finally, throughout the rainy season (Feb-Dec), the species Hemiaulus sinensis (e16) associated its high abundance to above-average nitrate, nitrite and salinity values in the south station, Pto. Bolívar. These results reaffirm the reports of other studies [106] that describe how P. alata (e5) is the most abundant species in Esmeraldas and D. fragilissimus (e1) in Manta during the dry season, and P. pungens (e10) in La Libertad and H. sinensis (e16) in Pto. Bolívar during the rainy season. Nevertheless, the findings of this study highlight the association of phytoplankton growth with location and seasons, where P. alata (e5) was found to be associated with these variables. However, in the case of D. fragilissimus (e1), P. pungens (e10) and H. sinensis (e16), these species were found to have had a stable level of abundance throughout all the seasons and along the entire coastal profile.
This study has shown the utility of this new multivariate method in analysing associations between biota abundance and environmental variables such as location and season within ENSO neutral conditions. In addition, this work highlights the seasonal patterns and possible association with freshwater run-off, which proves that the method could be useful for the assessment of other environmental contexts.

Conclusions
In this work, using the MixSTATICO multi-way mixed multivariate method, the need for additional analysis has been reduced. In this sense, descriptive statistics techniques such as PCA and CCA were employed and applied independently to the data collected at each sampling station. In addition, this work introduces geographical location and climatic periods as qualitative variables and analyses them in conjunction with the other quantitative variables, namely temperature, dissolved oxygen, salinity, nutrients and phytoplankton abundance.
Our findings support the effectiveness of the proposed method, which reduces the amount of analysis required to obtain the same results. This is conducted through the analysis of data cubes that contain sets of quantitative and qualitative variables. In this sense, the STATICO method calculates the stable part of the common structure between two data cubes.
This innovative method has identified the phytoplankton species showing poor, stable and high levels of abundance according to climatic variability and the environmental conditions typical of these geographical locations and seasons. In addition, this work has identified the more vulnerable species in the coast profile as D. antarcticus, P. alata, S. costatum, L. borealis, C. didymus and D. brightwellii (centric diatoms) and Gyrodinium sp. (dinoflagellate). During the dry season, it was possible to observe more apparent patterns of association between species.
These results provide proof of the validity of this method for analysing mixed data, and even offer the possibility of introducing qualitative characteristics to the study. In this sense, the results obtained are further enriched for the knowledge of the ecosystem. This is performed from a quantitative and qualitative analysis perspective. Furthermore, this method, supported by statistical data, efficiently introduces the insight of environmental assessment in terms of environmental parameters' relationships.

Data Availability Statement:
The data presented in this study are available in Appendices B and C.

Acknowledgments:
We acknowledge the invaluable collaboration of INOCAR (Oceanographic Institute of the Navy, Ecuador), CPFG-EM Edwin Pinto Uscocovich, and Gladys Torres, who have provided us with oceanographic data from a project that studied El Niño under the regional project framework El Niño-ERFEN-ECUADOR. We have no business and/or financial interest in the project; we have disclosed our interests fully and have in place an approved plan for managing any potential conflicts arising from this arrangement.