Visualization Framework for High-Dimensional Spatio-Temporal Hydrological Gridded Datasets using Machine-Learning Techniques

: Numerical modelling increasingly generates massive, high-dimensional spatio-temporal datasets. Exploring such datasets relies on e ﬀ ective visualization. This study presents a generic workﬂow to (i) project high-dimensional spatio-temporal data on a two-dimensional (2D) plane accurately (ii) compare dimensionality reduction techniques (DRTs) in terms of resolution and computational e ﬃ ciency (iii) represent 2D projection spatially using a 2D perceptually uniform background color map. Machine learning (ML) based DRTs for data visualization i.e., principal component analysis (PCA), generative topographic mapping (GTM), t-distributed stochastic neighbor embedding (t-SNE) and uniform manifold approximation and projection (UMAP) are compared in terms of accuracy, resolution and computational e ﬃ ciency to handle massive datasets. The accuracy of visualization is evaluated using a quality metric based on a co-ranking framework. The workﬂow is applied to an output of an Australian Water Resource Assessment (AWRA) model for Tasmania, Australia. The dataset consists of daily time series of nine components of the water balance at a 5 km grid cell resolution for the year 2017. The case study shows that PCA allows rapid visualization of global data structures, while t-SNE and UMAP allows more accurate representation of local trends. Furthermore, UMAP is computationally more e ﬃ cient than t-SNE and least a ﬀ ected by the outliers compared to GTM.


Introduction
One of the biggest challenges of the big data era is to make sense out of all the information available. Unfortunately, not all that huge volume of data is informative. Such datasets may contain spatial or temporal information or both spatial and temporal information. The information is available either in the form of grid or point data. However, gridded data is difficult to capture in low-dimensional space especially in Earth sciences, due to their dynamic and non-linear behavior.
Effective data visualization plays a key role in exploring such big datasets, finding patterns/features and outliers. Such insights are essential to develop hypotheses on the data-generating processes [1]. In addition, data visualization tools can help improve decision making, primary data analysis and information sharing [2]. Several visualization approaches exist in literature to extract the information based on graphs, charts, parallel coordinates and tree maps just to name a few [3,4]. So far, domain experts in Earth sciences rely on traditional visualization methods, such as maps and time series plots, to explore patterns and structures of high-dimensional spatio-temporal datasets. Data visualization is embedded in various inference and feature extraction techniques [5]. Therefore, visualizing T-distributed SNE (t-SNE) [12] is an improved variation of SNE consists of a long-tailed distribution, hence large neighborhoods of the data can be matched by a wide range of scales in the two-dimensional (2D) projection and in this way avoid the overcrowding of data points. In fact, the t-SNE approach tries to match the probability distributions induced by the pairwise data dissimilarities in the original data space and the projected space. t-SNE has successfully been used to visualize high dimensional datasets to reveal local as well as global structures at several scales in various application areas such as, computer security [38], cancer biology [39], music analysis [40] and bioinformatics [41]. It is worth mentioning that t-SNE computational and memory complexity is quadratic in the number of data points (P 4 ) and the local nature of t-SNE makes it sensitive to the curse of inherent dimensionality of high-dimensional data [42]. In addition, t-SNE performance on the general dimensionality reduction tasks is still vague as it is primarily built for visualization purpose only.
Uniform manifold approximation and projection (UMAP) is a recently introduced non-parametric DRT, which shows its effectiveness in coping with diversity of dynamic and non-linear datasets [43]. It builds on strong mathematical foundations largely based on manifold theory and fuzzy topological representation, which allows it to scale to very large datasets in an efficient manner. Like t-SNE, UMAP has widely been used in the fields of bioinformatics [44], material science [45] and machine learning [46], however, so far, no application has been found in hydrology and the Earth sciences. UMAP computational efficiency equals P and has no computational restrictions on projected dimensions. This is because UMAP does not require global normalization. Further, UMAP is built on solid theoretical grounds useful for general purpose dimensionality reduction and preprocessing of machine learning techniques.
This study focuses on one linear and three non-linear ML based unsupervised DRTs for visualization i.e., principal component analysis (PCA) [16], generative topographic mapping (GTM) [19], t-distributed stochastic neighbor embedding (t-SNE) [12] and uniform manifold approximation and projection (UMAP) [43] summarized in Table 1 along with their respective computational efficiencies in terms of data points represented by P. The reason to choose above mentioned DRTs is to test their practicality in terms of accuracy, resolution and computational efficiency for high dimensional spatio-temporal gridded dataset.
To quantify visualizations of selected DRTs, there exists various quality metrics in literature, either independent or dependent on DRTs. The quality is assessed by calculating pairwise proximities such as, distances, similarities/dissimilarities or probabilities between low-dimensional and high-dimensional space or by reproducing a high-dimensional space from a low-dimensional projection [47,48]. Different DRTs preserves different proximities, therefore, difficult to compare. The DRTs dependent quality metrics include neighborhood scales [49] and agreement evaluation criteria [50]. These evaluation criteria preserve distances between neighboring points and are not suitable for comparison of different pairwise proximities calculated from various DRTs. There are few DRT independent quality metrics, which includes scale independent [51], distance [52] and rank based criteria [53]. The scale independent criteria, such as Q NX [51] use ranks instead of pairwise distances between the data points to define the nearest neighbors. The rank comparison-based approach i.e., co-ranking [53] has a benefit of comparing Euclidean distance of projected low-dimensional data points to any pairwise proximity of original high-dimensional data as ordering the neighboring points is possible in all cases. Although, the absolute information of a proximity is lost in the ranking procedure, however, the rank comparison-based technique is suitable for comparing different DRTs. To visualize patterns effectively, the low-dimensional projected plane needs to be placed in a spatial context. In this regard, color maps will help the end users to explain spatio-temporal patterns intuitively. Visualizing higher dimensional data traditionally relies on directly mapping variables to R, G, B channels to create a pseudo-color image e.g., [54,55] or combining different variables into a predefined index such as, [56,57]. The main drawbacks of these methods are that they can only visualize limited number of variables and that they require an in-depth understanding of the data generating process to develop a meaningful index. High-dimensional data can be visualized by color coding data points according to their projection in a lower dimensional space [58]. One of the most challenging aspects of visualizing data through false color images is to develop a perceptually uniform color scheme in order not to inadvertently emphasize or obscure parts of the data range [59][60][61].
In crux, the process of dimensionality reduction incurs a loss of information. Where information loss is strongly linked with the preservation of geometry (distances, topology and reproducibility) and helps in evaluating the trustworthiness of visual maps as shown in [47] i.e., the greater the loss of quality, the less the preservation of geometry. Furthermore, the loss can be quantified to gain an in-depth insight into the visualization's accuracy. The critical analysis of the literature review reveals that computationally efficient and accurate methods are required, which automatically embeds dimensionality reduction in visualization workflow to develop hypothesis on complex non-linear systems.
The objective of this study is to suggest a workflow that, (i) projects the inherent structure of the high-dimensional spatio-temporal data in 2D accurately, (ii) Compare DRTs in terms of resolution and computational efficiency and (iii) spatially visualizes the structure in an intuitive manner through a perceptually uniform color coding of 2D reduced parameter space. Furthermore, to the best of our knowledge, such comparison hasn't been performed on hydrological dataset to extract local and global structures. These structures serve as a backbone to describe hypothetical phenomenon. There is a great need for such frameworks, which will help to reduce the uncertainty in observations along with the improved understanding of physics and dynamics of hydrological systems.
The next section describes the dataset used followed by the adopted methodology, which shows DRTs, accuracy metrics, color scheme adopted for visualization and computational efficiency comparison. Result section consists of different DRTs visualization maps along with their visualization quality quantification and time series analysis. Later, the discussion section will compare the DRTs in terms of accuracy, resolution and computational efficiency followed by the conclusion.

Dataset and Study Area
The Australian Water Resource Assessment (AWRA) model is an operational, near real-time continental landscape model [62]. For each day, the model partitions rainfall in millimeters (mm) per day into potential and actual evaporation, runoff and deep drainage to groundwater as well as the change in water storage in four soil compartments, expressed in mm.
These nine hydrological variables are calculated on a 5 × 5 km grid across Australia, considering variability in vegetation, topography and soil properties [63]. The ongoing model development is focused on improving the representation of local conditions and improving computational efficiency [64].
For this case study, we used the model outputs for the state of Tasmania from 1 January 2017 to 30 September 2017, downloaded from http://www.bom.gov.au/water/landscape on 30 September 2017 as shown in Figure 1. This dataset has 273 daily values for the nine hydrological variables on a spatial grid scale of (−43.71 < latitude < −40.14) and (144.49 < longitude < 149.41). The three-dimensional arrays of hydrological variable are normalized to the [0, 1] range individually and appended to each other to create a single three-dimensional array of size i.e., latitude by longitude by AWRA components × daily time series observations (72 by 79 by 9 × 273) as shown in Figure 2. This array is vectorized

Methods
To visualize spatio-temporal structures locally and globally, four DRTs have been compared to efficiently and accurately visualize the multivariate hydrological AWRA model output components. Later, the perceptually uniform color scheme has been used, which allows user/domain experts to associate each data point on a 2D color plot by preserving its specific location on a spatial map of Tasmania. Figure 3 shows this workflow.
Water 2020, 11, x FOR PEER REVIEW 6 of 15 consists of (9 × 273) observations and (c) three-dimensional arrays are normalized (Nor) and append to each other to form a single three-dimensional array of size (72 × 79) by (9 × 273).

Methods
To visualize spatio-temporal structures locally and globally, four DRTs have been compared to efficiently and accurately visualize the multivariate hydrological AWRA model output components. Later, the perceptually uniform color scheme has been used, which allows user/domain experts to associate each data point on a 2D color plot by preserving its specific location on a spatial map of Tasmania. Figure 3 shows this workflow.

Dimensionality Reduction Techniques (DRTs)
The objective of dimensionality reduction techniques is to find a low-dimensional projection of a high-dimensional dataset with minimal loss of the information [6]. These techniques visualize correlations and patterns in high-dimensional spatio-temporal datasets.
For this case study, two parametric (PCA, GTM) and two non-parametric (t-SNE, UMAP) DRTs are compared.
PCA is a parametric linear technique that projects a dataset onto a 2D space defined by linearly uncorrelated principal components [16].
Another parametric but non-linear DRT used for visualization is GTM [19,65]. GTM aims to extract a low-dimensional representation of dataset, initially unknown hence called latent, lies on high-dimensional data space. Such mapping uses non-linear function to map points in latent space corresponding to points in data space, provides a map with the distribution of data points centered at lattice. A lattice consists of grid nodes window. These lattices have an attached responsibility i.e., either mean or mode of a distribution, and are used to provide visualization of the map for individual data points in 2D latent space. The size of the lattice is dependent on the chosen width factor of radial basis function, which has a direct influence on the visualization. Smaller the lattice size, less compact visualization can be attained. The hyperparameters govern the execution of GTM including (i) number of nodes used for tuning the GTM resolution (k); (ii) number of hidden units in radial basis function (m); (iii) width factor of radial basis function (s) and (iv) regularization coefficient (r) [66].
Unlike PCA and GTM, t-SNE is a non-parametric non-linear technique [12]. t-SNE visualizes high-dimensional data by giving each datapoint a location in a low-dimensional projected map. It calculates the probability of similarity between data points using gaussian distribution in highdimensional space and calculates the same for its corresponding data points in low-dimensional space using T-distribution. The similarity of data points is calculated as conditional probabilities i.e., the points nearest to a defined center are picked by gaussian distribution and T-distribution in high and low dimensional space, respectively. Later, t-SNE tries to minimize the difference between the conditional probabilities in high-dimensional and low-dimensional space for the best possible visualization of data points in 2D using gradient descent method. The objective is to preserve the neighborhood information without any pre-requisite of user defined input.

Dimensionality Reduction Techniques (DRTs)
The objective of dimensionality reduction techniques is to find a low-dimensional projection of a high-dimensional dataset with minimal loss of the information [6]. These techniques visualize correlations and patterns in high-dimensional spatio-temporal datasets.
For this case study, two parametric (PCA, GTM) and two non-parametric (t-SNE, UMAP) DRTs are compared.
PCA is a parametric linear technique that projects a dataset onto a 2D space defined by linearly uncorrelated principal components [16].
Another parametric but non-linear DRT used for visualization is GTM [19,65]. GTM aims to extract a low-dimensional representation of dataset, initially unknown hence called latent, lies on high-dimensional data space. Such mapping uses non-linear function to map points in latent space corresponding to points in data space, provides a map with the distribution of data points centered at lattice. A lattice consists of grid nodes window. These lattices have an attached responsibility i.e., either mean or mode of a distribution, and are used to provide visualization of the map for individual data points in 2D latent space. The size of the lattice is dependent on the chosen width factor of radial basis function, which has a direct influence on the visualization. Smaller the lattice size, less compact visualization can be attained. The hyperparameters govern the execution of GTM including (i) number of nodes used for tuning the GTM resolution (k); (ii) number of hidden units in radial basis function (m); (iii) width factor of radial basis function (s) and (iv) regularization coefficient (r) [66].
Unlike PCA and GTM, t-SNE is a non-parametric non-linear technique [12]. t-SNE visualizes high-dimensional data by giving each datapoint a location in a low-dimensional projected map. It calculates the probability of similarity between data points using gaussian distribution in high-dimensional space and calculates the same for its corresponding data points in low-dimensional space using T-distribution. The similarity of data points is calculated as conditional probabilities i.e., the points nearest to a defined center are picked by gaussian distribution and T-distribution in high and low dimensional space, respectively. Later, t-SNE tries to minimize the difference between the conditional probabilities in high-dimensional and low-dimensional space for the best possible visualization of data points in 2D using gradient descent method. The objective is to preserve the neighborhood information without any pre-requisite of user defined input.
UMAP is also a non-parametric non-linear DRT and it searches for a low-dimensional projection of a dataset that has the closest possible equivalent fuzzy topological structure (made up of local manifold approximations which preserve distances) in high-dimensional space [44]. UMAP then minimizes the cross-entropy between low and high-dimensional space to optimize the visualization in low-dimensional space. UMAP preserves the essential topological structure of the learned manifold in the 2D representation of a dataset. Three main input parameters need to be defined by the user i.e., (i) the number of neighboring points used in local approximation of manifold structure (n), which ranges from 5 to 50; (ii) factor (d) with values between 0.001 to 0.5 controlling how tightly the embedding is allowed to compress points together and (iii) the choice of metric (c) used to measure distance in the input space including minkowski style metrics, spatial metrics, angular and correlation metrics [67].
The DRTs are performed in Python 3.6 using sklearn package for PCA and t-SNE. GTM is applied using the ugtm package imported from https://github.com/hagax8/ugtm and UMAP is applied using the umap package imported from https://github.com/lmcinnes/umap.

Quality Metric
The Q NX measure [51] is used to quantify DRT based visualizations in a topology preserving manner i.e., how well neighborhood relationship can be preserved between data vectors to capture in 2D. Q NX relies on the ranks of sorted distances between the high-dimensional and 2D projections. Q NX measure is independent of any DRT, therefore, successfully used for the quantification of 2D projected visualizations for comparative analysis. This technique averages the quality curve Q NX over varying values of K-ary of neighborhood (K) [51] as shown in Equation (1).
In Equation (1) Q NX is a quality metric ranging between [0, 1], 1 means a perfect projection and vice versa. N is the total number of observations in high-dimensional space i.e., (72 × 79) by (9 × 273), X represents projected low-dimensional vectors and K is the neighborhood size. In Equation (2), ρ ij is the rank of a sorted distance (ξ i , ξ j ) in a high-dimensional space corresponding to a rank r ij allotted to a sorted distance (x i , x j ) in 2D projection. A more comprehensive overview of the mathematical details is provided in [46,51].
The above quantification criteria show a major advancement over the distance preservation measurement as the use of ranks allow distances to grow or shrink, makes it scale-independent, given that their orders do not change. Such criteria only dependent on neighborhood size K, produces curve that may be analyzed on various scales.

Visualization
The next step in the workflow is to super impose the perceptually uniform color scheme on the projected 2D plane to generate a spatial map of the dataset in which points with similar colors indicate similar data vectors [57]. The color scheme is based on HSL UV (www.hsluv.org), a human-friendly alternative to the Hue, Saturation, and Lightness (HSL) color space. It extends the perceptually uniform CIELUV color space with a saturation component that allows chroma to be expressed as a percentage. It is to be anticipated that the combination of three components i.e., colored scheme, colored map and colored time series in a machine learning based visualization workflow allows to capture rich structure of the data and display results in a format domain expert are familiar with.

Results
To accurately present the spatio-temporal structure of high-dimensional AWRA model output in 2D and to visualize it using the perceptually uniform color scheme, the comparative analysis has been performed using PCA, GTM, t-SNE and UMAP. The color plots corresponding to their spatial maps are provided in Figure 4a-c respectively. The background color to the plot allows us to associate each AWRA pixel on a spatial map with a position on the color plot. Each pixel consists of nine time series and reflect the water balance in a specific location.
Water 2020, 11, x FOR PEER REVIEW 8 of 15

Results
To accurately present the spatio-temporal structure of high-dimensional AWRA model output in 2D and to visualize it using the perceptually uniform color scheme, the comparative analysis has been performed using PCA, GTM, t-SNE and UMAP.
The color plots corresponding to their spatial maps are provided in Figures 4(a-c) respectively. The background color to the plot allows us to associate each AWRA pixel on a spatial map with a position on the color plot. Each pixel consists of nine time series and reflect the water balance in a specific location.  In Figure 4a PCA, t-SNE and UMAP are one-to-one mapping, where each data point is represented in a low-dimensional 2D space. GTM however is a many-to-one mapping in which multiple data points can be mapped to a single node, which represents the mean of a probability distribution. The best GTM visualization for the dataset in hand consists of a parameter selection (k = 16, m = 10, s = 0.3, r = 0.1) along with the mean data points representation. Further the choice of parameters (n = 10, d = 0.1 and c = correlation metric) provides the best visualization results for UMAP. Figure 4b visualizes the main trends in the dataset arising from topographic and rainfall gradients. As the color scheme is designed to gradually blend from one color to another, colors close to each other indicate similar behavior. This allows us to quickly visualize areas with similar behavior forming clusters and find outlying values. In this regard, t-SNE and UMAP, appears to provide higher contrast to identify subtle trends and dissimilarities. This is largely due to the ability of t-SNE to capture small pair-wise distances, i.e., local structures in a 2D projected map, which resulted in better and clear local patterns.
Further UMAP performance is visually comparable to t-SNE but arguably capture mores of the global structure than local structure. This is due to the tradeoff between the number of neighborhoods (n) chosen in local approximations of manifold structure and the reasonable tightness of data points embedding (d) in optimizing the visualization quality. As larger n with tight d will average out the local approximations in the process and result in potentially densely packed regions, which captures the global structure better. GTM performs relatively poor in capturing any type of trends largely due to its dependence on the efficient parameter selection and sensitivity to outliers. Furthermore, the reason may be its dependence on the mean nodes of the data points instead of the data points itself. The first two components of PCA summarize the main trends in the data, which in results in displaying more global patterns than other compared techniques.
It is noticeable that without applying any clustering technique, the continuous perceptual uniform color plot scheme provides an in-depth insight into different groupings with an intuitive interpretation and meaningful patterns. Figure 4(c(i)-(ix)) can further authenticate the above given statements. The 2D space is divided in 9 equal regions and within each region, 20 points are randomly selected. The associated time series are plotted with the corresponding color. The time series of colors associated with t-SNE and UMAP are more similar within a 2D space region, showing less variation in changing colors abruptly and results in forming clusters more accurately at local scale; whereas, the time series associated with PCA and GTM show more variation in time series within a 2D space region. These variations are clearly shown by varying time series colors in Figure 4(c(i)-(ix)), resulted in identifying patterns locally with less accuracy. Figure 5 shows the Q NX metric for DRTs. The relatively constant Q NX values across neighborhood values is an indication that t-SNE and UMAP capture the local structures of high-dimensional spatio-temporal dataset in 2D projection as well as global structures. However, UMAP performance to t-SNE is slightly better at capturing the global trends but the local trends are affected by the outliers/noise. Furthermore, GTM performs poor in capturing the local as well as global trends due to its sensitivity to the outliers. The first two principal components, not surprisingly, do not provide much insight into the local structures, but perform better than t-SNE and UMAP to capture global structures as shown by higher values of Q NX for larger neighborhoods (K) shown in Figure 5.
versa. Drastic increase in QNX values against larger K for PCA compared to GTM, t-SNE and UMAP shows that PCA is better at capturing global trends as QNX values quickly approaches to 1. However, QNX values against varying values of K are far from 1 for GTM, therefore, does not capture any kind of trend with higher accuracy.  On the other hand, t-SNE showed better performance in capturing local as well as global trends. This is due to higher QNX values against K compared to GTM and UMAP as shown in Table 2. It is important to note that, UMAP is slightly better at capturing global trends compared to t-SNE for the range of K, i.e., 250 < K < 550.
As far as the computational efficiency is concerned, PCA took 20 seconds to run, whereas GTM takes at least 150 seconds to run. On the other hand, UMAP is much faster in producing results due to its non-parametric nature and took approximately 15 seconds, however, a non-parametric t-SNE took only 80 seconds to run. The experiment is performed on 64-bit operating system with 32 GB Ram.   Figure 5 by providing the Q NX metric values against their varying neighborhood sizes K for four DRTs. As stated before, if Q NX~1 , it shows perfect projection and vice versa. Drastic increase in Q NX values against larger K for PCA compared to GTM, t-SNE and UMAP shows that PCA is better at capturing global trends as Q NX values quickly approaches to 1. However, Q NX values against varying values of K are far from 1 for GTM, therefore, does not capture any kind of trend with higher accuracy. On the other hand, t-SNE showed better performance in capturing local as well as global trends. This is due to higher Q NX values against K compared to GTM and UMAP as shown in Table 2. It is important to note that, UMAP is slightly better at capturing global trends compared to t-SNE for the range of K, i.e., 250 < K < 550.
As far as the computational efficiency is concerned, PCA took 20 seconds to run, whereas GTM takes at least 150 seconds to run. On the other hand, UMAP is much faster in producing results due to its non-parametric nature and took approximately 15 seconds, however, a non-parametric t-SNE took only 80 seconds to run. The experiment is performed on 64-bit operating system with 32 GB Ram.

Discussion
Choosing an appropriate data representation method is not a trivial task as they differ in the goals of exploring either correlations, clusters or patterns in datasets with varying computational efficiencies. Most of the DRTs are designed for high-dimensional point datasets. The gridded spatio-temporal datasets are computationally more challenging.
Generally speaking, linear techniques compared to non-linear DRTs are more flexible in representing high-dimensional structures in lower dimensions and therefore will incur less loss of information. Topographic mappings and spectral embedding are designed to preserve the topography or geometry of the dataset, whereas neighborhood preservation techniques, such as multi-dimensional scaling and neural networks, aim to retain the multi-dimensional distances between data points in the low-dimensional projections. Furthermore, the parametric DRTs optimize the parameters in the process of training, which will provide out-of-sample extensions, whereas the non-parametric DRTs do not.
Specific to the linear parametric DRT discussed in this case study i.e., PCA, a linear projection cannot faithfully reveal the non-linear structures of the datasets and disturbs the local neighborhood structures. Further, for very high-dimensional datasets, PCA becomes costly and sensitive to noise due to its dependency on data covariance matrix. However, PCA is a useful preprocessing step for very high-dimensional datasets to later proceed with the non-linear feature extraction techniques. Alternatively, non-linear features are well captured by the parametric DRT i.e., GTM, however, requires an appropriate selection of parameters.
On the other hand, non-parametric techniques have an advantage of fast processing and do not assume any functional form of mapping to regulate parameters, however, suffers from a disadvantage of not providing out of sample extension. T-SNE and UMAP are the main non-parametric ML based DRTS discussed in this case study. UMAP is preferable to use for general purpose DRT, however, t-SNE is preferred for visualization.
This case study suggested a workflow by comparing four ML based DRTs in terms of accuracy, efficiency and resolution suitable for high-dimensional spatio-temporal gridded datasets followed by its visualization quantification.
PCA retains large pairwise distances in the reduced dimensional space defined by the first two principal components only. Local structures may be captured in other principal components. t-SNE performs well for the high dimensional gridded datasets as its non-linear mapping function helps to capture various spatio-temporal structures efficiently. It is important to mention here that UMAP is competitive to t-SNE for visualization quality, however, preserve more of a global structure compared to t-SNE. Furthermore, the performance of GTM largely depends on its parameter selection i.e., number of nodes (k). With the increase in m, the lower-dimensional space points get more clustered together due to the network overtraining, resulting in a more precise visualization rather than representing optimized structures.
Overall, UMAP is a general-purpose DRT and can be recommended to treat high-dimensional datasets with superior run time performance to capture non-linear global structure more accurately compared to t-SNE. UMAP and t-SNE can both handle large non-linear datasets more efficiently, however, non-linear GTM due to its parametric nature is prone to over-fitting.
All above discussed DRTs for data visualization have some associated advantages and disadvantages, however, will assist domain experts to select suitable technique for their dataset. Identifying the inherent structure of spatio-temporal dataset will, however, always be hampered by the information loss that is unavoidable when representing high-dimensional data in two dimensions. The selection of suitable ML based DRT depends on the dataset in hand. If the nature of dataset is linear than PCA is best at capturing the local and global features, however, non-linear datasets can be handled well by parametric GTM. If the parameters are more difficult to decide than UMAP and t-SNE are preferable to capture non-linear trends. However, it should be kept in mind that t-SNE is computationally more expensive compared to UMAP.
Moreover, different DRTs often result in very different visualizations and it is hard to decide the best suitable DRT for a given dataset at hand. It is often not clear if differences in the visualizations are due to the data structure or the method chosen. Several techniques have been developed to assess the quality of low dimensional projected visualizations. However, the Q NX quality metric is suggested to quantify visualizations as it does not depend on any DRT. Further, to visualize spatial patterns of a single quantity of interest the perceptually uniform color scheme is recommended in order to capture patterns in a diverse range.

Conclusions
The suggested workflow applied DRTs to visualize the multivariate AWRA model output hydrological components in order to determine the prominent spatio-temporal features followed by its quantification to access the visualization accuracy. The comparative analysis of four DRTs i.e., PCA, GTM, t-SNE and UMAP by accounting a perceptual uniform color scheme has been performed on AWRA model output for the Tasmania region.
t-SNE and UMAP are effective in detecting local spatial patterns with high resolution as compared to the GTM, whereas, GTM lacks resolution in explaining the local as well as global trends. On the other hand, PCA is better at detecting the global patterns. The time-series color plot further validates the results and the quality metric Q NX proves it quantitatively. Moreover, t-SNE proves to be computationally quite expensive for high-dimensional spatio-temporal datasets, compared to PCA, UMAP and GTM but provides much better insight almost equivalent to UMAP in explaining the data structures and patterns. Furthermore, GTM is much sensitive to outliers compared to UMAP.
In essence, t-SNE and UMAP are better to use when non-linear trends are expected and local trends are of much more importance, however, the UMAP is computationally more efficient. Furthermore, PCA can capture global trends better for linear datasets, whereas, the parametric nature of the GTM to capture non-linear trends makes it harder to capture any kind of trend.
The suggested workflow is beneficial for the exploratory data analysis of hydrological data.
Funding: The APC was funded by Deep Earth Imaging-Future Science Platform, CSIRO, Australia.