An Interactive Data Visualization Framework for Exploring Geospatial Environmental Datasets and Model Predictions

: With the rise of large-scale environmental models comes new challenges for how we best utilize this information in research, management and decision making. Interactive data visualizations can make large and complex datasets easier to access and explore, which can lead to knowledge discovery, hypothesis formation and improved understanding. Here, we present a web-based interactive data visualization framework, the Interactive Catchment Explorer (ICE), for exploring environmental datasets and model outputs. Using a client-based architecture, the ICE framework provides a highly interactive user experience for discovering spatial patterns, evaluating relationships between variables and identifying speciﬁc locations using multivariate criteria. Through a series of case studies, we demonstrate the application of the ICE framework to datasets and models associated with three separate research projects covering di ﬀ erent regions in North America. From these case studies, we provide speciﬁc examples of the broader impacts that tools like these can have, including fostering discussion and collaboration among stakeholders and playing a central role in the iterative process of data collection, analysis and decision making. Overall, the ICE framework demonstrates the potential beneﬁts and impacts of using web-based interactive data visualization tools to place environmental datasets and model outputs directly into the hands of stakeholders, managers, decision makers and other researchers.


Introduction
Large-domain modeling is an important advancement in the environmental sciences. Models covering broad spatial areas are expected to improve our ability to study, monitor and manage natural resources at regional, continental and even global scales [1][2][3]. Modeling at this scale is increasingly feasible thanks to the growing computational power of desktop and cloud computing platforms, as well as the availability of large-scale and spatially continuous meteorological and geospatial datasets [3]. Large-domain models facilitate research and management not only at broad scales but also at local scales by providing spatially consistent datasets for filling data gaps (e.g., estimating streamflow in ungaged basins) and supporting site-specific assessments and comparisons. However, for viewing interactive animations of hydrologic model simulations [35]. While these examples demonstrate two approaches to making environmental data and models more accessible through user-friendly web-based interfaces, both provide limited interactive data visualization capabilities and were also developed for specific datasets and models covering small spatial scales. New approaches for creating web-based interactive data visualizations that can be generalized across different datasets and over large spatial scales are still needed.
In this paper, we present a web-based interactive data visualization framework for exploring spatial patterns in environmental datasets and model outputs called the Interactive Catchment Explorer (ICE). In developing the ICE framework, our objectives were to (1) create a generalized framework for making environmental and geospatial datasets and models easier to access, explore and understand; (2) demonstrate the application of this framework to different datasets and models through a series of case studies; and (3) determine what benefits and broader impacts, if any, these tools can have on research, management and decision making.
In the next section, we begin by describing the datasets, models and management issues associated with three separate research projects, which served as case studies to demonstrate the application of the ICE framework. We then describe the design and implementation of the core features and functionality that are central to this framework. In the results, we use a simplified user interface created solely for demonstration purposes to describe how an ICE application works and how various tasks can be performed. We then present the web application built for each of the three case studies, highlighting the unique aspects of each application and how it has been used in practice. The discussion focuses on the broader impacts of these applications and highlights the unique aspects of the ICE framework compared to similar efforts found in the literature. We also discuss some of the limitations of both the ICE framework itself as well as our understanding of how ICE applications affect user thinking and decision making, and we provide some suggestions for how these limitations could be addressed in future research. Lastly, the conclusions provide an overall summary of this work and the potential impacts of interactive data visualization tools on environmental research and management.

Datasets and Study Areas
Using the ICE framework, a series of web-based interactive data visualization tools were created to explore the input and output datasets of models and analyses associated with three separate research projects. Because these projects focused on different research topics and management issues, they varied in terms of their spatial units, geographic region and the types of analyses and models that were performed (Table 1; Figure 1).    [36] with the Albers equal area conic projection for the contiguous United States.

Stream Temperature and Brook Trout Occupancy in the Northeast Region
Water temperature is an important attribute of stream ecosystems that affects both water quality and habitat availability for many aquatic biota [37][38][39]. Native fish species such as the eastern brook trout (Salvelinus fontinalis), for example, require cool water temperatures for recruitment and survival [40,41]. However, human alterations of the landscape and hydrologic cycle, coupled with increasing air temperatures, have reduced the amount of coldwater habitat available for these species across their native ranges [42]. To preserve and protect these valuable species, resource managers need to identify existing coldwater habitat as well as evaluate which of those habitats are most threatened by and which are most resilient to future land use and climate change.
To support these management needs, researchers at the U.S. Geological Survey (USGS) Leetown Science Center developed a pair of empirical models to evaluate stream temperature and coldwater fish habitat across the Northeast region of the United States ( Figure 1). A hierarchical Bayesian model was developed to predict daily mean stream temperatures based on land use, elevation and other geospatial characteristics as well as air temperature and precipitation [43,44]. The model domain was based on a high-resolution catchment delineation (n = 386,591) that had an average catchment area of about 2 km 2 . The model only included catchments containing small to medium-sized streams with cumulative drainage areas less than 200 km 2 ; catchments with larger drainage areas were excluded due to the greater complexity of temperature dynamics in larger rivers. The model was calibrated and validated using a crowd-sourced database of stream temperature observations that allows users to upload and manage their own monitoring data. This database currently contains over 145 million observations collected by 80 government agencies, universities and non-profit organizations across the region [45]. The calibrated model was used to generate a series of temperature metrics for each catchment, such as the mean summer temperature and the average annual frequency of temperatures exceeding various thresholds over the historical period of record .
The stream temperature predictions were used along with the geospatial basin characteristics and climate variables to develop a logistic mixed effects model for predicting the probability of brook trout occupancy in each catchment [46]. The occupancy model was calibrated and validated using presence/absence data collected by government agencies in 12 states across the region. Together, the two models were used to predict potential future changes in stream temperature and brook trout occupancy under a series of climate change scenarios representing a range of air temperature increases.  [36] with the Albers equal area conic projection for the contiguous United States.

Stream Temperature and Brook Trout Occupancy in the Northeast Region
Water temperature is an important attribute of stream ecosystems that affects both water quality and habitat availability for many aquatic biota [37][38][39]. Native fish species such as the eastern brook trout (Salvelinus fontinalis), for example, require cool water temperatures for recruitment and survival [40,41]. However, human alterations of the landscape and hydrologic cycle, coupled with increasing air temperatures, have reduced the amount of coldwater habitat available for these species across their native ranges [42]. To preserve and protect these valuable species, resource managers need to identify existing coldwater habitat as well as evaluate which of those habitats are most threatened by and which are most resilient to future land use and climate change.
To support these management needs, researchers at the U.S. Geological Survey (USGS) Leetown Science Center developed a pair of empirical models to evaluate stream temperature and coldwater fish habitat across the Northeast region of the United States ( Figure 1). A hierarchical Bayesian model was developed to predict daily mean stream temperatures based on land use, elevation and other geospatial characteristics as well as air temperature and precipitation [43,44]. The model domain was based on a high-resolution catchment delineation (n = 386,591) that had an average catchment area of about 2 km 2 . The model only included catchments containing small to medium-sized streams with cumulative drainage areas less than 200 km 2 ; catchments with larger drainage areas were excluded due to the greater complexity of temperature dynamics in larger rivers. The model was calibrated and validated using a crowd-sourced database of stream temperature observations that allows users to upload and manage their own monitoring data. This database currently contains over 145 million observations collected by 80 government agencies, universities and non-profit organizations across the region [45]. The calibrated model was used to generate a series of temperature metrics for each catchment, such as the mean summer temperature and the average annual frequency of temperatures exceeding various thresholds over the historical period of record .
The stream temperature predictions were used along with the geospatial basin characteristics and climate variables to develop a logistic mixed effects model for predicting the probability of brook trout occupancy in each catchment [46]. The occupancy model was calibrated and validated using presence/absence data collected by government agencies in 12 states across the region. Together, the Water 2020, 12, 2928 5 of 20 two models were used to predict potential future changes in stream temperature and brook trout occupancy under a series of climate change scenarios representing a range of air temperature increases.

Climate Change Vulnerability Assessment of Native Trout Species in the Crown of the Continent Ecosystem
The Crown of the Continent Ecosystem (CCE) is a biologically diverse region in the northern Rocky Mountains ranging from central Montana in the United States to southern British Columbia and Alberta in Canada (Figure 1). The CCE is home to two native salmonids-bull trout (Salvelinus confluentus) and westslope cutthroat trout (Oncorhynchus clarkii lewisi)-that are under threat from multiple physical, biological and climatic stressors [47]. Researchers at the USGS Northern Rocky Mountain Science Center conducted a climate change vulnerability assessment (CCVA) of these two species to understand the relative risk of populations to climate change, invasive species and habitat loss. The goal was to provide this empirical information to natural resource managers and stakeholders to inform proactive conservation and restoration actions for improving native trout resilience and adaptation across the transboundary ecosystem.
Building on an approach described by Wade et al. [48], the CCVA incorporates the climate sensitivity, exposure and adaptive capacity of each species to quantify a series of relative risk scores based on empirical studies across space and time [49][50][51]. The input datasets for this assessment included geospatial characteristics (e.g., land use, hydrography), presence/absence data, demographic and hybridization metrics, habitat availability, climate conditions and modeled stream temperatures. Based on these input datasets, risk scores were calculated and assigned to conservation populations of westslope cutthroat trout (n = 497) and bull trout (n = 123) in the CCE. The risk scores were generated for four future climate change scenarios based on two emissions trajectories (Representative Concentration Pathways (RCPs), 4.5 and 8.5) and two time horizons (years 2035 and 2075) [52]. The area associated with each conservation population was delineated based on spawning and rearing habitat containing the known presence of genetically similar individuals (i.e., local populations). The results of this vulnerability assessment along with the input datasets provided a basis for understanding where native salmonid species are most at risk and which factors are the primary contributors to vulnerability.

Streamflow Conditions and Alteration in the Lower Mississippi-Gulf Region
The hydrologic alteration of rivers and streams can have large ecological consequences by changing the magnitude, timing, duration and frequency of freshwater flows [53]. In the Lower Mississippi-Gulf (LMG) region ( Figure 1), streamflow alteration negatively impacts the water quality, habitat, shellfish and fisheries of coastal bays and estuaries along the Gulf coast [54]. Recently, resource managers and decision makers in this region have begun to take a holistic approach to restoring coastal and marine resources along the Gulf coast that includes the restoration of freshwater delivery from upstream drainage basins. To support these goals, the USGS and the U.S. Environmental Protection Agency (USEPA) embarked on a collaborative project to quantify the occurrence and magnitude of stream alteration in basins draining to the Gulf of Mexico.
For this ongoing project, researchers are generating a series of datasets related to streamflow conditions and the degrees of flow alteration using a variety of statistical analyses and hydrologic models. These datasets are being generated for both the USGS streamflow gages (n = 956) as well as all 12-digit hydrologic unit code (HUC12) basins (n = 9314) in the region. Input datasets include the drainage basin characteristics (e.g., land use, topography, hydrography), hydrologic indices (e.g., base-flow index, topographic wetness index) and climate variables (precipitation, air temperature) for both the gages and the HUC12 basins as well as observed streamflow statistics for each gage. Output datasets include estimated streamflow quantiles (i.e., flow duration curves) of all HUC12 basins based on a neural network model [55] and the results of a long-term trend analysis based on observed flows at each gage. For both the input and output datasets, most variables contained time-varying values that were computed for each decade from the 1950s through the 2000s. As this project continues, additional datasets will provide a series of metrics representing the degree of streamflow alteration over time for each gage and HUC12 basin. Together, these datasets are meant to help local, state and federal agencies and decision makers to understand the spatial and temporal patterns of streamflow alteration and help them to prioritize basins for future flow restoration.

Web-Based Interactive Data Visualization Framework
Our goal in developing the ICE framework was to create a highly responsive user interface for exploring spatial patterns in large-scale environmental datasets and model outputs using any standard Web browser. Although it was originally developed for a specific dataset (stream temperature and brook trout occupancy in the northeast region), the underlying concepts and design of the ICE framework were sufficiently generalizable that it could be readily adapted to other datasets. Each adaptation of this framework resulted in incremental changes to its design and features due to the unique aspects of each project and the associated datasets. However, the core functionality and architecture were consistent across projects.

Features and Functionality
The design of the ICE framework was inspired by Shneiderman's Visual Information-Seeking Mantra: "Overview first, zoom and filter, then details on demand" [56]. With this mantra in mind, our goals were to (1) provide users with a high-level overview of broad spatial patterns across the region, (2) allow users to zoom to different spatial scales and also dynamically filter the dataset using one or more criteria and (3) enable users to see further details about any one specific feature. To achieve these goals, the user interface for each ICE application contains the following components:

1.
A map showing the spatial features of the dataset along with optional base maps and other static layers (e.g., hydrography, political boundaries, etc.). The map can be panned and zoomed to view the dataset at different spatial scales and can be used to select and view the details of a specific feature (e.g., catchment).

2.
A color variable selector, which allows the user to choose which variable from the dataset defines the color of each spatial feature on the map.

3.
Crossfilters, which are interactive histograms of one or more variables as chosen by the user. Each crossfilter shows the distribution of its corresponding variable and allows the user to filter the dataset by interactively selecting an allowable range of values for that variable.
These components were combined into a user interface containing multiple coordinated views of the dataset, which is a data visualization strategy commonly used for exploring multivariate datasets [57,58]. Using a technique known as brushing [57], whenever the user defines or changes the filter range for one crossfilter, the other crossfilters as well as the map are immediately updated to reflect only the filtered portion of the dataset.
From this interface, the user can perform a number of specific tasks, including the following: 1.
Exploring spatial patterns of different variables at multiple scales; 2.
Identifying specific sets of features using multivariate criteria by defining filter ranges on one or more crossfilters; 3.
Evaluating how spatial patterns change in response to filtering different variables; 4.
Discovering relationships between variables by filtering the dataset using one variable and then observing how the distributions of other variables change in response; and 5.
Viewing more details about any one spatial feature by selecting it on the map.
Together, these tasks were meant to facilitate the exploration of datasets in ways that will help users to generate new hypotheses and form a better understanding of the underlying environmental systems and processes. By interactively exploring the dataset, the user may discover interesting or unexpected patterns, which in turn will lead to more questions about the dataset. Furthermore, by seeing how both the spatial patterns and variable distributions (i.e., histograms) change in response to interactive filtering, the user may develop stronger mental models of the relationships between variables and their spatial distributions. This process can be especially useful for understanding how an underlying model works by focusing on the relationships between input and output variables.

Architecture and Implementation
In order for an interactive data visualization tool to be effective, it needs to provide a responsive user experience with minimal latency [13,59]. To achieve this, the ICE framework uses a client-based architecture (also known as a thick-client architecture) in which the application logic, data processing and visualization are all performed locally within the user's Web browser [60,61]. Unlike desktop software applications, which are executed entirely on a single computer, Web applications generally involve two systems: the server and the client. Traditionally, most web applications have used a server-based (also known as a thin-client) architecture, for which each user action triggers the client to send a request to the server, which retrieves and processes the data for that request and then returns a response containing the updated results and rendered visualization to the client. However, this round-trip transfer of information between the client and the server introduces a noticeable lag in the application, which can negatively impact the user experience and thinking process. To create more responsive user interfaces, client-based architectures emerged in the mid-2000s thanks to advances in Web standards and technologies that enabled Web browsers to perform complex application logic independent of the server. With a client-based architecture, when the user first visits the application's uniform resource locator (URL), the server sends all of the files necessary to run the application along with the raw data. Once it is loaded in the browser, the application performs all data processing and visualization rendering itself in response to each user action and without the need to communicate with the server. As a result, by loading the application and datasets up front, there is significantly less lag between each user action and the interface response.
During the development of each application, the FOSS libraries are bundled with the application source code resulting in a set of static hypertext markup language (HTML), cascading style sheets (CSS) and JavaScript files that can be hosted by any standard Web server. In addition to these files, the datasets are also stored on the server using simple text files in comma-separated values (CSV) or JavaScript Object Notation (JSON) formats. Because there is no database or application running on the server, hosting costs are low and there is minimal maintenance required to keep these applications running aside from standard Web server administration and routine updates of the third-party libraries in order to incorporate the latest changes related to security and browser compatibility issues. The need for any Web server maintenance can be entirely eliminated by using low-cost, Cloud-based file storage systems, such as Amazon Web Services (AWS) Simple Storage Service (S3), that can be configured to host client-based Web applications. The use of static files for hosting both the application and the datasets was a deliberate decision to ensure the longevity of each application. Web applications with server-side architectures and databases generally require more resources and long-term funding to support ongoing maintenance.
Instructions for obtaining the source code of the three case study applications as well as a starter template for creating new ICE applications from other datasets are provided in the Supplemental Material (Document S1) [64][65][66][67].

Results
In this section, we first describe the main features and functionality of the ICE framework using a simplified user interface developed solely for demonstration purposes. We then describe each of the three applications that were built using this framework as a series of case studies. For each case study, we focus on the unique aspects of its application as well as the specific ways in which that application has been utilized by different groups of users.

Demonstration of the ICE Framework
To demonstrate how the ICE framework works, we present a simplified version of the user interface that contains only the core features common to all ICE applications (Section 2.2.1). For this example, we used a subset of the stream temperature and brook trout occupancy dataset for the northeast region (Section 2.1.1) that includes only catchments within the Saco River watershed (HUC 01060002) in southern Maine. To illustrate the responsive and dynamic nature of the ICE interface, a screenshot video of this demonstration is also provided in the Supplemental Material (Video S1).
The user interface shows each of the three core ICE components: (1) a map showing the catchments, (2) a selection menu for choosing which variable is used to assign the color to each catchment (currently, the mean summer stream temperature) and (3) three crossfilters for the catchment elevation, percentage of forest cover and mean summer stream temperature ( Figure 2). Catchments that do not have a color contain larger river segments, which were excluded from the stream temperature model and therefore have no value for the selected color variable (see Section 2.1.1). For each crossfilter, a histogram shows the distribution of the associated variable using 40 equal-width bins, with the height of each bar corresponding to the relative number of catchments within that bin. Vertical axes are not shown on the histograms because, as the user filters the dataset, the bars are automatically rescaled; thus, the tallest bar always has the same height as the chart.
Water 2020, 12, x FOR PEER REVIEW 8 of 20

Demonstration of the ICE Framework
To demonstrate how the ICE framework works, we present a simplified version of the user interface that contains only the core features common to all ICE applications (Section 2.2.1). For this example, we used a subset of the stream temperature and brook trout occupancy dataset for the northeast region (Section 2.1.1) that includes only catchments within the Saco River watershed (HUC 01060002) in southern Maine. To illustrate the responsive and dynamic nature of the ICE interface, a screenshot video of this demonstration is also provided in the Supplemental Material (Video S1).
The user interface shows each of the three core ICE components: (1) a map showing the catchments, (2) a selection menu for choosing which variable is used to assign the color to each catchment (currently, the mean summer stream temperature) and (3) three crossfilters for the catchment elevation, percentage of forest cover and mean summer stream temperature ( Figure 2). Catchments that do not have a color contain larger river segments, which were excluded from the stream temperature model and therefore have no value for the selected color variable (see Section 2.1.1). For each crossfilter, a histogram shows the distribution of the associated variable using 40 equal-width bins, with the height of each bar corresponding to the relative number of catchments within that bin. Vertical axes are not shown on the histograms because, as the user filters the dataset, the bars are automatically rescaled; thus, the tallest bar always has the same height as the chart. Initially, the interface provides an overview of the broad spatial patterns in mean summer stream temperature as predicted by the model (Figure 2). Temperatures tend to be coolest in the headwaters at the northwest end of the watershed and warmer throughout the rest of the watershed. The three crossfilters show that catchment elevations are heavily skewed to the right, with most elevations being less than ~400 m; forest cover is skewed to the left, with highly forested catchments being most common; and the mean summer stream temperature in the majority of catchments is Initially, the interface provides an overview of the broad spatial patterns in mean summer stream temperature as predicted by the model (Figure 2). Temperatures tend to be coolest in the headwaters at the northwest end of the watershed and warmer throughout the rest of the watershed. The three crossfilters show that catchment elevations are heavily skewed to the right, with most elevations being less than~400 m; forest cover is skewed to the left, with highly forested catchments being most common; and the mean summer stream temperature in the majority of catchments is between 16 and 18 • C, with a relatively symmetric distribution.
The user can filter the dataset by interactively selecting (i.e., brushing) a range of values on one of the crossfilter histograms. Because the map and histograms are linked as multiple coordinated views, setting a filter on one crossfilter causes the map and other histograms to update and reflect only the filtered catchments within the dataset. Multiple filters can be set simultaneously to identify features that meet more than one criterion. For example, to identify which catchments provide coldwater habitat and are also heavily forested, the user can filter for catchments with mean summer stream temperatures less than 17 • C and forest cover greater than 80% (Figure 3). After setting the filters, the map shows 823 filtered catchments (out of 2519 total) meeting both criteria. Similarly, the elevation histogram is updated to reflect the distribution of only those filtered catchments.
Water 2020, 12, x FOR PEER REVIEW 9 of 20 only the filtered catchments within the dataset. Multiple filters can be set simultaneously to identify features that meet more than one criterion. For example, to identify which catchments provide coldwater habitat and are also heavily forested, the user can filter for catchments with mean summer stream temperatures less than 17 °C and forest cover greater than 80% (Figure 3). After setting the filters, the map shows 823 filtered catchments (out of 2519 total) meeting both criteria. Similarly, the elevation histogram is updated to reflect the distribution of only those filtered catchments. In addition to identifying catchments using multivariate criteria, crossfilters can also be used to dynamically explore the spatial distribution of a variable and to evaluate relationships between variables. Using the brushing technique, the user can modify a filter range by sliding it horizontally or by adjusting the start or end point to expand or shrink it. For example, a filter on the mean summer stream temperature can be shifted from an initial range of 12-15 °C to an intermediate range from 15-18 °C and finally to an upper range from 18-21 °C (Figure 4). As the filter range is shifted from left to right, the map shows how the spatial distribution of catchments changes as temperatures increase from low to high values. Similarly, the histograms for elevation and forest cover also update in response to the increasing temperatures.
Most catchments with low mean summer temperatures (12-15 °C) are in the headwaters of this basin and therefore have relatively high elevations, where forest cover is also relatively high ( Figure  4a). As the temperature filter is increased from the low to intermediate range, the elevation histogram shifts to the left towards lower values, and the forest cover histogram becomes less heavily skewed, indicating that forest cover is generally lower among catchments with intermediate temperatures (Figure 4b). When the temperature filter is shifted to the highest range (18-21 °C), the elevation histogram becomes more concentrated at lower values, and the forest cover histogram is more evenly In addition to identifying catchments using multivariate criteria, crossfilters can also be used to dynamically explore the spatial distribution of a variable and to evaluate relationships between variables. Using the brushing technique, the user can modify a filter range by sliding it horizontally or by adjusting the start or end point to expand or shrink it. For example, a filter on the mean summer stream temperature can be shifted from an initial range of 12-15 • C to an intermediate range from 15-18 • C and finally to an upper range from 18-21 • C ( Figure 4). As the filter range is shifted from left to right, the map shows how the spatial distribution of catchments changes as temperatures increase from low to high values. Similarly, the histograms for elevation and forest cover also update in response to the increasing temperatures.
Water 2020, 12, x FOR PEER REVIEW 10 of 21 mean summer stream temperature and both elevation and forest cover, which are in fact two of the strongest predictor variables in the stream temperature model [44].  In addition to exploring the spatial patterns and relationships between variables, the user can retrieve more details about a specific catchment by selecting it on the map and then opening a tabular summary for that catchment containing the values of all variables within the dataset ( Figure 5). Most catchments with low mean summer temperatures (12-15 • C) are in the headwaters of this basin and therefore have relatively high elevations, where forest cover is also relatively high ( Figure 4a). As the temperature filter is increased from the low to intermediate range, the elevation histogram shifts to the left towards lower values, and the forest cover histogram becomes less heavily skewed, indicating that forest cover is generally lower among catchments with intermediate temperatures (Figure 4b). When the temperature filter is shifted to the highest range (18-21 • C), the elevation histogram becomes more concentrated at lower values, and the forest cover histogram is more evenly distributed and no longer dominated by heavily forested catchments (Figure 4c). Based on this interaction, which is near-instantaneous in the application, the user can quickly infer that stream temperatures are coolest in the high-elevation headwaters and warmest at lower elevations, especially where forest cover is also low. Overall, these results indicate a negative correlation between mean summer stream temperature and both elevation and forest cover, which are in fact two of the strongest predictor variables in the stream temperature model [44].
In addition to exploring the spatial patterns and relationships between variables, the user can retrieve more details about a specific catchment by selecting it on the map and then opening a tabular summary for that catchment containing the values of all variables within the dataset ( Figure 5). In addition to exploring the spatial patterns and relationships between variables, the user can retrieve more details about a specific catchment by selecting it on the map and then opening a tabular summary for that catchment containing the values of all variables within the dataset ( Figure 5).

Stream Temperature and Brook Trout Occupancy in the Northeast Region
The first application to be developed using the ICE framework focused on stream temperature and brook trout habitat in the northeast region of the United States (https://www.usgs.gov/apps/ecosheds/ice-northeast; Figure 6). This application was designed to explore local and regional spatial patterns of the predicted stream temperature and the probability of brook trout occupancy under current and future climate conditions. The application was created to help resource managers to identify catchments that provide coldwater habitat and then to prioritize those catchments for protection or restoration based on the sensitivity to future air temperature increases.

Stream Temperature and Brook Trout Occupancy in the Northeast Region
The first application to be developed using the ICE framework focused on stream temperature and brook trout habitat in the northeast region of the United States (https://www.usgs.gov/apps/ecosheds/ ice-northeast; Figure 6). This application was designed to explore local and regional spatial patterns of the predicted stream temperature and the probability of brook trout occupancy under current and future climate conditions. The application was created to help resource managers to identify catchments that provide coldwater habitat and then to prioritize those catchments for protection or restoration based on the sensitivity to future air temperature increases.
The datasets and models for this application were based on a high-resolution delineation of nearly 400,000 catchments across the region. Rendering the full set of catchments was not only infeasible due to the amount of geospatial data involved (100s of megabytes), but it would have also introduced too much detail and variability at broad spatial scales. To reduce the number of features that needed to be rendered on the map, the application was modified to aggregate catchments spatially by larger watershed units based on HUC delineations. The user can choose from varying resolutions ranging from six-digit up to 12-digit HUCs (HUC6 to HUC12). For each HUC, the area-weighted mean value is computed for the selected color variable based on all catchments within it. This average value is used to assign the color for each HUC. When a crossfilter is used to filter the dataset, the spatially aggregated values are recomputed in real-time using only the filtered catchments. To view local spatial patterns, the application can display the individual catchments within a single HUC at the request of the user.
In addition to the spatial aggregation, this application is also unique among the three case studies in that it is linked to an underlying model, which is updated periodically. Approximately once every 6 months, the stream temperature model is recalibrated using any new observations available in the crowd-sourced database as well as the most recent climate data (e.g., air temperature and precipitation). Predictions from the recalibrated stream temperature model are then used to update the brook trout occupancy model predictions. Finally, the updated predictions from both models are integrated into the dataset that is loaded by the web application. The datasets and models for this application were based on a high-resolution delineation of nearly 400,000 catchments across the region. Rendering the full set of catchments was not only infeasible due to the amount of geospatial data involved (100s of megabytes), but it would have also introduced too much detail and variability at broad spatial scales. To reduce the number of features that needed to be rendered on the map, the application was modified to aggregate catchments spatially by larger watershed units based on HUC delineations. The user can choose from varying resolutions ranging from six-digit up to 12-digit HUCs (HUC6 to HUC12). For each HUC, the areaweighted mean value is computed for the selected color variable based on all catchments within it. This average value is used to assign the color for each HUC. When a crossfilter is used to filter the dataset, the spatially aggregated values are recomputed in real-time using only the filtered catchments. To view local spatial patterns, the application can display the individual catchments within a single HUC at the request of the user.
In addition to the spatial aggregation, this application is also unique among the three case studies in that it is linked to an underlying model, which is updated periodically. Approximately once every 6 months, the stream temperature model is recalibrated using any new observations available in the crowd-sourced database as well as the most recent climate data (e.g., air temperature and precipitation). Predictions from the recalibrated stream temperature model are then used to update the brook trout occupancy model predictions. Finally, the updated predictions from both models are integrated into the dataset that is loaded by the web application.
By linking this ICE application to its source database and models, we found that users were given an incentive to contribute their own data to the project. One group in particular, the Maine Water Temperature Working Group (MWTWG), is composed of representatives from various state, federal and tribal agencies, as well as universities and non-profit organizations from around the state of Maine. Each year, the MWTWG members collect new stream temperature data, which they upload to the database at the end of the field season. Using the ICE application, they can then access the latest model results, which are refined using their recently collected data. These results can then be used to support their own research projects and management needs. Therefore, this application demonstrates how data visualization tools can create a feedback loop between monitoring, modeling and Figure 6. Screenshot of the ICE Web application for stream temperature and brook trout occupancy in the northeast region showing the area-weighted mean elevation of each HUC10 basin and crossfilters for catchments' percentage of forest cover, mean summer stream temperature and the probability of brook trout occupancy under current climate conditions. Available at https://www.usgs.gov/apps/ ecosheds/ice-northeast.
By linking this ICE application to its source database and models, we found that users were given an incentive to contribute their own data to the project. One group in particular, the Maine Water Temperature Working Group (MWTWG), is composed of representatives from various state, federal and tribal agencies, as well as universities and non-profit organizations from around the state of Maine. Each year, the MWTWG members collect new stream temperature data, which they upload to the database at the end of the field season. Using the ICE application, they can then access the latest model results, which are refined using their recently collected data. These results can then be used to support their own research projects and management needs. Therefore, this application demonstrates how data visualization tools can create a feedback loop between monitoring, modeling and management. This feedback loop can motivate users to not only use the model results but also contribute data back to the project in order to improve the quality of the model.

Climate Change Vulnerability Assessment of Native Trout Species in the Crown of the Continent Ecosystem
The second ICE application was developed for a climate change vulnerability assessment of two native trout species, bull trout and westslope cutthroat trout, in the CCE (http://ice.ecosheds.org/cce/; Figure 7). The design and functionality of this application was similar to the previous one except that it did not include spatial aggregation due to the smaller number of features, and it allows the user to switch between two separate datasets: one for each species. For each dataset, the main output variables included a set of risk scores associated with changes in habitat, demographics, genetics and climate, all of which were also combined into an overall risk score. Risk scores were computed for four future climate change scenarios (RCPs 4.5 and 8.5, years 2035 and 2075). The datasets also included a large number of input variables that were used to compute each risk score. These input variables included metrics related to the degree of hybridization, demographics, presence/absence, habitat availability and climate exposure, which was based on the output of a stream temperature model of the region. climate, all of which were also combined into an overall risk score. Risk scores were computed for four future climate change scenarios (RCPs 4.5 and 8.5, years 2035 and 2075). The datasets also included a large number of input variables that were used to compute each risk score. These input variables included metrics related to the degree of hybridization, demographics, presence/absence, habitat availability and climate exposure, which was based on the output of a stream temperature model of the region. The CCE application was developed primarily for members of the Crown Managers Partnership (CMP), which is composed of fisheries, biologists and resource managers from government and tribal agencies and non-profit organizations in Canada and the United States. In March 2018, this application was used by the CMP during a forum held in Lethbridge, Alberta [68]. During the forum, participants broke out into small working groups, each tasked with using the application to identify potential restoration actions for one species in a specific part of the region. At the end of the meeting, the recommendations from all groups were combined and provided a roadmap for priority actions to protect and restore critical habitats for native populations of the two trout species.
Use of this application during an in-person meeting demonstrates how interactive data visualization tools can support collaborative management and decision making by providing large, complex datasets that are easy to access and explore in a group setting. Without a tool like this, participants would have been limited to static charts and maps of the dataset contained in The CCE application was developed primarily for members of the Crown Managers Partnership (CMP), which is composed of fisheries, biologists and resource managers from government and tribal agencies and non-profit organizations in Canada and the United States. In March 2018, this application was used by the CMP during a forum held in Lethbridge, Alberta [68]. During the forum, participants broke out into small working groups, each tasked with using the application to identify potential restoration actions for one species in a specific part of the region. At the end of the meeting, the recommendations from all groups were combined and provided a roadmap for priority actions to protect and restore critical habitats for native populations of the two trout species.
Use of this application during an in-person meeting demonstrates how interactive data visualization tools can support collaborative management and decision making by providing large, complex datasets that are easy to access and explore in a group setting. Without a tool like this, participants would have been limited to static charts and maps of the dataset contained in presentations, reports and publications. Those static visualizations would have covered only a small subset of the spatial scales and variables that could potentially be explored. However, with this tool, they were able to access and evaluate a large number of variables at multiple scales, which led to both a stronger understanding of the vulnerability assessment itself as well as informed management plans for prioritizing specific habitats for restoration and protection.

Streamflow Conditions and Alteration in the Lower Mississippi-Gulf Region
The third and most recent ICE application was created to explore streamflow conditions and degrees of alteration in basins draining to the Gulf of Mexico as part of a collaborative project led by the USGS and USEPA (https://www.usgs.gov/apps/ecosheds/lmg-restore/; Figure 8). Because of the ongoing nature of this project and the large number of datasets that will ultimately be generated, there was a need to add new datasets efficiently as they become available and to allow the user to switch between datasets that may contain different types of spatial features (e.g., USGS streamflow gages or HUC12 basin pour points). Therefore, the underlying code for the ICE framework was updated to have a more flexible structure for supporting multiple datasets with different spatial features. Furthermore, because many of the datasets for this project contained variables that varied by decade from the 1950s through the 2000s, we added support for time-varying values by including a slider (not shown) that allows the user to see how each variable changes over time on both the map and the crossfilter histograms. We also performed a major upgrade of the user interface by applying a more modern style and revising the layout of the interface components to provide a better user experience.
The primary purpose of this ICE application is to serve as a central location for accessing and exploring the various datasets generated for the associated project. Each dataset, along with the code used to generate it, can be accessed directly through the USGS ScienceBase catalog [69]. However, in order to evaluate patterns and relationships both within and across these datasets, each user would need to download and process each dataset individually before manually combining them. Not only does this require certain skills and experience with processing complex data, but even for technical users with adequate skills, accessing and processing the data would still be time-consuming. Therefore, by making these datasets centrally available through a Web-based tool such as this ICE application, users with a wider range of backgrounds and skills can access and explore them more efficiently from a single, unified interface.
Because this application was recently released, we have yet to determine specifically how it is or will be used to support research and management efforts. However, based on the initial feedback we have received so far, potential users have already expressed interest in using the application as the project proceeds, especially once the upcoming datasets containing streamflow alteration metrics are completed and incorporated into the application. Our hope is that this tool will help users to understand where streamflow alteration is most significant and foster collaboration between stakeholders and decision makers as they develop management plans for improving freshwater delivery to the estuaries and wetlands along the Gulf coast.
Water 2020, 12, x FOR PEER REVIEW 14 of 21 a slider (not shown) that allows the user to see how each variable changes over time on both the map and the crossfilter histograms. We also performed a major upgrade of the user interface by applying a more modern style and revising the layout of the interface components to provide a better user experience. The primary purpose of this ICE application is to serve as a central location for accessing and exploring the various datasets generated for the associated project. Each dataset, along with the code used to generate it, can be accessed directly through the USGS ScienceBase catalog [69]. However, in order to evaluate patterns and relationships both within and across these datasets, each user would need to download and process each dataset individually before manually combining them. Not only does this require certain skills and experience with processing complex data, but even for technical users with adequate skills, accessing and processing the data would still be time-consuming. Therefore, by making these datasets centrally available through a Web-based tool such as this ICE application, users with a wider range of backgrounds and skills can access and explore them more efficiently from a single, unified interface.
Because this application was recently released, we have yet to determine specifically how it is or

Discussion
For large-scale environmental models and datasets to be effective in supporting management and decision making, we need new ways of making this information easier to access, explore and understand. The ICE framework demonstrates one approach to developing web-based interactive data visualizations for exploring the spatial patterns in environmental datasets and model outputs. Although we initially created the ICE framework for a specific dataset, we found that its primary components and architecture could be readily adapted to datasets from other research projects. Using this framework, we developed a series of Web applications designed to help users explore datasets and model results at multiple spatial scales, discover spatial patterns, evaluate relationships between variables and identify and prioritize areas for future management.
Based on our three case studies, we found that tools such as these can have a variety of broader impacts on research, management and decision making. From the northeast application, we found that, when linked to underlying datasets and models, these tools can play a central role in the iterative process of monitoring, modeling and management by motivating users to collect and contribute data to the project, which in turn improves the accuracy and quality of the underlying models. These tools can also provide a common platform for learning and exploring new datasets, which can be used to facilitate discussion and collaboration between stakeholders and decision makers across international jurisdictions, as demonstrated by the CCE application. Lastly, these tools can provide a unified interface for accessing and exploring multiple datasets that may be generated over the course of a single project and which otherwise may require substantial effort to process and analyze, as shown by the LMG application.
The use of interactive data visualizations for exploring geospatial datasets has been an area of research since as early as the 1990s [31,32]. However, only within the past decade or so has computer hardware coupled with new FOSS libraries and web standards made it possible to create interactive data visualizations of large datasets that can be accessed through a standard Web browser. While many data visualization tools have been created for exploring environmental datasets and models based on desktop software [33,70] or server-based Web architectures [24,34,35,71,72], the ICE framework demonstrates a new approach that uses a client-based architecture to create a responsive user experience for directly interacting with this kind of geospatial data.
One of the primary tradeoffs in using a client-based architecture is that the size of the dataset is limited to what can be reasonably downloaded, processed and rendered within the browser without causing excessive latency. Across our three case study applications, datasets ranged in size from approximately 100 kilobytes up to 50 megabytes, which we considered the upper limit for what can be supported by modern computer hardware. Although some datasets contained hundreds of variables, we selected only a subset of those variables in order to keep the dataset for each application to a reasonable size. To address this limitation, a server-side component could be added to the ICE framework that would generate datasets containing any combination of variables as chosen by the user. Although the number of variables would still be limited, this would provide greater flexibility for evaluating different combinations of variables depending on the interest of the user.
In addition to the dataset size, the applications were also limited by the number and complexity of the geospatial features that are rendered on the map. In two of our case studies, we addressed this limitation by using spatial aggregation in the northeast application to reduce the overall number of features and by using simpler geometries in the LMG application (HUC12 pour points instead of polygons representing the basin areas). However, this limitation could also be addressed using alternative Web-based graphics technologies. Currently, the ICE framework renders geospatial features on the map using the scalable vector graphics (SVG) format, which is the simplest but also the least performant graphics standard available. However, Web-based maps can also be rendered using raster-based methods such as canvas or Web Graphics Library (WebGL), which can have better rendering performance and scalability than SVG at the expense of increased code complexity.
Beyond the technical limitations of the ICE framework, we also recognize that our understanding of its impacts on end users is limited. Although user feedback suggests that these tools are useful and engaging, this feedback was provided voluntarily and therefore may be biased if it was only provided by those who find the tools most useful to begin with. Controlled user surveys and experimental studies would be beneficial for helping us to determine if and how these tools improve understanding and influence decision making. Based on similar types of environmental data visualization tools, user assessments have yielded mixed results regarding whether these types of tools improve understanding.
For example, Herring et al. [12] found that using an interactive data visualization tool for exploring local climate risks significantly changed the participants' attitudes towards climate change. Furthermore, van Hardeveld et al. [73] found that their interactive simulation system fostered cooperation among stakeholders and improved their understanding of the problems and actions related to peat management. In contrast, Xexakis and Trutnevyte [74] found that use of an interactive web application for simulating electricity supply scenarios did not improve user engagement and actually led to poorer user understanding. Similarly, Arciniegas et al. [75] found that an interactive map-based decision support tool was not considered helpful by study participants. Experimental user studies are therefore necessary to evaluate what impacts, if any, a tool has on user understanding and decision making.
In addition to understanding the impacts and effectiveness of these tools, user studies would also be helpful for improving the design of the ICE framework by helping to identify areas that cause confusion. Initial development of this framework was greatly improved thanks to feedback from a select group of users. However, this feedback was primarily obtained from other researchers and managers with similar technical skills and experience. More feedback from users with varied backgrounds would likely lead to a better tool that is both useful and effective across broader audiences.
Finally, as we continue to develop the ICE framework and adapt it to new datasets and models, there are a few key areas that we will focus on to expand and improve on its capabilities. Currently, the ICE framework is designed primarily to explore only the geospatial patterns of large-scale environmental datasets. However, many modeling datasets contain timeseries of observed and simulated values. Although some support for time-varying datasets was added in the LMG application, the addition of greater timeseries support will allow users to see not only the long-term average conditions of each spatial feature but also a complete time history. Furthermore, while the ICE framework is currently aimed at the exploration of static model outputs, it could also incorporate interactive modeling tools for performing scenario simulations of individual features. Lastly, finding ways of incorporating uncertainty in the visualizations will help to improve the use of these tools for decision making and management.

Conclusions
The ICE framework demonstrates a Web-based interactive data visualization approach to explore the spatial patterns in environmental datasets and model outputs. We applied this framework to datasets and models from three separate research projects, each focusing on a unique research topic and management issue in a specific region of North America. Based on our experience developing these applications, we found that not only can web-based data visualization tools be used for accessing, exploring and understanding large geospatial datasets, but these tools can also provide a number of broader impacts that go beyond any one user.
Web-based interactive data visualization tools such as the ICE framework can promote a democratization of environmental datasets and models by placing this information directly in the hands of stakeholders, managers, decision makers and other researchers. They can make large and complex datasets easier to access and understand for broader audiences who otherwise may not be able to analyze the raw data themselves. They can provide a central platform for shared learning and discovery, which in turn fosters discussion and collaboration between users. When linked to their underlying data sources and models, these tools can motivate users to contribute their own data to the project in order to improve the quality of the model results.
Although there are some limitations to this approach and the ICE framework specifically, further research into hybrid client-server architectures and the application of alternative rendering strategies is likely to increase the capabilities of this framework as well as improve its performance for handling larger datasets. Overall, the development and application of the ICE framework demonstrate that interactive data visualization tools can play a central role in the iterative process of environmental data collection, exploration, modeling and decision making.