The traditional flow of coastal ocean model data is from High-Performance Computing (HPC) centers to the local desktop, or to a file server where just the needed data can be extracted via services such as OPeNDAP. Analysis and visualization are then conducted using local hardware and software. This requires moving large amounts of data across the internet as well as acquiring and maintaining local hardware, software, and support personnel. Further, as data sets increase in size, the traditional workflow may not be scalable. Alternatively, recent advances make it possible to move data from HPC to the Cloud and perform interactive, scalable, data-proximate analysis and visualization, with simply a web browser user interface. We use the framework advanced by the NSF-funded Pangeo project, a free, open-source Python system which provides multi-user login via JupyterHub and parallel analysis via Dask, both running in Docker containers orchestrated by Kubernetes. Data are stored in the Zarr format, a Cloud-friendly n-dimensional array format that allows performant extraction of data by anyone without relying on data services like OPeNDAP. Interactive visual exploration of data on complex, large model grids is made possible by new tools in the Python PyViz ecosystem, which can render maps at screen resolution, dynamically updating on pan and zoom operations. Two examples are given: (1) Calculating the maximum water level at each grid cell from a 53-GB, 720-time-step, 9-million-node triangular mesh ADCIRC simulation of Hurricane Ike; (2) Creating a dashboard for visualizing data from a curvilinear orthogonal COAWST/ROMS forecast model.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited