**Advances in a Distributed Approach for Ocean Model Data Interoperability**

#### **Richard P. Signell and Derrick P. Snowden**

**Abstract:** An infrastructure for earth science data is emerging across the globe based on common data models and web services. As we evolve from custom file formats and web sites to standardsbased web services and tools, data is becoming easier to distribute, find and retrieve, leaving more time for science. We describe recent advances that make it easier for ocean model providers to share their data, and for users to search, access, analyze and visualize ocean data using MATLAB® and Python®. These include a technique for modelers to create aggregated, Climate and Forecast (CF) metadata convention datasets from collections of non-standard Network Common Data Form (NetCDF) output files, the capability to remotely access data from CF-1.6-compliant NetCDF files using the Open Geospatial Consortium (OGC) Sensor Observation Service (SOS), a metadata standard for unstructured grid model output (UGRID), and tools that utilize both CF and UGRID standards to allow interoperable data search, browse and access. We use examples from the U.S. Integrated Ocean Observing System (IOOS®) Coastal and Ocean Modeling Testbed, a project in which modelers using both structured and unstructured grid model output needed to share their results, to compare their results with other models, and to compare models with observed data. The same techniques used here for ocean modeling output can be applied to atmospheric and climate model output, remote sensing data, digital terrain and bathymetric data.

Reprinted from *J. Mar. Sci. Eng.* Cite as: Signell, R.P.; Snowden, D.P. Advances in a Distributed Approach for Ocean Model Data Interoperability. *J. Mar. Sci. Eng.* **2014**, *2*, 194-208.

#### **1. Introduction**

Ocean modelers typically require many different types of input data for forcing, assimilation and boundary conditions, and routinely produce GB or larger amounts of output data. Depending on which model is used, the horizontal coordinate of the output data may be on a regular, curvilinear, or unstructured (e.g., triangular) grid, while the vertical coordinate may be on a uniform or stretched grid with a number of different possibilities (e.g., sigma, sigma-over-z, s-coordinate, isopycnal). Ocean modelers therefore often spend large amounts of time on mundane data manipulation tasks such as searching and reformatting data from external sources, writing custom readers for specific models so that results between models can be compared and assessed, as well as responding to custom data requests from consumers of their model products. Better tools reduce time spent on these mundane data manipulation tasks, thereby increasing time spent on modeling and analysis work.

The U.S. Integrated Ocean Observing System (U.S. IOOS®) has been working on better tools to support not only its member organizations, but the entire ocean science community. U.S. IOOS (hereafter referred to simply as IOOS), is a collaboration between Federal, State, Local, Academic and Commercial partners to manage ocean observing and modeling systems to meet the unique needs of each region around the US [1–3]. Federal partners provide the "National Backbone", and 11 IOOS Regional Associations (RAs) build upon the backbone with local assets to create observational and modeling systems designed to be more than the sum of the parts, capable of responding to the societal needs of each individual region (e.g., harmful algal blooms, eutrophication, search and rescue, oil spills, navigation, mariculture) (Figure 1).

**Figure 1.** The 11 United States Integrated Ocean Observing System (U.S. IOOS) Regional Associations, reproduced with permission from © 2011 Dynamic Network Services—DynDNS.com Internet Guide.

In 2008, IOOS held a community modeling workshop attended by 57 members spanning federal, research and private sectors, including modelers and stakeholders, and the workshop produced a report with nine specific recommendations to advance the state of ocean modeling in the US [4]. One of recommendations was to "develop an implementation plan for a distributed, one-stop shopping national data portal and archive system for ocean prediction input and output data". The US Geological Survey (USGS) had been working on model data interoperability for their collaborative projects on sediment transport modeling [5–7] and in 2009 agreed to send one of their modelers to the U.S. IOOS Program Office, within the National Ocean and Atmospheric Administration (NOAA), for a one year detail to lead the effort.

The one year project to develop model data interoperability for IOOS was remarkably successful. Leveraging technologies developed for the atmospheric community, a model data delivery and access system was implemented in all 11 IOOS RAs and at many of the National Backbone modeling centers [8]. The approach used mostly technologies that had grown from the community and emerged as community practices [9,10]. The system design allowed modelers to serve their data in a standardized manner via IOOS-approved web services without modifying their original data files or their models. Users were then able to access these standardized data streams using a variety of tools, from simple map-based browsing, to more sophisticated 3D visualization, to full scientific exploration on their desktop computers. With this success, future work to build on this infrastructure was recommended, including improved techniques for searching datasets, better support for unstructured grids and observational data, server-side subsetting for unstructured grids, more tools for common analysis tasks, and tools for scientific analysis and visualization environments in addition to Matlab.

**Figure 2.** (**left**) The structured (curvilinear orthogonal) grid SLOSH model. (**right**) The unstructured (triangular) grid ADCIRC model.

In 2010, IOOS funded a Coastal and Ocean Modeling Testbed (COMT), with the goal of accelerating improvement in ocean forecasting through targeted model assessment and comparison projects. The initial COMT focused on Estuarine Hypoxia, Shelf Hypoxia and Inundation [11,12], and prioritized assessment of model data output from both curvilinear orthogonal grid models (SLOSH, ROMS, NCOM and HyCOM) and unstructured triangular grid models (ADCIRC, FVCOM and SELFE) (Figure 2). The COMT Cyberinfrastructure team was charged with developing and implementing technologies to meet these needs.

Here we report on significant improvements of the IOOS infrastructure relevant to ocean modelers or users of ocean model products since the system described in [8]. Many of these were developed in the COMT and other IOOS activities, while other components were developed external to IOOS in the international geoscience community. These include new standards for unstructured grid model output and for observational data (e.g., time series, profiles, trajectories), new services and access tools for consuming these standardized data, more analysis tools for Matlab users, and new tools for Python users. These tools and techniques are not specific to IOOS, and should be of interest to anyone interested in more efficient distribution or access to ocean modeling and observational data.

**Figure 3.** Schematic of the IOOS Coastal and Ocean Modeling Testbed (COMT) model data interoperability design. Non-standard model output and data files are converted into standardized and aggregated virtual datasets using the NetCDF Markup Language (NcML), a lightweight XML layer. A custom NcML template is developed for each type of model output (e.g., collections of SELFE files). Once the data has been standardized to Common Data Model feature types (by the use of CF-1.6 and UGRID-0.9 conventions), it can be distributed uniformly by appropriate services and consumed by standards-based clients, providing data interoperability for the user.
