Most open repositories present a similar interface and workflow to publish data resultant from different types of research methods. Publishing simulation datasets is challenging due to the iterative nature of simulations that generate large numbers and sizes of files, and their need for detailed documentation. DesignSafe is a web-based open platform for natural hazards engineering research where users can conduct simulations in high performance computing resources, curate, and publish their data. Working closely with experts, we completed a data design project for curation and representation of simulation datasets. The design involved the creation of a data and metadata model that captures the main processes, data, and documentation used in natural hazards simulation research. The model became the foundation to design an interactive curation pipeline integrated with the rest of the platform functions. In the curation interface, users are guided to move, select, categorize, describe, and register relations between files corresponding to the simulation model, the inputs and the outputs categories. Curation steps can be undertaken at any time during active research. To engage users, the web interactions were designed to facilitate managing large numbers of files. The resultant data landing pages show the structure and metadata of a simulation process both as a tree, and a browsing interface for understandability and ease of access. To evaluate the design, we mapped real simulation data to interactive mockups and sought out experts’ feed-back. Upon implementing a first release of the pipeline, we evaluated the data publications and made necessary enhancements.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited