Multi ‐ Attribute Ecological and Socioeconomic Geodatabase for the Gulf of Mexico Coastal Region of the United States

: Strategic, data driven conservation approaches are increasing in popularity as conservation communities gain access to better science, more computing power, and more data. High resolution geospatial data, indicating ecosystem functions and economic activity, can be very useful for any conservation expert or funding agency. A framework was developed for a data driven conservation prioritization tool and a data visualization tool. The developed tools were then implemented and tested for the U.S. Gulf of Mexico coastal region defined by the Gulf Coast Ecosystem Restoration Council. As a part of this tool development, priority attributes and data measures were developed for the region through 13 stakeholder charrettes with local, state, federal, and other non ‐ profit organizations involved in land conservation. This paper presents the measures that were developed to reflect stakeholder priorities. These measures were derived from openly available geospatial and non ‐ geospatial data sources. This database contained 19 measures, aggregated into a one km 2 hexagonal grid and grouped by the overarching goals of habitat, water quality and quantity, living coastal and marine resources, community resilience, and economy. The developed measures provided useful data for a conservation planning framework in the U.S. Gulf of Mexico coastal region.


Summary
In response to the Deepwater Horizon Oil Spill in 2010, the United States government authorized the Resources and Ecosystems Sustainability, Tourist Opportunities and Revived Economies of the Gulf Coast States Act (RESTORE Act) in order to develop and implement a comprehensive strategy for restoration and protection of the Gulf Coast Region (GCR) of the United States. The RESTORE Act established the Gulf Coast Ecosystem Restoration Council (Restore Council), which is responsible for implementing protection and restoration of the GCR as described by their initial comprehensive plan. The Strategic Conservation Assessment of Gulf Coast Landscapes (SCA) Project, funded under the Restore Council's council selected restoration component (Bucket 2), is intended to: (1) collate existing plans and priorities within an ecological and socio-economic framework proposed by the Restore Council's initial comprehensive plan [1]; (2) develop tools and templates that can evaluate and strengthen existing land conservation proposals; and (3) develop spatial data layers that can be used to identify potential areas for land conservation projects. To accomplish the three objectives, the SCA project is developing a suite of three tools where users can explore a catalog of existing conservation plans and projects across the GCR (Catalog Tool) [1], evaluate and strengthen conservation proposals (Conservation Prioritization Tool), and explore areas within the GCR based on conservation priorities (Conservation Visualization Tool). The SCA tool suite is available to individuals and organizations who are interested in maximizing conservation benefits across the GCR within an environmental, social, and economic context. The backbone of the SCA tool suite is the geospatial database presented in this paper.
This geodatabase of 19 ecological and socioeconomic measures for the Gulf Coast Region of the United States was compiled to assess the ecological and socioeconomic benefits that may be associated with the protection of different landscapes. The measures are aggregated within a single hexagonal grid layer, with a resolution of one km 2 . The geodatabase was developed by the SCA project for use in the Conservation Prioritization Tool (CPT) and the Conservation Visualization Tool (CVT). The CPT, currently in beta testing, is an online tool where users can explore and compare ecological and socioeconomic benefits of land conservation actions throughout the Gulf Coast Region (GCR) of the United States. To assess the value of lands being considered for conservation, the CPT uses the geodatabase of 19 measures within a multi-criteria decision analysis framework. The CPT provides the user with summaries of geospatial raw data and utility values that may support conservation actions in identified project areas. The upcoming CVT will use the geodatabase to allow users to visualize where land conservation may achieve conservation objectives throughout the GCR.

Data Description
The geodatabase contains five categories of data, reflecting the five goals ( Table 1) that the Gulf Coast Ecosystem Restoration (Restore) Council declared in their initial comprehensive plan [2]. The data cover the SCA region of interest, which aligns with the RESTORE Act identified GCR, an area of approximately 700,000 km 2 ( Figure 1). The GCR encompasses coastal parts of AL, LA, MS, and TX, and all of FL. This geographic extent was created using the coastal management zone area in the five Gulf states, plus a 40.2 km inland buffer. Each measure, reflected by a dataset within the geodatabase, is described in Table 2. More detailed information about each measure can be found in Appendix A.  Build upon and sustain communities with capacity to adapt to short and long term changes.

5
Restore and Revitalize the Gulf Economy (GEC) Enhance the sustainability and resiliency of the Gulf economy. 1 In the remainder of the manuscript, RESTORE goals will be referenced by the defined acronyms.

Methods
The geodatabase was compiled with 19 measures that were developed either directly from existing datasets or by building datasets from the analysis of GIS data. All measures included in the geodatabase were identified from charrettes that the SCA project held across the GCR from March to May 2018 with land conservation stakeholders, which included representatives from various RESTORE member and partner agencies and organizations that engage in conservation actions. In total, 13 charrettes were conducted within the GCR (Austin, Corpus Christi, and Galveston, Texas; New Orleans, Louisiana; Biloxi, Mississippi; Mobile, Alabama; St. Mark's and St. Petersburg, Florida) where a total of 176 stakeholders were in attendance.
Throughout the charrettes, a total of 46 priority attributes were proposed by the stakeholders that they felt were important to land conservation in the GCR. Priority attributes were defined as key features that more specifically define conservation goals and can be quantified through measures. Stakeholders were asked to define priority attributes for land conservation within the framework of the five RESTORE goals (Table 1) and to suggest ways to measure each priority along with relevant sources of data. Stakeholders then ranked each attribute to indicate their priority for land conservation (highest to lowest). In total, a list of 260 tentative measures was developed from stakeholder identified priority attributes. Suitable data sources for each measure were then located by consultation of experts in geospatial analysis and large scale conservation planning. For a measure to be included in the database, it had to have Gulf-wide data availability and had to have relevance to land conservation. Following expert consultation and data exploration, the SCA project approved 19 measures for inclusion in the database. Data were then derived for measures from the openly available geospatial and non-geospatial data sources listed in Table 2.
Each measure was then processed to a 1 km 2 hexagonal grid using ArcGIS and the leaflet package within program R [33 -35]. A total of seven distinct methods was used to process the data into measures, depending on the data type from the source (i.e., raster or vector) and the type of measure being created (i.e., index, binary, percentage, count, length). See Section 3.2 for more information regarding data processing.

Hexagon Grid
Characterizing and comparing heterogeneous data types on the fly across a large geographic scale such as the SCA region required subdividing the region into finer and relatively homogenous sub-regions. Through subdividing, the variations in the data were minimal and the subdivision was relatively easy for interpretation and processing. In the literature, this process of subdividing is often referred to as zone mapping [36]. Zone mapping can be achieved by either using some criteria such as natural delineations (example: watershed, land cover), geopolitical delineations (example: county, city), or by using a fixed grid system. The method of subdividing geospatial data with a fixed grid system has unique advantages over other methods especially when it comes to undertaking any decision analysis as it coarsely represents the underlying data without the complexities of data size.
In this work, the SCA region ( Figure 1) was partitioned into an equal area hexagonal grid (side length of 0.61 km and area of 1 km 2 see Figure 2). Hexagonal grids are preferred over other geometrical shapes (e.g., square or triangle) because they are characterized by elements that do not have gaps or overlaps and the center-to-center distances between neighboring cells are almost equal [36]. Furthermore, hexagonal cells have a topology that is symmetric, invariant, and of equal area and can be recursively partitioned into smaller divisions of grids if or when required to represent higher resolution data types. The geodatabase was included within the hexagonal grid as an Esri Shapefile that stored nontopological geometry and attribute information for the spatial features in the dataset. An ESRI shapefile consists of a main file (.shp), an index file (.shx), a dBASE table (.dbf), a projection file (.prj), and a metadata file (.xml) that is compliant with the Federal Geographic Data Committee [37]. These data can be read in commercial GIS tools such as ArcMap and open source programs such as QGIS, R, and Python. The Esri Shapefile can also be converted into other file formats such as GeoJSON using open source tools.

Data Processing Workflow
The methods used to process each measure were dependent on the type of source data (i.e., vector or raster) and the type of measure unit (i.e., index, binary, percentage area, count, or length). See Table 3 for a more detailed summary of the source data type and measure unit type for each of the 19 measures. Figures 3 and 4 provide a visualization of the general workflow used to produce measures from vector and raster source data, respectively. Appendix A provides more detailed steps for how we produced each measure.

Data Overview
Figures 5a-5e illustrate the five different types of data measures in preprocessed states (i.e., index, binary, percentage area, count, and length) that are present within the database. Figures 6a  and 6b illustrate the two types of source data (i.e., vector and raster) that were utilized in processing the database measures.

User Notes
There are two potential sources of uncertainty that need to be considered with the use of this database. First is the time lag between source data production and when the database was created, which will depend on each measure. As time progresses, the accuracy of each measure will inevitably decline, but the database used the most recent version of each source datum to minimize any error induced by time lag. A second potential source of uncertainty came from the 1 km 2 . resolution that was used for the hexagonal grid. Since the measures were defined to describe general features of an area for conservation considerations and most lands conserved in this region were roughly 1 km 2 , the data resolution was adequate for assessments of the land conservation value. Caution should thus be noted if using this database for purposes other than the land conservation value.
The geodatabase is available at Scholar's Junction, an institutional repository for Mississippi State University as a shapefile. A readme file is available for download along with the database where the user can find instructions for how to convert the database from GeoJSON to other formats.

Supplementary Materials: Additional information about the Strategic Conservation Assessment of Gulf Coast
Landscapes project for which this database was developed can be found at https://github.com/scatools and http://www.landscope.org/gulfcoast. Author Contributions: S.S. conceived of the idea of this paper. A.S. and S.S. performed the writing. J.L., A.S., and S.S. led the data processing. J.R. led the stakeholder charrettes and provided rankings of priority attributes. K.E., J.R., and A.L. initiated discussions with experts on refining and using the data sources. K.E. and A.L. contributed to writing and revisions. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflicts of interest.

A1. Threat of Urbanization
Definition: Threat of urbanization (ToU) indicates the likelihood of the given project area or area of interest (AOI) being urbanized by the year 2060. A ToU score of zero indicates the AOI is already urbanized. A ToU score of one indicates that there is absolutely no threat of urbanization. A ToU score between zero and one indicates the predicted likelihood of threat in decreasing order. Source data came from the SLEUTH model [3][4][5][6]. 3. Perform spatial join with hexagon boundaries to extract areas.

A4. Composition of Natural Lands
Definition: This attribute prioritizes rare habitat types and those that have been identified as conservation priorities in state and regional plans. Scores reflect the proportion (%) of each area of interest that is covered by a priority land cover. Source data came from USGS GAP (2011) [9] Table A1). Definition: A percent attribute that stands for the change in peak flow during a 24 h 1 rainfall event of an HUC-12 watershed at its currently developed state relative to its hypothetical pre-developed state. The pre-developed state was created by replacing all anthropogenic land cover with the most dominant natural land cover found within the HUC-12 watershed. All land cover used in this measure came from the National Land Cover Database (NLCD) 2016 [14]. Workflow: 1. Perform spatial join of the vector data with hexagon boundaries.
Definition: An index that measures the intensity of light pollution within each hexagon. A score of zero indicates that the sky above the hexagon is already polluted/bright, and a score of 0+ to 1 indicates light pollution (LP) in decreasing order. Source data came from Falchi et al. [19]. Workflow: 1. Reclassify raster values from 0-30 to zero (Z), low (L), medium (M), and high (H) as per the thresholds shown in Table A2 below. 2. Convert Z, L, M, and H classes to vectors. 3. Crop the vector classes to hexagon boundaries and perform spatial joints to obtain areas (AZ, AL, AM, and AH). 4. The LP index is then calculated as shown in Equation (A4). 1. Clip the national heritage area data to hexagon boundaries. 2. Perform spatial join to obtain the area within each hexagon.

A14. Proximity to Socially Vulnerable Communities
Definition: This measure indicates the proximity to communities that are socially vulnerable according to the National Oceanic and Atmospheric Administration's (NOAA) Social Vulnerability Index. This is a binary attribute that represents the spatial relationship between a hexagon and areas that have been identified by NOAA as having medium or higher social vulnerability. Any area of interest that directly intersects or is within a 1 hex (1 km 2 ) distance of a socially vulnerable community would score a one, and areas of interest not within a 1 km 2 distance of a socially vulnerable community would score a zero. Source data came from the Social Vulnerability Index of Coastal Communities, from NOAAʹs Office of Coastal Management [22,23].
Definition: This measure indicates the number of points within a 25 km buffer radius of a hexagon, where the public can access places to engage in outdoor recreation, including boat ramps and access points to parks, wildlife management areas, wildlife refuges, and national estuarine research reserves. Besides boat ramps, access points to protected areas were identified by intersections between roadways and protected areas listed under the Protected Areas Database for the United States (PAD-US) 2.0 [7]. Roadways were acquired from the Topologically Integrated Geographic Encoding and Referencing (TIGER)/Line Shapefile by the United States Census Bureau (USCB) [32]. Public boat ramps were obtained from the Florida Fish & Wildlife Conservation Commission (FWC) [27], Alabama Dept. of Natural Resources (ALDNR) [28]