Quantifying the Representation of Plant Communities in the Protected Areas of the U.S.: An Analysis Based on the U.S. National Vegetation Classiﬁcation Groups

: Plant communities represent the integration of ecological and biological processes and they serve as an important component for the protection of biological diversity. To measure progress towards protection of ecosystems in the United States for various stated conservation targets we need datasets at the appropriate thematic, spatial, and temporal resolution. The recent release of the LANDFIRE Existing Vegetation Data Products (2016 Remap) with a legend based on U.S. National Vegetation Classiﬁcation allowed us to assess the conservation status of plant communities of the U.S. The map legend is based on the Group level of the USNVC, which characterizes the regional differences in plant communities based on dominant and diagnostic plant species. By combining the Group level map with the Protected Areas Database of the United States (PAD-US Ver 2.1), we quantiﬁed the representation of each Group. If the mapped vegetation is assumed to be 100% accurate, using the Aichi Biodiversity target (17% land in protection by 2020) we found that 159 of the 265 natural Groups have less than 17% in GAP Status 1 & 2 lands and 216 of the 265 Groups fail to meet a 30% representation target. Only four of the twenty ecoregions have >17% of their extent in Status 1 & 2 lands. Sixteen ecoregions are dominated by Groups that are under-represented. Most ecoregions have many hectares of natural or ruderal vegetation that could contribute to future conservation efforts and this analysis helps identify speciﬁc targets and opportunities for conservation across the U.S. anthropogenic habitat types. The area generally allows conversion to unnatural land cover throughout or management intent is unknown. Unknown areas, private lands, developed or agriculture areas


Introduction
Ecologists have long recognized the importance of conserving biological diversity and the need for interdisciplinary approaches to address the challenge [1,2]. Myers [3] emphasized the potential for long-term evolutionary consequences of losing species and, in conjunction, recent research warns of the potential for ecosystem disruption as a result of species loss [4][5][6]. Over the last several decades, the scientific community worked to develop methods to quantify biodiversity [7,8] and to understand the threats to that diversity, including habitat loss [9], climate change [10], invasive species [11], disease [12] and interactions between multiple threats [13,14]. The international community has responded by coming together to establish the scientific basis for management and conservation actions at every level of organization (genes, species, ecosystems) [15] and to identify specific goals to help achieve sustainable development while maintaining biological diversity [16]. Table 1. U.S. National Vegetation 8-level hierarchy, criteria, and example types at each level. For these analyses we focused on the Formation Class and the Group levels of the hierarchy [42]. interested in addressing four major questions:

NVC Level
(1) How well represented are the natural Groups in the existing conservation network? (2) What are the spatial patterns of representation?
(3) Where are opportunities for increasing representation? In other words, where are natural types outside the current conservation network and where are ruderal and plantation vegetation that might be restored to natural conditions? And (4) Which agencies are currently managing most of the nation's vegetation resources?  With the release of the new Standard, the USNVC Hierarchy Revisions Working Group was tasked to complete a global set of upper formation level units for the USNVC Standard [42]. Development of the middle levels of the vegetation classification, Division, Macrogroup, and Group, was done through a broader partnership of experts who needed these units to support moderate resolution inventory and mapping efforts. Between 2012 and 2015, the Ecological Society of America (ESA) Panel on Vegetation Classification worked with the ESA Panel's Editor in Chief, NatureServe Ecologists, and many state and federal ecologists to conduct a formal review and revision of the middle levels of the USNVC. In 2016, there was a formal release of vegetation types at all eight levels of the hierarchy for the conterminous U.S. (USNVC Ver 2.0).
Prior to the 2016 revisions to the USNVC, GAP and LANDFIRE relied on the Ecological Systems Classification System [45] which had been developed specifically to address the need for a vegetation classification system to support mapping. Over time, the two programs began collaborating on mapping vegetation with the goal of streamlining the process to support the goals of both GAP and LANDFIRE. In 2021, the LANDFIRE technical team released the first existing vegetation map for the conterminous U.S. based on the USNVC Group level classification [45], the finest level of the classification that could be mapped with moderate resolution imagery and captures important regional differences in natural vegetation types (Table 1, Figure 1).
Here we present the GAP analysis based on that Group level map and the most current version of the Protected Areas Database for the U.S. Ver 2.1. [46]. We were specifically interested in addressing four major questions: (1) How well represented are the natural Groups in the existing conservation network? (2) What are the spatial patterns of representation? (3) Where are opportunities for increasing representation? In other words, where are natural types outside the current conservation network and where are ruderal and plantation vegetation that might be restored to natural conditions? And (4) Which agencies are currently managing most of the nation's vegetation resources?

Materials and Methods
The ecological models used to create the LANDFIRE Remap existing vegetation map are based on Landsat Enhanced Thematic Mapper Plus (ETM+) and Operational Land Imager (OLI) imagery, a suite of ancillary datasets (e.g., topography, climate, and soils) and the LANDFIRE Reference Database [47,48]. Seasonal image mosaics centered on 2016 were created using the USGS Earth Resources Observation and Science Center high performance computing systems and a best-pixel image compositing process [49] used to produce cloudfree image mosaics for each vegetation production unit area. Data from the National Land Cover Dataset [50] and National Agricultural Statistics Service Cropland Data Layer [51] were used in mapping non-natural classes such as developed and agricultural types.
To map natural and semi-natural (ruderal) vegetation, the LANDFIRE technical team assembled vegetation plot data in the LANDFIRE Reference Database [52]. There were over 700,000 plots available for mapping existing vegetation in the conterminous U.S. The technical team screened the plots for recent disturbances and spectral outliers, and labeled them with the dominant physiognomy (herbaceous, shrub, or tree) and assigned them to the USNVC Group using an auto-key process based on species cover or a crosswalk based on other community classification information provided with the plot. Labeled plots were used in a classification tree model to create masks for the herbaceous, shrub, and treed areas. For each of those areas a new round of models were created, this time the response variable was the Group label. Predictor variables included Landsat mosaics and derivatives and ancillary environmental variables. Specialized binary masks (e.g., sparsely vegetated, alpine, riparian) developed separately were used to map types in appropriate locations. The draft map for each vegetation production unit area was reviewed by several ecologists familiar with the area vegetation and when necessary post-processing was done to refine classes either by applying decision rules to relabel pixels or a new model was run if the concept of the type needed to be refined based on new information or additional plot data. For a full description of the LANDFIRE Remap methodology see Picotte et al. [47].
The final map of the conterminous U.S. contains 499 mapped land cover classes at 30-m resolution, including 287 Groups representing natural (265) and ruderal (22) Table S2.
An initial assessment of the Group level map has been conducted based on a subset of 10% of the plot data that were set aside prior to the modeling work. The contingency tables can be found at the LANDFIRE Program's website (LANDFIRE Program: Data Products-Data Quality-LF Remap EVT Agreement Assessment). The results of the assessment are preliminary and future assessment work is being discussed. While the plots were randomly selected from the plot database, they do not represent a stratified sampling across all types and geographies and therefore it is not possible to know the true precision of the mapping for all the Groups at this time. For our analyses, we assume that the map represents the best information available and take the next step in quantifying the level of representation at a national scale.
To quantify the representation of plant communities, we started with the LANDFIRE Remap 2016 (LF 2.0.0) USNVC product for the conterminous U.S. [53] and combined these data with the most recent version of the rasterized Protected Areas of the U.S. (PAD-US  Figure 2). Specifically, we used the ArcGIS Combine tool [54] to bring together the Group level map, the PAD-US management and GAP Status attributes, and a rasterized version of the Level II Environmental Protection Agency (EPA) Ecoregions [55]. We chose the EPA Ecoregions because they were found in the LANDFIRE's prototyping work to provide ecologically meaningful boundaries that aligned with the distribution of natural vegetation types [48]. The resulting attribute table (Table 2) summarizes the unique combinations of Group, Management, GAP Status, and Ecoregion. Because the USNVC is truly hierarchical, we can use Table 2 to summarize information at coarser thematic resolutions of the classification (e.g., Macrogroup). The LANDFIRE snapshot of the USNVC descriptions dates to 2016 and we provide a linkage to the May 2021 version in Table S1. Table 2. GAP Status Codes (USGS GAP 2020).

Status
Criteria and Examples 1 An area having permanent protection from conversion of natural land cover and a mandated management plan in operation to maintain a natural state within which disturbance events (of natural type, frequency, intensity, and legacy) are allowed to proceed without interference or are mimicked through management. National Parks, Wilderness Areas 2 An area having permanent protection from conversion of natural land cover and a mandated management plan in operation to maintain a primarily natural state, but which may receive uses or management practices that degrade the quality of existing natural communities, including suppression of natural disturbance. National Wildlife Refuges, State Parks, The Nature Conservancy Preserves 3 An area having permanent protection from conversion of natural land cover for the majority of the area, but subject to extractive uses of either a broad, low-intensity type (e.g., logging, Off Highway Vehicle recreation) or localized intense type (e.g., mining). It also confers protection to federally listed endangered and threatened species throughout the area. National Forests, BLM Lands, State Forests, some State Parks 4 There are no known public or private institutional mandates or legally recognized easements or deed restrictions held by the managing entity to prevent conversion of natural habitat types to anthropogenic habitat types. The area generally allows conversion to unnatural land cover throughout or management intent is unknown. Unknown areas, private lands, developed or agriculture areas We assembled attributes from the spatial data in a coded workflow using Python© (version 3.6.5) and the Python Data Analysis Library© (version 1.1.2). The workflow is documented and available in a Jupyter notebook. We conducted analyses to quantify representation of plant communities throughout the conterminous U.S. Specifically, we compiled raster cell counts to summarize:  There are no known public or private institutional mandates or legally recognized easements or deed restrictions held by the managing entity to prevent conversion of natural habitat types to anthropogenic habitat types. The area generally allows conversion to unnatural land cover throughout or management intent is unknown. Unknown areas, private lands, developed or agriculture areas

Representation of Natural Groups in the Conservation Network
Representation of the individual natural USNVC Groups within the GAP Status 1 and 2 lands varies greatly from <1 to 90 percent ( Figure 3; Table S2). When multiple use lands (Status 3) are included in the calculation there are a few groups with greater than 90 percent representation. The Groups in the Desert and Semi-Desert USNVC Class have the lowest mean (14%) representation in Status 1 and 2 lands, and the Groups in the Polar and High Montane Scrub, Grassland & Barrens Class have the highest (75%). Mean representation for Forest and Woodland Groups is 16%, Shrub and Herb Groups is 18%, and Open Rock Vegetation is 25%. Of the 265 Groups representing natural vegetation, 159 Groups had less than 17% of their distribution on Status 1 and 2 lands, and 216 Groups had less

Representation of Natural Groups in the Conservation Network
Representation of the individual natural USNVC Groups within the GAP Status 1 and 2 lands varies greatly from <1 to 90 percent ( Figure 3; Table S2). When multiple use lands (Status 3) are included in the calculation there are a few groups with greater than 90 percent representation. The Groups in the Desert and Semi-Desert USNVC Class have the lowest mean (14%) representation in Status 1 and 2 lands, and the Groups in the Polar and High Montane Scrub, Grassland & Barrens Class have the highest (75%). Mean representation for Forest and Woodland Groups is 16%, Shrub and Herb Groups is 18%, and Open Rock Vegetation is 25%. Of the 265 Groups representing natural vegetation, 159 Groups had less than 17% of their distribution on Status 1 and 2 lands, and 216 Groups had less than 30% in protection. If multiple use lands are included 61 Groups still have less than 17% and 116 Groups have less than 30% of their mapped distribution within the Status 1, 2 & 3 lands.

Spatial Patterns of Protection for USNVC Groups
The spatial patterns of representation of the mapped Groups in GAP Status 1 & 2, and 1, 2, & 3 lands are depicted in Figure 4. Areas of the Midwest dominated by human land use shown in gray are not included in our assessment. The ruderal and plantation types are mapped in pink. Notable locations of relatively high representation include the Everglades, the Warm Deserts, and portions of the Western Cordillera ( Figure 4A). When the multiple use lands are included, the pattern in the west changes and in the Mixed Wood Shield of the upper Midwest, with now much more extensive representation ( Figure 4B). Ruderal vegetation is concentrated in the Mediterranean California, the Cold Desserts, and South and Central Semi-Arid Plains Ecoregions. Ruderal and plantation vegetation is distributed more evenly throughout the Southeastern Plains, the South-Central Arid Prairies, and Temperate Prairies. Examples of some of the more extensive ruderal types  (Table S2). Eastern North American Temperate Forest Plantation is mapped extensively throughout the Southeastern USA Plains.     Desert vegetation and the U.S. Forest Service manages nearly 500,000 km 2 of Forest and Woodland vegetation. The majority of BLM and Forest Service lands are managed for multiple use. State agencies are responsible for managing over 145,000 km 2 of Forest and Woodland and nearly 60,000 km 2 of Shrub and Herb Vegetation and nearly 46,000 km 2 of Desert and Semi-Desert Vegetation, respectively.

Representation of USNVC Groups by Level II Ecoregion
When stratified by the intermediate ecoregion level, the majority of Groups have less than 17% of their mapped distribution in GAP Status 1&2 lands ( Figure 6, Table S1

Representation of USNVC Groups by Level II Ecoregion
When stratified by the intermediate ecoregion level, the majority of Groups have less than 17% of their mapped distribution in GAP Status 1 & 2 lands (Figure 6, Table S1). The ecoregions with the highest number of Groups with very low representation are the South Central Semi-Arid Prairies and the Tamaulipas-Texas Semiarid Plain, with 63 of 116 and 37 of 39 Groups with <1% representation respectively. The Warm Deserts and Western Cordillera ecoregions had the most Groups with over 30% representation (35 of 76 and 25 of 124 groups, respectively). In 16 of the 20 ecoregions, the majority of plant communities are under-represented at the 17% threshold. The Mixed Wood Shield, Everglades, Mississippi Alluvial and Southeast Coastal Plains, and Warm Deserts Ecoregions are the exceptions.

Distribution of GAP Status Designation by Level II Ecoregion
Four of the twenty ecoregions meet the 17% target for protection (Status 1 & 2), Specifically the Everglades, Mixed Wood Shield, Warm Deserts and Western Cordillera (Figure 7). None meet a 30% threshold. The Central USA Plains and Southeastern USA Plains have little protection and are dominated by intensive land use, whereas the South Central Semi-Arid Prairies and Tamaulipas-Texas Semi-Arid Plain Ecoregions have little protection but extensive non-converted vegetation. The percentage of Status 4 lands with non-converted vegetation varies greatly across ecoregions from a low of 13% in the Central USA Plains to as much as 79% in the Tamaulipas-Texas Semi-Arid Plain Ecoregion.

Distribution of GAP Status Designation by Level II Ecoregion.
Four of the twenty ecoregions meet the 17% target for protection (Status 1&2), Specifically the Everglades, Mixed Wood Shield, Warm Deserts and Western Cordillera (Figure 7

Discussion
We chose the U.S. National Vegetation Classification to conduct our GAP analyses of ecosystem protection because it represents a national standard classification [41] and provides a common language for the natural resource agencies to communicate relative to management. In addition, it provides a hierarchical structure that links to global classification systems [56] and provides for meaningful scaling of classification units to meet the natural resource management needs of the end-users [34]. For example, the Groups rep-

Discussion
We chose the U.S. National Vegetation Classification to conduct our GAP analyses of ecosystem protection because it represents a national standard classification [41] and provides a common language for the natural resource agencies to communicate relative to management. In addition, it provides a hierarchical structure that links to global classification systems [56] and provides for meaningful scaling of classification units to meet the natural resource management needs of the end-users [34]. For example, the Groups represented in LANDFIRE Remap could be generalized to Macrogroups for an analysis at continental scales. At the same time, agencies such as the National Park Service can and have been collecting data, classifying, and mapping vegetation on their lands at the finest levels of the USNVC classification, the Alliance and Association [57]. The hierarchy allows for meaningful scaling of the resolution while maintaining the important ecological context necessary for management decisions.
This analysis is based on three nationally consistent datasets (LANDFIRE Existing Vegetation, Protected Areas Database of the U.S. and the USNVC) and could provide the base for monitoring success in meeting conservation targets through time. The three datasets used in this analysis are the culmination of years of data development, expert review and refinement of the methods, and they represent the best available information at a national extent for each of the themes (plant communities, protected areas and vegetation classification). However, no dataset is without error, so these analyses should be treated as a general guide for identifying specific conservation actions. In addition, not all USNVC Groups were mappable using LANDFIRE methods, and additional methods are needed to ensure their representation. As we learn more about novel and previously undescribed plant communities, we will need to incorporate that knowledge in future assessments. Although there are hundreds of thousands of plots describing vegetation, those data are generally not the result of a systematic sampling and therefore many vegetation types and geographies are under-surveyed. Ideally those gaps in the data will be filled through a nationally coordinated effort and through continued collaborations with ecologists across the U.S.
Finally, we note, as Scott et al. [24] did, namely that a Gap Analysis is not a substitute for a thorough ecological inventory and assessment of the nation's ecosystems. Understanding the drivers that affect the ecological condition of ecosystems, both within and beyond the protected area boundaries, will be essential to ensuring the persistence of natural ecosystems in these protected areas.
While the analysis is relatively straightforward, it does identify specific plant communities and ecoregions that are currently under-represented in the conservation network. We do not attempt to prescribe methods for expanding the network, although our analysis does provide some of the potential targets for those decisions. Criteria on threats and intactness [58,59] or habitats for potentially rare species [25] or climate refugia [20] could be used in combination with this analysis to refine the list of proposed actions while at the same time making sure the actions could benefit biodiversity throughout the U.S. In addition, our assessment treats all ecosystems as of equal interest; it may also be helpful to evaluate the protection status of those ecosystems most at risk [60].

Conclusions
While the raw percentage of land in the conservation network could be used as metric of success, we chose to use plant communities, at a level of thematic resolution that would ensure regional stratification of protection, the USNVC Groups. By capturing the full range of natural vegetation at the Group level in the conservation network we are more likely to conserve the full range of biogeographic variation and to provide for conserving a full range of habitats across the country. Relative to the Aichi target of 17%, our analysis indicates that the majority of natural plant communities in the U.S. are currently underrepresented on Status 1 & 2 lands. As with other studies in the U.S., we show there is a spatial bias in the distribution of the protected lands [25,29,61] and therefore variation in the level of protection for a regionally specific element of biodiversity is high. Groups within the Polar and High Montane Scrub, Grassland and Barrens Class are well represented in the conservation network, while the mean representation for groups in the Forest and Woodland and Desert and Semi-Desert Vegetation fall below the 17% threshold. The mean for the Groups in the Shrub and Herb Vegetation and Open Rock Vegetation are slightly above the Aichi target but below 30%.
At the ecoregional extent, the majority of natural USNVC Groups are underrepresented. Our analysis shows that while a few ecoregions have low potential for adding to representation, there are ecoregions where the extent and distribution of natural, ruderal and plantation vegetation represents conservation opportunities for increasing representation.