US EPA EnviroAtlas Meter-Scale Urban Land Cover (MULC): 1-m Pixel Land Cover Class Definitions and Guidance

This article defines the land cover classes used in Meter-scale Urban Land Cover (MULC), a unique, high resolution (one meter 2 per pixel) land cover dataset developed for 30 US communities for the United States Environmental Protection Agency (US EPA) EnviroAtlas. MULC data categorize the landscape into these land cover classes: impervious surface, tree, grass-herbaceous, shrub, soil-barren, water, wetland and agriculture. MULC data are used to calculate approximately 100 EnviroAtlas metrics that serve as indicators of nature’s benefits (ecosystem goods and services). MULC, a dataset for which development is ongoing, is produced by multiple classification methods using aerial photo and LiDAR datasets. The mean overall fuzzy accuracy across the EnviroAtlas communities is 88% and mean Kappa coefficient is 0.84. MULC is available in EnviroAtlas via web browser, web map service (WMS) in the user’s geographic information system (GIS), and as downloadable data at EPA Environmental Data Gateway. Fact Sheets and metadata for each MULC Community are available through EnviroAtlas. Some MULC applications include mapping green and grey infrastructure, connecting land cover with socioeconomic/demographic variables, street tree planting, urban heat island analysis, mosquito habitat risk mapping and bikeway planning. This article provides practical guidance for using MULC effectively and developing similar high resolution (HR) land cover data.


Introduction
Land cover (LC) data indicate the type, extent and configuration of the physical materials present at earth's surface (e.g., vegetation, built surfaces) and are essential to informed, effective stewardship of community landscapes, supporting decision making that integrates ecological, social, and economic factors. Toward this integration, the United States Environmental Protection Agency (US EPA) created EnviroAtlas (www.epa.gov/ enviroatlas), a collection of interactive geospatial tools and resources that allows users to explore the many benefits people receive from nature, often referred to as ecosystem goods and services (EGS) [1]. Key components of EnviroAtlas are a multi-scaled Interactive Map, which provides easy access to EnviroAtlas data, the Eco-Health Relationship Browser, which shows linkages between ecosystems, the services they provide, and human health [2], and ecosystem services information and educational resources, including a range of lesson plans that educators may integrate into classrooms.
EnviroAtlas is organized at two spatial scales. A coarser national-scale component spans the conterminous US and builds on the US National Land Cover Dataset (NLCD) [3] with a 30×30 m pixel resolution. For a finer community-scale component, the EnviroAtlas team has develops Meter-scale Urban Land Cover (MULC) at 1×1 m per pixel resolution, to support analysis and visualization of ecosystem services at a fine spatial resolution that captures individual trees, buildings and roads (Figures 1 and 2 The term "meter-scale" indicates the general size range of the smallest identifiable features on the ground. This corresponds to objects approximately one to four meters in size. The size of the smallest detectable features varies, depending largely on the spectral and spatial contrast of the target against its background. Image quality, date and atmospheric conditions are also factors. Similar high spatial resolution (HR) land cover (LC) data products have been developed by other groups, translated to MULC, and incorporated into EnviroAtlas. These external sources are the University of Vermont Spatial Analysis Lab, Sonoma Veg Map, the State of Iowa, Chesapeake Conservancy, Central Arizona-Phoenix (CAP LTER), Oneida Total Integrated Enterprises (OTIE), University of Arkansas Center for Advanced Spatial Technologies, and the Missouri Resource Assessment Partnership (MoRAP). After external LC data are translated to the MULC system, such data are considered equivalent to MULC, and are accompanied by the full suite of EnviroAtlas community EGS metrics. External land cover data sources, and how those data are translated to MULC, are specified in metadata for each MULC community.
As of 2010, approximately 81 percent of the United States (US) population lived in "urban areas" (US Census terminology for communities with population > 2500) [4]. Expanding urbanization is one motivation for developing high spatial resolution urban LC data for EnviroAtlas Communities. By modelling community landscapes at the fine MULC spatial scale of individual streets, buildings, trees, and lawns, we are better able to quantify landscape properties and patterns that contribute to human well-being and healthy urban ecosystems (Figures 1 and 2) and these EGS may then be better represented in making community decisions and policy. Potential MULC users include planning, commerce, transportation, recreation and public health authorities; water, wildlife and natural resource managers; community decision makers, teachers, students and citizens.
The purpose of this paper is to define the EnviroAtlas MULC land cover classes, describe the processes used to generate MULC, and provide guidance to support the most effective use of MULC data. In the Materials and Methods section, we present the MULC design, aerial imagery and LiDAR data specifications, image classification methods and a fuzzy accuracy assessment method. Next, we define the MULC classes and their characteristics. The Results summarize statistics for 30 US EnviroAtlas communities. The Discussion highlights some MULC applications and practical guidance for interpreting MULC data.

Materials and Methods
The MULC classes are intended to represent common urban landscape composition and features that can be reliably identified in 1×1 m pixels, visible near-infrared digital aerial photography, by human aerial photo interpreters, and by computer image classification algorithms. MULC classification design considerations include: • encompass the LC types anticipated in US community landscapes; • these LC classes are broadly recognized and understood by users; • simplicity; • the size of discrete landscape objects and features readily classifiable in single date, 1×1 meter pixel imagery; • minimal confusion between classes; • broad range of potential applications.  [5]. An additional 1 km buffer extends outward to eliminate potential edge effects when calculating EGS metrics based on moving-window analyses. In three cases we have used county boundaries for data provided by external partners (Chicago, IL [comprised of 10 counties], Los Angeles, CA, and Sonoma County, CA). The county is a convenient geographic unit; it typically leverages coordinated geospatial, financial, and administrative resources.

Input data
The input raster data stack for EPA-developed MULC typically consists of four-band aerial photography, Normalized Difference Vegetation Index (NDVI), and LiDAR data (height above ground and intensity layers) ( Figure 3). Ancillary geospatial data layers (Table 1) are used as available, advantageous, and appropriate to overlay agriculture and wetlands on the classified product, and for performing post-classification error correction.
Imagery from the United States Department of Agriculture (USDA) National Agriculture Imagery Program (NAIP) [6] is the primary MULC aerial photography. It has multiple advantages: • high spatial resolution: 1×1 m pixels (and finer in some recent imagery); • free (no cost) and available for most of the US; • updates every two to three years by state; • adequate horizontal positional accuracy (<= 6 m by specification) (in our experience, NAIP is co-registered within about two meters of other HR image sources such as Google, Bing and ESRI); • four spectral bands: three visible light bands (blue, green, red) and one nearinfrared (NIR) band, which is used to derive a normalized difference vegetation index (NDVI).
NAIP imagery is acquired via internet download or external hard drives from the USDA Aerial Photography Field Office (APFO) or State sources. The standard data format is uncompressed 8 bit GeoTIFF with uncalibrated radiance represented by 256 grayscale levels in each band. Uncompressed data are used to retain maximum spatial and spectral fidelity needed in classification. NAIP images are typically tiled and distributed using the United States Geological Survey (USGS) 7.5 min quarter quadrangle topographic map series. A few MULC datasets adapted from external sources may use other HR aerial imagery as indicated in metadata While NAIP imagery is available across the entire conterminous US, airborne LiDAR data are not. We acquire LiDAR data as available from the USGS National Map [7], NOAA [8], and state and county geospatial data portals and personnel. LiDAR point clouds are interpolated into rasters of the following layers: digital elevation model (DEM) (all bare earth, ground points), digital surface model (DSM) (first point returns), height above ground (HAG) (also referred to as Normalized DSM, or nDSM; HAG=DSM-DEM) and return pulse intensity. (Note: five initial EnviroAtlas communities -Durham, NC, New Bedford, MA, Paterson, NJ, Portland, ME, and Tampa Bay, FL -were produced without LiDAR.)

Image classification
EPA-developed MULC data are produced by classifying a raster dataset comprised of NAIP aerial photos, NDVI and LiDAR HAG and intensity data ( Figure 3). Externally developed MULC datasets are classified from similar raster layers as specified in metadata. We have used three different classification approaches: pixel-based supervised, object-based supervised, and object rule-based. The approach used in each community is described in the metadata. Figure 4 shows the overall workflow for producing MULC data. Pixel-based and object-based classification methods are discussed in [9]. For MULC datasets created by supervised classification methods, training samples are selected from within the community boundary being mapped.
The segmentation algorithms used in our object-based classification vary according to the software used: ArcGIS Desktop (v10.x) [10] and ArcGIS Pro (v2.x) (Segment Mean Shift) [11], ENVI (v5.x) (Watershed) [12], and eCognition (v9.x) (Multiresolution, Contrast Split) [13]. The analyst prepares to classify by studying existing land cover information to understand local vegetation, conditions and landscapes. NLCD and USDA Crop Land Data Layer [14] are particularly useful, in combination with HR imagery such as Google/Bing/ ESRI satellite view, NAIP NIR, Google Street View and Bing Birdseye view.

Post-classification operations-
The MULC data are reviewed after classification and errors are addressed in two ways. First, we perform as many edits as possible using GIS functions (e.g., conditional statements, convolution filtering). Ancillary spatial layers such as roads and building footprints are useful to mask and focus edits. We inspect each output layer to detect potential artifacts introduced by post-classification GIS functions. Second, we perform manual editing (on-screen, heads-up digitizing) to identify and recode remaining errors. Here the analyst interactively selects pixel groups (or polygons) for recoding from the incorrect to correct class. Manual editing is labor intensive and time consuming but can substantially improve the visual appearance.

2.3.2.
Fuzzy accuracy assessment-We use a fuzzy approach [15] to assess the accuracy of the MULC classification. The motivation is to better accommodate the nonexclusive nature of land cover class membership: "The need for using fuzzy sets arises from the observation that all map locations do not fit unambiguously in a single map category. Fuzzy sets allow for varying levels of set membership for multiple map categories. A linguistic measurement scale allows the kinds of comments commonly made during map evaluations to be used to quantify map accuracy" [15].
An assessment analyst labels the land cover at each reference point (pixel) and assigns a fuzzy confidence value to the label on a scale from (1) (incorrect) to (5) (correct). For example, tree canopy over grass is a situation where both classes could be considered "correct". The sensor may capture both the canopy and the ground through thin canopy or canopy gaps. The analyst might assign a tree label with confidence of 4, and a grass label with a confidence of 3. Another situation is accommodating the continuum between grass and soil endmembers. The fuzzy approach allows both agriculture and soil class labels to be considered correct for a barren crop field. Accuracy assessment results are presented in two confusion (error) matrices, showing errors of omission (producer's accuracy) and errors of commission (user's accuracy) for each class as well as an overall accuracy value. The fuzzy confusion matrix is less conservative and based on these fuzzy confidence evaluations; the non-fuzzy confusion matrix is more conservative and based on strict binary correct/incorrect class membership.
The MULC classification is compared to 500 to 700 randomly distributed photo interpreted reference points (i.e., an initial target of 100 reference points per class, 5 to 7 classes per community). If rare classes (e.g., soil, water) are under sampled (n<50), additional reference points are collected to reach n>=50, stratified by class as indicated by the MULC classification. The NAIP imagery input to the classification serves as the primary photo interpreted reference imagery. This assures spatial and temporal correspondence of the reference imagery and the MULC classification. Uninterpretable or ambiguous points may be removed from consideration (e.g., deep shadow or boundary between classes). Photo interpretation is aided by spatially linked displays of LiDAR-derived layers, NIR false color composite and other temporally appropriate high resolution imagery as noted above. Wetlands classes (woody and emergent) are not included in the accuracy assessment. Because remote identification of wetlands is complex and beyond the scope of our study, we assume that the ancillary wetlands data are reliable. However, reference pixels located in wetlands areas are assessed in terms of their non-wetlands, underlying MULC class.
The final quality assurance step is on-screen visual assessment of the classified MULC by multiple analysts at scales from 1:50,000 to 1:5,000. Known errors and uncertainties are described in the metadata for each community.

Definitions of MULC Classes
The standard EnviroAtlas MULC product is provided at a "Level 1" thematic resolution and is similar but not identical to the Anderson and NLCD Level 1 classes (Table 2) [3,16]. MULC data are published at Level 1; a structure of Level 2 classes is provided below in anticipation of potential future analyses requiring greater thematic specificity.
As discussed above, data are either created by EPA EnviroAtlas personnel or incorporated from external non-EPA sources. Externally produced data must meet these criteria:

•
The classes can be unambiguously translated to the MULC system;

•
The data are at same or finer spatial resolution; • The data are sufficiently contemporaneous with the EnviroAtlas period of study; • The data have an overall target fuzzy accuracy >= 80%. (We perform the standard MULC accuracy assessment on externally developed LC data.) To the first point, a dataset acquired from external sources that contains separate Building and Street classes can be unambiguously recoded into the MULC impervious surface class. However, a hypothetical Residential class defined as "50% impervious surface and 50% vegetation" cannot be used because impervious, trees, shrubs and grass are inseparably combined into a single class and cannot be unmixed.

Unclassified-The 00
Unclassified class is available for special cases or unanticipated LC classes not present in the existing MULC system.

Water-
The water class includes all natural and some anthropogenic surface waters: rivers, streams, canals, ponds, natural lakes, artificial lakes, dammed valley reservoirs, bays, estuaries and near-shore coastal waters. Note that wastewater treatment tanks, clarifiers, basins and sumps are labeled impervious surface, as are swimming pools, fountains and similar small anthropogenic water features. This distinction is made based on their ecosystem services which are very different to those in the forms of surface water above: they are not biologically active (wastewater treatment notwithstanding); they are closed systems without natural surface water exchange with the environment; they are constructed features. The water class is most commonly confused with shadow, tree and dark impervious surface. Bright sun glint on water is confused with highly reflective classes such as soil or impervious surface. Turbid, sediment-laden brown or tan water is confused with soil. Shallow water is confused with soil, impervious and vegetation depending on bottom surface optics of the substrate (e.g., sand, silt, rock, submerged vegetation). Water with floating vegetation may misclassify as vegetation but is intended to be in the water class. Floating vegetation is assumed to be ephemeral, and that the LC at such a point is better represented as water than vegetation.
Lakes, ponds, tidal zones, estuaries and other water bodies that vary in extent and shoreline location over time are mapped according to how they appear in the imagery; i.e., at the date and time of image acquisition. If circumstances favor using a different shoreline (e.g., authoritative NOAA shoreline) this is indicated in the metadata.

Impervious
Surface-An impervious surface prevents or substantially limits rainfall and other water from infiltrating into the soil. The impervious class includes paved roads, parking lots, driveways, sidewalks, roofs, swimming pools, patios, painted surfaces, wooden structures and most asphalt, concrete and paved surfaces. In MULC, dirt roads, gravel roads and railways are classified as impervious. These areas are compacted, disturbed and altered leading to a loss of perviousness. Except for bare rock, most impervious surfaces are anthropogenic and most pervious surfaces are natural (e.g., vegetation, soil). Bare rock is functionally impervious and is commonly confused spectrally with the impervious class, but in MULC it is assigned to the soil and barren class. Rooftops and roads that incorporate sand and clay materials are spectrally confused with soil but belong in impervious.
Level 1 MULC combines roads/pavements and buildings into one impervious class (20), rather than separate Level 2 roads (23) and buildings (24) because of the requirement for height information to classify buildings. The original MULC classes are designed to be classified from NAIP imagery, with or without LiDAR, because of patchy LiDAR availability. If height above ground or building footprints are available, one can separate roads and buildings. Solar panel farms (class 27) are a separate impervious Level 2 class. They represent a third type of impervious built surface after pavements and buildings/rooftops. Solar panels present an interesting EGS case in that biological functions continue beneath the artificial canopy. The panels provide shade and collect and distribute rain preferentially.

Soil-Barren-
The soil and barren class ("soil") includes soil, bare rock, mud, clay, sand, barren (fallow) agricultural fields, construction sites, quarries, gravel pits, mine lands, recreational areas, golf course sand traps, ball parks, playgrounds, stream and river sand bars, sand dunes, beaches and other bare soil, sand, gravel and rock surfaces. Soil and barren includes natural areas with widely spaced or no vegetation cover, including the soil substrate of semiarid and arid rangeland, shrubland and desert. Unpaved dirt roads, gravel roads, and railways are typically semi-impervious, and are assigned to the impervious class unless otherwise noted.
Soil is a relatively rare class in humid temperate communities such as Milwaukee, WI, Pittsburgh, PA and Portland, ME. Soil is more common in arid communities such as Phoenix, AZ and Fresno, CA. Construction sites are a common soil surface in highly developed urban landscapes, and barren agricultural fields on the periphery. Soil is commonly confused with light impervious surfaces.

Tree-
The tree class includes trees of any kind, from a single individual to continuous canopy forest. Trees are single stem woody perennial plants with a trunk, branches and leaves and height greater than 2 m. Signature characteristics of the tree class in NAIP imagery include greenness, high NIR reflectance, NDVI, a mottled textured canopy, tree crowns illuminated and shadowed on opposite facets, visible trunks, length of shadows and context. Signature characteristics of the tree class in LiDAR include height above ground, intensity, object shape, multiple LiDAR returns and canopy surface texture.
Level 1 MULC combines deciduous and evergreen trees in one tree class. Shrubs greater than two meters height are classified as tree unless otherwise indicated. Bamboo is botanically a grass (family Poaceae) but is classified as tree here if height >= 2 m. Tree is most commonly confused with water, dark impervious, shrub and grass.
Tree canopy pixels that extend over other LC surfaces such as streets, buildings and lawns are assigned to tree rather than the underlying class. The tree canopy is what the sensor "sees" in its direct line of sight. This convention reflects an EnviroAtlas emphasis on EGS and the importance of street trees in urban areas. Thus, where trees extend over a road, driveway, sidewalk or rooftop, the amount of underlying impervious surface (or grass, soil or water) will be underestimated. The horizontal surface area of tree canopy will be correctly estimated. If accurate road and building footprint data are available, one may compute the under-canopy extent of these obscured surfaces.
2.4.6. Shrub-Shrubs are multiple stem woody perennial plants between 0.5-2 m height. Shrubs are recognized in air photos by context (e.g., desert, rangeland, urban landscaping), the mottled texture of the canopy (compared to grass), and lesser shadows (compared to trees). Shrubs are recognized by height (and possibly shape) in LiDAR data.
In some land cover datasets, arid and semiarid natural shrub vegetation is mapped as undifferentiated shrubland (51). In that case, shrubs, soil, and grass are mixed in a single class, rather than as differentiated classes of shrub (52), grass (70) and soil (30). In EnviroAtlas, using shrub class (52) (individual shrubs) is preferred over shrubland (51). Shrub (52) is at a finer information granularity to support EGS analysis.

2.4.7.
Grass-Herbaceous-The grass and herbaceous class ("grass") includes the graminoids, forbs and herbs lacking persistent woody stems. Grass includes residential lawns, golf courses, roadway medians and verges, park lands, transmission line and natural gas corridors, recently clear-cut forest areas, pasture, grasslands, and prairie grass. Small shrubs may fall into this category as noted above. It is also known as "low vegetation." For healthy, photosynthetically active grass, the principal identifying characteristics in NAIP imagery are greenness, high reflectance in the near infrared, high NDVI, urban context and a smoother image texture than tree and shrub canopy. Context helps in identifying grass (e.g., proximity to a building, athletic field or highway). NAIP imagery is collected in summer leaf-on conditions when grass may be green, or brown with heat and moisture stress. Sparse or brown grass is commonly confused with soil and impervious. Grass-soil confusion is greater in arid than in humid-temperate environments.
What to do with indeterminate pixels in NAIP imagery that could be either grass or soil? Sparse, brown or dead grass are spectrally like soil, and soil and grass intermix along a continuum. Guideline: if potential grass or soil pixels/polygons show above-background reflectance in the near-infrared band (indicative of photosynthetic activity), they are labeled grass. An operational assumption is that, except in arid regions, soil has the potential to support grass or other vegetation at some point during the growing season. The analyst consults other HR imagery from different dates to assess if grass is present at other times.

Agriculture-Agriculture is a layer superimposed on the MULC classification.
The USDA Common Land Unit (polygon) [17] and raster Cropland Data Layer (CDL) [14] are used to help identify agriculture polygons. Level 2 agriculture is labeled as row crops (80) if MULC pixels are classified as grass, shrub, or soil and fall within these ancillary agricultural datasets, and orchards (82) if classified as tree. (Note: The agriculture class numbering deviates slightly from standard MULC class numbering conventions due to a transcription error in the initial data upload.) Pasture is assigned to the grass class for two reasons: 1) difference in land management practices between row crops and pasture, and 2) difficulty differentiating pasture from non-cultivated grass.
The agriculture ("Ag") class is included in a MULC product if the most recent NLCD indicates agriculture greater than 5% within the EnviroAtlas community boundary. If agriculture is less than or equal to 5%, agriculture pixels (polygons) are labeled as whatever LC is on the ground when the NAIP imagery is acquired (grass, soil, shrub or tree), rather than as agriculture. Twenty of the EnviroAtlas communities have an agriculture class.
2.4.9. Wetlands-As defined by Section 404 of the Clean Water Act: "Wetlands are areas that are inundated or saturated by surface or ground water at a frequency and duration Pilant et Remote Sens (Basel). Author manuscript; available in PMC 2020 August 24. sufficient to support, and that under normal circumstances do support, a prevalence of vegetation typically adapted for life in saturated soil conditions" [18]. Wetlands include swamps, marshes, bogs, and other wet and flooded areas [19,20]. Like agriculture, in MULC data, wetlands are delineated using the best available ancillary data, which to date have been the U.S. Fish and Wildlife Service National Wetlands Inventory (NWI) [21] and U.S.G.S. National Hydrography Dataset (NHDPlus v2) [22]. Classifying wetlands directly from imagery/LiDAR is beyond the scope of this study, and generally requires ground validation and ancillary data. Wetlands boundary polygons are overlaid on the MULC data; areas classified as tree are labeled woody wetland (91), and areas classified as grass-herbaceous are labeled emergent wetland (92). Treatment of shrub areas is indicated in the community metadata. Visual checks are performed for thematic and positional agreement of wetlands layers and underlying imagery.

Results
Here we present statistics characterizing the MULC dataset. Table 3 summarizes size, population, year and accuracy statistics for 30 EnviroAtlas communities. MULC communities range considerably in both aerial extent and population, with the largest community (Chicago) encompassing more than 14,000 km 2 and the smallest (Paterson, NJ) spanning just 47 km 2 . The mean area is 3,139 km 2 . Community populations closely aligned with aerial extent in a positive relationship. The largest community population is over 9.8 million (Los Angeles, CA County, 11,336 km 2 ) and the smallest is just over 1,500 people (Woodbine, IA, 51 km 2 ). The mean population for EnviroAtlas communities is 2.1 million people according to the 2010 U.S. Census [23].
The data used for classification in each community vary by availability, and typically the most recent available data are prioritized. Twelve of the 30 communities published to EnviroAtlas are based on 2010 NAIP imagery and most of the other communities are based on NAIP imagery from 2016 or earlier (Table 3). Twenty-four community datasets incorporate LiDAR, but, due to timing of LiDAR acquisition, only four of those communities have LiDAR matching the imagery collection dates. LiDAR is not collected as frequently as imagery, and collection years for both LiDAR and imagery often do not overlap. When imagery and LiDAR are of different dates and do not agree due to land use changes, the analyst typically defers to the imagery when making post-classification corrections and during the accuracy assessment process. Data limitations for each community are indicated in metadata. Table 4 summarizes fuzzy and non-fuzzy MULC accuracies by class for all existing EnviroAtlas communities. Table 5 is a confusion matrix constructed from 17,760 reference points for the 27 EnviroAtlas communities that have received both fuzzy and non-fuzzy accuracy assessments, illuminating the nature of interclass confusion.
Class user and producer accuracies for all communities are generally high and increase between non-fuzzy and fuzzy assessments ( Table 4). The class user accuracy, calculated by dividing the number of correct reference points (where both the row and column classes agree) for a class by the row total, indicates how well the land cover represents the class as

EPA Author Manuscript
EPA Author Manuscript EPA Author Manuscript defined by the reference points. The class producer accuracy, calculated by dividing the number of correct reference points for a class by the column total, indicates how well the class is represented in the classification. The most accurate class in MULC landcover is water. Twenty-eight communities have both fuzzy and non-fuzzy accuracy assessments. The mean overall accuracy across all EnviroAtlas communities is 88% fuzzy and 82% non-fuzzy. Overall fuzzy accuracy is always higher than overall non-fuzzy accuracy. Mean kappa values are 0.84 fuzzy and 0.77 non-fuzzy. The soil class has the lowest user accuracy (77.8%) and grass class has the lowest producer accuracy (78.9%). Based on the fuzzy confusion matrix (Table 5), grass class confusion is mostly with soil and tree classes.Soil class mixing is mostly with grass and impervious. Shrub landcover class is mapped in only six (western) communities and consequently has fewer reference points than the other classes.

Discussion
MULC underpins metrics that complement those derived from the national, 30 meter resolution land cover component in EnviroAtlas, offering decision makers, researchers and others the ability to evaluate ecosystem services and land cover characteristics at household/ street, community, neighborhood (block group), city, and regional levels. MULC and other EnviroAtlas data have been used in a range of applications from regional to local scales. Portland, Oregon, city planners have used MULC to design street tree planting and green infrastructure for urban heat island mitigation [24]. In Durham, North Carolina, EnviroAtlas has been used to identify census block groups with low tree cover and vulnerable populations to explore how tree planting might benefit child development, overall public health, and environmental quality [25,26]. In Tampa Bay, Florida, a Health Impact Assessment (HIA) has demonstrated how MULC and EnviroAtlas metrics, tools, and data can assist decision makers in a health and wellness application [27]. See the EnviroAtlas use case page for more information (https://www.epa.gov/enviroatlas/enviroatlas-use-cases).
The high level of detail provided by MULC data contributes to diverse research ranging from environmental and public health to the economic benefits attributed to EGS. MULC data and derived EnviroAtlas community metrics support research including epidemiological studies on the salutogenic effects of natural environment exposure in urban areas [28], mosquito distribution analyses to assess vector-borne disease risk in Texas [29], and urban revitalization efforts in the Great Lakes region [30], among others. A bibliography of research using MULC and other EnviroAtlas data can be found on the EPA EnviroAtlas website: https://www.epa.gov/enviroatlas/enviroatlas-publications.

Uncertainty in MULC Data
In this section we discuss MULC interpretation and origins of common uncertainties in MULC data. It is important that map classification errors be understood so that the EGS metrics can be accurately estimated, as major map errors can translate into incorrect valuation of ecosystems [31]. A confusion matrix is just one expression of map accuracy and users have varying needs that may prioritize map characteristics other than the statistical evaluation of reference points. MULC datasets have been developed using both pixel-based and object-based methods and land cover features, in actuality, are groupings of pixels representing the real world. MULC users who reside in the communities represented in the MULC dataset series are likely to possess the best understanding of accurate (or expected) land cover types in areas of interest. There are times when a map can have high statistical accuracy but still possess errors in a particular area of local interest; this can make users lose confidence in the product. It is important that a map have good statistical accuracy, but also accurately represent real conditions for local users. It is for that reason that we spend a large portion of the data development process in quality assurance (QA) to ensure that MULC datasets possess acceptable statistical accuracies and have minimal visual errors.

Evaluation and uncertainty in MULC
We recommend that to evaluate MULC data, the user display MULC at 40-60% transparency overlaid on the source imagery (e.g., NAIP) basemap and view at multiple zoom levels. This allows direct comparison of the MULC layer and source imagery.
Comparing with higher resolution (e.g., 0.1-0.5 m) imagery may add additional useful information. Displaying the MULC over a more recent image basemap may visually highlight sites of land cover change.
There are multiple types of uncertainty and errors in high resolution land cover data for consideration when evaluating data accuracy and quality:

1.
True misclassification (e.g., the image pixel is composed of soil but the map labelled it grass);

2.
Non-exclusive class membership (the pixel is a mixture of soil and grass);

3.
Inter-observer error (the map developer and the accuracy assessor use different criteria (e.g., pixel color and brightness) for labeling an ambiguous pixel as either soil or grass)

4.
Severity of misclassification (for a specific user application, mistaking grass for tree may be less significant than mistaking soil for tree [i.e., vegetated versus non-vegetated] or water for impervious);

5.
Items 2 and 3 are allowed greater flexibility as a result of the fuzzy accuracy assessments employed for MULC datasets.
Sources of errors and uncertainty in MULC data include: ambiguous class membership (e.g., the grass-soil continuum); shadows; image quality and dynamic range; radiometry and solar geometry differences between image acquisition dates for a community; different image and LiDAR acquisition dates; low quality or missing LiDAR data (e.g., non-returns over water); errors or misalignment of ancillary data.

Grass-Soil confusion
Grass and soil show the greatest class confusion (Tables 4 and 5). A pixel may be on a continuum between all grass and all soil, and brown, senescent grass is spectrally and texturally similar to soil. Another factor is potential differences in heuristic thresholds used by the MULC data developer and accuracy assessor to distinguish grass from soil. There are ambiguous or borderline cases for distinguishing grass and soil, especially in semi-arid and arid locales; e.g., Los Angeles, CA and Phoenix, AZ. When grass is brown, sparse, stressed or senescent, the green and near infrared reflectance are reduced and more resemble a soil spectral signature. NAIP data are collected in summer when non-irrigated vegetation may be brown and water stressed. Because grass and soil land cover have different ecosystem services and functions, we strive to differentiate them in a community's MULC. An EnviroAtlas MULC convention is to assume that most soils (especially in more humid ecoregions) are capable of supporting some amount of grass or other low vegetation, so the classification algorithms are tuned to favor a grass label under ambiguous circumstances. The MULC developer examines ancillary aerial and street imagery from multiple dates to see if grass is present at other times of year to determine algorithm thresholds for discriminating grass from soil.

Soil-Impervious confusion
Impervious surfaces misclassed as soil is the second most common misclassification of soil and one of the major sources of low soil user accuracy (Table 5). Light (high albedo) impervious surfaces and soil are spectrally and texturally similar and thus easily confused in classification. Light impervious surfaces can include parking lots, paved roads, compacted dirt roads and light-colored roofs. LiDAR intensity may be useful for differentiating soil and impervious which may be otherwise inseparable in four band optical imagery.

Grass-Tree Confusion
Grass misclassed as tree is the second most common misclassification of the grass class and one of the major sources of low producer accuracy for grass (Table 5). Grass and tree classes overlap spectrally in imagery, though trees are usually darker and more textured. One common error is speckling of apparent grass pixels amidst an otherwise continuous tree canopy. Such pixels may be true grass pixels visible through canopy gaps, but commonly they are produced by bright, well-illuminated sun-facing facets of tree canopy. This effect may be amplified with low sun angle and mixed tree heights. LiDAR height and intensity layers usually help clarify tree-grass confusion but may also overcorrect speckling errors and precipitate additional corrections. If LiDAR is unavailable or inadequate, tree canopy speckle can be reduced using smoothing and majority convolution filters.

Shadows
Shadows are a common source of error in high resolution imagery. A pixel in shadow receives only non-direct sunlight which affects its spectral signature and lowers the signal to noise ratio. (With fewer incident photons, fewer reflected photons reach the sensor.) Shadows are common at the edges of buildings, shrubs, trees, and forest patches, and in steep topography. Explicably, shadowed features tend to be misclassified among the darker classes: tree, impervious, and water. shadows on water are commonly misclassified as impervious. Such water errors are sometimes correctable using NHDPlus or NWI layers and by thresholding on low reflectance values in the NIR band.

Conclusions
We define a classification system for US EPA EnviroAtlas Meter-scale Urban Land Cover (MULC). At 1×1 m pixel size, MULC data supports community mapping, planning, modeling and decision making at high spatial resolution as fine as individual trees, buildings and roads. MULC data and more than one hundred sustainability, health, and ecosystem goods and services metrics have been developed for 30 US communities. MULC and other EnviroAtlas data are free and accessible via web browser in EnviroAtlas, as web services, and by download through EnviroAtlas and the EPA Environmental Data Gateway. MULC data are suitable for many applications including tree planting, green infrastructure siting, watershed protection and modeling, urban heat island and stormwater runoff mitigation and mosquito habitat risk mapping. Data and information updates are available at EPA EnviroAtlas. We hope that guidelines presented here help MULC users and support similar high spatial resolution mapping efforts.       Graminoids, forbs and herbs lacking persistent woody stems; includes residential lawns, golf courses, roadway medians and verges, park lands, transmission line and natural gas corridors, recent forest clear-cuts, meadows, pasture, grasslands and prairie grass. Also known as "low vegetation." Grass classified in wetlands areas is recoded to emergent wetlands.  (92) wetlands Wetlands polygons are overlaid on classified MULC using best available data (e.g., NWI, NHD+). Grass recodes to emergent wetland. Trees recode to woody wetland. soil, water and impervious classes remain unchanged. Treatment of shrubs is indicated in metadata.