Next Article in Journal
RNA Sequencing Dataset of Drosophila Nociceptor Translatomic Response to Injury
Previous Article in Journal
Credit Evaluation of Technology-Based Small and Micro Enterprises: An Innovative Weighting Method Based on Machine Learning and AHP
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Data Descriptor

A Comprehensive Parcel-Level Dataset on Farmland Assessment: Addressing Grid-Cell Data Bias Estimation

1
Institute for Coastal Adaption and Resilience (ICAR), Old Dominion University, 800 W. 46th St., Norfolk, VA 23508, USA
2
Department of Applied Economics, Utah State University, Logan, UT 84322, USA
*
Author to whom correspondence should be addressed.
Data 2025, 10(1), 10; https://doi.org/10.3390/data10010010
Submission received: 6 September 2024 / Revised: 10 January 2025 / Accepted: 13 January 2025 / Published: 17 January 2025

Abstract

:
Grid-cell data are increasingly used in research due to the growing availability and accessibility of remote sensing products. However, grid-cell data often fails to represent the actual decision-making unit, leading to biased estimates in socio-economic analysis. To this end, this paper presents a comprehensive parcel-level dataset for Salt Lake County, Utah, spanning from 2008 to 2018. This dataset combines detailed spatial and temporal data on land ownership, land use, and preferential farmland tax assessments under the Greenbelt program. Compiled from multiple geospatial sources, the dataset includes nearly 200,000 parcel-year observations, providing valuable insights into landowner decision-making and the impact of tax abatement incentives at the decision-making level. This resource is beneficial for researchers, educators, and practitioners in sustainable development, environmental studies, and farmland conservation.

1. Summary

As remote-sensing land cover products become increasingly prevalent in assessing the impact of land management and conservation policy interventions, a critical decision involves selecting the appropriate spatial unit—whether to use grid cells or polygons (points are also a potential spatial unit, typically derived as the centroid of homogeneous grid cells or polygons). Grid-cell units are widely favored by scholars for their ease of collection, processing, and analysis (e.g., [1,2,3], as well as their ability to produce balanced data in panel-data settings. However, the use of grid cells has long been criticized because they do not represent actual decision-making units. Land use decisions are typically based on land ownership or administrative units, rather than grid cell units [4,5,6,7]. Recent studies have demonstrated that using grid cells (or points derived from cells) that are either too large or too small relative to the spatial scale at which land use and land cover change decisions occur can result in biased estimates of treatment effects [8,9]. This underscores the importance of compiling parcel-level data when assessing the impact of land use policies on landowners’ decision-making and behavioral changes. Yet, existing parcel-level datasets often lack the necessary temporal depth, spatial detail, or integration with relevant policy instruments to support rigorous causal analysis (e.g., [10]). While the U.S. National Resources Inventory (NRI) provides valuable data on land use and natural resource conditions, it is based on a statistically derived sample rather than comprehensive parcel-level data. The NRI collects data at intervals, with surveys conducted every five years between 1982 and 1997 and annually from 2000 to 2017. However, the locations of these sample segments are confidential, and the segments themselves are discontiguous, limiting their utility for parcel-specific land use analysis. Additionally, while parcel-level GIS data may be publicly accessible in many U.S. counties, access challenges remain due to inconsistent availability, high costs in some regions, and the fragmented nature of such data, as there is no centralized nationwide source. Historical records pose another challenge, as parcel boundaries and ownership often change over time, and such changes are rarely systematically documented.
To address these limitations, we have introduced a comprehensive, annual, parcel-level dataset that incorporates preferential farmland tax assessment information, enabling researchers to evaluate policy impacts on land use decisions with greater precision. By overcoming the biases inherent in grid-level approaches and providing a consistent temporal framework, this dataset facilitates robust, policy-relevant analyses. Moreover, it offers a structured and temporally consistent resource for historical tracking and causal analysis, addressing many of the challenges associated with existing datasets. By bridging gaps in data availability, usability, and temporal coverage, this dataset provides a valuable tool for advancing research and informing policy in land use, conservation, and farmland management strategies.
This study helps address this critical need by compiling a comprehensive annual georeferenced parcel-level dataset for Salt Lake County, Utah, spanning the years 2008 to 2018. The dataset integrates spatial and temporal information on land ownership, land use, and preferential farmland tax assessments from multiple geospatial sources, thereby providing a detailed depiction of landowner decision-making regarding tax abatement incentives under the Greenbelt program. As a key farmland conservation policy in Utah, the Greenbelt program offers preferential tax assessments to encourage farmland preservation. Although we rely on secondary data sources, we adopt a rigorous scientific approach, compiling nearly 200,000 parcel-year observations into the dataset. This data resource is a valuable contribution to the scientific community, including researchers, educators, and practitioners in fields such as sustainable development, environmental studies, and the social sciences associated with farmland conservation in the United States.

2. Data Description

This dataset is designed as an accessible resource, enabling both academics and non-specialists, particularly those unfamiliar with GIS software, to quickly grasp the key information on farmland conversion in the study area. Therefore, the dataset contains two data files. One is the tabular data in comma-separated values (CSV) format, containing information on 27 variables with 183,940 observations spanning the years 2008 to 2018. The dataset includes variables determining the eligibility of preferential farmland tax assessment and important variables influencing farmland conversion. The other is a parcel boundary shapefile linked to the tabular data by a unique parcel identifier (labeled “ID”), which can also be linked to the 2008 parcel boundary shapefile. Table 1 provides a description of all variables included in the tabular data.

2.1. Greenbelt Status and the Eligibility Variables

The Greenbelt program is a voluntary program that landowners in Utah can participate in when their parcels meet the criteria set by the preferential tax assessment program [11]. To qualify for the program, land must meet three key criteria related to use, size, and productivity: it must be actively farmed for at least two consecutive years before the tax year in question, meet a five-acre minimum (or have specific exemptions for smaller parcels), and generate agricultural income above 50% of the county’s average. In causal influence analysis, establishing causation typically relies on treatment assignments being random. However, a voluntary program like this does not meet this requirement. As a result, to identify the causal effect of participating in the Greenbelt program on farmland protection, one potential strategy is to use program eligibility variables to construct instrumental variables that are less susceptible to manipulation by the landowners while controlling for factors influencing landowners’ decisions to enroll in the program.
Figure 1 illustrates the relationship and significance between the Greenbelt designation variable and two eligibility variables, Ag.elgb (agricultural use and size) and EVI.elgb (productivity) in 2017 (the year selection is arbitrary for illustration purposes). As defined in Table 1, Ag.elgb is a binary indicator representing Greenbelt eligibility based on the previous two years’ land use and the current year’s land size criteria. EVI.elgb denotes the share of a parcel meeting the productivity criterion, with spatial variation in providently captured using the MODIS-Enhanced Vegetation Index (EVI). A more detailed explanation of these variables is presented in Section 3.2 below, and relationships between key variables are located in Figure 2. Nearly all Greenbelt parcels align with at least one of the eligibility layers. Although not all eligible parcels are enrolled in this voluntary program, 51% of Greenbelt-designated parcels meet both the productivity and agricultural-use-and-size requirements. Additionally, over 90% of these parcels meet the agricultural use criterion, while 56% fulfill the productivity requirement. Some parcels, particularly in the northern region of the county, do not meet productivity eligibility; however, exceptions are made for landowners who provide evidence that the failure was due to circumstances beyond their control or that the land is involved in an approved agricultural practice. Ownership information in the dataset can be used to further control for these exceptions.

2.2. Farmland Development and Its Potential Drivers

Table 2 shows summary statistics of key factors that may affect the conversion of farmland to urban development. Variable Y represents the annual rate of farmland conversion. We see that on average, just under 33% of each parcel in the dataset is being converted to developed land annually (This differs from the estimate in the original publication because this variable is not inverse hyperbolic sine transformed). About 10% of the parcels are Greenbelt designated. Most parcels are 15 acres or smaller (1 acre ≈ 0.405 hectares), with the largest parcel being over 2000 acres. The average agricultural acreage per parcel is about six acres, with a maximum being over 1800 acres. The majority of the parcels are located about a quarter mile (1 mile ≈ 1.609 km) from a developed area. Around 27% of the agricultural land has productivity above the county average, and 26% of the parcels have been in agricultural use within the past two years, meeting criteria by either 5 acres or larger or having the same owner with other eligible land.
Figure 2 presents a correlation plot matrix illustrating the relationship between farmland development and its potential drivers. The negative correlation between Greenbelt designation and farmland development suggests that enrollment in Greenbelt program is associated with lower rates of farmland development. Additionally, negative correlations between Y and other factors, such as Area, AgAcre, Dist, EVI.elgb, and Ag.elgb, are observed. While we cannot make causal claims based on correlation alone, the negative relationships imply that larger parcels and those with more agricultural acreage are associated with lower rates of farmland development. Similarly, parcels located farther from developed land, and those meeting the productivity and agricultural use requirements (EVI.elgb and Ag.elgb), tend to be associated with lower rates of development.
In summary, the preliminary analysis of nonparametric relationships among key variables suggests that the Greenbelt designation has a generally negative effect on farmland development. These findings also support key hypotheses of monocentric city and urban growth models: parcels with higher agricultural productivity or those located farther from cities are less likely to be developed.

3. Methods

3.1. Data Collection

Parcel boundary, ownership information, and Greenbelt status from 2008 to 2018 are sourced from the Salt Lake County Assessor’s Office [12]. Land and crop type information are compiled using the Cropland Data Layer raster at a 30 m resolution from 2008 to 2018 [13]. Finally, crop productivity and program eligibility information are derived from the 16-day interval 250 m MODIS-EVI raster data from 2008 to 2018 (retrieved from [14]) and processed in combination with the CDL data.

3.2. Data Processing

Step 1: The decision to use the 2008 parcel boundary as the base reference is primarily driven by the need for a consistent spatial framework to track land use changes over time. Since parcel boundaries often change due to subdivision, consolidation, or reclassification, relying on a single, stable baseline year enables a uniform comparison of the same geographic areas across multiple years. This approach also simplifies panel-data analysis, as each parcel can be followed consistently through time, even if it subsequently splits into smaller parcels or merges into larger ones. However, using a fixed baseline year does mean that certain nuances in later years may not be fully captured. For instance, a parcel that splits into two distinct properties will still be represented by its original 2008 boundary. While this might mask some spatial details of later-year subdivisions or mergers, it allows researchers to focus on the temporal evolution of land use and policy impacts within a stable spatial boundary. In essence, the trade-off is between spatial precision in later years and analytical consistency across the entire study period.
To do this, yearly shapefiles from the assessor’s office for years 2009 to 2018 are overlaid with year 2008 to extract parcel information for years 2009 to 2018 linked to the parcel boundary in 2008. ID is a unique identifier variable for the parcels. It was created because there are duplicated and missing parcel identifiers (labeled as Parcel.ID08) for different parcel boundaries in the unprocessed shapefile from the Salt Lake County Assessor’s Office. ID ranges from 1 to the number of parcels in the 2008 County Assessor’s office parcel shapefile. OwnID is a unique identifier string indicating parcel ownership over the study period. This variable was created using a multi-step data processing and string-matching procedure designed to reconcile differences in naming conventions, typographical errors, and data entry inconsistencies. Parcel boundary changes typically occur during land development, resulting in either smaller parcels or merged larger parcels being created. While parcel boundary changes are overlooked to preserve consistent boundaries across the study period, these parcels retain their Greenbelt designation as long as they contain any Greenbelt parcel(s) after splitting or merging.
Step 2: Extracting the CDL land cover information related to developed land and agricultural land and applying it to the base parcel boundary map. Overlaying the 2008 parcel boundary shapefile on the CDL raster data, we then extract crop type and land cover information from the CDL within each parcel for the years 2008 to 2018. The CDL provides highly detailed land cover classifications as well as hierarchical crop classification of cropland cover per 30 m pixel, which allows us to count the number of pixels in a parcel corresponding to specific types of crops and land use (i.e., Ag, G, I as described in Table 1). The number of pixels corresponding to the various development stages (i.e., open space, low intensity, medium intensity, and high intensity) was converted from Ag and G and is indicated as D.o, D.l, D.m, and D.h. Agricultural land here refers to land cover classified as cropland (including idle cropland) or grassland/rangeland. Knowing the number of pixels corresponding to different land uses can determine whether a parcel has agricultural use (AgUse), as well as its agricultural acreage (AgAcre). This also enables tracking changes from agricultural land to developed land by comparing pixel classification values within a parcel from year to year (Y). The variable Dist is created by measuring the distance (in miles) from the centroid of a parcel to the nearest developed pixel.
Step 3: Creating two important eligibility variables for Greenbelt designation (i.e., receipt of a preferential tax assessment). We use two variables (Ag.elgb and EVI.elgb) to capture the three criteria outlined in the Greenbelt Act [5] regarding land size, previous years’ land use, and land productivity. The variable Ag.elgb is created by tracking agricultural use of a parcel in the previous two years and taking into account the current year’s parcel size. Specifically, for any given parcel i in year t , A g . e l g b i t = 1   if A g U s e i , t 2 > 0 , A g U s e i , t 1 > 0 , and p a r c e l   s i z e i t 5   a c r e s under the same ownership; A g . e l g b i t = 0 otherwise. Consequently, Ag.elgb captures land size and previous years’ land use criteria.
The EVI.elgb variable is created by combining CDL and MODIS-EVI data to represent the percentage of a parcel meeting the productivity criterion (see Figure 3). This is achieved by comparing crop productivity (measured by crop-specific EVI) at the 30 m pixel level to the crop-specific county average. EVI is available every 16 days with a spatial resolution of 250 m and a range of −1.0 to +1.0; higher EVI values represent denser vegetation coverage. The yearly EVI value for Salt Lake County, Utah, is extracted from MODIS-EVI by selecting the maximum value for each year in each 250 m pixel within the county. The logic is that crop productivity is measured when the crop is being harvested, which is reflected in the maximum EVI value. To identify the type of crop represented by the EVI images, the CDL pixel image is overlaid with the EVI image. This allows us to determine the corresponding EVI value for each CDL pixel. We then use this information to calculate the county average EVI for each crop grown in Salt Lake County each year. By comparing the EVI value of each CDL pixel in a given year to the county average, we determine its crop productivity eligibility. Specifically, for any given pixel n in year t , E V I . e l g b n t = 1 if E V I n t is not less than the county average EVI in year t . Once the pixel’s eligibility is determined, the percentage of eligible pixels in a parcel is used to create the EVI.elgb variable.

3.3. Computation of Development Metrics

Only parcels with agricultural land at time t 1 are included in the sample for a particular year during our study period from 2010 to 2018. The underlying assumption is that agricultural land is being converted to developed land, and once developed, it is irreversible. To measure and track parcel development over time, we introduce the annual land development rate variable ( Y i t ) as outlined in Equation (1). For a summary and correlation of the corresponding statistics, please refer to the variable Y in Table 1 and Figure 2.
Y i t = d i t a g i t 1
where Y i t is the rate of change from agricultural land to developed land on parcel i between year t 1 and t for each parcel, a g i t 1 is the area of all types of agricultural land (i.e., Ag, G, and I) on parcel i in year t 1 , and d i t is the land area converted from agricultural land in year t − 1 to all types of developed land (i.e., D o ,   D l ,   D m ,   a n d   D h ) in year t.

3.4. Data Quality

Because the underlying satellite data provide both high-frequency and extensive coverage, missing data are not a significant concern in this dataset. The remotely sensed productivity observations are collected every 15 days, ensuring multiple data points per year and minimizing temporal gaps. Additionally, each pixel measures 30 m by 30 m—approximately 0.22 acres—which allows multiple pixels to represent a single parcel. Given that the average parcel size in our dataset is about 14.46 acres (referring to Table 1), each parcel is well-covered by numerous pixels. This combination of frequent observations and fine spatial resolution ensures that ample data are available to create reliable annual variables with minimal missing information. The only remaining concern relates to potential classification errors in the raw land cover and land productivity data. For instance, the source satellite data may occasionally misidentify certain land cover types or productivity values. However, addressing such classification errors is beyond the scope of this paper. It is worth noting that both MODIS EVI and the USDA Cropland Data Layer are widely used and highly cited in the literature, reflecting their acceptance and reliability in scientific research.

4. Value of the Data

4.1. Key Insight from the Data

The original comprehensive analysis on the effect of Greenbelt designation on agricultural land protection, based on this dataset, has been published in [7]. We find that an unintended effect of the Greenbelt designation leads to more conversion of agricultural land to development than the protection it provides. On one hand, Greenbelt parcels with less than five acres of agricultural area experience a modest protective effect, resulting in an annual farmland-to-urban conversion rate about 1.3% lower than that of non-Greenbelt parcels. On the other hand, larger agricultural parcels enrolled in the Greenbelt program are more susceptible to conversion, with their annual conversion rate approximately 24% higher than that of comparable non-Greenbelt parcels. These results contribute to the ongoing debate on taxation-based farmland preservation by highlighting a policy loophole that allows agriculture to serve as a secondary use, ultimately undermining the intended protective effects of the program.

4.2. Potential Application of the Data

The following examples describe potential applications and insights that can be gained from the dataset as avenues for research and exploration in the context of urban growth control and farmland protection.
While our dataset focuses on Utah’s Greenbelt Act, the underlying methodology can be readily adapted to examine similar preferential farmland tax assessment programs in other states. Beyond Utah, nine states allow tax-preferred parcels to be used primarily for non-agricultural purposes, including Alabama, Alaska, Delaware, Maine, North Carolina, Pennsylvania, Rhode Island, Vermont, and Virginia. Direct comparisons across these states are feasible by applying the same data processing techniques and utilizing similar data sources. Parcel boundary information can be obtained from local county assessor offices, and the satellite-based datasets employed—such as MODIS EVI and the USDA Cropland Data Layer—are available nationwide. By following consistent protocols for data collection, integration, and analysis, researchers can conduct cross-state comparisons that yield valuable insights into the relative effectiveness of different farmland assessment policies in preserving agricultural land.
These data can be used to test hypotheses concerning the theory of urban growth, in particular the impact of a preferential farmland tax assessment program on urban sprawl characterized by low-density development, strip development, scattered development, and/or leapfrog development. The dataset is presented at parcel and landownership levels, making it superior to many other pixel-level datasets that lack land-ownership information. Importantly, the land-ownership data have been processed using a string-searching algorithm to match land parcels under the same ownership (across years). Given that there are nearly 200,000 parcel-year observations, it would be impossible to perform this matching manually. The data can be easily connected to Salt Lake County Assessor annual data to obtain parcel-level real estate information on property type, land value, building value, gross value, and tax value. Additionally, the data can be connected with other geospatial data, such as forest and natural vegetation land cover changes. Combined with ecosystem service assessment models, the indirect impact of a preferential farmland tax assessment program on the provision of ecosystem services can be explored. For example, if urban growth displaces forested or vegetated parcels, ecosystem functions like carbon sequestration, wildlife habitat connectivity, and stormwater regulation may be compromised. By linking our parcel-level data with ecosystem service models (e.g., InVEST, ARIES), researchers can quantify how preserving farmland through tax incentives maintains or enhances these services. Such analyses might involve comparing scenarios where farmland remains in production (thereby retaining surrounding natural vegetation) against scenarios where the same land is developed into urban or suburban areas. In this way, the dataset enables a more holistic understanding of how farmland protection policies influence not only agricultural outputs but also the broader ecological benefits provided to communities.
In addition, development practitioners can utilize the data to identify long-term trends in farmland-related activities for areas under development risks. By analyzing parcel-level data over the years 2008 to 2018, practitioners can observe patterns in how agricultural land is being converted to developed land. Interested readers can refer to Table B6 in the supplementary document published in [7]. This information helps practitioners understand the pace and extent of urban development pressure on agricultural land. Additionally, the data allows practitioners to pinpoint specific areas where farmland is most vulnerable to development pressures. This insight is crucial for designing targeted interventions and policy measures aimed at preserving agricultural land, optimizing land use planning, and balancing development needs with farmland conservation. By identifying these trends, practitioners can make informed decisions to promote sustainable development practices and mitigate the loss of valuable agricultural resources.
Finally, the simple format of the dataset makes it a convenient and useful resource for instructors of statistics and econometrics courses. With clearly defined variables and a straightforward structure, it is easily accessible for students learning to apply statistical and econometric techniques. The dataset spans multiple years and includes various variables, providing opportunities to practice data cleaning, manipulation, and analysis. Instructors can use it to teach regression analysis, hypothesis testing, time series analysis, and panel analysis, with real-world applications in farmland conversion, land use, and urban development. For example, instructors might assign a project where students explore the relationship between parcel size and development probability. Another exercise could involve testing whether parcels closer to developed land face higher development pressures, prompting students to consider control variables and fixed effects. This hands-on experience helps students bridge the gap between theoretical knowledge and practical application, fostering a deeper understanding of statistical principles and empirical research skills. Additionally, the dataset can support student projects, encouraging independent investigation and critical thinking, making it an invaluable teaching tool for enhancing the learning experience in statistics and econometrics courses.

5. User Notes

To effectively track parcel development changes over time, it is essential to maintain consistent parcel boundaries when creating the dataset. As a result, the information associated with split parcels is not retained in our data. Instead, the information is aggregated to the 2008 parcel boundary. For analyses focused on parcels with unchanged boundaries, one can select samples with a value of one in the “NoChange” variable. For studies focused on analyzing changes in parcel boundaries (i.e., land ownership), the 2008 parcel boundary information provided in this dataset can be easily updated by connecting to Salt Lake County Assessor data for other years using the “Parcel.ID08” variable that corresponds to the parcel ID as in the assessor data. The variable “Total” is larger than the number of pixels in Ag, G, I, D, L, M, and H because the number of pixels in the background, open water, forest, and wetlands is not included in the latter variables.
The dataset has been processed into a simplified format, making it accessible for analysis using tools like Excel or open-source software such as R. For example, once the dataset is loaded into the R environment, a simple function like summary() can generate the descriptive statistics presented in Table 1. Additionally, using the ggpairs() function from the GGally package, users can easily create a correlation plot similar to Figure 2. This simplicity ensures that the dataset is user-friendly and facilitates quick exploration and visualization for researchers and practitioners.
One clear limitation of this dataset is the reliance on a fixed baseline year, as this can limit the ability to capture every detail of parcel changes over time. For example, if a parcel splits into two separate plots after 2008, it will still be represented by its original 2008 boundary. This approach may obscure some spatial variations arising from later subdivisions or mergers, but it preserves a stable spatial reference that allows researchers to consistently track land use trends and policy effects over the entire study period. Ultimately, this approach strikes a balance between maintaining analytical consistency and forgoing some spatial precision in subsequent years.

6. Conclusions

This dataset addresses a critical gap in land use research by overcoming the limitations of grid-cell data, which often fail to represent actual decision-making units and introduce biases in socio-economic analyses. By providing annual parcel-level data for Salt Lake County, Utah, spanning 2008 to 2018, and integrating spatial and temporal details—such as land ownership, agricultural productivity, and tax status—it offers the opportunity for policy impact evaluation at the appropriate scale.
Beyond addressing these biases, the dataset facilitates policy-relevant research by enabling long-term tracking of farmland conversion and the effectiveness of conservation incentives. Its accessibility and straightforward format also make it a practical tool for researchers, practitioners, and educators, opening new opportunities for rigorous analyses that inform more effective land management and conservation strategies. This contribution bridges critical gaps in existing data and sets a benchmark for future studies in land use policy evaluation.

Author Contributions

Conceptualization, W.Y.S. and M.L.; data curation, W.Y.S.; formal analysis, W.Y.S.; methodology, W.Y.S. and M.L.; supervision, M.L. and A.J.C.; validation, M.L.; visualization, W.Y.S.; writing—original draft, W.Y.S. and M.L.; writing—review and editing, W.Y.S., M.L. and A.J.C. All authors have read and agreed to the published version of the manuscript.

Funding

This work was jointly supported by the USDA National Institute of Food and Agriculture (NIFA)’s Agriculture and Food Research Initiative (AFRI) Grants (2022-67024-36736 and 2024-67024-42700) and Hatch Capacity Grant (UTA-01785). This research was also supported by the Utah Agricultural Experiment Station, Utah State University, and approved as journal paper number 9851.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in Mendeley Data at https://data.mendeley.com/datasets/h8gwx4r2bn/1 or https://doi.org/10.17632/h8gwx4r2bn.1 (accessed on 11 June 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Deng, X.; Huang, J.; Uchida, E.; Rozelle, S.; Gibson, J. Pressure cookers or pressure valves: Do roads lead to deforestation in China? J. Environ. Econ. Manag. 2011, 61, 79–95. [Google Scholar] [CrossRef]
  2. Li, M.; Wu, J.; Deng, X. Identifying drivers of land use change in China: A spatial multinomial logic model analysis. Land Econ. 2013, 89, 632–654. [Google Scholar] [CrossRef]
  3. Li, M.; De Pinto, A.; Ulimwengu, J.; You, L.; Robertson, R. Impacts of road expansion on deforestation and biological carbon loss in the Democratic Republic of Congo. Environ. Resour. Econ. 2015, 60, 433–469. [Google Scholar] [CrossRef]
  4. Deng, X.; Huang, J.; Rozelle, S.; Uchida, E. Growth, population and industrialization, and urban land expansion of China. J. Urban Econ. 2008, 63, 96–115. [Google Scholar] [CrossRef]
  5. Lichtenberg, E.; Ding, C. Local officials as land developers: Urban spatial expansion in China. J. Urban Econ. 2009, 66, 57–64. [Google Scholar] [CrossRef]
  6. Li, M. The effect of land use regulations on farmland protection and non-agricultural land conversions in China. Aust. J. Agric. Resour. Econ. 2019, 59, 643–667. [Google Scholar] [CrossRef]
  7. Siu, W.Y.; Li, M.; Caplan, A.J. Unintended effects of preferential tax assessment on farmland protection: Evidence from Utah’s farmland assessment act. J. Agric. Appl. Econ. Assoc. 2023, 2, 737–752. [Google Scholar] [CrossRef]
  8. Avelino, A.; Baylis, K.; Honey-Rosés, J. Goldilocks and the raster grid: Selecting scale when evaluating conservation programs. PLoS ONE 2016, 11, 0167945. [Google Scholar] [CrossRef] [PubMed]
  9. Blackman, A.; Leguízamo, E.; Villalobos, L. Points, cells, or polygons? On the choice of spatial units in forest conservation policy impact evaluation. Environ. Res. Lett. 2024, 19, 054046. [Google Scholar] [CrossRef]
  10. Irwin, E.G.; Geoghegan, J. Theory, data, methods: Developing spatially explicit economic models of land use change. Agric. Ecosyst. Environ. 2001, 85, 7–24. [Google Scholar] [CrossRef]
  11. Utah State Legislature. Farmland Assessment Act; Utah State Legislature: Salt Lake City, UT, USA, 1987. Available online: https://le.utah.gov/xcode/Title59/Chapter2/59-2-P5.html (accessed on 11 October 2023).
  12. Salt Lake County Assessor. Parcel Data; Salt Lake County Assessor: Salt Lake City, UT, USA, 2019; Available online: https://slco.org/assessor/parcel-data/ (accessed on 11 October 2023).
  13. USDA-NASS. USDA National Agricultural Statistics Service Cropland Data Layer; USDA-NASS: Washington, DC, USA, 2019. Available online: https://www.nass.usda.gov/Research_and_Science/Cropland/SARS1a.php (accessed on 1 September 2024).
  14. Didan, K. MOD13Q1 MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V006; NASA EOSDIS Land Processes Distributed Active Archive Center: Sioux Falls, SD, USA, 2015. [Google Scholar] [CrossRef]
Figure 1. Greenbelt program enrollment criteria and Greenbelt designation.
Figure 1. Greenbelt program enrollment criteria and Greenbelt designation.
Data 10 00010 g001
Figure 2. The correlation of the drivers of farmland development. Note: “***” if the p-value is <0.001, “**”if the p-value is <0.01, “*” if the p-value is <0.05, “.” if the p-value is <0.10, and “ ” otherwise.
Figure 2. The correlation of the drivers of farmland development. Note: “***” if the p-value is <0.001, “**”if the p-value is <0.01, “*” if the p-value is <0.05, “.” if the p-value is <0.10, and “ ” otherwise.
Data 10 00010 g002
Figure 3. The workflow diagram for the EVI.elgb variable. Note: Square boxes represent inputs, blue circles represent processes, and the yellow box represents output.
Figure 3. The workflow diagram for the EVI.elgb variable. Note: Square boxes represent inputs, blue circles represent processes, and the yellow box represents output.
Data 10 00010 g003
Table 1. Name, type, and description of columns in the tabular data.
Table 1. Name, type, and description of columns in the tabular data.
Column LabelDescription TypeSources
IDA unique parcel identifierCategoricalAuthors’ Calculation
Parcel.ID08Parcel identifier that corresponds to the 2008 shapefile boundaryCategoricalSalt Lake County Assessor Office
OwnIDUnique landowner identifierCategoricalAuthors’ Calculation
YearYear of the dataCategoricalSalt Lake County Assessor Office
GBGreenbelt indicates whether a parcel is enrolled in the tax break programDummySalt Lake County Assessor Office
NoChangeIndicates whether there are any changes to the boundary between 2010 and 2018DummyAuthors’ Calculation
ZipZIP Code (a system of postal codes used by the US Postal Service)CategoricalAutomated Geographic Reference Center
AreaSize of land in acresContinuousAuthors’ Calculation
AgAcreSize of the agricultural land in acresContinuousAuthors’ Calculation
AgNumber of 30 by 30 m pixel that is agricultural landContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
GNumber of 30 by 30 m pixels that are grassland/ rangelandContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
INumber of 30 by 30 m pixels that are idle croplandContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
D.oNumber of 30 by 30 m pixels that changed from AG, G, or I in the previous year to developed-open density in the current yearContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
D.lNumber of 30 by 30 m pixels that changed from AG, G, or I in the previous year to developed-low intensity in the current year ContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
D.mNumber of 30 by 30 m pixels that changed from AG, G, or I in the previous year to developed-medium intensity in the current yearContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
D.hNumber of 30 by 30 m pixels that changed from AG, G, or I in the previous year to developed-high intensity in the current yearContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
ONumber of 30 by 30 m pixels that are developed-open densityContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
LNumber of 30 by 30 m pixels that are developed-low densityContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
MNumber of 30 by 30 m pixels that are developed-medium densityContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
HNumber of 30 by 30 m pixels that are developed-high densityContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
TotalNumber of pixels in the parcelContinuousUSDA National Agricultural Statistics Service Cropland Data Layer
EVI.elgbPercent of the land that meets productivity criterion ContinuousAuthors’ Calculation
Ag.elgbTakes the value of one if the parcel was agricultural in use in the last two years and the parcel is either 5 acres or above or the same owner has other eligible landDummyAuthors’ Calculation
AgUseTakes the value of one if the parcel was in agricultural use in last two years based on whether there were pixels within the parcel in Ag, G, or I in the previous two yearsDummyAuthors’ Calculation
DistDistance to the closest urban boundary in milesContinuousAuthors’ Calculation
UFAAThe Urban Farming Assessment Act (UFAA) takes the value of one if the parcels were in food crop production use in the last 2 years, the year is post-2013, and the food production acreage is larger than two acres but less than five, and zero otherwise.DummyAuthors’ Calculation
YRate of change from agricultural use to development at the parcel levelContinuousAuthors’ Calculation
Table 2. Summary statistics.
Table 2. Summary statistics.
VariableNMeanSt. DevMinMax
Y (%)183,94032.6045.500100
GB (1 if yes)183,9400.100.2901
Area (acre)183,94014.4653.1302147
AgAcre (acre)183,9406.2633.03018,556
Dist (mile)183,9400.28007.52
EVI.elgb (%)183,94026.9037.200100
Ag.elgb (1 if yes)183,9400.260.43401
Note: Adapted from [7].
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Siu, W.Y.; Li, M.; Caplan, A.J. A Comprehensive Parcel-Level Dataset on Farmland Assessment: Addressing Grid-Cell Data Bias Estimation. Data 2025, 10, 10. https://doi.org/10.3390/data10010010

AMA Style

Siu WY, Li M, Caplan AJ. A Comprehensive Parcel-Level Dataset on Farmland Assessment: Addressing Grid-Cell Data Bias Estimation. Data. 2025; 10(1):10. https://doi.org/10.3390/data10010010

Chicago/Turabian Style

Siu, Wai Yan, Man Li, and Arthur J. Caplan. 2025. "A Comprehensive Parcel-Level Dataset on Farmland Assessment: Addressing Grid-Cell Data Bias Estimation" Data 10, no. 1: 10. https://doi.org/10.3390/data10010010

APA Style

Siu, W. Y., Li, M., & Caplan, A. J. (2025). A Comprehensive Parcel-Level Dataset on Farmland Assessment: Addressing Grid-Cell Data Bias Estimation. Data, 10(1), 10. https://doi.org/10.3390/data10010010

Article Metrics

Back to TopTop