Mapping the Potential for Biofuel Production on Marginal Lands: Differences in Definitions, Data and Models across Scales

: As energy policies mandate increases in bioenergy production, new research supports growing bioenergy feedstocks on marginal lands. Subsequently there has been an increase in published work that uses Geographic Information Systems (GIS) to map the availability of marginal land as a proxy for bioenergy crop potential. However, despite the similarity in stated intent among these works a number of inconsistencies remain across studies that make comparisons and standardization difficult. We reviewed a collection of recent literature that mapped bioenergy potential on marginal lands at varying scales, and found that there is no common working definition of marginal land across all of these works. Specifically, we found considerable differences in mapped results that are driven by dissimilarities in definitions, model framework, data inputs, scale and treatment of uncertainty. Most papers reviewed here employed relatively simple GIS overlays of input criteria, distinct thresholds identifying marginal land, and few details describing accuracy and uncertainty. These differences are likely to be major impediments to integration of studies mapping marginal lands for bioenergy production. We suggest that there is future need for spatial modeling of bioenergy, yet further scholarship is needed to compare across countries and scales to understand the global potential for bioenergy crops.

studies based their estimates on a compilation of existing literature. Hoogwijk et al. [22] combined the estimates reported in literature published by Hall et al. [21] and Houghton et al. [23] for a total estimate of 430 to 580 Mha of degraded lands available for biomass production globally. More recently, in China two studies provided estimates of available land for biofuels based on the existing literature and government reports [9,24]. What these studies lack, however, is spatially explicit information on exactly where these lands can be found.
To visualize the potential spatial patterns of bioenergy production on marginal land there has been an increase in published work that uses quantitative and spatially explicit models within Geographic Information Systems (GIS) to map the availability of marginal land for bioenergy crops. However, despite the similarity in stated goals among these works-mapping marginal land for bioenergy production-there remain a number of inconsistencies across studies that make comparisons and standardization difficult, and to date none of the current methods are widely accepted [25].
In this paper we shed light on the practical challenge of mapping bioenergy potential across scales by reviewing recent literature that uses geospatial technology to explicitly map marginal, abandoned or degraded lands specifically for the purpose of planting bioenergy crops. We reviewed papers that use geospatial techniques to spatially link placement of bioenergy production on marginal land. We used the Google Scholar Database and citation tracing to identify papers on bioenergy and marginal lands published over a 20-year period from 1993 to 2013. Our search words included: bioenergy, biofuel, biomass, biodiesel, marginal land, abandoned land, degraded land, land, GIS, spatial, and scale. Articles were removed if they did not meet all of the following criteria: (1) target second generation bioenergy feedstocks; (2) focus on land that may be categorized as marginal; and (3) use spatially-explicit (GIS) techniques. Papers that used mapped results from other studies were removed from the list (for example, Zhuang et al. [26] use the work published by Cai et al. [27] in their analysis of marginal land for algae production). Our search criteria narrowed our review to 21 papers from years 2008 to 2013. Of the studies reviewed here, five were conducted at a global scale, seven were national, eight were regional, and one was conducted at the local (city) scale ( Table 1). The specific bioenergy crops targeted by the studies also varied. Some addressed bioenergy generally, others focused on single crops, and others concentrated on a suite of crops specific to the region. To highlight the diversity of methods for even one geographic region, we examined China as a comparative case study in part because of the relatively large number of studies in this region meeting the criteria listed above. Through this lens we developed a framework to evaluate the other work based on several factors: we first examined the various definitions of marginal land given, then investigated how each working definition is implemented in spatial models through model choice, data selection, scale, and treatment of uncertainty. Our goal was to provide a framework to help researchers evaluate existing marginal land scholarship, and to suggest possible changes that might be made to modeling protocols so that more realistic comparisons across projects can be made. Campbell et al. [28] Global Abandoned agriculture lands -areas that have been abandoned to crop and pasture due to the relocation of agriculture and due to degradation from intensive use‖ bioenergy crops none (1) (2) none none 2008 Field et al. [29] Global Abandoned cropland, abandoned pastureland -land that was previously used for agriculture or pasture but that has been abandoned and not converted to forest or urban areas‖ biomass energy none (1) (2) none none 2011 Cai et al. [27] Semi-Global Marginal agricultural land -has low inherent productivity for agriculture, is susceptible to degradation, and is high-risk for agricultural production‖ 2nd generation biofuel feedstocks and LIHD (low input high diversity) perennial grasses Odeh et al. [33] National (Australia) Marginal agricultural lands here the definition -is based on precipitation since the most limiting factor for dryland agricultural intensification is low water availability caused by low amount of precipitation. However, the marginal agricultural lands may be characterized by degraded soils, particularly the saline soils‖ Pongam-perennial tree and Indian mustardoilseed crop (14) none none (2) 2011 Schweers et al. [34] National (China) Degraded and abandoned land -land degradation is a long-term loss of ecosystem function and services, not least production, caused by disturbances from which the system cannot recover unaided‖ generic bioenergy crops (6) (6) (5) (4) none 2011 Swinton et al. [13] National (USA) Marginal land non-crop land at -the extensive margin, where land quality is low enough that the value of biomass produced just covers its cost of production‖ cellulosic biofuels none (2) none none 2011 Zhuang et al. [26] National (China) Marginal land -land that has relatively poor natural condition but is able grow energy plants, or land that currently is not used for agricultural production but can grow certain plants‖ 5 spp. bioenergy crops (13)

Defining Marginal Land
The term -marginal land‖ is currently so intertwined with discussions surrounding bioenergy that its definition might be assumed to be specific and certain. Yet the definition of marginal land has varied across time, space and discipline to meet multiple management goals [25,46]. The concept of marginal land made an appearance in the literature in the 19th century when David Ricardo identified variations in the desirability of land based on its proximity to essential resources, such as water supply or food markets [47]. Ricardo's land rent theory posited the logical view that certain land units have a comparably higher value if located in proximity to essential resources. Following Ricardo, economist Ricardo Johann Heinrich von Thünen elaborated on the theory of location value and by the early 1900's an economic concept of marginal land had emerged [48]. Peterson and Galbraith [49] discussed marginal land in a theoretical framework, wherein rational land use decisions stem from and respond to changes on the -extensive margin,‖ where revenue is equal to the costs of production. They also highlighted the importance of socio-economic factors, including cost of living, landholding size, accessibility to credit, and land tenure policies. This concept of the extensive margin has been well studied since [50][51][52][53] and is still recognized by the USDA as playing a role in land use change today [54].
More recently the definition of marginal land began to take on an explicit spatial characteristic. Beginning as early as the late 60's, researchers used non-GIS forms of spatial overlay to map land capability [55]. By the 1980's several studies began to map physically marginal lands for the purpose of inventorying underproductive agricultural land, often with the goal of taking vulnerable and risky lands out of production [56]. These studies were largely based on locating soils with physical restrictions and production constraints. In 1988, a regional study in Minnesota was conducted to find -marginal agricultural land‖ to be targeted for a cropland set-aside program based on erodible soils and poor land productivity [57]. In 1990, Breuning-Madsen performed an overlay mapping assessment of wet and droughty soils and those on steep slopes marginal for agricultural use in Denmark to isolate areas of the country most economically vulnerable to low yields [58]. A decade later, researchers in Poland set out to map -marginal agricultural lands‖ and -less favored farming areas‖, respectively, based on biophysical characteristics, the later as part of a mandatory inventory requirement before joining the European Union [59,60]. Such studies relied solely on biophysical definitions of marginal land, including poor soils, poor drainage, and steep slopes.
Currently, the term marginal land is often referenced within bioenergy research. This emergent conversation is evidenced by GoogleScholar search statistics, which show that since 1993 there have been an increasing number of papers addressing marginal lands, biofuels, GIS and any combination of these. Within the literature, the initial jump may arguably be credited to Hall et al. (1993), who highlighted the potential benefits of biomass production to restore degraded lands. Since then, interest in the topic further expanded after publication of a paper by Tilman et al. [7], and especially in 2008 with both the emergence of the food versus fuel debate [61][62][63] and finally with the publication of the first studies to map the global extent of abandoned or degraded agricultural lands for bioenergy production [28,29]. However, as the subject becomes an increasing focus in the literature, the working definitions of marginal land become increasingly diverse, making comparisons between studies, and standardization of estimates difficult.
Each of the papers reviewed here offer definitions of marginal land as they relate to bioenergy. However, there is a common distinction between the general definition of marginal land referred to in the introduction of each paper, and the working definition of marginal land as implemented in the methods by way of crop choice, input criteria and modeling framework. General definitions of marginal land are fairly generic by nature and are therefore relatively consistent across studies. The definition offered by Gelfand et al. [17] is a typical example, where they describe marginal lands as -those poorly suited for food crops because of low productivity due to inherent edaphic or climatic limitations or because they are located in areas that are vulnerable to erosion or other environmental risks when cultivated‖. The general agreement is that although these lands are unsuited for conventional agricultural crops, they are conversely well suited for bioenergy crops (withstanding major limitations like steep slopes that prevent mechanized farming). For example, Gopalakrishnan et al. [40] define marginal land as -marginal for conventional crops but not marginal for biofuel crops or other functions, based on economic, soil health, and environmental criteria". General definitions overwhelmingly describe lands that are not prime for conventional crops because they are high risk for economic payoff owing to low productivity resulting from climate or soil limitations.
In contrast, the working definitions of marginal land implemented in the papers differ considerably between studies. Working definitions of marginal land vary by nation and target crop, which in turn drive input criteria and modeling framework. Thus direct comparisons between published outputs are difficult. Moreover, there are few areas in the world that have been mapped with multiple methods making comparisons practical. One exception is China, which is currently under enormous pressure by energy and food security policies not to grow bioenergy crops on agricultural land. For this reason, bioenergy potential in China has been mapped several times and in different ways. As a precursor to developing a framework for evaluating the larger body of work, it is useful and illustrative to examine how different studies map bioenergy potential on marginal land in China.

A Comparative Case Study: Mapping Marginal Lands for Bioenergy in China
Of the studies reviewed here, five reported results for the People's Republic of China (three at the national scale and two at the global scale). Each used different working definitions of marginal land: they focused on different crops, used different input criteria, models and assumptions, and all had different mapped results.
Lu et al. [35] mapped the national potential for growing Chinese Pistache (Pistacia chinensis) on marginal lands in China. Their general definition of marginal land is land -unsuitable for crop production, but ideal for the growth of energy plants with high stress resistance. These lands include barren mountains, barren lands and alkaline lands". They used a GIS overlay analysis to first target marginal land uses suitable for planting bioenergy crops, including sparse forest, natural grassland, and unused land (alkaline, bare, shoal/bottom lands). Then they characterized three levels of suitability for Chinese Pistache production based on eco-environmental requirements (temperature, precipitation, soil and slope), and modified their results with social, economic and environmental constraints, including sensitive and protected areas, national reserves, and cultivated lands. Before producing their final mapped results, they employed a minimum mapping unit of 200 ha. All data are referenced in Table 2. This method yielded 19.9 Mha of available land (2.08% of China's total land area).   Zhuang et al. [26] mapped the national potential for five bioenergy species, including Jerusalem artichoke (Helianthus tuberous L.), Chinese Pistache, Chinese castor oil (Jatropha curcas L.), Cassava (Manihot esculenta), and Tung Tree (Vernicia fordii). They generally defined marginal land as -land that has relatively poor natural condition but is able to grow energy plants, or land that currently is not used for agricultural production but can grow certain plants". They used a GIS overlay analysis with binary thresholds that first identified marginal lands based on land use, terrain (slope < 25% and elevation specific to species), climate (specific to species) and soil (they excluded sand, sandy gravely, saline and alkalized soil, and soil depth). Then they modeled the optimum location on the identified marginal lands for each species based on the eco-environmental requirements. Land uses considered suitable for bioenergy included woodlands (shrub land, sparse forest land), grasslands and barren lands (including shoal/bottomland, saline and alkaline land, and bare land) ( Table 2). The results suggested that 43.75 Mha were available for these five species (4.57% China's total land area).
Schweers et al. [34] performed a national study in China examining the potential for general bioenergy production on degraded and abandoned land. They qualified that, -land degradation is a long-term loss of ecosystem function and services, not least production, caused by disturbances from which the system cannot recover unaided". They used a GIS overlay analysis with multiple input criteria that defined degraded land as the loss of net primary productivity (NPP) between 1981 and 2003. They calculated abandoned land by subtracting the maximum value from the percentage grassland and cultivated land between 1700 and 2000 from those values in 2005, where negative change values indicated abandoned land. Lands suitable for bioenergy crops included land cover with mixed vegetation and cropland, land cover with mixed grassland, forest or shrub land, lands with closed to open shrub land, lands with closed to open herbaceous vegetation, and lands with sparse vegetation. Land covers unsuitable for bioenergy crops included croplands, forested areas, wetlands, urban areas, water, snow and ice, bare and un-defined lands. They also excluded conservation areas, including protected areas identified by the World Database on Protected Areas (WDPA), areas of high biodiversity, areas with high percent organic carbon content, and steep slopes ( Table 2). The authors also conducted ground verification in two regions including GPS, photos, and interviews-with 60% of the verification points confirming the remote suitability assessment. This method yielded 39.1 Mha for bioenergy crops when conservation areas were not excluded, or 20.2 Mha when conservation areas were excluded (2.11% or 4.09% China's total land area, respectively).
Milbrandt and Overend [31] mapped the potential for lignocellulosic biomass plants on marginal land in APEC countries, including China, defined as land -characterized by poor climate, poor physical characteristics, or difficult cultivation. They include areas with limited rainfall, extreme temperatures, low quality soil, steep terrain, or other problems for agriculture‖. They employed a GIS overlay analysis that used linear combination with multiple input criteria using binary thresholds. The analysis focused primarily on land cover, soils and slope ( Table 2). Bare and herbaceous areas (not in use or with only moderately intensive pastoralism) were targeted, and lands with intensive and extensive pastoralism were excluded. Lands with moderate and steep slopes and lands with soil problems (e.g., course, sandy, acidic) or shallow soils were considered marginal lands suitable for planting bioenergy crops. After marginal land was mapped based on the above criteria, they excluded protected lands as well as deserts, cold regions, ice/glacier areas, water features, forests, agricultural lands, urban areas, as well as herbaceous and bare lands under intensive and extensive pastoralism. The analysis predicted 51 Mha (5.34% China's total land area) should be available for bioenergy production.
Finally, Cai et al. [27] modeled the potential for second-generation biofuel feedstocks and low-input high-diversity (LIHD) mixtures of native perennials for most of the globe on marginal agricultural land that has, -low inherent productivity for agriculture, is susceptible to degradation, and is high risk for agricultural production". As compared to the four studies previously mentioned, the authors used a more complex model that employed Fuzzy Logic Modeling (FLM) and land cover exclusion. Their model identified marginally productive lands based on soil productivity, slope, and climate (soil temperature regime and humidity index) ( Table 2). Input criteria were evaluated by applying a membership function to each criterion based on empirical knowledge, thereby converting quantitative values to qualitative ratings on the level of land productivity (low, marginal, or regular). Criteria were then aggregated into probabilities of land belonging to a category of land productivity, which yielded a final land productivity score. After the marginal lands were mapped, the authors developed four scenarios of marginal lands available for bioenergy crops that progressively included: (1) mixed crop and marginally productive natural vegetation; (2) marginally productive cropland; (3) marginally productive grasslands, savanna, and shrub lands (assumes regularly productive regions of these areas excluded for pasture or for future crops); and (4) regular land that used for mixed crop and vegetation and grassland, savannah, and shrub land with either regular or marginal productivity, and removed marginally productive pastureland possibly accounted for in the grassland class in scenario 3. For China specifically, mapped results yielded 52-213 Mha of marginal agricultural land, depending on scenario (5.43% to 22.26% of China's total land area).
In summary, despite similarity of intent, these five examples from China illustrate considerable differences in working definition, including input criteria, modeling framework, and mapped results. Mapped results ranged from 2.08 to 22.26% of China's total land area. Of the five projects, all five used some representation of land cover, but each model used a different land cover dataset (Table 2). In addition, studies using nearly the same dataset (only with different publication dates, but identical thematic classes) differ in which thematic classes they considered marginal. For example, while both Zhuang et al. [26] and Lu et al. [35] initially targeted shrub land as a marginal land class, Lu et al. [35] later ultimately excluded the shrub land category in their study, citing China's policies on forestry that state shrub land should not be modified for other purposes. Another differentiating factor regarding land cover is the inclusion or exclusion of existing cropland. Both Schweers et al. [34] and Cai et al. [27] included cropland in their analysis, while the others did not.
Even among studies mapping within the same country and using the same national data sources, there are both subtle and large differences between input criteria and parameters that impact mapped land area. In the present analysis, both Zhuang et al. [26] and Lu et al. [35] used nearly identical datasets (they provide one of the closest comparisons of the papers reviewed here), but have different mapped results: 43.75 and 19.9 Mha, respectively. This is in large part due to the fact that the former mapped marginal land for five bioenergy species and the latter only mapped for one species. However, differences may also be attributed to variances in both input data and parameters applied. For example, Lu et al. [35] used 2000 land use data from the RESDC, while Zhuang et al. [26] use land used data from the same source and at the same scale, but for a different year-2008. In addition, Lu et al. [35] ultimately excluded shrub lands from their analysis, while Zhuang et al. [26] included shrubs in their analysis. The two studies also got their soil and terrain data from the same sources (RESDC for soil and SBSC for terrain), but reported two different mapping scales, 1 km and 1:100 k, respectively for soil, and 1:250 k and 1:100 k, respectively for terrain. When targeting marginal lands, Lu et al. [35] and Zhuang et al. [26] both included alkaline lands as mapped in the RESDC land use database. However, Zhuang et al. [26] then went on to exclude -seriously alkalized soil and saline soil‖ as reported in the RESDC soils database. Because the two studies mapped for different species, it is hard to say how influential these additional differences in input criteria are, however it may be assumed that the mapped results would still not be identical.

Mapping Potential for Bioenergy
These examples from China illustrate considerable differences in mapped results that were driven by differences in definitions, model framework, and data input. Next we evaluated the broader literature and further evaluated possibilities for commonalities. Our review included 21 papers from years 2008 to 2013; five were conducted at a global scale, seven were national, eight were regional, and one was conducted at the local (city) scale (Table 1).

Model Selection
Geographical Information Systems (GIS) are a powerful set of tools for site suitability models that incorporate numerous input datasets [64,65]. GIS-based site suitability analyses are increasingly used to identify potential locations for integrating renewable energies into the landscape [66,67]. The majority of these analyses are based in Multi-Criteria Evaluation (MCE), the underlying principal of which is to synthesize complex problems by examining the coincidence of factors among multiple spatially co-registered variables [68].
GIS suitability analyses can range from very simple GIS linear overlays with binary thresholds, to very complex models that incorporate binned thresholds, weights, standardization, fuzzy logic, or all of the above. The oldest and most commonly used form of GIS-based land suitability analysis relies on linear combinations of spatially referenced input criteria, also known as -Boolean‖ overlays. This methodology has a long history [69] and has the benefit of being transparent and easy to follow. These models typically result in distinct thresholds of suitability, i.e., a parcel of land is either suitable for planting bioenergy crops or is not. The resulting suitability maps are often further restricted by the application of constraints. For example, marginal lands otherwise potentially suitable for bioenergy crops are modified by the elimination of unsuitable land (e.g., urban areas or water). Another category of geospatial land suitability models are those that incorporate Weighted Linear Combination (WLC) [68]. Models that use WLC standardize each criterion layer, which can then be weighted based on their importance as decided by selected stakeholders and or experts in the field. Therefore when combined, one criterion with relative low suitability can be recompensed by the high score of another. In contrast to Boolean and WLC models, which often employ distinct thresholds of suitability, models that incorporate fuzzy set theory assign continuous grades of suitability [70,71]. In a fuzzy set, the concept of suitability, or -membership‖, is not definitive because all objects belong to the suitability set in varying degrees. Using membership transformation functions, input criteria are given standardized -fuzzy‖ membership values, which vary continuously between 1 and 0. Values approaching 1 are considered more suitable and values approaching 0 are considered less suitable; Sites with a value of 0 are definitely not suitable and sites with a value of 1 definitely are suitable. While set thresholds can also be employed in fuzzy models for where 0 and 1 values begin, this approach incorporates more realistically the continuous nature of biophysical and economic variables [68]. For both WLC and fuzzy models, Boolean exclusion criteria can also be employed to further limit the results to feasible areas.
The majority of the studies reviewed here used relatively simple GIS suitability models that relied on linear combinations of spatially referenced input criteria with distinct thresholds, i.e., -Boolean‖ overlays. The one primary difference between the studies that used the linear combination overlay method is simply the succession in which input criteria were added to the model. For example, five of the twenty-one studies reviewed here first mapped for species suitability based on biophysical crop requirements, then mapped for land availability based on marginality [32,35,38,39]. Three other studies first mapped marginal land, and then mapped the potential suitability of that land for specific bioenergy species [26,33,37]. The eight studies that mapped for generic bioenergy species first mapped marginal lands, then employed land use constraints, or masks, where bioenergy crops should not be planted [27][28][29][30][31]34,36,45]. Finally, three studies first targeted marginal lands for generic bioenergy species based on multiple input criteria, but employed no land cover constraints [13,40,44]. For example,  performed a -hierarchical‖ GIS overlay in succession mapping lands that were considered physically marginal (using slope, rock fragment, bedrock depth, flooding, and ponding), biologically marginal (using temperature, moisture, soil erosion, soil depth, sand content, production, CEC, EC, sodicity, pH, drainage, water table, and soil restriction), environmentally-ecologically marginal (using soil organic carbon trend, slope, erosion, wetland), and finally lands that were economically marginal (from breakeven price or yield). In this way they targeted marginal lands by defining what they are considered to be, but no specific land cover constraints were employed.
The remaining three studies reviewed here used more complex methods for mapping marginal land. Tenerelli and Carver [43] combined a suite of GIS suitability methods. After identifying bioenergy crop types (perennial grasses, short rotation coppice and short rotation forestry) and ecological requirements based on crop typology, they both standardized and weighted input criteria. At the same time they excluded constraints (e.g., built-up areas, natural habitats, land with high ecological value, and highly productive agricultural land) based on a binary evaluation. These variables (the standardized and weighted criteria, the binary constraints, and the uncertainty and sensitivity analyses) were then used to generate the land capability index specifically for the bioenergy crops being studied. This tailored land capability index was then combined with an existing land capability classification to derive the final land allocation for bioenergy crops.
Only two methods reviewed here incorporated fuzzy set theory into their suitability models. Wu et al. [39] used the FAO's Agro-Ecological Zone (AEZ) method to map marginal land for Jatropha, and for this the authors employed a fuzzy membership approach. First they sorted three soil quality indicators into five sequential bins. Then based on the crop's soil quality requirements, they identified the best suitable range of each of soil quality factors with a fuzzy membership equaling 1.0 and transition zone with membership values between 0.5 and 1.0. They then aggregated the degree of membership over three soil quality factors into one index with three categories of land (suitable, moderately suitable, and unsuitable for Jatropha plantation). Cai et al. [27] delved much deeper into fuzzy set theory for their analysis. As described above, the authors first identified marginally productive land using a fuzzy logic model based on the soil rating for plant growth (SRPG) index. The model applied a fuzzy membership to input criteria based on empirical or expert knowledge, thereby converting quantitative land productivity values into qualitative ratings. Criteria were then aggregated by fuzzy rule inference to determine the probabilities of land belonging to a category of land productivity, and a final land productivity index was generated. Steps which were based on empirical or expert knowledge were then iteratively calibrated through a learning process that incorporated existing land use. Finally they overlaid mapped marginally productive land with existing land cover and designed four land availability scenarios for bioenergy crops.
One modeling step commonly employed among the studies reviewed here, and in GIS-based land suitability analyses in general, was the implementation of land cover constraints, or restrictions-however, not all restriction criteria are alike. Restrictions that eliminate an area from the analysis can be classified as either -hard‖ or -soft‖ [67,72]. Hard restrictions are land covers like urban areas and ice/snow-lands where it can generally be agreed that large bioenergy plantations cannot be planted. For this reason, hard restrictions were generally consistent among the studies reviewed here. This is in contrast to soft restrictions, which may include croplands, pasturelands, and shrub lands-lands where there may be legitimate reasons that bioenergy crops should not be planted, but it is feasible that they may be. For example, Lovett et al. [32] first mapped nine absolute factors that precluded any opportunity for energy crop planting, then they mapped two secondary factors, where planting perennial biomass crops would not be encouraged, but also not necessarily excluded. Soft restrictions may be modified over time and can also vary regionally (for example shrub lands may be considered a hard restraint in China, but a soft restraint elsewhere), which makes them harder to standardize, especially at international and global scales [67].

Biophysical Data Inputs
Of the studies reviewed here, soil, land cover and topography (i.e., slope) were the most common input criteria used. Of these, land cover was the most frequently used input criteria (used by 18 out of 21 studies), and was most often employed as a masking or elimination factor in the analysis. For example, it was generally assumed that bioenergy crops will not be grown on lands classified as water, ice or snow. Soil variables were also a common input factor, with 17 out of 21 of the studies reviewed here using at least one soil variable as an input criterion. For instance, Milbrandt and Overend [31] used 10 soil variables to target marginal lands potentially suitable for bioenergy crops, including soil texture, fertility, pH, etc. Topography (particularly slope) was the third most considered factor, with more than half (14 of 21), of the studies using slope as an input criterion defining land marginality. Primarily slopes too steep for planting bioenergy crops were excluded, for example three studies mapping marginal land in China excluded slopes > 25% [26,34,35].
Among the 18 studies that used land cover as an input criterion, 13 different land cover datasets were used, and among the 17 studies to use soil, 16 different soil datasets were used (Table 3). Moreover, for -current‖ land cover datasets, publication dates span over a decade, ranging from the -end of last millennium‖ (released 1997) (IGBP DisCover) to 2008 (CDL).

Socio-Economic Data Inputs
Several studies include what they term -socio-economic‖ factors-usually used as constraints in the model-however these data were often inconsistently utilized. In general -socio-economic‖ criteria consisted of varying aspects of land use/land cover, as decided by the scope and region of the study.
For example, Lovett et al. [32] incorporated the following -socio-economic‖ factors into their multi-criteria suitability analysis: urban boundaries and cultural heritage sites comprising doorstep greens, millennium greens, historic parks and gardens, monuments, registered battlefields, and world heritage sites. Wu et al. [39] also incorporated -social-economic constraints‖ by excluding all but 2% of land used for food production and all but 50% of barren, grass and open forestlands from their mapped results. The only socio-economic constraint used by Lu et al. [35] was shrub land, citing that this land use cannot be used for other purposes based on Chinese policy. In short, the socio-economic criteria employed were subjective and non-standardized. Despite the common use of the term -socio-economic‖, these factors may be better categorized as land cover/ land use criteria.

Other Data
A small number of papers used other input criteria less common between studies. For example, polluted areas were considered in two of the papers reviewed here. Fahd et al. [37] point out that some lands become marginal when excess pollution is generated by human-dominated processes (including illegal disposal of liquid and solid waste), therefore they used polluted areas as a primary input into their overlay model (Fahd et al., 2011). In the United States Gopalakrishnan et al. [40] included in their map of marginal land what they termed -environmentally degraded land‖, including land with brownfield sites, areas with water contamination, and areas with excessive irrigation.
Other less common input layers included highly localized datasets, such as roadways and riparian corridors, which were mapped by Gopalakrishnan et al. [40], but only qualitatively mentioned elsewhere [44]. Projected changes in climate were infrequently used. Odeh et al. [33] mapped marginal land in Australia at first by isolating annual precipitation to 300-600 mm/yr, and then made adjustments for climate change based on six emissions scenarios.

Choice of Data Threshold
Even amongst studies that used the same datasets for input criteria and exact same thematic classes, differences can be made when determining exact thresholds of delineation determining marginality. Topography (slope) is a prime example of how a threshold decided on by the authors or a -panel of experts‖ can vary widely between studies and influence mapped results. In the studies reviewed here, primarily slopes too steep for planting were excluded from the analysis; however, exact thresholds can be subjective and therefore results differ considerably. For example, in their working definition of marginal lands, Lovett et al. [32] excluded slopes > 15%, while Gopalakrishnan et al. [40] included slopes > 15%. Milbrandt and Overend [31] included slopes between 8% and 30%. Fiorese and Guariso [38] and Gelfand et al. [17] excluded slopes > 20%, and three studies in China, including Schweers et al. [34], Zhuang et al. [26] and Lu et al. [35] excluded slopes > 25%. Studies with numerous levels of marginality had multiple thresholds for slopes, including Kang et al. [44] who used a physical definition of marginal lands that excluded slopes > 30%, and an environmental-ecological definition that excluded slopes < 8%. Cai et al. [27] identified eight slope classes for their fuzzy analysis.

Choice of Thematic Class
Considering data layers of the same subject and scale, there were often differences across studies in the thematic classes used in the analysis. The majority of studies used soil (17 out of 21) and land cover (18 out of 21) as input criteria to determine marginal land, but in non-standardized ways. For instance, certain land cover datasets sometimes included other less-common classifications specific only to those datasets, which make comparisons problematic. For instance, both Tenerelli and Carver [43] and Lovett et al. [32] used the Land Cover Map (LCM) 2000 (Table 3), but Lovett et al. [32] only used the LCM2000 grassland classification yet Tenerelli and Carver [43] used unique land cover classes including arable cereals, horticulture, improved grassland, and set-aside grassland that were exclusive to the LCM2000 [73]. An Italian study by Fahd et al. [37] used the CORINE (Table 3) land cover dataset to isolate non-irrigated arable lands. In the United States, Gopalakrishnan et al. [40] used the 56m resolution 2007 USDA Cropland Data Layer (CDL) ( Table 3) and targeted the thematic category -idle and fallow cropland‖ for their model, which they assumed includes Cropland Reserve Program (CRP) lands (i.e. set-aside lands). In sum, the categories listed above were not commonly available in all land cover datasets, therefore making cross comparison between studies difficult.
One of the most common soil-related input criteria in the studies reviewed here was land capability class (LCC). In the U.S., the Natural Resources Conservation Service (NRCS) describes LCC as, -a system of grouping soils primarily on the basis of their capability to produce common cultivated crops and pasture plants without deteriorating over a long period of time‖ [74]. Several adaptations of this system have been used worldwide [55,75]. Across datasets, LCC is derived from a combination of criteria, including erosion risk, soil depth, wetness, slope and climate [76]. Low LCC values designate prime agricultural land with few restrictions and higher values represent lands with increasing restrictions for agricultural production. In the studies reviewed here, LCC was most often used as a restrictive layer to exclude prime agricultural land from the analysis; however varying numeric classes were implemented. For example, in the U.K. Lovett et al. [32] restricted marginal land to LCC levels 3 and 4 from the Agricultural Land Classification (ALC), in Canada Liu et al. [36] restricted their analysis to Land Suitability Rating System (LSRS) classes 4 through 6, and in the U.S. Gelfand et al. [17] restricted marginal lands to NRCS LCC levels 5 through 7, and in Italy, Tenerelli and Carver [43] restricted their final mapped result to predetermined Agricultural Land Capability Map classes 3 through 5. In sum, none of the studies that implemented LCC into their model used the same thematic classes.

Issues of Data Representation
Finally, many of the GIS suitability models reviewed here assumed that input criteria exhibit crisp boundaries between what is marginal land what is not. However, as in the case of physical variables, like slope, the reality is often more continuous. Sometimes uncertainty in thresholds delineating marginal lands can be addressed in methods of analysis as a way to map input criteria in a continuous manner. For example, Tenerelli and Carver [43] standardized their input criteria according to their compatibility with the ecological requirement of the crops. Based on expert knowledge, numeric input criteria (e.g., degree days, pH, rainfall and slope) were standardized using continuous benefit or cost functions, and nominal input criteria (e.g., qualitative values of soil depth and soil texture) were standardized using a ranking approach. Instead of binary thresholds, these methods employed more continuous grades of marginality that more accurately resemble the continuous nature of the input data.

Issues of Scale
The papers reviewed here present work from global, national, regional and local scales. As is common with most GIS models, it is clear that the scale of analysis impacts the way in which the working definition of marginal land is structured, as it impacts model choice, data availability and selection, and resolution. For example, available datasets for national studies ranged in spatial resolution from 56 m to 1 km, while global datasets varied in resolution from 300 m to 5 arc-minutes. In addition, different modes of scale were reported, including ratio (1:100 k), pixel size (1 km), and spatial resolution (30 arc-seconds), making direct comparisons between studies even more difficult.
In general, global datasets are available at a relatively coarse resolution and are also often outdated-sometimes produced over a decade before the study using the data was conducted. The largest differences in land cover datasets were apparent in the global studies examined, where four global land cover datasets were used among the five global studies reviewed here. The analysis by Cai et al. [27] used the International Geosphere-Biosphere Programme (IGBP) land cover dataset released in 1997 [77,78], which is available at 30 arc-seconds. Both Field et al. and Campbell et al. [28,29] used both past land use and current land use (1700 to 2000) from the HYDE database at 5 arc-minutes [79] to map abandoned pasture and cropland, along with a 2004 MODIS land cover product to mask current unsuitable land covers. Nijsen et al. [30] used the Global Land Cover 2000 dataset (GLC2000) with a 1-km resolution [80]. Milbrandt and Overend [31] used 2000 GAEZ land cover data available at 5 arc-minutes. In sum, of the -current‖ global land cover datasets used in the studies reviewed here, one was released in 1997, two have a publication date of 2000, and studies that used MODIS datasets had varied dates including 2004, 2006 and 2008. These examples illustrate the challenge global modeling has in acquiring timely data.
The other drawback of global studies is the coarse resolution of the datasets. Nijsen (2012) attempted to address this issue by downscaling the coarse GLASOD soils database (a database of human-induced soil degradation from 1945 to 1990) from a mmu 5652 km 2 to 5 arc-minute scale, using several spatially explicit databases with finer resolution [30]. Other studies relied on currently available soil datasets, yet no two global studies reviewed here used datasets from the same source. Cai et al. [27] used two datasets, the Harmonized World Soil Database (HWSD) available at 30 arc-seconds to get a soil productivity rating for each pixel and also 16 indices on soil temperature regime (STR) available from USDA-NRCS. Milbrandt and Overend [31] used the GAEZ soils dataset with a resolution of 5 arc-minutes.
Global studies were not alone in these differences. There were also international differences between national datasets, as methodologies to generate national datasets vary by country. For instance, of the papers discussed here, many national studies used national-level soils datasets. Lovett et al. [32] used the NatMap 1000 database for their study in the United Kingdom; Odeh et al. [33] used the CISIRZ dataset available for Australia, and both Lu et al. [35] and Zhuang et al. [26] used the Soil Map of China (RESDC) ( Table 2).
Only one study reviewed here was conducted at a local scale. For their city-level analysis in Pittsburgh, Niblick et al. [45] used very localized datasets, such as commercial zoning at the parcel-level [45]. Their analysis also employed a greenways feature class, which included agricultural easements, forested floodplains, designated greenways, land trust properties, rivers buffered by 100 ft., conservation streams buffered by 50 ft., sensitive slopes, wetlands 1 acre or more buffered by 50 ft., golf courses, parks and trails. This type of dataset better represents real world limitations to bioenergy plantations; however such data are not available for global-level analyses.

Issues of Uncertainty
A complete accounting of error in any GIS analysis is important, yet not often completed. Issues of accuracy, error propagation and sensitivity are especially important because proponents of bioenergy production often cite these studies, more often quoting the upper bound rather than the lower bound of the mapped results [81]. To best incorporate error and uncertainty in land suitability analysis, four components should be examined: (1) accuracy of input criteria; (2) validation of final results; (3) error propagation, or the compounding of errors in input datasets through the model; and (4) sensitivity of the model outputs to inputs [82][83][84][85]. The first two, accuracy and validation are more widely appreciated in the literature. The last two, uncertainty analysis and sensitivity analysis, are increasingly used to evaluate the effect of error propagation and model uncertainties, as well as the relative importance of sources of uncertainty [83,84,86,87]. These methods can help decision-makers evaluate the utility of input data, as well as the risk of assuming a particular modeled scenario [83]. We reviewed our target papers for their attention to these issues. At least eight studies reviewed here failed to mention accuracy or uncertainty at all [13,17,26,35,36,38,39,41]. As these studies mapped crops that have yet to be planted, this is not surprising. However, there are extensive protocols developed in the GIScience discipline that outline methods of evaluating the quality of input criteria and understanding possible error propagation and model sensitivity [82,84,88].
Though most studies reviewed here do not cite the accuracy of their input datasets, metadata reveals reported accuracy of land cover datasets ranged from 67.1% (GlobCover) to 87% (CORINE) ( Table 2). The accuracy of land cover datasets can have profound impacts on mapped results. Fritz and See [89,90] highlighted thematic uncertainty as well as spatial uncertainty in global land cover maps. When comparing GlobCover to MODIS land cover data, both input criteria used in the studies reviewed here, they found the combined forest and cropland disagreement to be 893 Mha (Fritz et al., 2011). As Field et al. [29] pointed out, the MODIS land cover dataset does not distinguish between grassland and pasture-a potentially important distinction for planting bioenergy crops.
Regarding final mapped results, uncertainties were often addressed only qualitatively in the paper's discussion or conclusion. For example, Field et al. [29] qualified the uncertainty in their results, saying that while -the regional distribution of agriculture and pastures is relatively certain, the uncertainty for this abandoned area estimate is substantial (probably ± 50% or more)‖. Likewise, Gopalakrishnan et al. [40] qualitatively acknowledged that uncertainty arises from classification of marginal land and from using data layers at varying scales (scales of input data used in their analysis range from 10 m road and riparian buffers to soils data with a minimum mapping unit of 617 ha). Owing to the coarse resolution of inputs, some global-level analyses, including Field et al. [29] and Campbell et al. [28], added disclaimers to their mapped results, suggesting that general estimates of spatial distribution should not be prescribed at the local level.
Cai et al. [27] assert that concerns of uncertainty are inherently addressed in their methods, saying -FLM (Fuzzy Logic Modeling) is used to treat the uncertainty of the global data sets and the fuzzy nature inherent in land classification according to multiple criteria. FLM has been proven to be a powerful tool to address data variability, imprecision, and uncertainty and to treat the ambiguity and uncertainty involved in generating realistic continuous classifications‖. However, global-level analyses such as this one still face a difficult obstacle when their results are compared to local-level analyses. In a recent study, mapped results from Cai et al. [27] were calibrated by Fritz et al. [81], who downgraded the 2011 estimates based on statistical adjustments derived from crowdsourcing Google Earth images. Their estimates reduced available land area between 264 and 376 Mha, depending on scenario, which might suggest that global studies overestimate land available for bioenergy production.
One study reviewed here verified their results with existing datasets that may be proxies for marginal lands.  compared their results to existing Conservation Reserve Program (CRP) lands, as well as Land Capability Classes 5 through 8, arguing that these designations have in the past been used to quantify marginal land. They argued that their hierarchical analysis was more comprehensive than just using LCC or CRP land alone, in part because their results showed more area being mapped. Another study by Schweers et al. [34] conducted ground surveys to verify their mapped results in two regions, using GPS data, photos, interviews, and government data. They found that only 60% of the ground verification sites (out of 19 locations overall) totally agreed with the remotely sensed assessment. They also reported that using 2005 land cover data did not capture more recent land use changes as observed in their 2009 field survey, and that the resolution of the DEM (30 arc-seconds) used to derive slope was insufficient to reveal subtle nuances of the landscape.
Only one study quantitatively addressed both uncertainty and sensitivity. Tenerelli and Carver [43] conducted an extensive sensitivity and uncertainty analysis, acknowledging that results can be affected by both input and model errors, and that uncertainty propagates from the input criteria and parameters to the final output. To assess error propagation, their uncertainty analysis employed Monte Carlo simulations based on data accuracy. Their sensitivity analysis, which assessed how each input affects the model, involved two methods. The first removed each of the input criteria one-by-one through a jack-knife approach. The other was a sensitivity simulation on the criteria weights based on a Monte Carlo approach. The results showed which criteria are more or less influential to the mapped results, and how sensitive each criterion is to the input parameters, including applied weights.

Discussion
We examined a collection of recent literature that describes the mapping of bioenergy potential across space and scale. Projects that targeted second generation bioenergy feedstocks, focused on land that may be categorized as marginal, and used spatially-explicit GIS models were evaluated for commonalities and differences in methodology. While most papers provided similar general definitions for marginal land, their working definitions-those which are implemented in a spatial modeling framework-differed greatly. The concept of -marginal land‖ is often assumed to be static, yet our review suggests that the concept is better understood as relative: considered in proportion or in relation to something else. For example, many papers reviewed examined lands that are marginal when compared to agricultural lands. These lands might not be marginal when compared to their ability to provide wildlife habitat or other ecosystem services. Alternatively, many papers reviewed here were defined in response to what data a researcher can acquire, suggesting that difficult-to-map, but nonetheless important landscapes might be absent from the discussion.
We found no common working definition of marginal land across all of these studies, including considerable differences across models, input data, scales and validation methods. One country-China-provided a case study to examine potential comparisons. Despite the potential similarity of intent, the examples from China illustrated considerable differences in mapped results that were driven by differences in crop choice, model framework, data inputs, scale and treatment of uncertainty. These differences were echoed throughout the broader literature.

Modeling Framework
The Geographic Information System framework is a useful and flexible one in which to perform suitability modeling. Most papers reviewed here employed relatively straightforward GIS overlays using linear combination of input criteria with distinct thresholds identifying land as either marginal or not. Models specifying varying degrees of marginality were less common [27,35,39,43]. However, as this study shows, the thresholds determining marginality vary greatly between studies and are far from exact. Although simple GIS overlays have the benefit of being transparent and easy to replicate, they may not be the best measure of the dynamic concept of marginal land. Therefore, there is a need to incorporate the natural continuity of datasets-either by standardization of input criteria or incorporation of Fuzzy Set Theory-in ways that more effectively represent this fluid subject.

Data Availability
We found a large range in data choice for input criteria, therefore it was not surprising that data choices were rarely consistent across studies. For example, most studies across scales used some kind of land cover and soil dataset, yet we found 13 different land cover datasets and 16 different soil datasets were used. This fact alone is enough to significantly influence mapped results. Additionally, differences in data thresholds, thematic class selection, and ways in which data is represented mean that analysis with identical data can yield different results. We also found that lack of appropriate data was an additional driver of differences between studies. For example, Kang et al. [44] initially identified 30 key variables for their analysis, however only 21 were applied because data for the other nine desired variables (e.g., nutrient loss, biodiversity, resilience, resistance, buffer-zones or corridors, and placeholders for -other restrictions‖) were not readily available or easily quantified. Socio-economic factors were the least consistent category of input criteria, highlighting that standardization was especially lacking in this category.

Scale of Analysis
We reviewed projects that focused on global, national, regional and local scales. The scale of analysis clearly impacted the way in which a working definition of marginal land is structured, as it impacts model choice, data availability and selection, and resolution. In some projects data matched the scale of analysis, in the manner that global studies used global datasets, but in some projects, there was a miss-match. Schweers et al. [34] performed their analysis at a national scale, but because they were mapping abandoned lands they used global-level datasets. This makes results more comparable to global estimates, but less so to national ones.
National studies excel at addressing national policies and bioenergy targets. They also have the flexibility of either mapping for specific bioenergy species, e.g., Miscanthus, or for a set of generic bioenergy crops, such as perennial grasses or short rotation forestry. Input criteria for national studies are necessarily available at a national or global scale; therefore the data is generally more coarse than that available for regional studies, but can be of higher resolution than global studies. Datasets are often country-specific and are not internationally standardized.
Studies conducted at a regional scale are best suited for incorporating detailed datasets, including fine scale soils data (e.g., the Soil Survey Geographic Database (SSURGO), 1:24,000) as well as road and riparian boundaries [40]. Regional studies also have the advantage of being able to incorporate socio-economic inputs at a more realistic level of analysis [38]. This allows regional studies to better address specific management goals and mapping for a specific bioenergy species. Methodologies for studies at regional scales vary from standard binary GIS overlays [37] to very complex analyses [43]. However, regional studies can be limited in that the results are often crop-specific and it is therefore difficult to extrapolate models to larger areas.

Uncertainty
Most of the papers reviewed here provided few details describing accuracy and uncertainty, suggesting great scope in the future for this kind of work. A review of the metadata revealed that the accuracies of input land cover data ranged from 67.1% to 87%, yet these numbers were rarely reported in the works themselves and the impacts of propagation of error were seldom addressed. The results of accuracy and uncertainty analyses are especially important since the high end of the range of mapped estimates is often the most widely cited. Because these estimates have the potential to influence policy and real-world investments, it is especially important to ensure they are truly representative of the land resource availability. At the most basic level, studies can express these uncertainties by offering a range of mapped marginal land areas (as adjusted by soft constraints or accuracy analyses) that may be suitable for bioenergy crops instead of one exact number. However there is a still danger that only the highest value of the range estimate gets cited in future literature promoting bioenergy.

Conclusions
The challenges in planning for bioenergy mandates globally are large and numerous. Foremost among them is the need to determine where to plant bioenergy feedstocks to meet energy mandates while ensuring sustainable food production and environmental protection. The breadth of the projects reviewed here underpin the benefits of spatial modeling and GIS-even when simply implemented-in projecting where bioenergy might be planted. However, the considerable differences in definitions, models and data revealed in this review allow limited potential for comparison across studies, as well as for synthesis work that quantifies global biofuel potential. Some of the differences highlighted in this work might be minimized with standardization, through the use of similar datasets and similar analyses, and some might be minimized with better application of existing protocols, such as common accuracy assessment practices. Thus, the mapping of bioenergy potential is ready for a meta-analysis or shared examinations that use common data and protocols.
Challenges also remain in effectively addressing the bioenergy land use dilemma. The studies reviewed here individually and broadly reiterate the importance of refining theoretical estimates of bioenergy suitability with real world conditions that reflect explicit understanding the balance between fuel, food and conservation. Lands that may be identified as marginal and mapped as such based on solely physical conditions may not necessarily be available for bioenergy plantation when considering economic, social, or environmental factors [91]. However, incorporating these variables in a GIS environment is not always straightforward. Hard-to-depict land uses such as the presence of pastoralists, the use of land for cultural purposes, or for biodiversity protection are difficult to capture via remote sensing techniques (which excel in mapping land cover) and thus are not often included in broad-scale geospatial datasets. These mapping limitations make effective understanding of the tradeoffs between food, fuel and land elusive, but nonetheless tremendously important, and worthy of much more applied research.
This study is the first to systematically review projects that map bioenergy potential on marginal lands. Our goal was to provide a framework to help researchers evaluate existing scholarship in mapping marginal land. We have identified areas where understandable differences in definitions, models, data and applications result in differences in mapped results. We suggest that there is tremendous future need for spatial modeling of bioenergy, yet further work should be done to allow for comparative work across countries and scales and understanding of cumulative global potential for bioenergy crops.

Author Contributions
Sarah M. Lewis and Maggi Kelly conceived and designed the paper together; Sarah M. Lewis reviewed the literature and framed the discussion together; Sarah M. Lewis led the drafting the article; Maggi Kelly helped revise the article.