Relevance of the Cell Neighborhood Size in Landscape Metrics Evaluation and Free or Open Source Software Implementations

: Landscape metrics constitute one of the main tools for the study of the changes of the landscape and of the ecological structure of a region. The most popular software for landscape metrics evaluation is FRAGSTATS, which is free to use but does not have free or open source software (FOSS). Therefore, FOSS implementations, such as QGIS’s LecoS plugin and GRASS’ r.li modules suite, were developed. While metrics are deﬁned in the same way, the “cell neighborhood” parameter, specifying the conﬁguration of the moving window used for the analysis, is managed differently: FRAGSTATS can use values of 4 or 8 (8 is default), LecoS uses 8 and r.li 4. Tests were performed to evaluate the landscape metrics variability depending on the “cell neighborhood” values: some metrics, such as “edge density” and “landscape shape index”, do not change, other, for example “patch number”, “patch density”, and “mean patch area”, vary up to 100% for real maps and 500% for maps built to highlight this variation. A review of the scientiﬁc literature was carried out to check how often the value of the “cell neighborhood” parameter is explicitly declared. A method based on the “aggregation index” is proposed to estimate the effect of the uncertainty on the “cell neighborhood” parameter on landscape metrics for different maps.


Introduction
In recent decades, landscape metrics were one of the main tools for the study, quantification, and possibly the parametrization of the changes of the landscape and of the ecological structure of a territory [1,2]. Landscapes consist of natural or anthropogenically modified mosaics of patches, and the spatial arrangement of these patches is called landscape pattern [3]. The quantification of spatial heterogeneity over multiple spatial and temporal scales is necessary to clarify the relationships between ecological processes and spatial patterns [4,5]. Landscape metrics became standard and gained the status of an indispensable instrument for those who investigate landscape change dynamics not only in ecology [6,7] but also in many other disciplines such as soil protection and water management [8][9][10]. Many are metrics that can be calculated, but some of them are considered particularly robust and significant and are used more frequently in scientific literature [2,6]. It is a common practice to compare the different landscapes according to the results of some of these indices like, for example, patch number (NP) or mean patch size (MPS) [11]. However, we argue that the results of some of the most used and reliable indices can be severely affected by the settings used during the processing [11] and, if these settings are not accurately described in the scientific works, this can create a cascade effect that affects the comparability of the results. In particular, we realized that the cell neighborhood size is one of the most sensible parameters but a literature analysis highlighted that not many works have investigated its influence. Additionally, each software provides information about the setting of cell neighborhood size with different emphasis.
The most popular software for landscape metrics evaluation is FRAGSTATS [12], which is free to use but not free or open source software (FOSS). Therefore, FOSS implementations, such as QGIS's LecoS plugin and GRASS' r.li modules suite, were developed. Landscape metrics are defined in the same way in all these software. However, the cell neighborhood parameter, specifying the configuration of the moving window used for the analysis, is managed differently: FRAGSTATS can use values of 4 or 8 (8 is default), LecoS uses 8 and r.li 4.
Are the software users aware of which cell neighborhood size they are using and what the performances and the effects it can produce are? How reliable are the comparisons among different studies that do not mention the cell neighborhood size settings? Is it possible to safely compare the results of works carried out with different software?
In order to answer to these questions the aims of our work are: (i) to investigate how the cell neighborhood size affects the results of landscape metrics calculation using both tests map and real maps; (ii) to examine the transparency of the settings of the cell neighborhood size in different software; (iii) to understand if the cell neighborhood size settings are usually reported in the scientific papers and, therefore, how reliable are the comparisons of different works developed with the same or different software.
The paper is organized as follows: Section 2 illustrates the literature review and its results; Section 3 describes the materials and methods used for the analysis, the software, the metrics, and the maps used for the tests; Section 4 illustrates the results; Section 5 describes a new method to evaluate the sensitivity of landscape metrics to the cell neighborhood size and finally the conclusions and future developments are presented in Section 6.

Literature Review
An analysis of the scientific literature for papers regarding the use of FRAGSTATS, the LecoS add-on, or the r.li/r.le modules in GRASS GIS was carried out to determine if the cell neighborhood parameter was mentioned or explained in scientific papers describing applications using landscape metrics.
The Scopus database was used, searching for the term "cell neighborhood" coupled with "Fragstats", "r.li", "r.le", "LecoS", "landscape patch", and "landscape metrics" as keywords. The joint use of these terms yielded no results, so the keywords have been used separately.
The papers for the current review were chosen following these conditions: (i) the articles must be research papers, peer reviewed, although with no minimum limit on the number of citations; (ii) at least one of the FRAGSTATS, LecoS, or r.li/r.le software must have been used in the study; (iii) more than one landscape metric must have been used; (iv) no limit has been set for the year of publication.
In the Scopus archive the term "landscape metrics" produces an excess of 37,000 results. Refining the search by adding the keyword "Fragstat" narrows the result to 800 papers. Adding the terms "lecos" and"landscape metrics" refines the result to six papers; finally, adding "landscape metrics" with "r.li" refines the result to four papers. For further analysis the pool of results was narrowed to 93 papers:  These papers can be classified according to their purpose: • papers that deal with landscape fragmentation as an issue in itself; • papers that deal with landscape fragmentation as an issue for habitat connectivity; • papers that investigate which landscape metric could explain other metrics; • papers that investigate the size of a moving window algorithm and how it affects landscape analysis; • papers that present new software for landscape metrics calculation, basing their structure and methods on FRAGSTATS. Table 1 shows the most used landscape metrics, with the numbers of times the selected metric was used in the reviewed papers as absolute values and percentages. The value of the cell neighborhood parameter was specified in eight different papers: seven chose a Moore definition [13][14][15][16][17][18][19][20] and one a von Neumann definition [21]. The 16 papers which exclusively employed LecoS [22][23][24][25][26][27][28][29][30][31][32][33][34][35][36][37] used an 8-cells neighborhood rule, as it is the only setting available in the add-on; a similar hypothesis is applicable to the four works which used r.li in GRASS [38][39][40][41], which uses a 4-cells neighborhood definition. The papers which did not specify the definition of the neighborhood in FRAGSTATS are assumed to have used a 8-cells definition, since this is the default setting [42], although it is not possible to prove this assumption. Results in Section 4 demonstrate that some of the most commonly used metrics are affected by the choice of this parameter. Therefore, the fact that the cell neighborhood configuration is not explicitly or implicitly specified in most of the research papers analyzed (by the indication of the software used) could make the results difficult to repeat and interpret.
The full list of the examined papers, a table indicating the papers citing each software, and a breakdown of the landscape metrics cited by each paper is available as an additional material of this paper (see Supplementary Materials).

Materials and Methods
To analyze the effect of the cell neighborhood size on the values of landscape metrics calculation, three different geographic information system (GIS) software were selected (Section 3.1) to evaluate a common set of metrics (Section 3.2) on artificial (Section 3.3) and real (Section 3.4) maps.

Test Software
Tests on artificial (Section 3.3) and real (Section 3.4) maps were carried out using three different software systems: FRAGSTATS [12], GRASS GIS [43], and QGIS [44]. The rationale behind this choice is that using software in the public domain (FRAGSTATS) or available as free and open software (FOSS) (GRASS GIS and QGIS) allows the replication of the experiments described in this paper. Additionally, FOSS source code is available for analysis. Therefore, possible discrepancies between metrics' values can be resolved examining the code.
According to its manual [45], "FRAGSTATS is a spatial pattern analysis program for quantifying the structure (i.e., composition and configuration) of landscapes.". FRAGSTATS [12] was the reference software for the evaluation of landscape metrics since its inception in 1995. It is available as executable in the public domain for the MS Windows operative system only. Written in Microsoft Visual C++, the use of a 32-bit address space limits its capacity to process large maps. Its source code is not accessible but its features are well documented in the manual available on-line and its widespread use in landscape and ecology research speaks to its accuracy and reliability in metrics evaluation. Units of input maps are assumed to be meters. Input map can only contain signed integer values and have square cells with sides larger than 10 −3 m. FRAGSTATS can evaluate 36 area and edge metrics, 79 shape metrics, 46 core area metrics, 17 contrast metrics, and 64 aggregation metrics, for a total of 242 different metrics and 392 parameters describing them [45]. Additionally, FRAGSTATS can evaluate 9 Diversity Metrics.
GRASS GIS [43] is a multi-purpose FOSS GIS for geospatial data production, analysis, and mapping [46] available under the GNU Public License (GPL), used in research and education [47]. It is part of the Open Source Geospatial Foundation (OSGeo). GRASS is mainly written in ANSI C, with some parts in the C++ and, more recently, in the Python programming languages; it is highly modular and easily scriptable. Besides the source code, it is available as ready to install packages for the MS Windows, Apple Mac OSX, and Linux operating systems. Limitations on data size are mostly related to the capacity of the operative system to deal with large files [48], with the possibility of enabling large file support (LFS). Landscape analysis in GRASS is carried out using the r.li modules suite, which can evaluate 10 patch indices and 7 diversity indices and provides a graphical user interface for the configuration of the analysis. A previous implementation of landscape metrics evaluation, called r.le is used in old GRASS versions.
QGIS is a user friendly open source geographic information system (GIS) licensed under the GNU General Public License [44], part of the Open Source Geospatial Foundation (OSGeo). It is written in the C++ language using the Qt toolkit for its interface. Plugins can be written either in C++ or Python. It is available as source code and packages for the Linux, Unix, Mac OSX, MS Windows, and Android operative systems. The maximum size for maps depends on the combination of operative system and Geospatial Data Abstraction Library (GDAL)/OGR Simple Features Library (OGR) limitations for the data type in use. The Landscape ecology statistics (LecoS) plugin performs landscape analysis in QGIS using the Python numpy library [49]. LecoS, like version 3.0.0, can evaluate 20 landscape metrics, 8 zonal statistics and 3 diversity indexes.
Other modules to calculate landscape metrics are integrated in the most popular GIS desktop software: V-Late and PA4 in ArcGIS (set of GIS programs created by the Environmental Systems Research Institute, ESRI), Pattern and Texture modules in IDRISI, and the API called Land-metrics DIY developed by Zaragozí et al. [50] and released under the GPL license. Recently, a new R package dedicated to calculation of landscape metrics was released; with landscapemetrics it is now possible to perform landscape analysis on rasters inside the R environment [51].

Landscape Metrics
Landscape metrics are indexes used to describe and quantify the spatial characteristics of the landscape, described as a set of categorical data, typically representing land use [6]. These indexes are useful to study the evolution of the landscape features over time [7,41,52,53] or to compare different areas [6].
The great variety of metrics can be classified according to the spatial level they are dealing with: patches, classes of patches, or the whole landscape [54]. Commonly, in landscape ecology, a patch is defined as a discrete area of homogeneous environmental conditions at a specific scale [45]. These metrics fall into two categories: those that quantify the composition of the map without reference to spatial attributes (for example, the number of patches) and those that quantify the spatial configuration of the map, such as spatial arrangement, position, or orientation of the patches [54].
Landscape metrics calculated at the patch level describe simple statistics about each land use patch (area, perimeter, number of patches, . . . ) and serve primarily as the computational basis for other landscape metrics [45]. Landscape metrics calculated at the class level report information about all the patches of a given type. These metrics are useful to quantify the amount and spatial configuration of each patch type and thus are useful to estimate the fragmentation of each patch type in the landscape [45]. Finally, landscape metrics can provide information about the pattern, like class metrics: these may be integrated by a simple or weighted averaging or may reflect aggregate properties of the patch mosaic [45].
For the purpose of this work, a limited number of landscape metrics was selected among the many available according to the following criteria: (i) the availability for calculation across software; (ii) the use of the cell neighborhood parameter for their calculation; (iii) the relative simplicity of their formulation, in order to understand the effect of the cell neighbor parameter on the result and to be able to manually evaluate their values for simple maps.
Patch Number (NP) is the number of patches of each type; it is an adimensional metric. It is defined as: where: • N is the number of patches in the k-th category.
Patch density (PD) equals the number of patches of the corresponding patch type divided by total landscape area [m], multiplied by 10,000 and 100 (to convert to 100 hectares). The unit of measure is number of patches per 100 hectares. It is expressed as: where: • N is the number of patches in the landscape; • A is the total landscape area in [m 2 ]; • 10 4 and 100 are constant and used to express the index in (100) ha.

Mean patch size (MPS)
is the average area of all the patches of a given type. It is measured in square meters. When used in combination with NP, MPS gives information about how the patches of a given land use class are growing or merging over time [53]. It is defined as: where: NP is the number of patches.
Edge density (ED) equals the sum of the lengths (m) of all edge segments involving the corresponding patch type, divided by the total landscape area (m), multiplied by 10,000 (to convert to hectares). The unit is meters per hectare. ED > 0. This index is useful in ecological studies dealing with ecotone species. It is expressed as: where: • k is the category of the patches; • m is the total number of different categories of patches; • n is the number of boundary edges for the patch; • e ik is the total length of boundary edges for the k-th category of patches; • A is the total landscape area; • 10 4 is a constant to convert the index in [m/ha].
Landscape shape index (LSI) measures the perimeter-to-area ratio for the landscape as a whole. All edge segments (m) within the landscape boundary involving the corresponding patch type are divided by the square root of the total landscape area (m). LSI > 1, adimensional. This index is a measure of the overall geometric complexity of the landscape and is defined as: where: • E is the sum of the lengths of all the boundary edges of the patches; • A is the sum of all the areas of the patches; • 0.25 is a adjustment coefficient.
Aggregation index (AI) equals the number of adjacent patches involving the corresponding class, divided by the maximum possible number of like adjacencies, which is achieved when the class is maximally clumped into a single, compact patch, multiplied the proportion of the landscape comprised of the corresponding class, summed over all classes and multiplied by 100 (to convert to a percentage). Unit: Percent, range 0 < AI < 100 [45]. It is useful to quantify spatial patterns and fragmentation of the landscape. It is defined as: where: • g ii is the number of like adjacencies between pixels of patch class i based on the single-count method; • max(g ii ) is the maximum number of like adjacencies between pixels of patch class i based on the single-count method; • 100 is a constant used to express the index as percentage.
Cell neighborhood is defined as the nearest cells to a specific one. Only cells in the neighborhood are used to evaluate the metrics above. There are two possible concepts of "cell neighborhood" (CN): the von Neumann definition, which considers the four adjacent cells in the cardinal directions, and the Moore one, which considers both the corners and edges of adjacent cells ( Figure 2). While von Neumann neighborhood considers 4 cells, Moore neighborhood uses 8 cells for the calculations. FRAGSTATS allows the user to choose between the two definitions [42].
The evaluation of some landscape metrics is influenced by the choice of this parameter. Corner edge cells of the same land use class will be considered part of the same patch in a 8-cells neighborhood, while they are not considered such using a 4-cells neighborhood algorithm.
For this reason, using the same map in different software will lead to different results for some landscape metrics. While LecoS, r.le, and r.li allow the use of one single type of neighbors rule, in FRAGSTATS the default option is the 8-cells Moore definition but the user can select the 4-cells von Neumann one.

Artificial Maps
Four artificial maps were created to test the sensitivity of the different metrics to the variation of the cell neighborhood value and to understand which patches' configurations are more susceptible to different values of the cell neighborhood (CN) value. The four maps are binary, with value 1 indicating the patches and value 0 the background. The maps are square, with 100 cell sides, for a total of 10,000 cells and each pixel measures 10 × 10 map units.

Real Maps
A map of the forest coverage in the Val di Fassa, Italy, in 2006 was used for the tests on real maps. Val di Fassa is a valley in the Alps, in the north east of the Trentino region in Northern Italy (Figure 7). This area was affected by a marked expansion of the forested area in the last century [6,41,51,52] and is currently being investigated in the Trentinoland project, which is analyzing the forest cover of the whole Trentino region in Northern Italy [56].
Val di Fassa has an area of 567.28 km 2 (56,728 ha), of which 132.36 km 2 (13,236 ha, 23.33%) were covered by forest in 2006. The region boundaries are 5157094 N, 5132029 S, 721983 E, 699351 W, in the ETRS89/UTM 32 N (EPSG 25832) datum. The raster map has a 10 m resolution, with 2509 rows and 2265 columns, for a total of 5,682,885 cells.
Two maps were derived from this map, by applying a low pass filter with a 3 × 3 pixels window, assigning the third quartile for the first map (Figure 8, center) and the first quartile for the second one ( Figure 8, right), with the aim to test how the fragmentation of the maps influences the variation of the landscape metrics depending on the CN value. The application of a low-pass filter obviously leads to less fragmented maps. These maps will be used to develop an index predicting how different values of CN influence landscape metrics values when CN is unknown in Section 5.
These two maps cover the same area and have the same resolution of the original one.    All these maps were created by automatic classification of aerial images using the same classification scheme. An area is considered forest if 20% of the ground area is covered by trees and it is larger than 2000 m 2 with a minimum width of 20 m.

Results
For each test map described in Sections 3.3 and 3.4, the landscape metrics listed in Section 3.2 were evaluated using GRASS GIS, FRAGSTATS, and QGIS. Values for the artificial maps described in Section 3.3 are reported in Tables 2-5.
For the first artificial test map (Figure 3), configured as a chessboard, Table 2 shows how a 4-cells CN (GRASS, FRAGSTATS using 4 cells) is able to recognize the 5000 patches corresponding to single pixels and to correctly evaluate the corresponding metrics, while using 8-cells CN (QGIS, FRAGSTATS using 8 cells) one single patch is detected. This difference influences the patch density (PD) and mean patch size (MPS) metrics because they depend on the number of patches. Conversely, edge density (ED) and landscape shape index (LSI) do not change using different CN values because the lengths of all edge segments and the total landscape area change in the same proportion. Results for the second artificial test map (Figure 4), with two large patches separated by a single straight edge. Table 3 shows how all the landscape metrics values are the same regardless of the value of CN used. This is obviously due to the fact that the map contains only two large compact patches with no fragmentation and a single edge. For the third artificial test map ( Figure 5), with a large patch covering half of the map, Table 4 reports how all the landscape metrics do not change with different CN choices. The reason is the same as for the previous map.  Table 5 shows how both choices for CN correctly detect the only patch, composed by a single pixel, present in the fourth artificial test map ( Figure 6). All the landscape metrics values are the same for the different CN values. Finally, results for the real forest map are reported in Table 6. Values provided by QGIS for mean patch size and edge density differ from those calculated by FRAGSTATS with the same CN value by 4 orders of magnitude, respectively of a factor 10 4 and 10 −4 . This is due to the fact that areas in QGIS are expressed in square meters, while in GRASS and FRAGSTATS, ha is used. Therefore, the values of the two mean patch size and edge density metrics for QGIS were rescaled using ha in Table 6. The same procedure will be applied to all the successive results from QGIS. It is evident from Table 6 that some metrics, such as number of patches, patch density, and mean patch size, change when different values of CN are used. Edge density (ED) and landscape shape index (LSI) values are independent from the choice for CN.

Aggregation Index
The literature review of Section 2 revealed that in many cases the value of CN is not explicitly stated. Sometimes it is possible to infer the value by the indication of the GIS software used for the processing but even then there can be some ambiguity, such as for the case of FRAGSTATS which can use two different values and the default depends on the software version.
For this reason, it was investigated whether it is possible to find a parameter indicating how the uncertainty on the CN value reflects on the significance of the landscape metrics. Tests on artificial maps (Section 4, Tables 2-5) showed how the more fragmented a map is the more the values of landscape metrics change for different values of CN. Therefore, the aggregation index (AI), defined in Section 3.2, can be used as an indicator of the importance of the knowledge of CN for the analysis of a map. Its value varies between 0 and 100; it is 0 when the patch class is completely disaggregated, i.e., there are no like adjacencies; it is 100 when the patch class is completely aggregated in a single compact patch. Since tests in Sections 3.3 and 3.4 have shown that for a given value of CN landscape metrics do not depend on the software used to evaluate them, all the values reported in this section were evaluated using FRAGSTATS.
The first tests to investigate a possible relationship between the AI and the variation of the values of landscape metrics using different CN values were carried out on the artificial maps described in Section 3.3. Table 7 shows the values of the AI for the four artificial maps, the number of patches obtained using values of CN 4 and 8, and their variation as percentage. For the first raster, with a chessboard configuration, disaggregation is maximum and the AI is 0, while the difference in the number of patches for the two values of CN is nearly 100%. Conversely, the second and third artificial maps have an AI above 99% and the number of patches does not depend on CN (Tables 3 and 4). Finally, for the last artificial map the AI is undefined, since it is not possible to evaluate the AI for classes consisting of a single patch. These first tests show ( Table 7) that for maps with low values of AI the use of different values of CN leads to large variations of even a simple metric such as the number of patches, while for maps with high AI values the choice for CN becomes irrelevant. The four artificial maps were built to represent extreme configurations; therefore, this hypothesis was tested on real maps. The three maps for the forest in Val di Fassa in 2006, described in Section 3.4, represent a sequence of maps with increasing patch compactness, as shown in Figure 8. Table 8 reports landscape metrics evaluated using four cells CN and eight cells CN for the three maps. AI values do not depend on the choice of CN; therefore, it can be used to assess the effect of the variation of CN on landscape metrics.  Figure 10 shows the variation of the number of patches, patch density, and mean patch size metrics as a function of AI, both in the cases of 4 and 8 cells CN.    The difference between values obtained with the two values of CN decreases when the AI increases for all the tested indexes. In fact, the differences for the number of patches and the patch density decrease from 51.6% to 14.0%, while the difference for mean patch size, negative because the mean patch size increases with CN, varies from −107.4% to −16.7% (Table 9). Table 9. Differences for number of patches, patch density (PD) (number of patches/100 ha) and mean patch size (MPS) (ha) for the two values of CN (4 or 8 cells) for the map representing the forest coverage in Val di Fassa in 2006 and the two maps obtained by applying a low pass filter with a 3 × 3 pixels window, assigning the first and the third quartile. These tests show that using a different value for the CN leads to large differences for landscape metrics for fragmented maps, while the metrics values are similar for maps with more aggregated patches. Therefore, the aggregation index can be used as an indicator to assess the suitability of the values of the landscape metrics for making comparisons when the CN value is unknown.

Original Map Third Quartile First Quartile
To further test this hypothesis, the number of patches was evaluated for three forest maps of Val di Fassa in 1954, 1974, and 1994 with the two values for CN. The results are reported in Table 10; for these maps the difference of number of patches is higher when the AI is lower.  When AI is lower than 95% the difference in the number of patches ranges from 37% to 60%, while for the highest value the difference is 14%.

Conclusions
This study provides an assessment of the importance of the knowledge of the cell neighborhood value when using landscape metrics. Tests on specially crafted maps have shown that CN can affect the values of some metrics with variations up to 5000%. For real maps the variation is not so dramatic but in some cases it can still reach 100% (Table 6). Therefore, the knowledge of the value used for the cell neighborhood (CN) parameter is fundamental, especially if metrics' values are compared across studies from different research groups using different software, even if the same classification scheme is used. Nevertheless, software users often seem to underestimate the importance of the effects that CN can have on landscape metrics and the selected CN value is often omitted.
When the CN is unknown, tests in Section 5 showed that it is possible to use the values of the aggregation index to assess the effect that the uncertainty on CN can have on some landscape metrics, thus providing an indication of the reliability of comparison with other metrics' evaluations. The determination of an analytical link between the AI and the significance of CN for landscape metrics evaluation is under way. This could lead to the identification of a threshold AI value below which the knowledge of the CN is essential. From the user perspective, the choice of cell neighborhood is implicit in the choice of the software. A 4-cells CN provides more evident trends for some metrics, see e.g., Figure 10. The main recommendation is to use the same cell neighborhood when comparing results. Moreover, researchers should clearly indicate the CN values when presenting their results.

Appendix A. Available Metrics
Feature tables for FRAGSTATS 4.2, GRASS 7.6 (r.li), and QGIS 3.4 (LecoS 3.0.0). Names of metrics change in different software systems, names in the following tables are from the FRAGSTATS 4.2 manual [45]. Table A1. Software features: cell neighborhood and available area and edge metrics for FRAGSTATS, GRASS (r.li), and QGIS (LecoS).  Table A2. Software features: available shape metrics for FRAGSTATS, GRASS (r.li), and QGIS (LecoS).