Evaluation of Driving Forces of Land Use and Land Cover Change in New England Area by a Mixed Method

Understanding the driving forces of land use/cover change (LUCC) is a requisite to mitigate and manage effects and consequences of LUCC. This study aims to analyze drivers of LUCC in New England, USA. It combines meta-study, GIS, and machine learning to identify the important factors of LUCC in the area. Firstly, we conducted a meta-study of the research on LUCC in the New England area and specifically focused on the driving forces analysis. The meta-analysis revealed that the LUCC studies in the research area were highly related with many other research topics, and population and economic factors were the most mentioned drivers of the LUCC. The drivers of LUCC in this study area for the past several decades were relatively well analyzed. However, the study of the main driving forces of recent LUCC is lacking. Then, the determinants of LUCC for the recent years were quantitatively assessed using the random forests (RF) model along with geospatial data processing. Two planning regions in Connecticut and one planning region in Massachusetts were selected to serve as the case study areas. Investigated variables included environmental and biophysical variables, location measures of infrastructure and existing land use, political variables, and demographic and social variables. These drivers were examined for their relations with LUCC processes. Their importance as driving forces was ranked by the RF method. The results show both consistency and inconsistency between the meta-analysis and the RF method. We found that this mixed method can enhance our understanding of driving forces of LUCC and improve the selection quality of important drivers for modeling LUCC. With more solid information, better land management advices for sustainable development may also be provided.


Introduction
Land use/cover change (LUCC) is a main component of global environmental change, which is mainly an intended or unintended outcome of human activities. LUCC has profound effects on climate, biodiversity, soil condition, water flows, and ecosystem services at local, regional, and global scales [1][2][3][4], which in turn affect land-use decisions. A better understanding of processes, trends, causes, and consequences of LUCC is of crucial importance to land planners, ecologists, and others [5,6]. Moreover, identification of driving forces that cause the dominant LUCC is essential for establishing management strategies and policies to mitigate or prevent negative effects of LUCC or for predicting future changes using models [7]. Identification of driving forces that cause the dominant land use and land cover half of the land was cleared to create croplands and pastures and remaining forest areas were cut extensively for timber harvest for construction [23]. The forest clearance and agricultural expansion reached a peak, with up to 60-80% of the land being cleared, in the mid-1800s. Around the same time, farming began to decrease and regrowth of secondary forests on abandoned agricultural land started. As quickly as the land was cleared, much of the land reverted to forest in the early 1990s. New England has been regarded as a primary example of forest transition, which refers to expansion of forest areas. However, New England is facing forest lose and more complex land cover change in recent years [24,25]. The recent land cover change and forest change in New England are not completely unknown. What remain uncertain though are the driving forces of land change during the last two decades.
The objectives of our study are to identify and characterize the most relevant factors responsible for LUCC, especially for recent years, in the New England area, through a new framework that involves a meta-study, GIS, and machine learning. The New England area is an important region for land transition with a diverse landscape. We tried to answer the following questions: (1) How and why is land use/cover changing in the New England area based on the previous studies? (2) What are the parameters that drive LUCC? Which ones are the most relevant driving forces of LUCC? (3) What are the main drivers for recent LUCC in the New England area? How is the consistency between the meta-study result and the RF method result? (4) What are the advantages and disadvantages of the mixed method in our case study?

Study Area
The study area, the New England region ranges from 41 • 50 N to 47 • 29 N latitude and 66 • 54 W to 73 • 45 W longitude, is located in the Northeastern United States and includes the states of Connecticut (CT), Rhode Island (RI), Massachusetts (MA), New Hampshire (NH), Vermont (VT), and Maine (ME) (Figure 1). On the one hand, New England is heavily forested, and New Hampshire is the second most forested state of the United States. On the other hand, New England includes some states with high population density, such as Rhode Island and Massachusetts. The meta-study focused on the land use/cover change studies in which the study areas are located in the New England area, cover the whole New England area, or contain parts of the New England area. At the meantime, two planning regions in Connecticut (Northeastern and Southcentral) and one planning region in Massachusetts (Metrowest) were selected for our case study for quantifying the determinants of recent LUCC using the random forests approach (Figure 2). These three planning regions can provide comparison between the relatively rural area (Northeastern) and urban area (Southcentral) within the same state (CT) and comparison between two relatively urban areas (Southcentral and Metrowest) in the two different states (CT and MA). The planning regions are the focus of the RF approach because they can provide a convenient scale for examining important land transition processes, and yet represent a feasible area for data collection and analysis, which ensure evaluation of most available drivers. ISPRS Int. J. Geo-Inf. 2020, 9,350 3 of 14 areas were cut extensively for timber harvest for construction [23]. The forest clearance and agricultural expansion reached a peak, with up to 60%-80% of the land being cleared, in the mid-1800s. Around the same time, farming began to decrease and regrowth of secondary forests on abandoned agricultural land started. As quickly as the land was cleared, much of the land reverted to forest in the early 1990s. New England has been regarded as a primary example of forest transition, which refers to expansion of forest areas. However, New England is facing forest lose and more complex land cover change in recent years [24,25]. The recent land cover change and forest change in New England are not completely unknown. What remain uncertain though are the driving forces of land change during the last two decades. The objectives of our study are to identify and characterize the most relevant factors responsible for LUCC, especially for recent years, in the New England area, through a new framework that involves a meta-study, GIS, and machine learning. The New England area is an important region for land transition with a diverse landscape. We tried to answer the following questions: (1) How and why is land use/cover changing in the New England area based on the previous studies? (2) What are the parameters that drive LUCC? Which ones are the most relevant driving forces of LUCC? (3) What are the main drivers for recent LUCC in the New England area? How is the consistency between the meta-study result and the RF method result? (4) What are the advantages and disadvantages of the mixed method in our case study?

Study Area
The study area, the New England region ranges from 41°50 N to 47°290 N latitude and 66°540 W to 73°450 W longitude, is located in the Northeastern United States and includes the states of Connecticut (CT), Rhode Island (RI), Massachusetts (MA), New Hampshire (NH), Vermont (VT), and Maine (ME) (Figure 1). On the one hand, New England is heavily forested, and New Hampshire is the second most forested state of the United States. On the other hand, New England includes some states with high population density, such as Rhode Island and Massachusetts. The meta-study focused on the land use/cover change studies in which the study areas are located in the New England area, cover the whole New England area, or contain parts of the New England area. At the meantime, two planning regions in Connecticut (Northeastern and Southcentral) and one planning region in Massachusetts (Metrowest) were selected for our case study for quantifying the determinants of recent LUCC using the random forests approach (Figure 2). These three planning regions can provide comparison between the relatively rural area (Northeastern) and urban area (Southcentral) within the same state (CT) and comparison between two relatively urban areas (Southcentral and Metrowest) in the two different states (CT and MA). The planning regions are the focus of the RF approach because they can provide a convenient scale for examining important land transition processes, and yet represent a feasible area for data collection and analysis, which ensure evaluation of most available drivers.

Meta-Study
As a study mentioned, "the term meta-studies includes meta-analyses, systematic reviews, and other secondary studies that aim to synthesize case-study findings [18]." The meta-study in this research does not include rigorous statistical treatment of case studies. The robust statistical treatment is usually not possible in land use study due to the differences of case studies in spatial scale, time period and design, complexity of empirical case studies, diversity in data and methods, and different perspectives of researchers from different disciplines [18]. Meta-studies have been used to assess, for instance, long-term urbanization across the globe [26], wetland conversion [27], drivers of typical deforestation [28], causes of desertification [29], and tropical agriculture intensification [30]. However, the meta-studies of LUCC processes and driving forces are still limited.
In our meta-study, we collected case study information on land use/cover change from peerreviewed articles, reports, book chapters, proceedings of conferences, and PhD theses. We conducted comprehensive search of Elsevier Scopus databases and Google Scholar databases using word combinations of "land use change", "land cover change", "land use/cover change", "New England", "Northeastern", "Maine", "Vermont", "Massachusetts", "Rhode Island", "Connecticut", and "New Hampshire" in English. These studies were first examined to ensure that the location of study site is in, is part of, or contains New England area. Around 100 papers were collected. We further restricted ourselves to studies that are within a suitable time interval, at least including LUCC after 1980s. Then we excluded the studies just about impact of LUCC (e.g., [31]), composition change of forest (e.g., [22]), and methodology studies (e.g., [32,33]). Thus, totally 54 papers were selected finally, which are all included in Supplementary.

The Random Forests (RF) Approach
The RF approach is one of the ensemble methods and was classified as one of machine learning approaches at the end of the nineties [34]. It is a popular and efficient algorithm to handle both classification and regression problems. It is based on bootstrap aggregation, which can overcome the overfitting problem of the decision tree model and can rank the explanatory variables using the random forests score of importance [19]. Recently, the RF approach is receiving highlighted interest in land-cover classification using multispectral and hyperspectral satellite sensor imagery [35,36], and lidar and radar data [37][38][39][40]. Meanwhile, its power on detecting variable importance has also received a growing attention. For example, [41] applied the RF approach to identify drivers of cropland and urban land changes in Jiangxi province, China, and confirmed its capability for assessing of the complex drivers of LUCC. Despite its popularity and efficiency, the application of RF to the analysis of the spatial driving forces of LUCC data is rare.
The RF method was chosen for evaluating the driving forces of LUCC in the research area for several reasons. First, it has been proved to be able to deal with complex nonlinear problems and

Meta-Study
As a study mentioned, "the term meta-studies includes meta-analyses, systematic reviews, and other secondary studies that aim to synthesize case-study findings [18]." The meta-study in this research does not include rigorous statistical treatment of case studies. The robust statistical treatment is usually not possible in land use study due to the differences of case studies in spatial scale, time period and design, complexity of empirical case studies, diversity in data and methods, and different perspectives of researchers from different disciplines [18]. Meta-studies have been used to assess, for instance, long-term urbanization across the globe [26], wetland conversion [27], drivers of typical deforestation [28], causes of desertification [29], and tropical agriculture intensification [30]. However, the meta-studies of LUCC processes and driving forces are still limited.
In our meta-study, we collected case study information on land use/cover change from peer-reviewed articles, reports, book chapters, proceedings of conferences, and PhD theses. We conducted comprehensive search of Elsevier Scopus databases and Google Scholar databases using word combinations of "land use change", "land cover change", "land use/cover change", "New England", "Northeastern", "Maine", "Vermont", "Massachusetts", "Rhode Island", "Connecticut", and "New Hampshire" in English. These studies were first examined to ensure that the location of study site is in, is part of, or contains New England area. Around 100 papers were collected. We further restricted ourselves to studies that are within a suitable time interval, at least including LUCC after 1980s. Then we excluded the studies just about impact of LUCC (e.g., [31]), composition change of forest (e.g., [22]), and methodology studies (e.g., [32,33]). Thus, totally 54 papers were selected finally, which are all included in Supplementary.

The Random Forests (RF) Approach
The RF approach is one of the ensemble methods and was classified as one of machine learning approaches at the end of the nineties [34]. It is a popular and efficient algorithm to handle both classification and regression problems. It is based on bootstrap aggregation, which can overcome the overfitting problem of the decision tree model and can rank the explanatory variables using the random forests score of importance [19]. Recently, the RF approach is receiving highlighted interest in land-cover classification using multispectral and hyperspectral satellite sensor imagery [35,36], and lidar and radar data [37][38][39][40]. Meanwhile, its power on detecting variable importance has also received a growing attention. For example, Wang, Q. et al. [41] applied the RF approach to identify drivers of cropland and urban land changes in Jiangxi province, China, and confirmed its capability for assessing of the complex drivers of LUCC. Despite its popularity and efficiency, the application of RF to the analysis of the spatial driving forces of LUCC data is rare.
The RF method was chosen for evaluating the driving forces of LUCC in the research area for several reasons. First, it has been proved to be able to deal with complex nonlinear problems and address different types of response variables, such as numerical data (e.g., population density) and categorical data (e.g., soil types). Second, it can also make faithful data descriptions without strong model assumptions. It does not require the normal distribution of the dependent variable and is insensitive to collinearity issues. Another advantage of RF is that it can be trained in parallel. Therefore, in a large LUCC dataset, training of the RF model can be fast. In addition, the RF method can gain good prediction accuracy. Finally, the results of the RF method can be easily interpreted, especially for the feature importance. The feature importance generated by the RF method is normalized to a value between 0 and 1 by dividing the sum of all feature importance values. A higher value indicates a larger impact on the LUCC. The sum of all feature importance values is equal to one.

Driver Classification and Data
Based on previous studies, proximate causes and underlying driving forces are the factors that affect LUCC processes. Proximate causes are the actual process of LUCC, such as deforestation or urbanization, and underlying driving forces are basic societal or environmental factors that cause these changes, such as income or elevation variation (for more information, see [28,42,43]). Sometimes, the proximate causes and underlying driving forces were mixed together; therefore, generalization of driving factors is not easy. In fact, proximate causes may represent the outcomes of human decisions, which may be affected by the underlying driving factors. For example, a study summarized the worldwide urban expansion studies and found that the observed urban land expansion mainly depended on the interaction with demographic and economic drivers, such as population and GDP growth [26]. Hence, the focus of our study is the underlying driving forces, which has been divided into economic, demographic, political, biophysical, and location drivers (Table 1). Site-specific studies are needed in order to comprehensively assess the driving forces in New England areas. Three planning regions in CT and MA are selected in this study. To ensure the best data quality, we used different data sources. The thematically-consistent land use/land cover (LULC) datasets of the years 1990 and 2010 for MetroWest, MA were created using post-classification processing of the original LULC maps from the MassGIS [https://www.mass.gov/orgs/massgis-bureauof-geographic-information], and the thematically-consistent LULC datasets for Northeastern, CT and SouthCentral, CT were created using post-classification processing of the original LULC maps from the CLEAR [https://clear.uconn.edu/index.htm] (Figure 3). The explanatory variables used in the importance assessment of drivers by the RF approach are listed in Table 2. The explanatory variables used in this study were also collected from the MassGIS (for MetroWest, MA) and CLEAR (for Northeastern, CT and SouthCentral, CT) with the same spatial resolution 30 m × 30 m as the LULC maps.

Meta-Study Results
The analysis was limited to LUCC reported in case study areas ranging from small towns to large multi-state regions within or involving the New England (NE) area. Following the study areas of different cases, we found that 14 (i.e., 26%) of all cases occur at the whole NE level (including study areas slightly larger than NE; for example, a study includes NE and New York state [44]); 16 (i.e., 30%) include more than one state in NE but does not cover the entire NE; 7 (13%) are located in MA; 6 (11%) are located in CT; 4 (7%) are located in ME; 3 (6%) are located in RI; 3 (5%) are located in NH; and 1 (2%) is located in VT (Figure 4). More than half of the cases (54%) are at regional scales. 24 (i.e., 44%) of all cases occur at the state level and among the six states, MA and CT have larger numbers of cases. In order to test whether there are differences among case locations, the information on LUCC and its underlying drivers were analyzed based on the case study locations. Totally four location sections were grouped, the NE, the partial NE, the south three states (CT, MA, and RI) of NE, and the rest three states of NE.
After reviewing all the 54 papers, we summarized the information on the land cover types involved in each paper, the proportion of the LUCC analysis, and the magnitude of drivers included ( Table 3). The cases involving all land cover types tend to be equally widespread (50% to 55%) among regional cases, except for north NE. The cases analyzing all land cover types in north NE hold a higher proportion (88%) than the other regional cases. The cases that only analyzed forest land (30% of all) and other land cover types (15% of all, such as urban land and protect area) occupy nearly half of all the cases. Meanwhile, regional variations across the cases are significant. The forest-only study is more common in the NE and partial NE cases, while the cases with other land types just occur in partial NE and south NE; this situation is reasonable since the south NE is more developed than the north NE and the other land types are mainly urban land and impervious landscape. Papers with less than 50% content and more than 50% content focusing on LUCC hold similar shares except for south NE, where a higher proportion of cases contain more than 50% content related to LUCC. Papers with less than 50% content for LUCC mainly focus on the research aspects highly related to LUCC, such as forest composition, nonnative species, and ecosystem. In addition, not all the study cases include the driving forces analysis, and nearly one fourth of all cases (24%) do not mention any explanatory factors of LUCC. At the same time, the study cases just including a simple analysis of driving forces of LUCC (e.g., some contain only several sentences about one or two explanatory factors) exist among the four location sections in the range of 25% to 50%. Only 43% of all the study cases involve a relatively comprehensive driver analysis. They do not vary to a considerable degree across the four location sections.

Meta-Study Results
The analysis was limited to LUCC reported in case study areas ranging from small towns to large multi-state regions within or involving the New England (NE) area. Following the study areas of different cases, we found that 14 (i.e., 26%) of all cases occur at the whole NE level (including study areas slightly larger than NE; for example, a study includes NE and New York state [44]); 16 (i.e., 30%) include more than one state in NE but does not cover the entire NE; 7 (13%) are located in MA; 6 (11%) are located in CT; 4 (7%) are located in ME; 3 (6%) are located in RI; 3 (5%) are located in NH; and 1 (2%) is located in VT (Figure 4). More than half of the cases (54%) are at regional scales. 24 (i.e., 44%) of all cases occur at the state level and among the six states, MA and CT have larger numbers of cases. In order to test whether there are differences among case locations, the information on LUCC and its underlying drivers were analyzed based on the case study locations. Totally four location sections were grouped, the NE, the partial NE, the south three states (CT, MA, and RI) of NE, and the rest three states of NE. After reviewing all the 54 papers, we summarized the information on the land cover types involved in each paper, the proportion of the LUCC analysis, and the magnitude of drivers included ( Table 3). The cases involving all land cover types tend to be equally widespread (50% to 55%) among regional cases, except for north NE. The cases analyzing all land cover types in north NE hold a higher proportion (88%) than the other regional cases. The cases that only analyzed forest land (30% of all)   Among all the study cases, we selected the cases that include driver analysis and then summarized these 41 cases based on the driver study methods and the occurrence of specific underlying driving forces in these cases ( Table 4). The driver study methods are generally grouped into two types: the descriptive methods and the quantitative methods. The papers with descriptive methods are the papers studying possible driving forces of LUCC based on their observations, reviewing related materials, or personal experiences, and the papers with quantitative methods are the papers quantifying the associations between LUCC (or land cover classes) and driving factors. The cases with descriptive methods (30 or 73%) are more common, about three times more than the cases with quantitative methods (11 or 27%), and the cases with descriptive methods show comparatively low location variations. The top three most commonly mentioned drivers are population, economics, and topography. The population here includes all its similar expressions, such as population density, population growth, and population change. The economics includes some vague words, such as economic activity and social-economic drivers in the descriptive cases, and some distinct words, such as per capita income and median household income in the quantitative cases. The topography mainly includes elevation and slope. Population is considered in 85% of all cases and it is the most pronounced driver in partial NE cases (90%). Economics is considered in slightly more than three fifths of all cases (63%). More cases that demonstrate economic drivers leading to LUCC are researches for NE and partial NE compared with the single state cases. Topography is less reported than the former two drivers, accounting for 39% of all cases. It appears that cases with topography driven LUCC are the least widespread in North NE (14%).

Results of the Random Forest Approach
The RF approach was employed to assess the driving forces of LUCC in terms of the importance of individual explanatory variables, that is, the predictive power for capturing the changes (Figure 4). The driver "Distance to developed area (DD)" was consistently ranked as the most important explanatory variable for all the three sample sites. Its contributions to LUCC were 20.44% for MetroWest, MA, 14.00% for SouthCentral, CT, and 23.61% for Northeastern, CT. In contrast, the driver "Protected area (PA)" was ranked as the least important variable for the two sample sites in CT with relative importance lower than 0.3%, and the driver "Aspect (AS)" was ranked as the least important variable (0.29%) for the MetroWest, MA. Obviously, the drivers of two sites in CT have more similar importance compared with those of the site in MA. Besides DD, distance to water (DW) and distance to roads (DR) also play an important role in explaining the land change in the two sites of CT. The location drivers in the two sites of CT explain more land change than social-economic drivers, such as house density change (HUSC) and population density change (POPC). In contrast, the social-economic factors, including HUSC, population density (POPL), and POPC, show strong relationship with land change in Metrowest, MA, with the percentages of variance explained by them being 15.78%, 15.28%, and 7.41%, respectively ( Figure 5). Hence, the variation of driver importance among different locations does exist.
The results show that the RF method may serve as an effective tool for evaluating the relationships between land changes and explanatory variables. The RF can predict whether there is a change of the land type at each pixel over this study period using different drivers that could affect this change, such as population density and distance to the road. The model evaluation method, cross-validation, has been widely accepted to be able to offer a robust measure of predictive power [45]. In order to utilize all the data, 10-fold cross-validation was adopted here to evaluate the performance of the RF method. Using the 10-fold cross-validation method, we split all the data to ten folds evenly, and each time we trained the random forest model based on nine folds and tested the model performance against the rest one-fold. This process was repeated ten times such that every fold served as the rest one-fold once to test the model. The averaged evaluation metric was used as the indicator of the overall model performance. We calculated the prediction of land use change for the three research areas, Northeastern, Southcentral, and Metrowest, and the accuracies from the 10-fold cross-validation are 96.20%, 95.33%, and 95.10%, respectively. Hence, the random forest model, with all the available explanatory variables, can explain over 95% of the land changes for all of the three planning regions.
MetroWest, MA, 14.00% for SouthCentral, CT, and 23.61% for Northeastern, CT. In contrast, the driver "Protected area (PA)" was ranked as the least important variable for the two sample sites in CT with relative importance lower than 0.3%, and the driver "Aspect (AS)" was ranked as the least important variable (0.29%) for the MetroWest, MA. Obviously, the drivers of two sites in CT have more similar importance compared with those of the site in MA. Besides DD, distance to water (DW) and distance to roads (DR) also play an important role in explaining the land change in the two sites of CT. The location drivers in the two sites of CT explain more land change than social-economic drivers, such as house density change (HUSC) and population density change (POPC). In contrast, the social-economic factors, including HUSC, population density (POPL), and POPC, show strong relationship with land change in Metrowest, MA, with the percentages of variance explained by them being 15.78%, 15.28%, and 7.41%, respectively ( Figure 5). Hence, the variation of driver importance among different locations does exist. The results show that the RF method may serve as an effective tool for evaluating the relationships between land changes and explanatory variables. The RF can predict whether there is a

Land Use/Cover Change Studies in New England Area from the Meta-Study
Understanding land use/cover change, which is not a new topic of concern, plays an important role in the study of global environmental change. Past works show that land use/cover change study is closely associated with multiple other research areas, such as urbanization, vegetation dynamics, nonnative species, and climate change in the New England area. LUCC study is not only an important research area for its own, but also serves as the foundation for multidisciplinary studies or the connection between multiple research topics. For example, a study assessed the water quality variability in Sebago Lake, Maine, with the climatic factors and land use change, and indicated an influential role of land use change in determining water quality [1]. Most of the past LUCC studies in the study area are multi-disciplinary studies, and nearly half of them (48% of all studies) have less than 50% content focusing on LUCC analysis based mainly on the LUCC information from pervious LUCC studies. Even the rest studies, which have more than 50% content for LUCC analysis, hardly have comprehensive LUCC analysis. In addition, many studies (30% of all studies) considered only forest land. This was determined by the characteristics of the research area where the major land class is forest. Therefore, a comprehensive LUCC study in the research area is still in demand and, moreover, it could also provide supportive information for related multi-disciplinary studies.
To better understand LUCC, an in-depth analysis of the relationships between the changes and the causes (the driving forces) is needed. The knowledge of the relationships between the changes and the causes may greatly improve modeling and projection of various kinds of LUCC and prevent the possible negative influence. Analysis of the driving forces for the research area is still limited. Nearly one fourth of the previous studies (24%) did not mention the drivers of LUCC and one third of all the studies (33%) simply discussed the drivers with only a few sentences. Even the remaining 43% of all the studies barely had a real comprehensive driving forces analysis. For those studies that mentioned the drivers of LUCC, the descriptive method (73%) was their major research technique. In the research area, an association is evident between LUCC and population growth over the long term. The associations between LUCC and economy and topography can also be found with variations. However, the association between LUCC and population growth is not a simple linear relationship. For example, urban sprawl and deforestation in the past several decades were directly associated with population growth, while they are recently more related to the low-density residential development. Yet these important contributions from the past studies only mapped the broad relationships between LUCC and its driving forces. Moreover, the studies that focused on LUCC and included a relatively comprehensive driver analysis only account for 30% of all the studies.

Consistency and Difference of the Meta-Study and the RF Method
RF explained over 95% of the LUCC for all three planning regions in our case study. Driver importance of LUCC in the three planning regions within the research area was evaluated and compared. In general, the RF model results show that LUCC is the aggregated effect of anthropogenic and biophysical drivers, which is identical to the findings from the meta-study. On the other hand, the importance of anthropogenic and biophysical drivers in the three planning regions varies to some extent, which was hardly detected by the meta-study. The RF feature importance reported that socioeconomic factors including population density and house density were predominant determinants of LUCC in the relatively developed regions such as the SouthCentral and the MetroWest, and the biophysical factors were relatively more important in explaining LUCC in the Northeastern, CT. The consistent finding from both meta-study and the RF method is that the location factors, such as distance to developed area and distance to road, were important drivers for all the three planning regions. However, the location drivers were barely mentioned in the previous qualitative studies in this research area. In summary, both consistency and inconsistency between the findings from meta-study and the RF method exist.

Model Suitability and Reliability
Identification of the driving forces of LUCC is a challenging task, due to the complexity and nonlinearity of LUCC, diversity of drivers at different temporal and spatial scales, and interactions and feedbacks among drivers. A mixed-method approach may provide more comprehensive information. In our study, the meta-study provided an initial idea about the research content. The strength of the meta-study lies in its capability to pool results from individual studies for generalization of the driving forces of LUCC in the research area. However, related LUCC studies in the New England area were rare and at different temporal and spatial scales. Moreover, differences in the research focus and methods could lead to bias. However, the geospatial analysis in the meta-study may provide information about rates of LUCC and spatial distributions of drivers, which may be combined with the RF model results to quantify associations between LULC and driving factors. The quantitative method of the RF model may reduce the subjectivity weakness in the meta-study. In general, quantitative methods are often seen as more reliable than qualitative methods since objective quality criteria are provided by quantitative methods [46,47]. However, many qualitative spatial data as indirect drivers are needed for the quantitative study of LUCC. Therefore, a mixed-method approach that combines qualitative and quantitative analyses should benefit the research through including both direct and indirect driving forces, which may increase the probability of discovering important drivers of LUCC. In addition, it may complement the methodological shortcoming of one method through the advantage of another method [48]. Furthermore, the mixed-method approach allows comparison of different methods about the discordance and consistency of findings. In summary, information provided by multiple sources may be more powerful than that from the single source.

Conclusions
The study started with a meta-study to reveal the general information of LUCC in New England and then focused on the driving forces analysis. Through combing geospatial analysis results from the meta-study and the RF model, this mixed method provides a condensed analysis of driving forces of LUCC from both qualitative and quantitative data in the study area. Our study revealed that the population and economic factors are important drivers of LUCC in the research area, with a complex nonlinear relationship. The value of the mixed method is that it provides a more comprehensive identification of the drivers by complementing the findings of one method through the advantage of another method. For instance, in our study the RF model indicated the importance of the location factors, which was barely noticed in the meta-study. Hence, this mixed method may be meaningful to the LUCC study by enhancing understanding of the driving forces and providing the drivers' selection criteria for modeling the changes. Moreover, better land management advice for sustainable development could be made based on more solid information using the mixed method. However, the mixed method is not feasible for all studies. It highly depends on previous studies, which should have a decent amount to support a systematic review. On the other hand, to conduct a quantitative analysis of driving forces, the availability of important data, such as population and economic datasets, is a prerequisite. Moreover, the new approach aims to improve our ability to understand the driving forces of LUCC, while linking findings from different aspects is always challenging. In future work, with the ongoing improvement of data, the mixed method should be able to better serve the LUCC studies. In addition, research on a more standard way to combine qualitative and quantitative methods is also necessary.