Towards Modelling Future Trends of Quebec’s Boreal Birds’ Species Distribution under Climate Change

: Adaptation to climate change requires prediction of its impacts, especially on ecosystems. In this work we simulated the change in bird species richness in the boreal forest of Quebec, Canada, under climate change scenarios. To do so, we ﬁrst analyzed which geographical and bioclimatic variables were the strongest predictors for the spatial distribution of the current resident bird species. Based on canonical redundancy analysis and analysis of variance, we found that annual temperature range, average temperature of the cold season, seasonality of precipitation, precipitation in the wettest season, elevation, and local percentage of wet area had the strongest inﬂuence on the species’ distributions. We used these variables with Random Forests, Multivariate Adaptive Regression Splines and Maximum Entropy models to explain spatial variations in species abundance. Future species distributions were calculated by replacing present climatic variables with projections under different climate change pathways. Subsequently, maps of species richness change were produced. The results showed a northward expansion of areas of highest species richness towards the center of the province. Species are also likely to appear near James Bay and Ungava Bay, where rapid climate change is expected.


Introduction
During the last two decades, climate change has been receiving growing attention in environmental studies, and it has been named as one of the main possible drivers of ecological change in the upcoming decades [1][2][3]. In its 2013 assessment report, the Intergovernmental Panel on Climate Change (IPCC) concluded that most of the natural systems on earth will be affected at different levels of intensity [4]. Terrestrial ecosystems have to adapt to early springs and fauna and flora have already started migrating, in most cases towards higher latitudes or elevations, where topography allows [4][5][6][7]. The IPCC conclusions highlight that a large number of ecosystems will be disturbed in the coming years, and that climate change impacts on the environment will encompass, amongst others, melting of ice caps, thawing of permafrost and vulnerability and instability of slopes in mountainous areas [4]. Parmesan (2006) reports on the many consequences of anthropogenic climate change on fauna and flora, highlighting the modification in spatial distribution of species. Similar research projects completed in North America, specifically for bird species, conclude that migration in the direction of higher latitudes or elevations has been observed when topography allows it [8][9][10]. This hypothesis is corroborated by studies developed in Antarctica, the Arctic and northern hemisphere temperate areas [3,10]. Furthermore, if temperatures continue to rise, particularly in the boreal areas, forest ecosystems may be facing considerable changes [11][12][13] which could in consequence facilitate shifts and reductions in suitable bird habitats [14].
Located in east-central Canada, the province of Quebec is the largest province in the country with a surface area of over 1.6 million km 2 . Its territory comprises various bioclimatic domains, from temperate forests to arctic tundra. Approximately 70% of Québec's territory is covered by the boreal forest [15]. It creates a belt over 1000 km wide between the 48th and 58th parallels, which comprises four distinct bioclimatic domains, going from deciduous and mixed forest stands within the Northern Temperate zone in the south, to the Arctic zone in the north, where no tree grows [16]. Quebec's boreal forest also provides critical habitat for birds, nesting from 150 to 300 migrant bird species, which could be greatly affected by climate change in the years to come [17].
In order to estimate the influence of climate change on the spatial distribution of boreal avian fauna, it is essential to identify the important predictors associated with the birds' spatial distribution. Following the above-mentioned literature concerning the effects of climate change on the habitats of bird species and possible consequent migrations [3,[8][9][10][11][12][13][14], we assume that the spatial distribution of species is highly correlated to climate, more particularly to precipitation and average annual temperature. Thus, bioclimatic projections should favor a migration of species towards higher latitudes or elevations. Moreover, this research aims to widen the knowledge about boreal bird species' responses to climate change, to help decision makers adapt their conservation policies. In order to accomplish this goal, we have divided our methodology in two steps as follows. First, we identified the explanatory variables linked to the spatial distribution of 37 boreal bird species via statistical modeling methods applied to multivariate data, such as redundancy canonical analysis (RDA) and variation partitioning. After the first part of the analysis, we calculated different species distribution models (SDMs) via bioclimatic modeling techniques and identified future trends in boreal avian fauna for the province of Quebec under different climate scenarios.

Study Area, Species Abundance, and Predictors Data
The geographic area subject of this study is the boreal forest of Quebec, Canada (44. (Figure 1). The two statistical modeling methods implemented in our research require different inputs; the redundancy canonical analysis (RDA) needs a dataset where the observations are noted by sampling site. Since data are available in raster format, for the whole province, a random sampling was done to reduce the number of observation points. In total, fifty sites were randomly selected in the bioclimatic domains of Quebec (Figure 1), using the random uniform point creation tool in ArcGIS [18]. The attribute table, to which the bird abundance data and explanatory variables are linked, was then exported to be used in the R statistical software [19] for multivariate analysis. In order to project the bird species distribution using bioclimatic models, the full dataset, corresponding to the study area, was used. These projections were obtained using the biomod2 package, in R, which takes raster layers as inputs.
Birds' abundance data were provided by eBird [20]. The eBird dataset contains, for Quebec only, more than 1.3 million entries. Even though these data do not go through any rigorous standardization and validation process, the enormous quantity of data available allows us to minimize biases related to data quality. The thirty-seven bird species used in this research (Appendix A), were selected by browsing a list of boreal birds [21]. The only criterion of selection was that the species is resident to the boreal forest of Quebec. From each record, spatial coordinates and the number of observations were selected for analysis. Multivariate statistical analyses (RDA and variation partitioning) require another step. In order to allow a random sampling of abundance for every species using the same sites, the maximum distribution area was interpolated for each species. This interpolation produced a raster layer of abundance for each species based on real observations, simulating a convex distribution hull, but working in a matrix environment composed of cells with values. In order to achieve that, a nearest neighbor interpolation method was used in ArcGIS [18]. The output is a raster layer showing the maximum area of distribution for each species with abundance values for the cells located inside the convex hull, giving 37 raster layers. Interpolation was necessary because multivariate statistics From each record, spatial coordinates and the number of observations were selected for analysis. Multivariate statistical analyses (RDA and variation partitioning) require another step. In order to allow a random sampling of abundance for every species using the same sites, the maximum distribution area was interpolated for each species. This interpolation produced a raster layer of abundance for each species based on real observations, simulating a convex distribution hull, but working in a matrix environment composed of cells with values. In order to achieve that, a nearest neighbor interpolation method was used in ArcGIS [18]. The output is a raster layer showing the maximum area of distribution for each species with abundance values for the cells located inside the convex hull, giving 37 raster layers. Interpolation was necessary because multivariate statistics require a data table with the same sites for the 37 species. The results of the interpolation of bird sightings give abundances ranging from 0 (absence) to 720 (highest abundance found, corresponding to the Surf Scoter sighted at the Île-aux-Basques bird sanctuary). A Shapiro-Wilk normality test, with the null hypothesis stating the data follows a normal distribution and the alternative hypothesis implying an skewed distribution, allows to conclude that the bird sightings are very skewed [22]. A log transformation of response variables was executed to reduce skewness and, thus, improve the strength of statistical tests. For the second step, bioclimatic modeling, raw observations were used as input (latitude-longitude) because species are modelled one at a time.
The goal of our multivariate analysis is the identification of spatialized variables influencing the spatial distribution of boreal birds. In order to do so, the selection of explanatory variables, also called predictors or independent variables, is of paramount importance. From the variables that affect birds' distribution, only the significant ones are used in the forecasts. The explanatory variables' coordinates are in the world geodesic system of 1984, or WGS84, and are available in raster format (with an extent of 2112 lines by 2721 columns). The area goes from 51.1 • W to 79.78 • W in longitude and from 44.98 • N to 62.58 • N in latitude. The spatial resolution is 0.0083 decimal degree by cell, or 787 m by 787 m, under one square kilometer. The bioclimatic data were provided by the WorldClim database and the variables were named BIO1 to BIO19 (Table 1) (Hijmans et al., 2005). The data were derived from monthly temperature and precipitation to give rise to more relevant variables in ecology, such as average summer temperature, annual range of precipitation, precipitation seasonality, and so on. The nineteen bioclimatic variables were considered initially, in order to determine which were associated to bird species distribution. During the selection of variables, some of them were discarded because of their low contribution to the models.  [4,23]. In its fifth report, the IPCC defined a new set of climate scenarios, named Representative Concentration Pathways (RCPs). These scenarios are named by their total radiative forcing for year 2100 in comparison with 1750, preindustrial reference year. The scenarios, named RCP 2.6 , RCP 4.5 , RCP 6.0 and RCP 8.5 represent a radiative forcing of 2.6 Wm −2 , 4.5 Wm −2 , 6.0 Wm −2 , and 8.5 Wm −2 respectively [24]. Climate simulations are available to be used, with atmospheric CO 2 concentrations of 421 ppm (RCP 2.6 ), 538 ppm (RCP 4.5 ), 670 ppm (RCP 6.0 ) and 936 ppm (RCP 8.5 ) for the year 2100 [4,24].
Elevation data were provided by the National Topographic Data Base of Canada [25]. This raster layer displays elevation, in meters, for the whole province of Quebec. Shorelines, riverbanks, oceans and estuaries have a value of zero. The median sea level (MSL) is assigned according to the geodesic height reference system of 1928 and the vertical resolution is one meter [25]. Anthropogenic disturbances dataset for Quebec was provided by Global Forest Watch Canada and is comprised of four type of disturbances: (1) linear (roads, pipelines, high voltage lines); (2) polygons (harvested forest, oil/gas well, agricultural land); (3) reservoirs; and (4) active mines [26]. No reservoir, mine or polygonal disturbance is found on the sampled sites, therefore only linear disturbances are kept for the analyses. Finally, forest inventory dataset was taken from the Canadian Forest Inventory of 2001 [27]. From this feature layer, two fields were extracted to a raster format: (1) a map showing the percentage of forest area for each cell; and (2) a map showing the percentage of wet area per cell. All the variables were grouped in three categories: the resource variables, containing forest and elevation datasets, the bioclimatic variables, BIO1 to BIO19, and the anthropogenic variables, containing linear disturbances.
The datasets required for the analysis were all provided by open data sources [20, 23,[25][26][27]. The tools used for analysis were the R 3.1.0 [19] statistical programming language and the ESRI ArcGIS for Desktop 10.1 [18] for some cartographic transformations and map production.

Methods
In order to predict the future spatial distribution of boreal bird species, a first stage consisted of identifying the variables linked to the spatial distribution of the 37 species through a multivariate analysis aiming to distinguish the shared predictors for the species under study. These variables being identified, a second step is the projection of species future distribution. In total, 37 species distribution models (SDMs) are generated, one for each species.

Statistical Analyses
The redundancy canonical analysis was first used on every variable [28]. After this step, collinear variables are successively removed from the model, in order to have a variance inflation factor (VIF) under 10 for each of the variables kept in the model. In order to keep only the more significant variables, a progressive bidirectional stepwise selection is executed. This type of approach does a selection of the most relevant variables in regards to the Akaike information criterion (AIC) and to the representatively p-value [29]. A stepwise selection using R squared as a selection criteria was used, in order to compare the result of these two selection approaches [29]. These three analyses are repeated until only the significant variables, with a VIF under 10, are integrated to the model.
A redundancy canonical analysis is an asymmetrical analysis needing a response dataset Y and an explanatory dataset X. RDA projects sites, response variables and explanatory variables in one ordination plot, allowing a simpler reading and understanding of the relations between Y and X [28,30]. Many tests were run on the results of the RDA in order to verify the significance of the results. The canonical axes and predictors were submitted to permutation tests to calculate the F-statistic and its associated probability. In order to model the spatial distribution of boreal birds, we need a set of explanatory variables that are independent from one another. Thus, it is necessary to reduce collinearity between variables. Collinearity exists when two or more explanatory variables are correlated. The bioclimatic variables provided by WorldClim [23], being based on temperature and precipitation, are all derived from the same data, and hence are highly correlated. We calculated the variance inflation factors (VIFs) of variables, by executing, successively, linear regressions of each explanatory variable on all other explanatory variables. The VIF for variable j is obtained with the following Equation [28]: A maximal threshold is fixed to 10, since a threshold of 5 does not allow, in this case, preservation of enough environmental variables. The removal of highly correlated variables, showing a high variance inflation factor, allows us to reduce the collinearity amongst them, and it also enables to better identify the individual influence of explanatory variables. Removed variables are the ones showing the highest VIF and being the least significant from an ecological point of view.
In order to better identify the factors influencing the spatial distribution of boreal birds, the variables are split in three groups: (a) bioclimatic variables; (b) the resource variables such as elevation, percentage of forest and percentage of wet area; and (c) anthropogenic linear disturbances [29]. Variation partitioning [30][31][32][33] identifies individual and joint effects of explanatory groups on variations of the response, as well as the proportion of unexplained variations.

Bioclimatic-Based Species Distribution Models
Bioclimatic modelling approaches were used to produce species distribution models (SDMs), using the R package {biomod2}, a platform for building, calibrating and evaluating species distribution models, or niche models [34].
To begin, it is essential to define the two concepts behind these static models: the pseudoequilibrium postulate and the niche model. The pseudo-equilibrium postulate is defined by Guisan et al. (2005) as follows: "As both species and environmental data are usually sampled during a limited period of time or/and space, models fitted using these can only reflect a snapshot view of the expected relationship. A convenient working postulate is to assume that the modelled species is in pseudo-equilibrium with its environment" [35,36]. The niche concept states that species distribution models, and particularly maps produced by them, lay on a combination of Grinnell's ecological niche, where a species can be found at any place where the environmental conditions for its reproduction are found, and the realized niche of Hutchinson, where a species is excluded from its fundamental niche by predation or competition [36]. This combination is due to the fact that SDMs use real sightings to identify the ecological niche of a species. However, the observations are the fruit of species competition and predation; thus, the realized niche serves indirectly to the generation of models.
For the species distribution models, three algorithms were used: RF, MARS and MAXENT (Table 2). Initially, for each species, the presence data were combined with 10,000 pseudo-absences randomly generated on the whole study area. Then, these presence/absences were overlapped to the environmental variables and divided in a training (80%) group, used for the creation of a niche model, and a test (20%) group, used for validation by discrimination metrics. To reduce the possibility of errors, the minimal area under the receiver operating characteristic (ROC) curve threshold, or AUC (Area Under the Curve), was set at 0.7; it is important to keep in mind that a random classification will have an AUC score of 0.5 while a perfect classification will have an AUC of 1. Through multiple runs, multiple models were calculated. Those meeting the AUC criterion were kept and others were excluded. For each of the remaining models, a cut-off threshold was calculated based on their ROC curve. These thresholds were reported in the output of the BIOMOD_Modeling() function of the package {biomod2} of R. The ROC curve is shaped by the False Positive (FP) and False Negative (FN) rates of predictions obtained by applying different thresholds to model outputs. The optimization of performance of the model based on its ROC curve should consider the prevalence (p) of positive cases as well as relative costs of FP and FN prediction errors [37,38]. These factors allow to calculate the slope m of the tangent to the ROC curve at the optimum point using Equation (2) [37,38]: If the ROC plot is a smooth curve, then the optimum point is where the curve has slope m; if ROC plot is not smooth but stepwise, then the optimum point is where the plot first touches a line with slope m as the line is moved down from top-left corner of the plot area [37,38]. Bearing in mind that each point on the ROC plot is the result of application of a threshold, once the optimum point on the plot is found, its corresponding threshold will be the desired ROC threshold of the model [38].
In the next step, future environmental conditions were added to the model to replace present climate data and generate future distribution maps. In order to define presence or absence of a species, an ensemble modelling method was used [34]. This method allows to calculate the mean of projections from the different algorithms used. A committee averaging algorithm was applied, where the probabilities of each model were transformed into binary data using the respective model's cut-off threshold. Each method voted in favor or against the presence of the species. Finally, this score was projected to the interval from 0 (consensual absence) to 1000 (consensual presence). This method allows us to adapt binary decisions of presence/absence to uncertainties related to projections in the same map, exhibiting the inherent uncertainties to the different bioclimatic modelling algorithms [39]. A value of 500 means that 50% of the models forecast an absence and 50%, a presence [40]. Figure 2 shows a simplified version of the bioclimatic modelling method used for each of the 37 species. Parallelograms show inputs/outputs, rectangles, processes, and the cylinder represents the climate database. The shaded grey rectangle represents the core of the model, the niche model.

Variable Selection
By including all variables in the first RDA and calculating VIFs, some variables showed an extremely high inflation factors, near 2000 (mean annual temperature-BIO1, mean temperature of coldest quarter-BIO11, lowest temperature of coldest month-BIO6). The R-squared is 0.854, but the adjusted R-squared is 0.735. A permutation test applied on the variables showed that the percentage of forested area (pct_for), precipitation of wettest month (BIO13), mean daily range (BIO2), isothermality (BIO3) and highest temperature of warmest month (BIO5) are significant (p < 0.05). The permutation test applied to canonical axes showed that only the axes 1, 7, 11 and 15 are significant (p < 0.01) after 299 permutations.
In order to reduce the number of variables and to strengthen the analyses, collinear variables were removed one by one, manually. The variables kept are elevation, linear disturbances, percentage of forested area, percentage of wet area and BIO7, BIO11, BIO15 and BIO16. The mean VIF of these eight variables is 5.24, with 11.7 as a maximal value (BIO11), which we found acceptable, because BIO11 is slightly over the fixed threshold, even though literature sometimes recommends threshold

Variable Selection
By including all variables in the first RDA and calculating VIFs, some variables showed an extremely high inflation factors, near 2000 (mean annual temperature-BIO1, mean temperature of coldest quarter-BIO11, lowest temperature of coldest month-BIO6). The R-squared is 0.854, but the adjusted R-squared is 0.735. A permutation test applied on the variables showed that the percentage of forested area (pct_for), precipitation of wettest month (BIO13), mean daily range (BIO2), isothermality (BIO3) and highest temperature of warmest month (BIO5) are significant (p < 0.05). The permutation test applied to canonical axes showed that only the axes 1, 7, 11 and 15 are significant (p < 0.01) after 299 permutations.
In order to reduce the number of variables and to strengthen the analyses, collinear variables were removed one by one, manually. The variables kept are elevation, linear disturbances, percentage of forested area, percentage of wet area and BIO7, BIO11, BIO15 and BIO16. The mean VIF of these eight variables is 5.24, with 11.7 as a maximal value (BIO11), which we found acceptable, because BIO11 is slightly over the fixed threshold, even though literature sometimes recommends threshold values of 15, and even 20 [28]. The RDA was then applied once again, this time only on the non-collinear variables. The adjusted R 2 of the model is 0.58. Afterwards, a bidirectional selection of variables led to the suppression of linear disturbances and percentage of forested area. The final analysis was then applied to six variables, these being the four bioclimatic variables (BIO7, BIO11, BIO15 and BIO16), the percentage of wet areas and elevation. At this stage, it became clear that BIO11, or mean temperature of coldest quarter, is the variable that influences the spatial distribution of species the most. The second most important variable is BIO16, or precipitations of wettest quarter. On the other hand, two variables were identified to negatively influence the presence and abundance of species: the seasonal variation of precipitation (BIO15) and annual range of temperatures (BIO7). One of the interesting facts shown by the RDA analysis is the species and sites grouping. The technique allows to distinguish the precipitation and temperature gradients found in Quebec. Table 3 provides information about each sample site (row0 to row49). A permutation test run on the explanatory variables allowed us to test for the significance of the six selected variables ( Table 4). Each of the variables integrated to the model have high significance (p = 0.01), with the exception of pct_wet, that is less significant than the others (p = 0.07). Finally, the same test, but run on the canonical axes, showed that the first three axes are highly significant (p < 0.005) ( Table 5) and that the fourth axis is to some degree significant (p = 0.071).

Variation Partitioning
For variation partitioning, the variables were initially separated into three groups. The four bioclimatic variables form the group of climate variables and the variables pct_wet and elevation form the group of resource variables, that is to say a group of variables that are indirectly linked to the species foraging behaviors. Since linear disturbances are dropped from the model, the group of anthropogenic variables was dropped as well. Bioclimatic variables are responsible for 53% of the total variation, directly (40%) or with the interaction with resource variables (13%). The resource variables, in turn, are only accountable for 18% of the total variation, directly (5%) or with the interaction of bioclimatic variables (13%). Finally, the residuals represent 41% of the total variation.

Bioclimatic Modelling
Averaging presence/absence projections for the thirty-seven bird species model gives rise to the maps presented in Figures 3 and 4a-d. For a better view and the possibility of zooming, these maps are also made available online at the address given in Supplementary Materials. Figure 3   The binary maps are produced by the classification of committee averaging maps, which have values from 0 to 1000, where 1000 is a presence forecast by all models. The threshold value of 500 was used to split between 0 and 1. The classification interval can be expressed as follows: [0:500] = 0, [500:1000] = 1. By adding the values of the 37 different maps, the result is a map where values are stretched from 0, or complete absence of the 37 species, to 37, where every species modelled is forecasted to be present. These results highlight an overall expansion of bird habitats towards north. In addition, there seem to be some discontinuities: some new habitats emerge in locations that are not connected to the larger habitable zone in the south. Two particular examples noticeable in Figure 4a are the James Bay area, located at the extreme west and Ungava Bay, located in the North-East of the province. Emergence of habitat in those two key areas could be explained by their proximity to the ocean, justified by the continentality principle [41]. According to Hirschi et al., the change in continentality can almost be explained by one factor, which is an increase of 3.5 • C in the temperatures of the coldest month. This variable goes in accordance with our identification of BIO11, or the temperature of the coldest quarter, as a key factor for bird species distribution. While most of the bird species are presently distributed along the Saint-Lawrence River Valley, in the south of the province, there is a chance that the species will move towards higher latitudes or longitudes, in any of the selected climate scenarios and time periods. Maps displayed on the right panel of Figure 4a-d represent the change in species richness, calculated as the difference between future (2050 and 2070) and current species richness distribution maps. Accordingly, a pixel with a value of 37 represents a "new presence" for the whole 37 species, while a pixel with a value of −37 would represent a loss in all 37 species. Negative change is shown in red, and positive change is displayed in different shades of green.
"new presence" for the whole 37 species, while a pixel with a value of −37 would represent a loss in all 37 species. Negative change is shown in red, and positive change is displayed in different shades of green.

Discussion
The redundancy canonical analysis (RDA) shows that spatial distribution of the 37 bird species is linked to bioclimatic factors, such as maximum and minimum temperatures and precipitation in the wettest quarter. Even though the initial analysis includes numerous predictors, such as anthropogenic disturbances, forest cover and wet areas, only four bioclimatic variables, elevation and wet areas explain boreal bird abundances. Bioclimatic variables explain 53% of boreal bird spatial distribution, and resource variables, 5%. Thus, it is possible to state that, for our data, anthropogenic disturbances do not seem to be linked to spatial distribution of boreal birds.
The core of the model calculates suitability of every pixel for the species of the study, and positive and negative effects of explanatory variables are defined in the same frame. The analysis revealed that the factor with the strongest effect on suitability of locations-and hence on presence of species therein-is the mean temperature of the coldest quarter. The higher the temperature of the coldest quarter at a given location, the more likely it is to have resident birds there. This is of course particular to the area and species of the present study. Examples of studies in other areas are given in the next paragraph. In our study, the second most important factor in the abundance of birds is the precipitation of the wettest season. Higher values of this variable are associated with higher probabilities of the presence of birds. On the other hand, it was identified that seasonal variations in precipitation and annual temperature range are two variables that negatively influence the presence of resident species.
Each location has its own biogeographical characteristics and influential factors that have to be identified in independent analyses, though similarities exist between studied cases as well. A study of indicators of breeding bird richness in the Canadian province of British Columbia [42] identified annual evaporative demand, climate moisture deficit, and mean elevation as important explanatory variables. In another study on the range shifts of birds in Finland and northern Norway [10] mean temperature of April-June, precipitation in April-June, mean temperature of the coldest month, and precipitation in December-February were the important climate variables, while elevation range and mean altitude above sea level were influential topographic variables. Another study on bird species abundance in the U.K. [43] used mean temperature and total rainfall in two periods from April to July and from December to February, in addition to land cover as pertinent variables. A study on breeding wetland birds in the Czech Republic [44] explained birds' distribution by temperature and precipitation in March-April as important climate variables, and by habitat and topography variables. Another study focusing on a single subspecies (Lagopus muta helvetica) in the Swiss Alps [45] found mean July temperature as the most important bioclimatic variable that explained suitability of areas in multiple scales. The same study noted that in smaller geographical scales, annual precipitation, July water budget, and July cloud cover were also important explanatory variables. In comparison, our analysis of factors affecting resident bird species richness in Quebec identified geographic conditions as well as two temperature variables (mean temperature of the coldest quarter and annual temperature range) and two precipitation variables (precipitation in the wettest quarter and precipitation seasonality). Indicating cold/wet limits and respective measures of dispersion, our selected bioclimatic variables refer to intervals of variation of temperature and precipitation. It is noteworthy that temperatures of coldest months were also used in the cases of Finland-Norway [10] and the U.K. [43], but not in the cases of British Columbia [42] and the Czech Republic [44]. In the case of British Columbia there is an indirect reference to the higher temperatures in the variable 'annual evaporation demand' [42]. In contrast, our study found the temperature of the colder quarter to be the most influential factor in habitat suitability. Such difference is reasonable considering that the climate conditions of Quebec are more similar to those of Finland than British Columbia.
One must remain cautious with the results, since bird sightings are usually made on the sides of roads or in disturbed areas, such as cities or countryside. That means, in areas that are far from reach, it is unlikely that anyone passes and reports a sighting. In particular, we have very scarce data for the North of Quebec, and none for remote areas. It must be considered that such lack of records does not mean complete absence of species in these areas. Furthermore, although data provided by eBird are very large and comprehensive, these datasets are highly biased by a lack of standardization in the process of data collection. We must keep in mind the sensitivity of the analysis to the spatial resolution of data. Thus, random sampling gives rise to sites with different properties varying with the scale or spatial resolution of datasets. Nevertheless, these results allow us to consider bioclimatic variables as important drivers of boreal bird spatial distribution, allowing the use of bioclimatic modelling techniques.
Thus, for the second stage of this project, the species distribution modelling applied to bird species, only the bioclimatic variables and elevation were used. The models agree with the hypothesis that climate change would induce a shift in the spatial distribution of boreal bird species. This idea is in agreement with the northern biodiversity paradox, which states that even though climate change will be a major cause of extinction of species, boreal areas, however, will see an increase in species richness and in biodiversity [46]. The results of a similar study show that, for 80% of the climate scenarios used, North America should see an 11% net loss of animal species under B1 scenario (similar to RCP4.5) and a 17% net loss of animal species under A2 scenarios (high emission, compared to the middle point between RCP6.0 and RCP8.5) [2].
Results of the present study show that, with climate change, bird species richness in southern Quebec is likely to increase remarkably. As the climate continues to change in the scenarios considered, the suitability of northern areas will increase gradually, such that zones of high avian species richness will expand towards higher latitudes. The results of multivariate analysis showed that boreal bird species are strongly linked to climate. Thus, as bioclimatic models exhibit, climate change could induce changes in the spatial distribution of these species. Taking into account that BIO11 is strongly related to bird species distribution, the changes in continentality could explain the forecasted increase in species presence around the James and Ungava Bays, both located in the Arctic Ocean, which experiences a slight change in continentality, explained by the temperatures of the coldest month.
The prediction maps also indicate that, for the same duration, the more intense climate change scenario (RCP8.5) leads to further expansion of the zone of high avian species richness. This is particularly noticeable in the prediction map of 2070, in which larger areas north of the province become zones of higher avian biodiversity.
The expected increase of biodiversity in Quebec, especially in the southern part in the near future, has an important implication for environmental decision making and policy. In a zone of high biodiversity, disturbed or destroyed habitat will influence a larger number of species and cause a strong environmental impact. In this context, preservation of intact land and protection of suitable habitats for these species will be of higher importance than before.
The species distribution models produced in this research are static and lie on a pseudoequilibrium postulate, defined by Guisan and Zimmermann (2000). Indeed, every static model lies on the premise that species distribution patterns are in equilibrium or pseudo-equilibrium with the environment, since static models cannot manage disequilibrium or dynamic equilibrium exhibited by ecosystems. Dynamic and stochastic modelling approaches, such as individual-based modelling or cellular automata, allow us to deal with such conditions, since they rely on an ascending (bottom-up) approach, modelling at the individual scale, allowing feedbacks and non-linearity, while the present methods rely on a descending (top-down) approach, where species distribution areas are identified using statistics and environmental variables, in contrast with behavioral traits and everyday preferences of species.
The method developed here consists of a combination of multivariate statistics and species distribution modelling. It could easily be adapted to other species and/or study areas. The first step, including redundancy canonical analysis, removal of collinearity, bidirectional selection and variation partitioning, allowed us to select relevant explanatory variables to be used in the second step, in order to generate parsimonious models, instead of including many irrelevant variables.

Conclusions
This research presents a multivariate analysis that allows us to identify independent variables explaining the spatial distribution of 37 bird species found in the boreal forest of Quebec. Subsequently, Random Forests, Multivariate Adaptive Regression Splines and Maximum Entropy models were used to explain spatial variations in species abundance, in order to calculate future species distributions under different climate change pathways. The key variables identified include annual range of temperature, mean temperature of coldest quarter, precipitation seasonality, precipitation in the wettest quarter, and elevation. The variable the most closely linked to modelled species is BIO11, or mean temperature of the coldest quarter. This is easily explained by the fact that the modelled species are boreal species. The results of multivariate analysis showed that boreal bird species are strongly linked to climate. Thus, as bioclimatic models exhibit, climate change could induce changes in the spatial distribution of these species. Results indicated that within the boreal forest of Quebec, the mean temperature of coldest quarter (BIO11) is strongly related to bird species distribution and explains the forecasted increase in species presence around James Bay and Ungava Bay, which are located in the Arctic Ocean. Mechanistic and stochastic models taking into account forest fragmentation and natural and anthropic disturbances could better represent the complexity inherent to forest ecosystems and, thus, take into consideration interactions and possible feedbacks with other species or with the environment. The analyses and results presented in this study could support environmental decision making, concerning the design of future protected areas to maintain biodiversity.