Characterizing Off-Highway Road Use with Remote-Sensing, Social Media and Crowd-Sourced Data: An Application to Grizzly Bear (Ursus Arctos) Habitat

Characterizing roads is important for conservation since the relationship between road use and ecological impact can vary across species. However, road use is challenging to monitor due to limited data and high spatial-temporal variability, especially for unpaved roads, which often coincide with critical habitats. In this study, we developed and evaluated two methods to characterize off-highway road use across a large management area of grizzly bear (Ursus arctos) habitat using: (1) a ‘network-based’ approach to connect human activity hotspots identified from social media posts and remotely detected disturbances and (2) an ‘image-based’ approach, in which we modeled road surface conditions and travel speed from high spatial resolution satellite imagery trained with crowd-sourced smartphone data. To assess the differences between these approaches and their utility for characterizing roads in the context of habitat integrity, we evaluated how behavioural patterns of global positioning system (GPS)-collared grizzly bears were related to road use characterized by these methods compared to (a) assuming all roads have equal human activity and (b) using a ‘reference’ road classification from a government database. The network- and image-based methods showed similar patterns of road use and grizzly bear response compared to the reference, and all three revealed nocturnal behaviour near high-use roads and better predicted grizzly bear habitat selection compared to assuming all roads had equal human activity. The network- and image-based methods show promise as cost-effective approaches to characterize road use for conservation applications where data is not available.


Introduction
Roads are one of the primary drivers of human-caused environmental change affecting terrestrial and aquatic ecosystems globally [1]. There is growing recognition that the ecological implications of roads are complex and far-reaching, and an entirely new scientific field of road ecology has emerged to better understand the direct and indirect effects of roads on ecological systems and wildlife populations [2]. In particular, there is mounting evidence that indirect and cumulative effects of roads on wildlife behaviour and health may be especially pervasive [2]. Central to this understanding is the need to characterize roads according to their physical attributes (e.g., width, road surface) and how they are used by people (e.g., traffic volume, travel speed, vehicle type). Although physical attributes tend to be simple to measure and are frequently documented during construction, road use characteristics related to human activity are challenging to measure and can be spatially highway, unpaved roads across a large management area of grizzly bear habitat in Alberta, Canada using remote sensing, social media, and crowd-sourced data collected from mobile devices. The first method was a 'network-based' approach using circuit theory to estimate the intensity of human activity across the road network by connecting hotspots identified from social media posts, public databases, and remotely detected disturbances. The second method was an 'image-based' approach, in which we modeled road surface conditions and travel speed from spectral attributes derived from high spatial resolution satellite imagery that was trained with crowd-sourced data voluntarily collected from smartphones.
Our primary research objectives were threefold. First, we described the two novel methods (network-based and image-based) to characterize road use in critical wildlife habitat where traffic data is not available. Second, we compared these two methods to each other and a 'reference' road classification derived from a detailed public database managed by the provincial government. We note that the reference classification does not represent a true ground-truth, but rather represents the best currently available characterization approach, which is known to be imperfect yet is still rarely available for most regions. Thus, it serves as an informative comparison. Third, we evaluated the utility of all three characterization methods in the context of modeling wildlife (i.e., grizzly bear) behaviour and habitat use compared to the assumption that all off-highway roads have equal human activity.
An in-depth evaluation of how roads and road use influence grizzly bear behaviour and habitat use was beyond this scope of this study, and has been covered elsewhere (e.g., [3,5,17,19,32]). Rather, we sought to understand the degree to which the three road characterization approaches (network-based, image-based and reference) captured information about roads that is meaningful when evaluating their impact on wildlife, and specifically whether the two new approaches (network-based, and image-based) yielded useful characterizations in the absence of detailed road data (e.g., the reference classification). Since grizzly bears are expected to elicit behavioural shifts in response to traffic, including increased nocturnal activity [17], we therefore evaluated if the three road characterizations influenced daytime and nighttime movement and habitat selection patterns observed in grizzly bears based on GPS collar locations. For comparison of the three characterization approaches, each approach placed individual road segments into rankings of low, medium, or high human activity levels, which are described in the subsequent sections.

Study Area
This study focused on unpaved roads within the Yellowhead Bear Management Area (BMA 3), covering approximately 28,774 km 2 in west-central Alberta, Canada for an interior population of grizzly bears. The area follows a distinct east-west gradient with the western extents comprised of alpine and sub-alpine sections of the Canadian Rocky Mountains, whereas central and eastern sections are comprised of montane forest as well as upper and lower foothills [33]. The central and eastern portions outside of Jasper National Park are highly fragmented by unpaved road networks constructed for oil and gas exploration, agriculture, timber harvesting, coal mining, and wildfire suppression ( Figure 1). Recreational activities are varied and widespread throughout the region and include hiking, camping, hunting, biking and off-road motorsports (e.g., ATVs, motorcycles). Many of the unpaved roads have 'unimproved' surfaces and are built for short term resource extraction activities rather than permanent access, and recreation activities along these roads may be inhibited by gates, barriers, seasonal closures or complete road deactivation once industrial activities cease. Other unpaved roads built as throughways to towns or to access mines are more permanent with improved gravel surfaces, ongoing maintenance and uninhibited access. along these roads may be inhibited by gates, barriers, seasonal closures or complete road deactivation once industrial activities cease. Other unpaved roads built as throughways to towns or to access mines are more permanent with improved gravel surfaces, ongoing maintenance and uninhibited access.

Provincial Road Data
The Government of Alberta maintains an authoritative and publicly available database of roads classified into 15 categories-five paved categories and 10 unpaved categories. We downloaded the road network encompassing the entire study area (Figure 1) from the GeoDiscover Alberta data portal in April 2018. About 13,800 km of unpaved roads existed within the study area, making up 85% of all roads. The road network data was split into individual segments at each road junction or intersection. These segments became the unit of classification for all three classification approaches.

Classification Method
For the reference classification, we grouped the ten unpaved road categories into a three-class ranking based on the authors' knowledge of the area and the provincial road database, to be compared with the network-and image-based classifications. We ranked "two-lane gravel roads" as high use, "one-lane gravel roads" as medium use, and all other unpaved road classes ("unimproved", "truck-trail", "temporary", "unclassified", "road", "winter-road", "ford-winter-xing", "ferry-route") as low use. The two-and one-lane gravel roads tend to be permanent access routes while the other classes tend to represent temporary or unmaintained roads. Two-lane roads generally connect population centres while one-lane roads serve as ancillary branches to access the temporary or unmaintained roads, which are typically constructed for forestry activities and are often abandoned or decommissioned after logging is complete.

Provincial Road Data
The Government of Alberta maintains an authoritative and publicly available database of roads classified into 15 categories-five paved categories and 10 unpaved categories. We downloaded the road network encompassing the entire study area (Figure 1) from the GeoDiscover Alberta data portal in April 2018. About 13,800 km of unpaved roads existed within the study area, making up 85% of all roads. The road network data was split into individual segments at each road junction or intersection. These segments became the unit of classification for all three classification approaches.

Classification Method
For the reference classification, we grouped the ten unpaved road categories into a three-class ranking based on the authors' knowledge of the area and the provincial road database, to be compared with the network-and image-based classifications. We ranked "two-lane gravel roads" as high use, "one-lane gravel roads" as medium use, and all other unpaved road classes ("unimproved", "truck-trail", "temporary", "unclassified", "road", "winter-road", "ford-winter-xing", "ferry-route") as low use. The two-and one-lane gravel roads tend to be permanent access routes while the other classes tend to represent temporary or unmaintained roads. Two-lane roads generally connect population centres while one-lane roads serve as ancillary branches to access the temporary or unmaintained roads, which are typically constructed for forestry activities and are often abandoned or decommissioned after logging is complete.

Human Activity Data
For use in the network-based classification, we derived locations of human activity representing three types: (1) general activities (e.g., recreation and commerce); (2) oil and gas exploration and development; and (3) forest harvesting. The latter two represent the main natural resource extraction activities that occur in the region. We located general activities from posts to multiple social media platforms, specifically: Flickr, Facebook, Google and Wikilocs. We chose to combine platforms in order to capture a larger number of users, and account for the possibility of different usage patterns and activity types for different platforms [26].
In addition to locations of activity, we also quantified the intensity of activity at each location as follows. For Flickr data, we used their application programming interface (API) to extract photos for every centre-point in a 0.005-degree grid over the study area and counted the number of unique users that had posted photos within 500 m of each point. For Facebook data, we used their API to search for every named location within the study area and extracted the coordinates and the number of 'check-ins' for each location. For Google data, we used their API to locate every named place in the study area and then programmed a web browser to extract the number of reviews for each named place. For Wikilocs, a website of user-shared trails locations and information, we manually downloaded each available GPS track in the study area, extracted the start location of the track and the number of times the track had been downloaded. Initial locations and intensity values for each social media platform are shown in Supplementary Materials: Figure S1.
We filtered all locations to select those within 1 km of an unpaved road and 'snapped' the remaining locations to the nearest unpaved road. Therefore, posts within 1 km of each other were combined into a single location with a summed intensity value. We scaled the intensity values to the maximum of the study area for each respective social media platform, and combined posts from the four platforms into a single overall intensity value ( Figure  2a). This was done to account for the fact that we did not restrict the time frame of location or intensity data, but rather included all data available on the date that data extraction occurred (approximately December 2018). The data, therefore, represent the cumulative activity across the landscape from when social media use became widespread (roughly 2005~2010) through approximately 2018. By combining datasets only after scaling each dataset separately, we sought to account for the fact that some platforms have been around longer than others and had different numbers of locations with data (Supplementary Materials: Table S1).
To identify locations of oil and gas activity, we extracted active well sites from the Alberta Biodiversity Monitoring Institute's 2016 Human Footprint database [34] (Figure 2b). For locations of recent forest logging, we used satellite-detections of forest harvesting that occurred between 2014 and 2016 from the Landsat-derived forest change detection product originally developed in 2015 by [35] and updated in 2018 ( Figure 2c).

Classification Method
We used "circuit theory" to develop a road classification approach based on the connection between known locations of human activity along the road network. Circuit theory is a connectivity model based on the movement of electricity through a connected network and has been applied across a variety of fields [36]. The circuit is simply a network of nodes connected by resistors. In our case, the nodes are locations of human activity and the resistors are the road network. When voltage is applied across a resistor (i.e., road) connecting two nodes (i.e., activity locations), the amount of current that flows between the two nodes is a function of the voltage applied, the resistance between the two nodes and the configuration of resistors between them. We used an open source software called 'Circuitscape' to apply circuit theory using spatial raster grids [37]. Specifically, we used Circuitscape version 4.0, programmed in Julia [38], which enabled processing at 5 m spatial resolution [39]. For a more detailed description of how circuit theory is applied to spatial raster grids, we refer readers to [36,37].
For each of the three types of activity, we iteratively applied voltage between each activity source and each of the activity locations (i.e., nodes; Figure 2). For general activities derived from social media posts, we defined sources as any post with an intensity >10th percentile that also overlaid nighttime lights visible in satellite imagery captured by the To identify locations of oil and gas activity, we extracted active well sites from the Alberta Biodiversity Monitoring Institute's 2016 Human Footprint database [34] (Figure  2b). For locations of recent forest logging, we used satellite-detections of forest harvesting that occurred between 2014 and 2016 from the Landsat-derived forest change detection product originally developed in 2015 by [35] and updated in 2018 ( Figure 2c).

Classification Method
We used "circuit theory" to develop a road classification approach based on the connection between known locations of human activity along the road network. Circuit theory is a connectivity model based on the movement of electricity through a connected network and has been applied across a variety of fields [36]. The circuit is simply a network of nodes connected by resistors. In our case, the nodes are locations of human activity and the resistors are the road network. When voltage is applied across a resistor (i.e., road) connecting two nodes (i.e., activity locations), the amount of current that flows between the two nodes is a function of the voltage applied, the resistance between the two nodes and the configuration of resistors between them. We used an open source software called 'Circuitscape' to apply circuit theory using spatial raster grids [37]. Specifically, we used Circuitscape version 4.0, programmed in Julia [38], which enabled processing at 5 m spatial resolution [39]. For a more detailed description of how circuit theory is applied to spatial raster grids, we refer readers to [36] and [37]. Input voltage was always held constant and a mask was applied to the study area such that the road network had a constant resistance of 1.0 and all grid cells outside the road network had infinite resistance (value of 0), which effectively allowed current to only flow along the road network. This meant that the initial current output of the Circuitscape algorithm was based solely on the spatial configuration of the activity sources, the activity locations and the road network. However, in the case of the current derived from the social media locations, the current output was then weighted by multiplying the current by the intensity value of the activity location used for each iteration. We summed the currents produced for each activity type and extracted this value for each road segment (Supplementary Materials: Figure S2). We then split the road segments into three categories using the summed current values, still for each activity type, based on percentiles of all observed summed values within each activity type as follows: low ≤ 10th percentile > medium < 90th percentile ≥ high. Finally, we combined the three activities into a single ranking: if two or more of the individual activity rankings were low, the overall ranking was low use; if two or more were high, the overall ranking was high use; otherwise, the overall ranking was medium use.

App-Based Road Data
We collected data on road surface conditions and travel speed for a subset of the study area using a free mobile app called RoadLab-Pro, originally developed by the World Bank (see progressana.com accessed on 28 June 2021). When activated, the app uses the GPS receiver in a mobile device to automatically track location and travel speed and uses the device's accelerometer to estimate road surface roughness. Six different users working for industrial and non-profit organizations operating in the region agreed to collect data for this project during daily work routines between June 2018 and March 2019 (e.g., checking oil and gas well sites, conducting wildlife surveys, etc.). Data coverage totalled approximately 1000 km of roads and 771 unique road segments, which we used as training data for the image-based classification.

Satellite Data
For use in the image-based classification, we obtained orthorectified (Level 3A) Rapid-Eye satellite imagery acquired between 8 August and 17 September 2017. The RapidEye constellation of commercial satellites produces moderately high-resolution images (5 m pixels) with the capability of frequent imaging (revisit time of 1.0-5.5 days) over large areas [40], which is a useful platform for road monitoring in rural areas [21]. RapidEye imagery provides five bands covering the visible blue (440-510 nm), green (520-590 nm) and red (630-685 nm) wavelengths, as well as a red-edge band (690-730 nm) and a near-infrared band (NIR; 760-850 nm) that are each highly sensitive to green vegetation.
We mosaicked the RapidEye imagery into a single multi-band image, thus representing conditions during late summer 2017. From the multi-band RapidEye imagery, we generated a suite of seven vegetation indices (Table 1) related to vegetation cover and soil composition in forested environments [41,42], with the expectation that variation in vegetation and soil reflectance would be associated with road surface conditions and margins that, in turn, are associated with use intensity. For example, heavily used roads are required to be constructed with gravel and have wider margins cleared of tall vegetation. Lightly used or temporary roads are more likely to be unimproved tracks cut into the soil with narrower margins and vegetation regrowth on or along the road surface. We also generated four terrain variables from a 30 m digital elevation model produced from the National Aeronautics and Space Administration (NASA) Shuttle Radar Topography Mission (SRTM) ( Table 1). In addition to the simple terrain variables of elevation and slope, we included insolation (a measure of primary energy received from the sun accounting for variation in elevation and aspect; calculated in ArcMap 10.6) and a topographic wetness index (a measure of near-and at-surface water accumulation) under the assumption that these represent patterns of snow melt and soil conditions which in turn influence where different types of road are constructed. For example, primary roads may be less likely to traverse a steep north slope which will hold snow and may become impassable in winter. We calculated descriptive statistic metrics for the seven vegetation indices and four terrain variables for a 25 m buffer around each road segment using Feature Extraction 2.7 (FETEX) software [43] (Supplementary Materials: Table S2). The 25 m buffer was used to account for roads of differing widths, road margins, and spatial errors that may exist in the GPSderived road vector data and orthorectified satellite imagery. The FETEX 2.7 software was developed to characterize and classify agricultural parcels, but is applicable to any vector overlay and provides a computationally efficient means to extract a variety of spatial metrics. In addition to the descriptive statistics from the vegetation indices and terrain variables, we also used FETEX 2.7 to compute textural and semi-variogram metrics (Supplementary Materials: Table S2) from the NIR band, which is sensitive to green vegetation. Within the FETEX program, we set the parcel perimeter buffer to 2 pixels and did not specify a minimum parcel size. For the texture metrics extraction, we set the greylevel co-occurrence matrix (GLCM) grey levels at 64 and for the semi-variogram metrics extraction we set the lag size to 20 pixels and the random pixels percentage to 15. All other Remote Sens. 2021, 13, 2547 8 of 21 settings were left as defaults see [43]. Semi-variogram metrics detail the spatial patterns of pixel values, and are useful when performing classification from high resolution imagery in forested landscapes [44,45]. We expected that these metrics would be helpful to distinguish wider roads with homogenous surfaces and margins (i.e., lower roughness, higher speeds) from narrower roads with variable surfaces that may be occluded by overstory vegetation (i.e., higher roughness, lower speeds).

Classification Method
Using training data collected from RoadLab Pro and the FETEX metrics derived from RapidEye imagery as predictors, we classified speed and roughness of road segments using supervised random forest classifications programmed with the caret package in R [46]. Prior to modelling, continuous speed and roughness training data were grouped into three discrete classes (low, medium, and high) based on visual inspections of histograms of the distribution of each variable. The three speed classes were separated at 40 and 65 kph (Supplementary Materials: Figure S3) and the roughness classes at 2.2 and 3.4 (unitless index values; Supplementary Materials: Figure S4). Predictions were made for all road segments longer than 250 m. For segments shorter than 250 m, we assigned the majority predicted value of the longest segment they adjoined. The final three-class road classification of low use, medium use and high use combined the speed and roughness classifications by first ranking each separate prediction from 1-3 (low to high), then summing the two rankings, and finally re-ranking the summed values as low use = 2-3, medium use = 4, and high use = 5-6.
Prior to classification, Pearson's correlation coefficient (r) was used to assess multicollinearity of all spectral predictors. A single metric from groups of highly correlated variables (r > 0.8) were included as predictors in classifications, resulting in a total of 38 possible predictors remaining from the initial 106 candidate predictors. In order to reduce data dimensionality all predictor metrics were standardized to have a mean of 0 and standard deviation of 1 [47,48]. All models were generated using 10-fold cross-validation and were assessed based on the overall accuracy using the out-of-bag error estimate [49].

Road Classification
We first compared the classification agreement between the network-based, imagebased and reference approaches. Since classification was performed at the level of individual road segments, but segments can differ in length, we calculated the classification agreement as a function of road length (i.e., the total km of road segments with class agreement or disagreement between approaches).

Grizzly Bear Response to Roads
We obtained GPS locations for 56 grizzly bears collected between March and October from 2014 to 2017, resulting in a total of 83 unique "bear-years" (i.e., an individual bear observed in a given year; bears observed for multiple years were considered as multiple bear-years). Grizzly bears were captured and collared by biologists from the fRI Research Grizzly Bear Program (Hinton, Alberta) as part of long-term research and to inform provincial recovery efforts. Individuals were fit with Followit Iridium satellite collars programmed to collect locations every 30 min or hourly. All animal capture and handling followed the Canadian Council of Animal Care Guidelines and were permitted annually through the University of Saskatchewan Animal Care Committee. See [50] for detailed information about grizzly bear capture and handling.
We evaluated population level movements and habitat selection of grizzly bears relative to non-classified roads ("Any road") and roads classified using the three methods described above. More specifically, we compared how movement rate changed and habitat selection varied by time of day-nocturnally and diurnally-for each road class as a function of distance to road. We predicted that grizzly bears would alter their behaviour when using areas within~1000 m of high-use roads by increasing nocturnal activity due to some perceived risk attributed to human activity levels. If grizzly bears were responding to high-use roads in this manner, we expected movement rates would change and selection would be stronger at night [17].
For each bear-year, when 10 or more consecutive GPS relocations were available, we calculated the movement rate (m/hR) and turning angle between each relocation (i.e., step) using the amt package in R [50]. To analyze changes in typical movement patterns near roads of different classes, we first modeled movement rate across all relocations using a generalized additive model (GAM). Grizzly bear movement tended to follow a diurnal pattern, with reduced activity at night and increased movement during the day, especially in the morning and evening [51], and some variation between age-sex-reproductive classes ( Figure 3). Movement also tended to be greater when the turning angle between steps was low, i.e., when grizzly bears are moving more linearly (Figure 3). Based on these observed trends, we modeled movement rate as a function of age-sex-reproductive class, hour of the day and turning angle. We smoothed hour of the day using a cyclic cubic regression spline and included an interaction with age-sex-reproductive class. We cosine-transformed turning angle and used a cubic regression spline for smoothing.
Once this model was created, we analyzed deviations from expected movement rates (i.e., model residuals) within 1000 km of unpaved roads in general (i.e., "Any road'"), and for each class within each of the three road-use classifications approaches. We were particularly interested in how deviations differed during the day and at night to determine if grizzly bears appeared to be more nocturnal around high-use roads.
We also analyzed habitat selection of grizzly bears at the level of the home range (third-order) by generating 100% minimum convex polygons (MCPs) for each bear and year. Because we were interested in grizzly bears' response to unpaved roads, we clipped out areas of the home range where paved roads were closer than unpaved roads. We then created separate datasets for those GPS locations that occurred during day-time hours, which included twilight, versus those that occurred during night based on sunrise, sunset, and civil twilight tables [52]. For each dataset and use location, we generated 10 random 'available' locations within each MCP and extracted the distance to the nearest road of each of the three classes and for each classification approach. We then transformed each distance variable using an exponential decay of the form 1-(e -αd ) where d was the distance in meters and α was set to either 0.02 (fine scale, within~200 m) or 0.002 (large scale, withiñ 2000 m). In doing so, the effects of roads would be localized to within meters (fine scale) or hundreds of meters (large scale). To ensure that our sample population of grizzly bears had each road type available, we calculated for each animal the proportion of available locations where the nearest road was classified as low, medium, or high. We then removed those individuals that did not have all three road classes available or where availability was less than 5%. step) using the amt package in R [50]. To analyze changes in typical movement patterns near roads of different classes, we first modeled movement rate across all relocations using a generalized additive model (GAM). Grizzly bear movement tended to follow a diurnal pattern, with reduced activity at night and increased movement during the day, especially in the morning and evening [51], and some variation between age-sex-reproductive classes (Figure 3). Movement also tended to be greater when the turning angle between steps was low, i.e., when grizzly bears are moving more linearly (Figure 3). Based on these observed trends, we modeled movement rate as a function of age-sex-reproductive class, hour of the day and turning angle. We smoothed hour of the day using a cyclic cubic regression spline and included an interaction with age-sex-reproductive class. We cosine-transformed turning angle and used a cubic regression spline for smoothing. Once this model was created, we analyzed deviations from expected movement rates (i.e., model residuals) within 1000 km of unpaved roads in general (i.e., "Any road'"), and for each class within each of the three road-use classifications approaches. We were particularly interested in how deviations differed during the day and at night to determine if grizzly bears appeared to be more nocturnal around high-use roads.
We also analyzed habitat selection of grizzly bears at the level of the home range (third-order) by generating 100% minimum convex polygons (MCPs) for each bear and year. Because we were interested in grizzly bears' response to unpaved roads, we clipped out areas of the home range where paved roads were closer than unpaved roads. We then created separate datasets for those GPS locations that occurred during day-time hours, which included twilight, versus those that occurred during night based on sunrise, sunset, and civil twilight tables [52]. For each dataset and use location, we generated 10 random 'available' locations within each MCP and extracted the distance to the nearest road of each of the three classes and for each classification approach. We then trans- We used generalized linear mixed-effects models (GLMM) to estimate populationlevel RSF model coefficients with the lme4 package in R version 3.6.2 [53,54]. To ensure independence of our sample population and account for unbalanced sample sizes, we specified animal ID as a random intercept for all models [55]. For each dataset (i.e., day and night) and scale, we fit (1) a null (intercept-only) model, (2) a model that included distance to road without classification as a fixed effect, and (3) a model that also included a random slope and intercept for distance to road by road classification type. We used Akaike information criteria (AIC) scores to identify the model that best fit the data, with the best fitting model identified as that having the lowest AIC score and was not within two AIC units of a model with fewer parameters [56].

Road Classification
Overall accuracies for speed and roughness were 66% and 55%, respectively ( Table 2). For both speed and roughness, the semi-variogram metrics were the most important in the models, comprising five of the ten most important variables for each (Supplementary Materials: Figures S5 and S6). The vegetation indices tended to be more important than the terrain metrics. The most frequently high-ranking vegetation indices were BARESOIL (Bare Soil Index) and RENDVI (Red-Edge Normalized Difference Vegetation Index). DEM (Elevation) was the only terrain metric appearing in the 10 most important variables. Generally, all three classifications estimated the length of road in each ranking to be low > medium > high, however, the spatial patterns of road rankings differed between the classifications (Table 3, Figure 4). Overall pairwise class agreement for the three classifications ranged from 50-55%. The reference classification resulted in fewer total km of roads classified as high use (838 km) compared to the network-based (1973 km) and the image-based (2314 km) classifications. There was poor class agreement for high use between the network-and image-based methods (<40%), and more of the road network that ranked as high in the reference classification was ranked as high in the Image-based classification (73%) compared to the network-based classification (60%). The majority of the other road segments ranked as high by the image-and network-based approaches were ranked as medium in the reference classification. The image-based classification had less medium use roads (4132 km) compared to the network-based and reference classifications (5548 km and 5572 km, respectively). Across all three classifications, there was more interclassification between medium and low rankings compared to medium and high rankings, and it was rare (<5%) for a road segment to be low in one classification and high in another (Table 3). Table 3. Three-way confusion matrix showing class agreement between the network-based, image-based and reference road characterization approaches for the three classes low use, medium use and high use. Values are in km of road segments classified to each road use. Agreement is calculated as the percent of the total km of road segments for a given row or column that were classified the same between the two respective approaches.

Grizzly Bear Movement
Analysis of grizzly bear movement rates within 1000 m of roads showed evidence of increased nocturnal activity. Compared to typical movement rates across the entire study area, when grizzly bears were within 1000 m of any unpaved road (i.e., not classified), movement was slightly greater than expected at night and slightly less than expected during the day ( Figure 5). Within 1000 m of high use roads, the trend was more pronounced for all three classifications, and it was strongest for high use roads as classified by the image-based approach ( Figure 5). In this case, the deviation from expected movement rates was 3-4 times higher compared to the medium-and low-use roads.
increased nocturnal activity. Compared to typical movement rates across the entire study area, when grizzly bears were within 1000 m of any unpaved road (i.e., not classified), movement was slightly greater than expected at night and slightly less than expected during the day ( Figure 5). Within 1000 m of high use roads, the trend was more pronounced for all three classifications, and it was strongest for high use roads as classified by the image-based approach ( Figure 5). In this case, the deviation from expected movement rates was 3-4 times higher compared to the medium-and low-use roads. Figure 5. Fitted generalized additive models (GAM's) of the deviation from expected grizzly bear movement rates (m/hr) by hour of the day for each road class within each of the three classification approaches, as well as for all unpaved roads ("Any road"; i.e., non-classified). Shading represents the 95% confidence interval.

Grizzly Bear Selection
Our sample population to fit GLMM models consisted of 23 and 21 grizzly bears and 56,575 and 26,003 GPS relocations for the day and night datasets, respectively. Our preliminary evaluation indicated that irrespective of the specific road classification method, models fit better where road class was considered. Because of this, we used AIC to compare only the three classification methods for each dataset to identify the model that best fit the data (Table 4). We also compared the effects of model coefficients for all road classification methods and unclassified roads (Figures 6 and S7).
Based on model AIC scores and weights (AICw), the reference classification method best fit the data (i.e., lowest AIC score) regardless of time of day and scale (Table 4). Between the network-and image-based classifications, the network-based method better fit the data during the day and the image-based better fit the data at night regardless of scale.
The relative probability of habitat selection was highest near high use roads for all three classification approaches at the 2000 m decay scale ( Figure 6). For the reference and image-based classifications, selection was higher near low-use roads compared to medium-, whereas the opposite was true for the network-based approach. When roads were not classified, the probability of selection was similar to low-use roads classified by the Figure 5. Fitted generalized additive models (GAM's) of the deviation from expected grizzly bear movement rates (m/hr) by hour of the day for each road class within each of the three classification approaches, as well as for all unpaved roads ("Any road"; i.e., non-classified). Shading represents the 95% confidence interval.

Grizzly Bear Selection
Our sample population to fit GLMM models consisted of 23 and 21 grizzly bears and 56,575 and 26,003 GPS relocations for the day and night datasets, respectively. Our preliminary evaluation indicated that irrespective of the specific road classification method, models fit better where road class was considered. Because of this, we used AIC to compare only the three classification methods for each dataset to identify the model that best fit the data (Table 4). We also compared the effects of model coefficients for all road classification methods and unclassified roads ( Figure 6 and Figure S7). Table 4. Comparison of generalized linear mixed-effects models describing the habitat selection of grizzly bears based on road classification method, time of day and scale using Akaike information criteria scores (AIC), where lower AIC scores indicates better model fit within a group. ∆AIC is the difference between the AIC score of that model and the lowest AIC score of the group. AIC w is the AIC weight (0-1 possible), an indication of the probability that a given model is the best model of the group. Sample sizes: Night = 21 grizzly bears (23,003 "used" locations), Day = 23 grizzly bears (56,575 "used" locations).   Based on model AIC scores and weights (AIC w ), the reference classification method best fit the data (i.e., lowest AIC score) regardless of time of day and scale (Table 4). Between the network-and image-based classifications, the network-based method better fit the data during the day and the image-based better fit the data at night regardless of scale.

Time of Day
The relative probability of habitat selection was highest near high use roads for all three classification approaches at the 2000 m decay scale ( Figure 6). For the reference and image-based classifications, selection was higher near low-use roads compared to medium-, whereas the opposite was true for the network-based approach. When roads were not classified, the probability of selection was similar to low-use roads classified by the reference and image-based methods, and the medium-use roads from the network-based method. All trends were generally similar, although less pronounced, at the 200 m scale (Supplementary Materials: Figure S7).

Discussion
We found that the three road classification methods had moderate class agreement and showed similar broad spatial patterns of road use. Despite some inconsistencies, all three methods improved grizzly bear habitat selection modelling compared to using unclassified roads. Additionally, grizzly bears showed similar selection and behavioural responses across the three methods: greater selection near high use roads irrespective of time of day, but a stronger shift in movement rate near high use roads between daytime (lower) versus nighttime (higher) compared to low and medium use roads.

Road Classification
While the reference is not a true ground-truth, it is encouraging that the network-and image-based approaches based on remote observation showed at least some consistency with the reference classes-more than half of the nearly 14,000 km of unpaved roads in the study area were characterized into the same class across the three methods ( Table 3). The Government of Alberta has devoted immense resources to produce and maintain a provincial database covering a wide range of road types, and there are few places in the world with such detailed information on unpaved rural and resource roads. However, even when such detailed road data are available, they may not be maintained at a frequency to permit analysis of temporal changes, which may be critical to habitat monitoring (e.g., [21]). For example, data inconsistencies can arise when data are collected at different time periods or in different jurisdictions. The network-and image-based approaches developed in this study show promise as cost-effective methods to characterize road use for wildlife conservation applications, although more work is needed to better understand their relative strengths and weaknesses. The relatively consistent response of grizzly bears in our models using all three methods suggests that the road segments for which there was consistent classification agreement are influencing grizzly bear behaviour in this region and even imperfect characterization may be sufficient to improve wildlife modeling.
Within this study area, given the lack of traffic data associated with rural roads across the region, it is not possible to determine with certainty whether any of the classifications represent human road use better than the others. The fact that all three show some agreement and provide more information than treating all unpaved roads as equivalent (e.g., Figure 5, Table 4) suggests that each of the methods could be used or adapted to the data availability of a given application. Each classification approach comes with advantages and disadvantages. For example, the reference classification is a simple approach when existing road databases provide additional attribute information that can be used as a proxy for road use activity. This study suggests that, in this region, the reference classes are appropriate elements to include when building grizzly bear habitat models for management purposes, and future research in this region should explore how other species respond to the reference classes. We encourage future studies to test classification systems available for other regions from public or open source datasets.
When pre-existing road classes are not available, or data are found to be inadequate or outdated, the image-based approach is likely the most objective and easily replicable across regions since it relies on spectral information derived from satellites and ground data that can be collected using widely available technology. The fact that the spatial distribution of high-, medium-, and low-use roads predicted using this approach was similar to the other two approaches suggests that there is a relationship between spectral reflectance patterns of roads captured by satellite and broad patterns of human activity.
The accuracy of predicting surface roughness and travel speed using moderately-high resolution RapidEye imagery and the FETEX-derived spectral variables was relatively low (Table 2), although we note that accurate predictions of road roughness and travel speed were not primary objectives in this study. However, this shows that more work is needed to predict these individual road features. Deep learning algorithms of higher spatial resolution imagery, both of which are becoming increasingly accessible, would likely improve results. Currently, there are few examples of successful satellite-based monitoring of road conditions (e.g., [30]), and while this study shows promise, more work is needed to develop highly accurate predictions. The development of such predictions could enable cost-effective monitoring of changes in road use over time, as well as other applications for monitoring road conditions for the purpose of maintenance or emergency response planning. Our study suggests that capturing spatial spectral variability (e.g., FETEX-derived semi-variogram metrics) may be particularly useful for characterizing unpaved roads with remotely sensed imagery at this spatial scale. It is important to note that classifications developed for one region may not be applicable to another region due to differences in road surface materials, as well as differences in the relationships between travel speed, road roughness, spectral reflectance and human activity. Furthermore, additional research is needed to better understand the conditions under which spatial spectral variability is related to road use and conditions. For example, the relationship may not hold in desert conditions where roadside vegetation is sparse or non-existent, or in urban conditions where roadsides are highly variable and managed for objectives unrelated to road use.
The network-based approach provides a framework from which to build innovative methods to translate the vast amount of spatially explicit human activity information shared on social media and available in other public and open databases into meaningful information to improve our understanding of human-environment interaction. Our application of Circuitscape-an algorithm developed to predict habitat connectivity-is a novel approach to quantifying road network activity. The method is uniquely designed to capture pinch points in the road network and, when parametrized with data of known locations of human activity (i.e., social media data), may be especially useful for identifying high-use road segments.
While social media databases represent a massive and rapidly growing source of information on the nature and intensity of human activity, such datasets are not without their limitations and potential biases. Age, socio-economic condition and other geo-demographics likely influence patterns in social media use, which may differ across platforms [26,57,58]. People engaging in activities where location is sensitive (e.g., hunting, fishing) may be less likely to make geo-tagged posts, whereas photogenic events such as blooming flowers or wildlife sightings could lead to a disproportionate number of posts relative to visitors at specific locations [26]. The latter may help explain why less of the road network was classified as low use with the network-based approach compared to the image-based and reference methods. A biased under-classification of low-use roads may also explain the reversed pattern in relative habitat selection between low and medium use roads in the network-based approach compared to the other two methods.
Different data sources will likely capture different activities. We sought to combine data from multiple social media platforms, along with satellite data and public records, in order to account for a wide range of activities. However, combining data through scaling and classification requires subjective decisions to be made about the relative contribution of each dataset to overall human activity and what level of activity is considered "high" or "low".

Grizzly Bear Response to Roads
Our results suggest that incorporating road classification into habitat models provides additional information to explain grizzly bear response to roads compared to treating all roads as equal. We observed different patterns of movement and habitat selection for roads with different predicted human activity levels. Grizzly bears appear to be more active near high use roads at night, as indicated by greater movement rates ( Figure 5) and stronger selection ( Figure 6) for all classification approaches. Indeed, a recent study of more than 2500 individuals showed that grizzly bears inhabiting human-dominated landscapes do not necessarily avoid areas with greater human activity, but rather shift toward more nocturnal behavioural patterns [59]. Our results support this conclusion and suggest that the deviation from typical diel movement observed in Figure 5 does not reflect avoidance of high use roads, but rather distinct behavioural patterns. For example, low movement rates during the day could be an indication of foraging behaviour, or more cautious movements near high use roads, and rapid movement at night may indicate that grizzly bears are using these as travel corridors during that time. The fact that the probability of selection is greatest near high use roads ( Figure 6 and Supplementary Materials: Figure S7) suggests that foraging, rather than caution, may be driving the lower daytime movement rates observed near high use roads. This selection of high use roads contrasts with findings by [17] in southern Alberta, where they found grizzly bears avoided roads with high traffic volumes (>100 vehicles per day) during the day. It is possible that traffic volumes on roads classified as high use in our study are still lower than those in more populated southern Alberta, or that the human activity datasets in our study do not fully capture traffic patterns.
Nonetheless, it is concerning that grizzly bears may be especially attracted to high use roads for foraging. Studies in western Alberta and eastern British Columbia have shown that more than 85% of all human-caused grizzly bear mortalities occurred within about 100 m of a road (most were shot, not hit by vehicles) [5]. If high use roads are more strongly associated with grizzly bear space-use, this could increase the likelihood of negative human-bear interactions. It is possible that high use roads are more attractive to bears as a result of site selection, disturbance, or for both reasons. Roads with greater traffic require wider road margins, resulting in more disturbance and more light penetration, both which can lead to enhanced production of grizzly bear foods [11]. Roadkill of small mammals and ungulates, which can attract bears, may be more abundant along the high use roads in this study area. While these roads are expected to have the greatest travel speeds and traffic volumes of all unpaved roads in the area, speed and traffic are still moderate or low compared to highways and our study shows they do not elicit avoidance in grizzly bears. Such roads with moderate traffic volume, but insufficient to elicit avoidance, are expected to have the largest number of ungulate and small mammal carcasses due to these animals behavioural response to roads [60]. Larger and more frequently travelled roads may also be more likely to be constructed within already existing attractive habitat, for example in flatter terrain and along riparian corridors, which are preferred by grizzly bears [11,61]. By incorporating spatially explicit estimates of food resources (e.g., from satellite imagery and modeling) along with road classifications into analyses of grizzly bear movement, future studies may reveal whether thresholds of human activity and food resources at which grizzly bear behaviour changes exist. A better understanding of the timing and nature of potential co-occurrence of grizzly bears and people along roads would enhance the toolkit of land managers to set policies such as temporary and permanent road access closures [5], site selection and construction guidelines for new road development and public outreach and education.
Analysis at finer temporal scales would likely yield a more nuanced understanding of human activity along road networks and their influence on wildlife behaviour. In this study, we used GPS collar data for grizzly bear movements from 2014-2017, and sought to represent cumulative human activity for approximately this period (i.e., active oil and gas wells in 2016, forest logging from 2014-2016 and cumulative social media activity as of late 2017). While our analysis of grizzly bear behaviour tested the assumption that human activity varied between daytime and nighttime, it is also expected that human activity would vary seasonally and year-to-year. All the datasets used in this study could be analyzed for separate years, while most of the social media datasets could also be analyzed sub-annually since they contain timestamps. Analyzing oil and gas or logging activity at sub-annual temporal scales would be more challenging since data on the timing of activities that may drive more frequent traffic or large-truck traffic are not available, although separate analysis for each human activity by type may show different broad scale grizzly bear responses based on the type of traffic associated with each activity. Such analysis was beyond the scope of this study, which had the primary objective of describing and comparing the newly developed road characterization methods. However, it is important to note that analysis at finer temporal scales and by type of human activity may reveal different patterns of human activity and grizzly bear behaviour in this study area.

Conclusions
This study demonstrated that remotely sensed and crowd-sourced datasets can be used to estimate human use of unpaved roads within critical wildlife habitat to better understand the behaviour of threatened wildlife and quantify impacts. We found evidence of nocturnal behaviour by grizzly bears near high-use roads that was not evident when roads where unclassified. Perhaps most importantly, for grizzly bears in this study area, including any of the three classifications examined for this research improved the fit of movement and habitat selection models. This underscores the importance of characterizing road attributes related to indirect effects on wildlife (e.g., human activity), even for unpaved rural roads. When existing road databases include road attribute information (i.e., the reference in this study), this can be explored to develop proxies for the hypothesized indirect effects specific to a region. When such information is not available, the networkbased and image-based classifications developed in this study provide examples of how emerging large datasets can be utilized for wildlife research and conservation where infrastructure development is widespread.
Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/rs13132547/s1, Table S1: Number of data observations for each social media platform used to derive locations of recreation activity in the Network-based classification; Table S2: Codes and description of the spectral, textural and semi-variogram based features produced by the FETEX 2.7 software. Figure Figure S2: Current flow outputs from the Circuitscape algorithm along the road network for (a) recreation activity represented by social media posts, (b) oil and gas activity represented by active well sites and (c) recent forest logging activity detected by satellite. Higher current flow along a given road segment was interpreted as greater human activity. Activity sources where travelers to activity locations are likely to have originated, and are thus the source nodes to which each activity location (i.e., destination node) was iteratively connected during the Circuitscape runs for each respective activity type. Figure S3: Histogram of road travel speeds calculated by the RoadLab-Pro app for all road segments with data. The low cutoff (40 km/hr) and high cutoff (65 km/hr) represent the manually selected speeds at which to define speed classes for the Random Forest training (low < 40 < medium < 65 < high). Based on these cutoffs, 18% of the training data was classified as low, 53% as medium and 29% as high. Figure S4: Histogram of road surface roughness (unitless index) calculated by the RoadLab-Pro app for all road segments with data. The low cutoff (2.2) and high cutoff (3.4) represent the manually selected roughness values at which to define roughness classes for the Random Forest training (low < 2.2 < medium < 3.4 < high). Based on these cutoffs, 30% of the training data was classified as low, 47% as medium and 23% as high. Figure S5: Variable importance estimates for the top 20 variables in the final trained Random Forest algorithm predicting road travel speed. Variable abbreviation descriptions can be found in Table S2; Figure S6: Variable importance estimates for the top 20 variables in the final trained Random Forest algorithm predicting road surface roughness. Variable abbreviation descriptions can be found in Table S2. Figure S7: The relative probability of grizzly bear habitat selection as a function of road distance at the 200 m scale by time of day (rows), classification method (columns) and whether roads were classified (colored lines; Low, Medium, or High human activity) or non-classified (dashed line; Any road).  Institutional Review Board Statement: All animal capture and handling followed the Canadian Council of Animal Care Guidelines and were permitted annually through the University of Saskatchewan Animal Care Committee.

Informed Consent Statement: Not applicable.
Data Availability Statement: GPS collar data and raw locational RoadLab-Pro data are not publicly available due to privacy and safety concerns. Non-sensitive data (e.g., summarized social media locations and summarized road condition data) can be made available upon request.