The Influence of Noise, Vibration, Cycle Paths, and Period of Day on Stress Experienced by Cyclists

Urban and transport planners need to assess the stressful conditions experienced by cyclists, considering that highly stressful situations can discourage people from cycling as a transport mode. Therefore, this study has two objectives: (1) to present a method for monitoring stress and other environmental factors along cycling routes using smart sensors; and (2) to analyze the influence of noise, vibration, presence of cycle paths, and the period of the day on stress experienced by cyclists. Data were collected in the city of São Carlos, Brazil, using stress and noise sensors, accelerometers, and Global Positioning System (GPS). Primarily, heat maps generated from the data made it possible to identify critical points of stress along the routes. In addition, the results of a logistic regression model were analyzed to identify the influence of the studied variables on stress. Although high levels of noise increased the odds of experiencing stress by 4%, very uncomfortable vibrations increased the odds by 14%, and the presence of cycle paths reduced the odds by 8%, an analysis of p-values and odds ratio confidence intervals shows, with a 95% confidence level, that only the period of the day influenced stress, as confirmed by the data. In this case, the odds of having stress increased by 24% in the afternoon rush hour compared to the morning rush hour.


Introduction
Rapid growth of traffic in cities causes problems concerning congestion, excessive noise, and air pollution. One solution to these problems could be to encourage sustainable transport modes such as cycling. However, for cycling to be widely accepted, good conditions ensuring safety and comfort should be established to minimize stress [1]. Although other means of transport where users are in a protected environment (e.g., vehicles) are subjected to a certain degree of environmental noise, stressful situations experienced by cyclists may be influenced more directly by environmental noise. Few studies have questioned whether there is any relationship between these environmental factors and stress responses that can be objectively measured.
Stress experienced by cyclists has been previously studied subjectively (through surveys) and according to factors external to cyclists, such as the flow and speed of motorized vehicles and infrastructure characteristics [2][3][4][5]. One of the first attempts was the bicycle stress level proposed by Sorton and Walsh [6]. The concept of level of service was introduced later [7-11] and subsequently • Is it possible to identify, directly and objectively, critical points of stress along cycling routes? • What is the importance of external variables such as noise, vibration, presence or not of a cycling infrastructure, and the period of the day on stress experienced by cyclists?
Therefore, the objectives of this study are as follows: (1) to elucidate a method of monitoring stress from objective measurements intrinsic to the cyclist and generate maps showing the critical points of stress; and (2) to analyze, using logistic regression models, the relationship of stress measurements with noise, vertical acceleration (pavement surface roughness), presence of cycle paths, and period of the day.

Method
In this section, we introduce the equipment used to monitor cycle paths, a description of the routes selected, the method developed for processing and fusing information obtained from each sensor, and the process for analyzing the importance of environmental variables in stress using logistic regression models.

Equipment for Stress, Noise, and Vibration Measurements
The sensors used in the stress, noise, and vibration measurements, which were adapted to bicycles and cyclists to record different input data, are presented next. A smartband designed to monitor physiological changes in real-world conditions [60] was used to monitor stress levels. In this case, it was used to measure stress responses in real cycling scenarios. Placed around the wrist of the cyclist's nondominant hand, the smartband basically combines data of skin conductivity levels (SCLs) and skin temperature to identify stress peaks. These data are georeferenced using a Global Positioning System (GPS). Both the conductivity data and temperature data were obtained at an acquisition rate of 10 Hz, and the GPS acquisition rate was 1 Hz.
To measure noise, a noise sensor [61,62] was adapted to a backpack to be easily carried by the cyclist during the ride. The noise sensor assembly consists of the sensor, a data acquisition and storage system, and a coupled GPS to georeference the measurements along the path. The acquisition rate of the data supplied by the sensor was 8 Hz, and of the GPS was 1 Hz. Once the collections are made, the acquisition system is connected to the internet network to upload the data to a server for processing, and then the data are downloaded.
For the vibration measurements, a system was designed to measure the magnitude of vibrations along the route. The data were collected using a mountain bike (MTB) equipped with a conventional smartphone attached to the frame, just below the saddle. The smartphone records the vertical acceleration data using the accelerometer embedded into the device, and its position is registered simultaneously using the device's GPS. Two Android applications were used to collect the data: one to measure vertical acceleration (Accelerometer Analyzer) and the other to record position (Geo Tracker). The accelerometer acquisition rate was 50 Hz and the GPS rate was 1 Hz. A GoPro video camera was adapted to attach to the shoulder straps of a backpack carried by the cyclist so that the sources of stress could be identified visually.

Routes Selected
The proposed method for monitoring stress and environmental characteristics was used in São Carlos, a medium-sized city (approximately 230,000 inhabitants) in the state of São Paulo, Brazil. Two routes were selected for data collection (Figure 1). Route 1 corresponds to a path of 17.1 km, which includes 4.2 km of cycle paths (red lines in Figure 1). Route 2 corresponds to a path of 5.7 km, without specially designed cycling infrastructure. The days selected for analysis were 12-14 September 2017, during the morning rush hour (between 7:00 a.m. and 9:00 a.m.) and the afternoon rush hour (between 5:00 p.m. and 7:00 p.m.).
Route 1 crosses a considerable part of the city, with segments of cycle paths disconnected from each other. This route is adjacent to a main avenue with medium traffic flow and medium speed limits. The topography of this route has medium to low slopes. Route 2 is in a central area of the city, without cycling infrastructure, and has excellent potential to attract cyclists due to the privileged location of this area in the city. This route also presents medium to low slopes. Weather conditions in São Carlos are characteristic of a humid subtropical climate zone, with an average temperature of 19.3 • C in September. Rainfall in this month is low, and during the campaign days there was no rain. The cyclists were usual riders.

Processing and Fusing Sensor Information
In order to fuse the sensor information, the time records of the sensors, obtained separately, were matched with GPS data. The data collected in this study (over 3 days, 2 periods of the day, and 2 routes) resulted in a large number of records obtained with different acquisition rates and registered with different formats. This required advanced computational tools to ensure that obtaining the information would be carried out automatically and efficiently. In the case of noise sensor data, for example, data in text format with more than 1 million rows were classified and combined with GPS information. In order to process and fuse the information obtained from the sensors, a code was developed for reading, processing, and analyzing the data using the R program. R is a freely available language and environment for statistical computing. The general scheme of the algorithm of the program developed for processing and merging the information is presented in Figure 2. According to this scheme, first all the data of the sensors are stored, read, classified, and processed independently for each sensor. After this process, the data are mixed and aggregated over time to generate a database that will be the basis of the statistical analysis. Each stage of processing the obtained data will be explained in detail. time to generate a database that will be the basis of the statistical analysis. Each stage of processing the obtained data will be explained in detail. time to generate a database that will be the basis of the statistical analysis. Each stage of processing the obtained data will be explained in detail.

Stress Data Processing
In order to process the stress sensor data, a preprocessing phase consisting of a series of steps was performed as follows: cleaning the signal, eliminating artifacts, and smoothing the signal. After this preprocess, the skin conductance and temperature data were combined with GPS data through an algorithm implemented in the Quantum Geographic Information System (QGIS) program for In order to process the stress sensor data, a preprocessing phase consisting of a series of steps was performed as follows: cleaning the signal, eliminating artifacts, and smoothing the signal. After this preprocess, the skin conductance and temperature data were combined with GPS data through an algorithm implemented in the Quantum Geographic Information System (QGIS) program for peak detection and georeferencing. This algorithm generates .shp files as output with the georeferenced information of the stress peaks. The files can be imported into a geographic information system (GIS) to be analyzed later in the form of maps or combined with the information of the other sensors [63].
Stress peaks obtained from physiological measurements were calculated using the first derivative of the skin conductance and temperature. These derivatives are intended to identify whether there is an increase or decrease in the slope. To detect whether there is a peak or stress event, it is necessary to know whether the level of skin conductance is increasing; the score for this event is +1 and the skin temperature should decrease to −1. At the end of the evaluation, the 2 columns with binary data were analyzed. A peak or stress can be identified if the signal shows a decrease in skin temperature 3 s after the skin conductance level has significantly increased [16].

Noise Data Processing
For the noise sensor data, the algorithm steps are as follows: (1) read and classify the noise sensor data, (2) match the time records of the noise sensor and GPS data, (3) aggregate the noise levels per second, and (4) classify these levels according to Table 1 [62]. To aggregate the noise data, a logarithmic sum of the sound levels was made, using the following equation: where LAeq is the value of noise aggregated by second and l i is the ith noise value collected by the sensor.

Vibration Data Processing
To process the accelerometer data, the algorithm steps are as follows: (1) read the accelerometer and GPS data, (2) match the time records of the accelerometer and GPS data, (3) calculate root mean square (RMS) value per second, and (4) classify these values according to Table 2. For the RMS calculation, the following equation was used: where RMS is the root mean square value, N corresponds to the N acceleration value, and x i is the values for vertical acceleration. Once the data from the stress, noise, and acceleration sensors were processed and combined, a unified database was constructed with the following variables: time, coordinates, noise, acceleration, presence or not of infrastructure, and time of day. This unified database was the same one used for descriptive statistical analysis and logistic regression models.

Logistic Regression Models
Logistic regression models are used to model binary responses in which there are 2 types of output: "success" or "failure." These are types of generalized linear models (GLMs) [65], which have 3 components: (i) the random component, which is identified as the response variable and has a binomial distribution; (ii) the systematic component, which specifies the explanatory variables of the model; and (iii) the link function (in this case, the logit function, Equation (3)), which the GLM relates to the explanatory variables through a prediction equation having a linear form (Equation (4)) [66].
The proposed model explores the relationship of the response to stress using 4 explanatory variables: noise, vibration, presence of cycle paths, and period of the day (morning and afternoon peak traffic times). Continuous variable noise (LAeq) and vibration (VA) were categorized according to Tables 1 and 2 and were considered as factors in the model. The model is represented by Equation (5): where LAeq corresponds to the category of noise level (Table 1), VA is vertical acceleration classified according to the category of vibration level (Table 2), PCI corresponds to the presence (1) or absence (0) of cycle paths, and Period corresponds to the period of the day (0 for the morning and 1 for the afternoon).
In logistic regression models, odds and odds ratio analyses are used to determine the importance of the explanatory variables in the model. For our model, the odds of response 1 (i.e., the presence of stress) are calculated as the exponential function of the β exponents of Equation (6): In this expression, for every 1-unit increase in X, the odds value is multiplied by e β . Consequently, if β = 0, then e β = 1 and the odds value does not change as X changes. To select the variables to be included in the model, on the one hand, the model considered should be complex enough to provide a better fit, but on the other hand, simpler models (with fewer variables) may be easier to interpret [66]. Based on these considerations, a selection of stepwise variables (stepwise variable selection algorithms) was implemented. To determine the fit of the model in terms of the ability to explain the model based on the selected variables, the Cox and Snell and Nagelkerke and Akaike information criterion (AIC) techniques were used [67,68]. The Cox and Snell and Nagelkerke techniques, used to determine the fit of the logistic regression models, known as pseudo R 2 , are similar to those used in traditional regression analyses in which R 2 is calculated. The adjustment values can be between 0 and 1, so that the closer to 1, the better the values fit the model.
The AIC judges a model by comparing how similar adjusted values tend to be the true expected values. An optimal model tends to have its values closest to the true probabilities of the result, that is, the model that minimizes AIC = 2 (log probability: number of parameters in the model) [66]. Thus, the lower the AIC value, the better the adjustment of the data to the model. Finally, the selected model was validated by dividing the database into two parts, one for training the model, with 70% of the data, and the other for validation, with 30%.

Results and Discussion
In this section, the results of applying the method to monitor stress as described above are presented. In the following subsections, maps with the results of the descriptive analysis of stress and of the logistic regression models are shown.

Stress Maps
The results of stress peaks and duration of stress (DOS) from the stress sensor can be represented and analyzed descriptively using maps. In these maps, the duration of stress can be associated with the intensity of the stress peak and can be represented in heat areas. Larger areas and more intense colors indicate higher concentrations of stress at certain points. In these points of greater intensity, considered as critical stress points, planned interventions that improve the cyclist's comfort could be proposed. Figure 3 shows the results of the heat maps of the routes evaluated combining all the days and periods. Figure 3 shows some highlights of high concentrations of stress where it would be interesting to make some improvements that reduce the levels of stress experienced by cyclists. Points A to D highlighted in the figure represent locations with a predominance of stress peaks in areas where the cycling infrastructure begins and ends without a transition or connection to the road. This missing connection forces an abrupt incorporation into mixed traffic, which causes high levels of stress. Actual images of these points extracted from video recordings are presented in Figure 4a. The upper left image shows a cyclist crossing the street to access the cycle path that begins at the median strip. The end of the cycling infrastructure without any connection to allow the cyclist to safely get onto the road can be observed in Figure 4a (upper right image).
In addition, the lack of safe spaces for cyclists and points of conflict at intersections (points E and F in Figure 3) can result in a high concentration of stress points. In general, high-frequency stress points were observed in the vicinity of intersections. Images at point F show the lack of space and unsafe traffic conditions for cyclists at this intersection (left and right images of Figure 4b, also captured from videos recorded during actual rides). Table 3 shows the results of descriptive statistics for the days selected. The results in this table include the mean, median, maximum, and minimum values, the first and third quartiles, and standard deviations, separated by morning and afternoon periods. It can be observed that the average duration of stress was 8.3% higher on average for the afternoon period, and the mean noise values were slightly higher (3.7%) in the afternoon period.
Concerning noise levels, the mean values indicate values considered to represent moderate noise (Table 1) along the routes. For vertical acceleration, it can be observed that, on average, the value was 1.5 m/s 2 for the two periods. This acceleration value is classified as a condition of uncomfortable vibration, as shown in Table 2.

Results of Logistic Regression Models
The results of the stress measurements were modeled from a logistic regression model as a function of the environmental variables noise (Noise), vertical acceleration (VA), presence or not of infrastructure (PCI), and period of the day (Period). A total of 19,007 records were analyzed, considering four categorical predictors: noise (three categories), vertical acceleration (six categories), presence (one) or absence (0) of cycling infrastructure, and period of the day (morning, zero; afternoon, one).
First, an analysis was carried out to determine which variables should be included in the model using a selection of stepwise variables (stepwise variable selection algorithms). Table 4 presents the results of adjusting the logistic regression models for each variable and combinations of variables. The adjustment was verified using the techniques proposed by Cox and Snell and Nagelkerke and AIC. In this table, lower AIC values are desirable for a good fit, and in the case of the criteria established by Cox and Snell and Nagelkerke, the closer to 1, the better the fit.

Results of Logistic Regression Models
The results of the stress measurements were modeled from a logistic regression model as a function of the environmental variables noise (Noise), vertical acceleration (VA), presence or not of infrastructure (PCI), and period of the day (Period). A total of 19,007 records were analyzed, considering four categorical predictors: noise (three categories), vertical acceleration (six categories), presence (one) or absence (0) of cycling infrastructure, and period of the day (morning, zero; afternoon, one).
First, an analysis was carried out to determine which variables should be included in the model using a selection of stepwise variables (stepwise variable selection algorithms). Table 4 presents the results of adjusting the logistic regression models for each variable and combinations of variables. The adjustment was verified using the techniques proposed by Cox and Snell and Nagelkerke and AIC. In this table, lower AIC values are desirable for a good fit, and in the case of the criteria established by Cox and Snell and Nagelkerke, the closer to 1, the better the fit.
In Table 4, it can be observed that the model with all variables (Noise, VA, PCI, and Period) has the lowest AIC value and the best values of fit using the Cox and Snell and Nagelkerke criteria (4.20 × 10 −3 and 6.49 × 10 −3 ), which would put this model in first place in the selection. The model with three variables (VA, PCI, and Period) presented adjustment values similar to those of the model with all four variables (4.03 × 10 −3 and 6.23 × 10 −3 ). However, although a simpler model may be desirable, the difference between the adjustment values did not improve significantly. In addition, the model with more variables allows a more in-depth analysis of the variables that can influence stress and its importance.
Based on the above, we selected the logistic regression model that includes all variables for subsequent analysis. Regarding the validation process, after obtaining the estimated stress probabilities and classifications, the accuracy (i.e., number of correct estimates divided by total number of cases considered for validation) obtained for the selected model was 0.78%, with a 95% confidence interval equal to (0.7714, 0.793). A summary of the results obtained with the selected model is presented in Table 5.
Even though loud noise increased the odds of experiencing stress by 4%, very uncomfortable vibrations increased the odds by 14% and the presence of cycle paths decreased the odds by 8%, the analysis of p-values and the odds ratio confidence intervals showed, with a 95% confidence level, that only the period of the day had an influence on stress, as confirmed by the data. In this case, the odds of experiencing stress increased by 24% in the afternoon rush hour compared to the morning rush hour.

Limitations of the Research
The findings of the study, however, were limited by the small sample. In addition to a greater number of people, more days would provide better statistical support and give more power of generalization to the conclusions.

Future Directions
Further studies using the methodology presented in this paper can include larger samples and subjects with different characteristics, such as gender, age, and other individual characteristics, in addition to various infrastructure types and conditions. In addition, external variables, such as weather (e.g., heat waves, rain, wind) and environmental conditions (e.g., air pollution) should be included.

Conclusions
A method to evaluate stress experienced by cyclists using objective measurements of physiological parameters, such as skin conductance and temperature, was proposed in this study. The main objective was to investigate the relationships between stress and noise, vibration, presence or absence of infrastructure, and period of the day. This method was validated in a medium-sized Brazilian city with only a few segments of cycling infrastructure.
Regarding the proposed methodology, this study presents contributions for planners and bicycle transport operations adopting a new approach. Unlike other approaches in which stress is inferred from the extrinsic characteristics of the cyclist (such as track width and general characteristics of the infrastructure), this new approach focuses on the perspective of monitoring parameters intrinsic to the user, such as emotions. From this perspective, stress level indicators are direct measurements of physiological responses in cyclists along cycle paths. This approach takes advantage of technological resources to extract user information through sensors and allows this information to be used in an integrated way to improve the cycling infrastructure.
Regarding the analysis of the model outcomes, the only possible conclusion was related to the period of the day. The results of the models suggest that there may be differences in stress levels between the morning peak hour and the afternoon peak hour. According to these results, the odds of experiencing stress increased by 24.3% in the afternoon peak compared to the morning peak.