How Does Operational Environment Influence Public Transport Effectiveness? Evidence from European Urban Bus Operators

: Public transport systems’ effectiveness is a well-recognized pillar of their sustainability. In this study, we employed order-m efficiency estimators to investigate the effectiveness of 57 bus public transport operators that provide services in both large and medium sized European cities. Their effectiveness was simulated through a tailored production model and was evaluated against critical exogenous variables, which were mostly extracted from Eurostat database. Results showed that the effectiveness of the examined operators is generally satisfactory. Our research suggests that certain exogenous factors significantly affect operators’ effectiveness and thus create either advantageous or disadvantageous operational environments for maintaining public transport sustainability. Among these factors, household size, unemployment and car ownership rates were found to be unfavorable to bus public transport operations. Contrary to them, the presence of university students and metro systems in cities create a favorable operational environment for bus public transport effectiveness. These findings assist in the identification of sustainable development policies that would both contribute to public transport sustainability and to the fulfillment of wider community goals. Our findings also rationalize benchmarking exercises in the public transport industry, since they enable fair performance comparisons between systems that seek to incorporate successful management practices to improve their sustainability.


Introduction
Sustainable transport systems seek to provide efficient services while minimizing the environmental impact [1]. Public transport (PT) sustainability is typically assessed against four categories of indicators, which are (a) environmental, (b) social, (c) economic and (d) system effectiveness indicators [2,3]. The latter category, which is an addition to the three basic pillars of sustainability, has two essential components: (a) cost-effectiveness, i.e., the relationship between PT physical inputs (such as vehicles, labor and energy) and consumed PT services (ridership) and (b) service effectiveness, i.e., the relationship between produced PT services (such as vehicle or seat kilometers) and consumed PT services [4,5].
In this context, to ensure their sustainability, PT agencies, among others, need to come up with successful management practices that will enable them to improve both effectiveness components. Cost-effectiveness reflects the PT agencies' point of view [6]. It is evaluated through traditional indicators, such as the operating expense per passenger or passenger-km, which, however, are not always linked to customer-oriented issues [6,7]. Service-effectiveness better represents the passengers' point of view, since the consumption of PT services is seriously dependent on their quality [8]. The delivered quality of PT services, however, does not always coincide with the perceived or desired quality from PT customers [9]. Delivered quality can be measured objectively by indicators, such as PT service frequency and speed, but in the case of perceived and desired quality, customer satisfaction and travel preferences surveys have to be employed in order to determine subjective evaluations of PT services' aspects, such as accessibility, cost, comfort, travel time [7][8][9][10]. Therefore, in order to obtain a comprehensive evaluation of PT effectiveness, PT agencies develop performance measurement systems, which reflect both PT service providers' and passengers' viewpoints and thus incorporate both traditional and quality of service indicators [6,11,12]. To this end, any PT management practices that seek to increase effectiveness should aim to utilize available PT resources in order to both decrease operating expenses and improve quality of services so as to attract and sustain an important share of travelers.
However, in the PT sector, as with any other industry, the management practices not only shape effectiveness levels but also the external environment in which the respective organizations operate. The exogenous or external variables are the variables that may influence the execution and effectiveness results of PT operations but without being under the direct control of the respective PT agencies [13][14][15]. Variables such as the type of service area [16], type of ownership [17,18], structure of PT network [12,17,19], population density [20][21][22] and traffic congestion [17,23] are only some of the external variables that scholars have discussed regarding their influence on PT performance (effectiveness is the one of the two basic elements of PT performance, the other being efficiency [4,5]). In this respect, the operational environment of PT systems should be defined by all the exogenous variables that affect the production of PT services. If operational environment is not properly considered, PT effectiveness, and thus sustainability, evaluations will be probably biased and lead to misguided decision-making, especially when benchmarking comparisons and exchanges of best practices are performed among PT systems in different cities and operational environments.
This study focused on PT system effectiveness as a discrete component of PT sustainability. We evaluated the effectiveness of 57 bus PT operators in Europe and investigated the impact of the exogenous environment on their effectiveness scores. We employed advanced non-parametric efficiency measurement methods, namely order-m measures [24], to indicate the exogenous determinants of bus PT effectiveness. In contrast to the traditional non-parametric techniques, i.e., data envelopment analysis (DEA) and free disposal hull (FDH), which have been widely applied in PT related literature [25], the order-m method facilitates more robust and reliable inclusion of external factors to the production processes explored [15,26]. During the last decade, this method was gradually adopted in order to investigate the impact of exogenous variables on production processes, which refer to a wide range of industries and disciplines including, inter alia, the educational sector [27], the financial sector [28], hospitals [29] and public services [30]. To the best of our knowledge, the order-m method is a relatively unexplored technique in PT research with very few applications so far [14]. Non-parametric techniques, in general, also avoid the weaknesses that are associated with indicators' comparisons (Fox's paradox) [31][32][33], which have been previously used for PT sustainability assessments [2,3].
The remainder of the paper is organized as follows. The next section reviews the external factors that influence PT effectiveness along with the corresponding analysis tools that have been employed. Section 3 analyzes the production model we built to evaluate PT system effectiveness and gives the basic mathematic formulae for performing the order-m methodology in this study's context. The data used for this study are presented in Section 4. Section 5 is devoted to the results and their discussion. Section 6 presents our conclusions.

Literature Review
Though the studies that focus on PT sustainability assessment are few [2], they explicitly highlight the importance of PT effectiveness in the overall appreciation of system sustainability. Specifically, Miller et al. [3] used four PT effectiveness indicators, i.e., PT occupancy rates, reliability, annual PT trips per capita and modal split, to demonstrate the role of effectiveness in evaluating the sustainability of various PT modes in Vancouver, Canada. De Gruyter et al. [2] used a similar list of PT effectiveness indicators for measuring the PT sustainability in 26 cities of Asia and the Middle East. The authors emphasize the fact that operational environment characteristics, such as urban form and land use densities of the examined cities, may greatly determine the PT sustainability assessment results [2].
The studies that have investigated the impact of exogenous variables on PT effectiveness can be divided into two categories: (a) studies that attempt to demonstrate the magnitude and the sign of exogenous influence on PT effectiveness indicators using regression analysis techniques and (b) benchmarking exercises, where non-parametric methods (DEA and FDH) are usually employed in order to simulate the cost and service effectiveness of different PT agencies and estimate performance scores and rankings.
Regarding the first category of studies, PT effectiveness is usually investigated by examining the relationship between PT ridership indicators and external variables. The respective regression models use dependent variables, such as PT mode shares for commuting or for other travel purposes [34], the number of daily or annual PT boardings in absolute terms or per capita [20,22,[35][36][37] and PT supply and demand figures, such as the number of seat-kms and passenger-kms per capita [21]. In his meta-analysis, Leck [38] reviewed regression models that were developed at local, state and national levels in the USA and showed that PT use had a statistically significant and positive relationship with residential density, employment density and mix of land uses. Paulley et al. [23] synthesized findings from related studies in the UK and concluded that income and private car ownership rates are fundamental exogenous variables for determining PT demand. Ingvardson and Nielsen [36] developed linear regression models to interpret PT ridership per capita across 48 European cities and found that cities with higher economic inequality showed lower PT ridership. Cordera et al. [39], however, detected a positive impact on bus PT ridership during the period of economic recession in Spain, where high unemployment rates appeared. Schwanen et al. [40] analyzed the Dutch national travel survey data and proved that income is negatively correlated with PT use for commuting but positively correlated for leisure related trips. In the same study, it is found that women travel by PT, bicycle and on foot more often than men. Albalate and Bel [21] analyzed PT supply and demand data from 45 European cities using regression analysis models and found that certain institutional and geographical variables, such as being a political capital and contracting out to private firms, influence supply and have some influence on demand as well. Taylor et al. [20] investigated PT demand in 256 urbanized areas in the USA and observed that PT operations are favored in cities with, among others, a high percentage of college students and immigrants. Kitamura et al. [37] discovered no significant correlation between PT patronage and household sizes in the San Francisco Bay Area. With regard to commuting trips, Schwanen et al. [40] did not find a statistically significant correlation between age and choice of transport mode. The negative correlation between car ownership rate and PT use has been highlighted in many studies [20,21,35,[40][41][42]. Souche [22] selected the cost of using a private car as one of the independent variables of her model and found that it is positively correlated with PT demand.
The studies of the second category compare one or more performance aspects among PT systems, where each performance aspect is determined by the composition of a particular production model. These production models may reflect PT efficiency, effectiveness or combined expressions of PT performance. The effect of external factors on DEA/FDH performance scores is determined by performing one-stage and/or two-stage analysis. In the one-stage process, the exogenous variables are included as inputs or outputs of the production models, while in the two-stage process DEA/FDH performance scores are regressed on exogenous variables so as to explain performance differences among compared units. More specifically, Kerstens [43] developed DEA and FDH production models to explore the performance of more than 100 urban PT operators in France. He used a two-stage analysis and found that population density, network speed and age of fleet did not considerably influence PT performance. In the same study, the private ownership of operators and duration of PT contract, among others, created a favorable operational environment for PT. Viton [17] analyzed supply and demand data of 217 bus PT providers in the USA using DEA techniques. He both employed one-and two-stage analyses to estimate the impact of traffic congestion, terrain, stopspacing, average fleet age, extent of service and ownership (public/private) status on PT performance but found no strong correlations for his hypotheses. A great variety of external variables, such as the predominant activity of cities (i.e., industrial or concentrated on services), the geographical extension of cities, population density, number of cars, income per capita and age of the population were tested against the DEA performance scores of 15 bus transport providers of cities located in Catalonia, Spain, but no significant exogenous impact on performance was detected [18]. Boame [13] discovered that the average commercial speed had a statistically important and favorable effect on the performance of 30 Canadian bus PT systems and justified this on the grounds of lower congestion and costs that appeared in their networks. Karlaftis [44] analyzed data from 256 US bus PT systems and built DEA models to evaluate their efficiency, effectiveness and combined performance over a five-year period. Under an on-stage analysis setting, he highlighted the positive relationship between efficiency and effectiveness as well as the existence of scale economies, indicating the role of bus fleet size in shaping PT performance ratings. For Norway, Odeck [45] performed two-stage DEA analysis to examine whether the effectiveness of 33 bus operators is affected by the size of the providers, their ownership status and the type of service area (urban or rural) and concluded that none of these factors exerted a significant influence. Pina and Torres [46] proved that the deregulation of PT sector and the introduction of competitive tendering favored the performance of 73 bus PT systems globally by using one-and two-stage DEA methods. Sampaio et al. [47] developed a DEA model to compare costeffectiveness of 19 metropolitan PT systems globally and stressed the importance of power partition among the components of the administrative agency of the transport system as well as the diversity of tariff structure for achieving better performance results. Merkert et al. [48] performed a DEA global benchmarking exercise to compare performance of 58 bus rapid transit systems and pointed out population density, status of ownership and number of stations as serious explanatory factors of costeffectiveness. Georgiadis et al. [14] developed a conditional non-parametric setting to highlight population density as a critical exogenous determinant of bus PT effectiveness within a sample of 34 multimodal PT systems worldwide.

Methodology
Our methodological framework relies on a tailored production model and synthesizes four analysis settings. To simulate the effectiveness of bus PT operators, we built a production model that incorporates both cost-effectiveness and service effectiveness dimensions and thus integrates both PT agencies' and communities' objectives (Section 3.1). Then, we used order-m efficiency estimators, originally developed by Cazals et al. [24], to obtain bus PT effectiveness scores and rankings (Section 3.2). We used the conditional non-parametric analysis framework, proposed by Daraio and Simar [15,26], to indicate and describe the influence of exogenous factors on bus PT effectiveness (Section 3.3). Within this framework, we adopted a more recent and reliable approach for selecting the bandwidth required to calculate the conditional order-m measures [49]. Finally, in order to highlight the statistically significant exogenous factors in our nonparametric environment, we implemented a consistent test of significance, which was originally proposed by Racine et al. [50] and Racine [51]. The following sub-sections present in more detail our methodological concept.

Production Model
In order to evaluate bus PT effectiveness and investigate the exogenous factors that influence it, we first need to define a production model that demonstrates the basic production processes of costeffectiveness and service effectiveness. To construct this model, we selected the number of bus vehicles as an indicator of physical inputs ( ), the number of total vehicle kilometers as an indicator of the produced services ( ) and the number of passenger boardings as a variable that represents the amount of consumed services ( ). The simultaneous combination of all these three variables into a production model allows the development of a combined effectiveness production model [14,33]. In terms of cost-effectiveness (operator's point of view), our model's inputs, i.e., the number of bus vehicles and vehicle kilometers, reflect a great deal of the expenses required for producing PT services, while our model's output, i.e., passenger trips, is associated with the acquired revenue. In terms of service effectiveness (passenger's point of view), the model's inputs are related to the delivered quality of PT services, while the model's output is related to the extent to which PT services are attractive to the travelers. Specifically, vehicle kilometers, among others, represent the temporal availability of PT services and along with number of vehicles demonstrate the capacity availability of the system. The size of the bus fleet, on its own, also reflects the capability of the operators to more successfully respond to vehicle breakdowns and maintain the required number of scheduled buses. This enables them to keep higher service reliability rates and allows for a better distribution of capacity availability within PT networks, especially if these extend in multiple directions to accommodate the integration of various districts. Though the passenger kilometers, as a production model output, would better describe the consumption of PT services, our model uses passenger boardings instead. This is because the use of passenger kilometers would be associated with two shortcomings, one theoretical and one practical. First, passenger kilometers would be very much correlated with one of our model's inputs, i.e., vehicle kilometers, and thus our results would be biased in favor of the comparatively large bus PT operators [52]. Secondly, passenger kilometers is a variable that is less often reported by bus PT operators [53], and thus we would not be able to compile a critical sample of observations for our analysis. We selected an output-orientation for this production model, since the overall sustainability objective of PT systems should be to achieve higher patronage figures by enhancing the management (utilization, quality) of their available resources and not necessarily by increasing them.

Bus PT Effectiveness Estimation
In a non-parametric environment, we may assume that a group of Decision Making Units (DMUs) can produce a set of outputs by utilizing a set of inputs following a so-called production process. The production set includes the points that represent the operation of DMUs [15] (Equation (1)): In this study we treated each European bus PT operator as one DMU, and as stated before we have = 2 inputs and = 1 output. Feasibility of the vector ( , ) implies that within the operator under examination, it is physically possible to convert the input quantities , … , into the output quantities , … , . DEA [54] and FDH [55] methods compare the performance of = 1, 2, . . . . DMUs, with input items ( , … , ) and output items ( , … , ) , against an efficient frontier. In the case of DEA, this frontier is constructed by the linear combination of points that stand for the best performing DMUs, which are found within . Contrary to the DEA frontier, the FDH efficient frontier does not account for the convexity assumption, which implies that when two observations ( , ) and ( , ) are feasible then all the linear combinations that lie between them are also feasible. According to Daraio and Simar [26], the production process can be also described by the joint probability measure of ( , ) on ℝ × ℝ . Equation (2) gives the corresponding probability function (. , . ): In this paper, we consider an output orientation for bus PT system effectiveness assessment, since the objective of the production model (Section 3.1) is to achieve as much as high ridership figures (outputs) for a given level of vehicles and vehicle-kms (inputs). In the output-oriented framework, Equation (2) can be decomposed as follows [15] (Equation (3)): where | ( | ) = ( ≥ | ≤ ) and ( ) = ( ≤ ) . Therefore, for each ( ), the output-oriented efficiency score represents the relative increase of outputs that this would require so as to reach the best performing observations within its sample (Equation (4)) [15]: DEA and FDH methods suffer from the curse of dimensionality, and their results may be biased due to extreme and outlier observations [15]. This mostly happens because both DEA and FDH are full frontier methods, i.e., their production sets enclose all the ( , ) observations. Responding to these drawbacks, Cazals et al. [24] originated the concept of partial frontier methods. They labelled them as order-m frontiers because they allow the assessment of a against other DMUs, where ≤ . These DMUs operate with inputs ≤ . Therefore, the order-m method provides for a less strict assessment of DMUs compared to the DEA and FDH techniques. The order-m output efficiency score is defined as follows [26] (Equation (5)):

Assessment of Exogenous Influence
Daraio and Simar [15,26], used the probabilistic approach of Equation (2) in order to explore the influence of other exogenous variables ∈ ℝ , which do not participate as inputs or outputs of the production process but possibly affect its results. Equation (6) gives the conditional production process, which they proposed, and represents the probability of a DMU ( , ) to operate under lower efficiency levels when compared with other DMUs, which function under the same set of exogenous factors = : Within this setting, Equation (7) gives the conditional order-m output efficiency score [15]: In order to estimate , ( , | ) , we may utilize the following non-parametric smoothed estimator of | , ( | , ) [15] (Equation (8)): where and ℎ , are the kernel and the bandwidth, which are needed for smoothing on . In this research we used the Epanechnikov kernel function when was a continuous variable. Aitchison and Aitken [56] kernel function was used when had unordered discrete values, and Wang and Ryzin [57] kernel function was used when had ordered discrete values. We used the process developed by Simar et al. [49] to estimate the bandwidth ℎ . The required calculations were made in R version 3.0.2 software (R Foundation for Statistical Computing, Vienna, Austria) [58] on the "np" package [59].
In order to determine the impact of each external factor (see Section 4.2) on the production model process of Section 3.1., we followed the method proposed by Daraio and Simar [15,26]. We produced scatter plots that combined the ratios , , (Equation (9)) and the values. In the scatter plots we demonstrated the relationship between these two sets of variables by their smoothed nonparametric regression line ( ) (Equation (10)). This line was estimated by the simple smoothed nonparametric regression estimator of Nadaraya [60] and Watson [61]: In our output-oriented setting, if the regression line ( ) is increasing along with , then we may characterize the influence of as favorable to the production process. If it is decreasing, then this influence will be unfavorable. If ( ) remains straight, i.e., ( , | ) = ( , ), then will have no impact on the production process.
In this paper, we used the Gauss kernel function and the least square cross validation criterion to select the bandwidth ℎ in Equation (10) [8]. We performed the estimations of ( ), and we produced the respective scatter plots by employing the R version 3.0.2. software [58] and the "np" package [59]. Furthermore, we followed the nonparametric significance test of Racine et al. [50] and Racine [51] to indicate the exogenous factors that significantly influence the modeled production process. This test is analogous to a simple t-test in a parametric regression setting [59], and when it returns a p-value lower than 0.05, it highlights a statistically important influence of on the variation of , , ratios. This conditional non-parametric environment for assessing exogenous influence avoids incorporating the theoretical and practical shortcomings of the one-stage and two-stage DEA methods, which have been extensively applied in PT DEA literature (Section 2). Specifically, in the case of the one-stage method, Daraio and Simar [15] described two disadvantages: (a) we have to assume beforehand the positive or negative role of an external indicator to the production process so as to appropriately allocate it to the production model, and (b) we have to assume the free disposability of the augmented . Regarding the two-stage approach, (a) Boame [13] noted that results may be biased if the variables used in the first-stage are highly correlated with the secondstage variables, and (b) Simar and Wilson [62,63] proved that bootstrap algorithms are necessary to remove bias from the results of such regression analyses.

Data
The data of our study were mainly compiled from two databases: (a) the Union Internationale des Transports Publics (International Association of Public Transport) Digest (UITPD) database [53] was our main source for the supply and demand variables of European bus PT operators, and (b) the Eurostat Urban Audit Database (EUAD) [64] was our main source for the exogenous variables, which define the operational environment where these PT operators provide services. The reference year for both data extractions is 2009. In this year, we were able to observe a satisfactory availability of data in both databases regarding the bus PT operators in the UITPD and their corresponding cities in the EUAD so as to examine as many exogenous variables and operators as possible. This data-driven approach resulted in the selection of = 57 European bus PT operators for further analysis.

Bus PT Effectiveness Variables
The UITPD is a global PT database of supply and demand figures concerning operator members of the UITP. The UITPD contains basic annual service variables for PT operators, such as the type of PT service (e.g., urban, rural), number of staff, passengers, passenger-km, vehicle-km, number of road/rail vehicles, number and length of PT lines. These variables are reported only for fixed routes and not for night or other on-demand routes. Access to the UITPD is only allowed for UITP subscribers. In this study we compiled a sample of = 57 European bus PT operators who either systematically report full annual data in the UITPD or announce annual performance reports publicly. Table 1 provides some basic statistics of the three PT variables we used in order to simulate the production model for evaluating the effectiveness of these operators. Table 1 figures indicate that our sample represents both large and small sized PT operators.

Exogenous Variables
The EUAD collects 170 variables and calculates 60 indicators that cover several aspects of life in over 1500 European cities [65]. EUAD information was collected for two basic spatial reference levels in each European city: (a) the city and (b) the functional urban area (FUA). The city area is determined by its administrative boundaries, while the FUA includes the city and its commuting zone. Appendix A reports the population figures of FUAs for all the 57 cities we considered in our analysis [64]. Each of these cities is one-way assigned to a bus PT operator in our sample. Of the total of 57 cities, 46 come from the largest European countries, i.e., France (22), Germany (8), Italy (8) and Spain (8). Almost half of the cities (28) range between approximately 150,000 and 500,000 residents. Eleven cities exceed 1 million citizens, and the largest of them are in Germany and Spain (Table A1, Appendix A). Table 2 reports the descriptive statistics for the 16 exogenous indicators ( − ), which we tested regarding their influence on the effectiveness of the = 57 bus PT operators. We considered the five major groups of factors that affect PT ridership, i.e., the output of our production model. These groups are [20]: (a) regional geography (urbanization, population density, population, climate etc.), (b) metropolitan economy (GDP, unemployment etc.), (c) population characteristics (age distribution, percent of immigrants and college students etc.), (d) auto/highway system (congestion, parking availability, car ownership rates etc.) and (e) PT systems characteristics (fares, transit modes, service frequencies etc.). Due to the fact that at the European level there are no standardized databases to fully cover all these aspects, we used the EUAD to finally select a short but yet comprehensive list of external variables to account for, as much as possible, these five groups of indicators. Therefore, Table 2 variables refer to demographics ( − ), living conditions ( , ), economy ( , ), urban transport ( − ) and other secondary factors ( , ) that characterize the = 57 PT operational environments in the cities under examination. We underline the fact that we did not include variables that describe PT services' characteristics, such as frequency, fares and reliability, since these variables cannot be considered as exogenous given that they are more under the control of PT operators. No strong correlations (coefficients up to 0.5) were found among Table 2 variables, and we were able to examine them independently from each other [15]. The variables mainly derive from FUAs, considering that FUAs better capture the service areas of the = 57 European bus PT operators, since urban bus networks usually extend beyond the administrative boundaries of cities and are used by the commuters who reside in cities' suburbs. Exceptions were made when certain exogenous indicators were only measured by the EUAD at city level ( − ) or metropolitan ( ) and Nomenclature of territorial units for statistics (NUTS)-3 level ( ). We considered previous research findings (Section 2.) in formulating our initial hypotheses for all 16 exogenous variables regarding the sign of their influence on bus PT effectiveness (last column of Table 2). In short, urbanized areas with increasing population numbers should generally favor PT [14,20,22]. There were studies [40] that highlighted comparatively greater PT use by women. Population density has been repeatedly associated with beneficial impacts on PT performance [14,38,48]. Cities with comparatively lower GDP and higher unemployment rates probably create more favorable PT operational environments [23,39]. Percentages of college students and foreigners are generally positive exogenous factors [20]. A lot of studies have demonstrated the negative influence of car ownership rates on PT [20,21,35,[40][41][42]. We selected an unfavorable impact for the bus fleet size due to the possible existence of economies of scale [44]. The exogenous indicators, which had been either associated with mixed results in previous studies or had been comparatively less explored, were assigned with ambiguous impact. In our analysis, if cities or FUAs had a missing value for the exogenous variable under examination, then these cities or FUAs were temporarily kept out from the analysis of the respective exogenous variable.  Table 3 presents partial frontier (order-m) performance scores for the combined effectiveness production model of the = 57 European bus PT operators. Performance scores are obtained from Equation (5) for each bus PT operator separately (Table 1 figures) and appear as 1/ values to ease our discussion. To save space, we present a summary of them. We tested several values of to monitor the percentage of bus PT operators that were left out of the order-m frontier (1/ > 1). Since this percentage remained very stable for > 30 , we can finally select = 40 to discuss the obtained results [15,24]. Under = 40, 20 bus PT operators were found that have effectiveness scores equal to or greater than one (1/ ≥ 1) and therefore can be considered as best-performing examples among their peers in our sample. This best-practice group of operators represents the ( = ) 35% of all 57 bus PT operators. The worst performing operator should increase its ridership by ( . = 2.985) 198.51%, without using more inputs (i.e., vehicles and vehicle km) in order to reach the best performing operators for its case. On average, however, a bus PT operator produces a level of ridership (output) that is equal to 0.784 times the expected value of the maximum level of ridership of = 40 other operators drawn from the population of operators using a level of input, which is equal to or less than the operator under evaluation. Table 3 also shows that these findings do not greatly deviate for the other values that were tested and thus indicate a generally satisfactory PT effectiveness level within our European sample. The next section discusses whether specific exogenous factors can explain the contrasts that were observed in the effectiveness scores of these bus PT operators.  Table 4 presents the results of the methodology explained in Section 3.3 and shows the influence, which the selected 16 exogenous variables ( Table 2) exhibit, on the combined effectiveness of the = 57 European bus PT operators. Conditional order-m scores were estimated for each bus operator separately (Equation (7)) using the unconditional ones that were discussed in Section 5.1. Third and fifth column of Table 4 interpret the scatter plots of the ratios , , (Equation (9)) along with the smoothed nonparametric regression lines ( ) (Equation (10)). These scatter plots are given in Appendix B (Figures A1-A4) for the exogenous indicators that had a statistically significant influence on bus PT effectiveness (fourth column of Table 4).

Exogenous Impact on Bus PT Effectiveness
The first six exogenous variables in Table 4 account for the population characteristics of the examined European cities. In line with our initial hypothesis, total population ( ) has a favorable effect on bus PT effectiveness, but this effect was not found to be statistically significant. The effect of "Women per 100 men" ( ), though statistically significant, does not seem to follow a uniform influence pattern and varies between favorable and unfavorable in accordance with the range of values that appear in our sample. The percentage of active population, i.e., between 20 and 64 years old ( ) , seems to be indifferent when it comes to the explanation of bus PT effectiveness. Additionally, the annual population change (percentage difference compared to the previous year) ( ), does not have any obvious effect on PT effectiveness. This may be because this indicator is unable to capture any growth / shrinking dynamics of cities and expresses a more momentary change. Proportion of non-EU foreigners ( ), was also explored, since, in past studies it was considered to be as a proxy for income inequality and poverty, but no significant influence was found. Given the result of "Students in higher education per 100 resident population aged 20-34" ( ), we reconfirm that the presence of university students in cities is accompanied by more favorable operational environments for bus PT and encourage a relatively higher patronage.
When it comes to the living conditions in European cities, the case of population density confirms our initial hypothesis since it is associated with a favorable influence on bus PT performance ( ). However, this influence does not have any important statistical value. This may be due to the fact that the EUAD does not report population density on cities or on FUAs and therefore the figures which were used in this study refer to the respective metropolitan regions. These regions do not always coincide successfully with the urban PT service areas under examination. Living conditions, however, do have an impact on bus PT performance since the "Average size of households" ( ) is an unfavorable factor with statistical significance.
In contrast to our initial hypothesis, rising unemployment rates ( ) are clearly associated with negative influence on bus PT effectiveness. This indicates that the sustainability of European bus PT systems is generally favored under high employment activity. GDP had no statistically important impact ( ). This may be because GDP indicators probably do not adequately reflect the expenses required for travelling by PT and other transport modes in relation to the disposable income of residents. Regarding the transport system characteristics of the cities, in line with our hypothesis, the number of registered private cars per 100 inhabitants has a negative impact on bus PT combined effectiveness ( ). Public operators were found to be more successful than private ones ( ). However, this result is probably too generic because this variable fails to incorporate the various PT organizational and management regimes, which are often determined by PT service contracts, no matter if the operators are private or public. Unfortunately, we were not able to locate any standardized database to construct variables for accounting for such contracting arrangements. The impact of bus operators' size on their systems' effectiveness was measured through the proxy variable of bus fleet size ( ). The results indicate the existence of economies of scale. Thus, for fleet sizes of up to 1200 vehicles, the operational environment is favorable, but, for larger fleet sizes, it is deteriorating. The mix of PT modes ( ) was found to be a significant factor and demonstrates the creation of a favorable operational environment in cities where bus, tram and metro networks coexist.
Finally, our results also showed that both the capital status and climate of European cities ( , ) do not exert any important influence on bus PT combined effectiveness.

Consolidation
Empirical findings indicated seven significant exogenous factors ( is excluded due to the reasons explained in Section 5.2.) and demonstrate that the operational environment does influence bus PT effectiveness and thus sustainability. We can now return to the very essence of our production model, so as to appropriately explain the variations of bus PT effectiveness scores that were observed among operators (Section 5.1.). Therefore, if bus PT operator A presents higher effectiveness when compared to bus PT operator B, then we may assume that operator A achieves higher ridership figures using the same or less inputs than operator B because of (a) management practices on utilization of vehicles and route scheduling and/or (b) quality of PT services and/or (c) the operational environment of operator A is more successful or favorable than that of operator B. In other words, bus PT demand is dependent on both endogenous (points a and b) and exogenous factors (point c). As such, the challenge of PT operators is to design and implement measures that aim to increase their patronage by improving their cost and service effectiveness while also considering the properties of their operational environment. In this context, we recognize two main fields for utilizing our research results.
First, our findings could assist in the formulation of sustainable development policies that would both contribute to increasing PT demand and sustainability as well as in the fulfillment of wider community goals. Specifically, it was made evident that policies which reduce private car use ( ), along with their well-known various environmental and social benefits, could also increase PT effectiveness. University students ( ) are recognized as a group of travelers that PT systems should further embrace in order to obtain effectiveness and sustainability yields. The intensification of employment ( ) may be emphasized as a policy that favors both economic and PT systems' sustainability. The introduction of rail PT systems ( ) generally supports sustainable urban mobility lifestyles, and, in line with this notion, our results also showed that this favors the greater use of bus PT services. The conditional non-parametric setting we adopted allowed us to also highlight the range of values (Table 4) where such exogenous factors could be either beneficial or detrimental to PT effectiveness. Therefore, the design of the above policies could be more specific, since a set of corresponding quantitative targets may be more easily determined.
Secondly, the consideration of the seven exogenous factors, which we found to be significant, may act as a prerequisite for conducting fair benchmarking comparisons among different PT systems. Specifically, the examination of these indicators among different cities allows the identification of benchmarking peers, i.e., bus PT agencies who share as much as possible similar operational environments. Therefore, any comparatively lower or higher effectiveness of such bus PT agencies will be only due to their unsuccessful or successful management practices, which need to be identified or (in the latter case) exchanged within their benchmarking networks. Considering the combined effectiveness model we built, such successful management practices would lead to comparatively higher effectiveness levels but without the need to increase PT resources (i.e., vehicles and vehicle km) and may, for instance, refer to the improvement of quality of PT services (e.g., fares, customer care, speed, comfort) and/or to a more sophisticated utilization of the available PT resources (e.g., modification of depots, span of services, scheduling, network design) in order to increase service effectiveness (PT passengers' viewpoint) and cost-effectiveness (PT agencies' viewpoint). In this benchmarking context, the comparatively best performing PT operators would also be associated with management practices that enable them to more successfully adapt their PT operations to their exogenous environment and PT demand characteristics. The utilization of these seven exogenous factors as peer-grouping criteria for European PT benchmarking exercises provides important benefits, since (a) they are continuously monitored and reported by the EUAD and thus enable single bus PT organizations to perform customized benchmarking comparisons with other systems, and (b) they are determined under the combined effectiveness model, which, as previously explained, stands for some of the most basic sustainability objectives of PT systems.

Conclusion
In this study, we applied advanced non-parametric analysis methods, namely order-m conditional and unconditional efficiency measures, to evaluate a basic component of PT sustainability, i.e., PT systems' effectiveness, and to investigate its exogenous determinants. Our dataset was representative enough, since it was composed of 57 bus PT operators who are active in large and medium sized European cities, where all combinations of road and rail PT modes are apparent. Our production model incorporated variables that reflected the combined effectiveness of PT systems and thus successfully encompasses a basic pillar of PT sustainability.
Our results indicated that the examined European bus PT operators generally present a satisfactory level of effectiveness. We also highlighted that seven specific exogenous factors may have a meaningful influence on such bus PT effectiveness scores. Within these exogenous indicators, household size, unemployment and car ownership rates are negatively associated with PT effectiveness, while the presence of university students and metro networks support the creation of more favorable operational environments for PT sustainability. These exogenous indicators correspond to specific topics, where appropriate sustainable development policies could upgrade both the quality of life in local communities and the sustainability of the respective bus PT systems. Moreover, these indicators can function as initial peer-grouping criteria for selecting comparable benchmarking partners. Within such PT benchmarking networks, the identification and exchange of successful practices will both increase PT systems' effectiveness and contribute to the improvement of their sustainability.
Due to certain limitations of the EUAD and other data sources we used, we were not able to sufficiently explore important exogenous factors such as population density, income inequality and transport system characteristics as well as PT contracting arrangement and network design features, which in certain cases may also act as exogenous factors in some bus PT systems. Another limitation of our study is the fact that due to breaks in time series in the PT and Eurostat databases we used, we obtained results only for one year (2009) and we were not able to repeat our analysis for consecutive years so as to further establish our findings and account for economic and technological changes that occurred afterwards and may have also affected the bus PT sector. This is especially true for the results we acquired regarding external indicators such as GDP, unemployment rates, car ownership rates and possibly population change. The year 2009 was characterized by global economic crisis, and this may have been reflected in these indicators' figures. The consideration of more recent years would probably have enabled us to investigate a differentiated distribution of values that would be possibly less skewed and, along with 2009 results, would return a more representative estimation of these indicators' influence on bus PT effectiveness. Forthcoming research should develop additional production models that will reflect alternative expressions of PT effectiveness and the other basic sustainability aspects of PT systems to further investigate their interrelationships and their influence from the respective PT operational environments.

Funding: This research received no external funding
Acknowledgments: The authors would like to thank the three anonymous reviewers for their valuable comments and suggestions on previous versions of this paper.