Sustainability Assessment of Public Transport, Part I—A Multi-Criteria Assessment Method to Compare Different Bus Technologies

This article departs from the perspective of Swedish regional transport authorities and focuses on the public procurement of bus transports. Many of these public organizations on the county level have the ambition to contribute to a transition involving the continued marginalization of fossil fuels and improved sustainability performance. However, there are several renewable bus technologies to choose between and it can be difficult to know what alternative (or combination) is preferable. Prior research and the authors’ experiences indicate a need for improved knowledge and supportive methods on how sustainability assessments can support public procurement processes. The purpose of this article is to develop a multi-criteria assessment (MCA) method to support assessments of public bus technologies’ sustainability. The method, which was established in an iterative and participatory process, consists of four key areas and 12 indicators. The article introduces the problem context and reviews selected prior research of relevance dealing with green or sustainable public procurement and sustainability assessments. Further on, the process and MCA method are presented and discussed based on advice for effective and efficient sustainability assessments. In the companion article (Part II), the MCA method is applied to assess several bus technologies involving biodiesel, biomethane, diesel, electricity, ethanol and natural gas.


Introduction
Transportation systems are linked to essential environmental, health and resource challenges [1] and many organizations take actions for improved sustainability performance, for example, References [2][3][4]. A rapid transition is needed to reach objectives such as those regarding renewable energy, climate impact, air quality, for example, in the EU and countries like Sweden [5][6][7][8]. For many different types of transportation, there is a wide array of possible alternative technologies involving, for example, biodiesel, biomethane, electricity and ethanol [9][10][11].
This article deals with the generic problem of how to assess transport technologies' sustainability performance (including technical and short-term economic aspects). It departs from the perspective of Swedish regions, which are authorities on the county level that, among other duties, are responsible for the procurement of public transport as regional public transport authorities. The focus is on buses, as they have a dominating position among the public transport modes in Sweden [12], as in many parts of the world [13].
Regarding bus fuels in public transport in Sweden, a major transition has occurred during the last two decades. At the beginning of the 21st century, the bus fleet was almost totally driven on fossil fuels, whereas in 2017, more than 60% of the buses used renewables [14]. The regions, specifically the regional public transport authorities, have been key actors in this transition [11]. Via green public procurement (GPP) or sustainable  [19].
The described palette with renewable technologies provides great opportunities for the continued marginalization of fossil fuels. However, when the regions (or other actors) ask what alternative or what combination of alternatives, is best suited for them, it may be challenging to answer as the question encompasses many sub-questions. Generally, there seems to be a great need for improved knowledge and supportive methods when it comes to including environmental/sustainability assessments in public procurement processes [20] and in other types of related appraisals [21,22]. This is not least relevant in connection with transportation [15,23,24]. These observations are in line with the authors' experience of working within the Swedish Biogas Research Center (BRC) and in related projects, where we have been approached by regions, municipalities and biogas sector organizations requesting support on how to compare different bus transport technologies. In addition, biofuel producers and distributors and other stakeholders, for example, see Figure 1 in Reference [11] have shown great interest in developing these kinds of assessments.

Aim and Scope
The purpose of this article is to develop a multi-criteria assessment (MCA) method to support assessments of public bus technologies' sustainability. The method shall: • Be based on existing knowledge of practitioners and scientists o Be adapted to the context and input from stakeholders, particularly Swedish regions' challenges related to the procurement of bus services and their views on sustainability  [19].
The described palette with renewable technologies provides great opportunities for the continued marginalization of fossil fuels. However, when the regions (or other actors) ask what alternative or what combination of alternatives, is best suited for them, it may be challenging to answer as the question encompasses many sub-questions. Generally, there seems to be a great need for improved knowledge and supportive methods when it comes to including environmental/sustainability assessments in public procurement processes [20] and in other types of related appraisals [21,22]. This is not least relevant in connection with transportation [15,23,24]. These observations are in line with the authors' experience of working within the Swedish Biogas Research Center (BRC) and in related projects, where we have been approached by regions, municipalities and biogas sector organizations requesting support on how to compare different bus transport technologies. In addition, biofuel producers and distributors and other stakeholders, for example, see Figure 1 in Reference [11] have shown great interest in developing these kinds of assessments.

Aim and Scope
The purpose of this article is to develop a multi-criteria assessment (MCA) method to support assessments of public bus technologies' sustainability. The method shall: • Be based on existing knowledge of practitioners and scientists Be adapted to the context and input from stakeholders, particularly Swedish regions' challenges related to the procurement of bus services and their views on sustainability • Include common indicators such as monetary costs and function/quality [20] but broaden the scope in relation to many existing methods Cover essential areas regarding sustainability and the most relevant aspects for the different technologies to be assessed • Be relatively simple to use with a reasonable number of indicators to facilitate data collection and overview When used, provide results for a wide range of indicators without weighting, thus leaving the users to decide based on their own preferences if any indicators are more important, such as local conditions and prioritized objectives.
This article is the first (Part I) of two associated articles and presents the MCA method establishment process. Following the introduction, Section 2 provides a literature review on green or sustainable public procurement (GoSPP) and environmental or sustainability assessments (EoSA), focusing on MCA and transport systems. Section 3 gives an overview of the working process, while Section 4 presents the outcome-the established MCA method. Finally, there is a concluding discussion in Section 5. In Part II, the method is applied for the comparison of different bus technologies (Dahlgren and Ammenberg, 2020 ("Part II" is hereafter used to refer to this second article.)).

Green or Sustainable Public Procurement
Cheng et al. [20] conducted an extensive literature review on GoSPP and found an increased focus on it in recent years, both in practice and research (cf. [25], focusing on the private sector). Key procurement actors, like local authorities, struggle in the implementation and improvements are required regarding the follow-up of requirements.
Budget constraints are commonly emphasized as the main barrier to GoSPP [20,26,27], due to additional costs of products or services with superior sustainability performance (cf. [22]) and a need for additional staff resources [15,28]. Conflicting objectives were also found to be a common barrier in several studies [28]. A broader and more long-term perspective is needed to account for sustainability impacts and shift the direction toward more comprehensive socio-economic outcomes [28][29][30]. Accordingly, based on life cycle costing (LCC), it has been established that GPP within the EU has led to decreased costs for the purchasing organizations. A short-term cost focus makes it, in general, very difficult to address sustainability issues [15].
A lack of sufficient competency is another frequently mentioned barrier [20,28]. Michelsen and de Boer [31] found, in a large study among Norwegian municipalities and counties, that only about 5 percent of the municipalities " . . . felt they had sufficient competence to formulate demands on environmental performance and evaluate the information they received in the suppliers' offers" (p. 163). Testa et al. [32] even concluded knowledge and training to be more important than economic resources.
Perera et al. [33] refer to a study from 2007, where hundreds of tools, including SPP tools, were identified, including guidelines, handbooks, databases and software. This report does not include the characteristics of these tools but indicates that a large share consists of unique tools developed by individual organizations to be applied in their particular context. Several studies highlight the need for simple methods and guidance [20,28,32,34]. In line with this, Cheng et al. [20] noted the lack of official guidance on green public procurement. Thomson and Jackson [35] emphasize the need to develop new models to quantify the wider impacts of procurement decisions. About 70% of the Norwegian municipalities in the study by Michelsen and de Boer [31] emphasized a need for templates, including standardized environmental requirements. The EU has provided a training toolkit on GPP, with criteria for ten key sectors including transport but Palmujoki et al. [36] found that this has the potential for further improvements, for example, by including more detailed criteria.
Some articles deal with EU policy and practical implications. Luttenberger and Luttenberger [29] state that the 2014 EU Directive on Public Procurement implies that technical specifications can relate to sustainability impacts at any stage of the life cycle of a product or service, that the considered costs can regard acquisition, use, maintenance, end of life and environmental externalities. Aldenius and Khan [15] mention the two options with the EU to use minimum compliance criteria and award criteria, where the latter can give additional points to the bidder in the tender. They state that minimum requirements are dominating regarding renewable fuels in the Swedish bus sector and found two types of such requirements to be used: functional (e.g., a maximum amount of CO 2 that can be released) and specific (e.g., that a specific fuel has to be used) (cf. Reference [37]).
In the reviewed literature, there is limited detailed information on methods or tools for comparison and evaluation of different types of products and services, including transport. Therefore, the literature review was extended to encompass publications on sustainability assessments that could provide more relevant information considering this article's purpose.

Environmental or Sustainability Assessments
There has been a focus on sustainable development for several decades [38] and much has been written about its meaning and implementation [39][40][41][42]. Despite abundant "sustainability initiatives" within organizations of different types, Waas et al. [43] emphasize an implementation gap (as do [44] and other researchers) and urgent need to focus more on implementation in essential decision-making processes, stating, " . . . sustainable development must be considered as a decision-making strategy" ( [43], p. 5513). In line with many others, they argue for the need for environmental or sustainability assessment (EoSA) methods that can support and influence decision-making towards sustainability.
Sustainability assessment methods are intended to provide decision-makers with knowledge on sustainability implications of different actions or alternatives [45]. There are several different methods for the integration of sustainability into decision making [46]. As we see the sustainability assessment of bus technologies as a complex problem or wicked problem [47,48] and acknowledge value pluralism [49], it seemed reasonable to use a multi-criteria approach. In practice, the process of establishing an MCA method and applying it often incorporates several of the other methods (or techniques) described by Bueno et al. [46], sch as combining an MCA with cost-benefit analysis [50,51]. For example, our application of the MCA method has been influenced by sustainability indicators used in other sustainability evaluation tools and results from LCC and LCA (Life Cycle Assessment/Analysis) studies were used. There are many different types of MCA methods in the literature, for example, References [24,[51][52][53][54][55]. Belton and Stewart [56] define multi-criteria decision analysis as "an umbrella term to describe a collection of formal approaches which seek to take explicit account of multiple criteria in helping individuals or groups explore decisions that matter". A strength of MCA methods is that they allow for the inclusion of several different types of indicators [21], both of a quantitative and qualitative character [57][58][59] and can involve expert assessments and participatory processes [52,60,61]. They can be used to handle large amounts of information and to select and structure the most relevant information, thereby providing more holistic evaluations [51,52] and simultaneously facilitating overview and communication [54,62]. In addition to the mentioned strengths, there are also drawbacks cited in the literature regarding MCA, for example:

•
The use of several indicators with different directions or units, can imply that some effects are "double"-counted [63,64], that is, in case indicators are overlapping, which can be hard to avoid • Critique regarding arbitrariness and subjectivity [63].
Weighting, meaning that different indicators/criteria can be assigned a weight in relation to their relative importance 5.
Assessment and analysis 6.
Presentation and interpretation of results, recommendations.
These steps are also similar to the MCA-based decision-making process illustrated by Foxon et al. [74] and Oltean-Dumbrava et al. [75]. There are several specified methods for MCA, for example, to assign weights and calculate scores that can be relatively advanced, such as the analytic hierarchy process developed by Saaty [76]. Hüging et al. [21] have reviewed assessment methods and conclude that there is a lack of suitable methods for sustainability assessments and specific demand for simple but broad approaches (cf. [22,75]). This has influenced our aim and recommendations from the mentioned studies (and others, see below) has influenced the MCA method development process.

Effective and Efficient Sustainability Assessments
The literature on sustainability assessments, including MCA, contains much advice regarding methods and processes. Based on their review, Waas et al. [43] (based on original references, such as Baker and McLelland [77]) present four categories of effectiveness: substantive (e.g., to achieve the intended outcome), normative (e.g., social learning), procedural (e.g., that the process is open, fair and objective) and transactive (efficient use of resources, including time). Furthermore, it is emphasized that the ideal-typical sustainability assessment needs a "top-down/expert-driven" and "bottom-up/stakeholder-driven" integration and be able to make use of different kinds of knowledge and create opportunities for learning. Oltean-Dumbrava et al. [75] present practical requirements for good multi-criteria decision-making tools, such as consistency and logical soundness, userfriendliness and good visual presentation of results. Through combining advice from several articles [43,74,[78][79][80][81][82], criteria for suitability and usability are listed below:

1.
Comprehensiveness and relevance: the indicators should cover economic, environmental, social and technical aspects in order to ensure that account is being taken of progress towards sustainability objectives (cf. [44]). The indicators should be relevant in relation to the studied problem and the context of the study (democratic, good stakeholder participation). The indicators should allow grading in relation to sustainability, that is, provide results on the sustainability performance.

2.
Practicability: a reasonable number of indicators that are straightforward and possible to use, considering the time frames and resources available for the assessment and which form a practicable set for the purposes of the decision.

3.
Applicability: the indicators should be applicable for every alternative under consideration and interpretable. Reference values can facilitate.

4.
Tractability: there should be sufficient reliable data (numerical or qualitative data should be available to enable the estimation).

5.
Transparency: the indicators (including criteria/scales) should be easy to understand and chosen in a transparent way, not least to enable stakeholders to clearly identify what is being considered, to understand the criteria/scales used and to propose other criteria for consideration. 6.
The indicators should be predictable in response, sensitive and responding to relevant changes or differences of performance. 7.
The indicators should not be (strongly) correlated. 8.
The indicators should be acceptable from an ethical perspective. The advice and these criteria have been considered in the MCA method development process. They are related to in the coming sections and discussed at the end of this article and in Part II. It can be challenging to establish a method with a strong set of indicators in relation to all criteria. The selection process commonly involves different tradeoffs [43,83].

Process Description
The process of establishing the MCA method was participatory (or collaborative, [84]) and iterative, with different actors being involved throughout the process. It was initiated in 2017 by the authors, who created a first version of the MCA method in the form of a presentation dealing with key areas, key questions and indicators to possibly include. This was done based on previous MCA method establishment experiences [54,62,85], an initial literature review of other studies dealing with assessments of bus technologies and discussions. From that point, the method has been developed iteratively in cooperation with several stakeholders and experts, as presented in Table 1, in parallel with extended literature reviews. Table 1. Actors that have been involved in the multi-criteria assessment (MCA) method development process, more actively as participants (P) in research projects and stakeholders (S) that have been given the opportunity to provide input at meetings and conferences.

Project Participants (P, p 1 ) and Involved Stakeholders (S) Relevance, Competences Comment
Region Östergötland (P), part in BRC Environmental strategist with long-term experience regarding bus technologies, sustainability issues and public procurement processes. In later stages, an energy and climate strategist was also involved Has participated in the whole MCA establishment process and has been part of several workshops dealing with indicators, scales and results Other regions (p) Long-term experience regarding public bus transports and other relevant issues The regions of Gotland, Kalmar and Jönköping participated in later stages of the process (the last two years). They, for example, provided input at a dedicated workshop During 2018, the then-existing version of the MCA method was further developed by students supervised by the authors, first, by one master's thesis student (30 ECTS) and then by a group of four students taking a project course (12 ECTS per student) (further information in the acknowledgements). The students provided further knowledge on relevant indicators and data and their work involved interactions with several of the actors shown in Table 1 and others within BRC. After the student contributions, the MCA establishment process continued with further literature studies and finetuning of the set of indicators. The project participants listed in Table 1 provided input during a dedicated workshop, where the indicators were discussed and they have regularly been involved in the process via BRC project meetings. In the final stages of the process, when the MCA method was established and had been applied, the pre-final drafts of both articles (Part I & II) were reviewed by a group of 10 selected experts. They were selected to include competency on sustainability systems analysis, specifically, multi-criteria analysis, LCA, LCC and energy analysis; transportation, especially regarding other fuels than biomethane to complement the competency profile of the authors; sociotechnical systems; and environmental innovations.

The Assessed Bus Technologies
Seven different kinds of technologies are used in Swedish public buses: biomethane, diesel, electricity, ethanol, FAME, HVO and natural gas [86]. They formed the basis for the development of the MCA method and are the alternatives assessed in Part II.
The FAME used is almost exclusively produced from rapeseed [87], meaning that the focus was on RME (Rapeseed Methyl Ester). Regarding electric buses, both slowcharging options, which use less infrastructure but larger batteries and fast-charging options, which use more infrastructure but smaller batteries, have been considered. These technologies/fuels are further specified in Part II. The assessment has focused on 12-m-long Euro VI buses.

Selection of Key Areas, Indicators and Scales for Assessment
Several key areas were selected for inclusion in the MCA method, considering what issues were focused on in the relevant literature regarding transport and sustainability, the specific technologies and other relevant contexts [83,88]. The participants and stakeholders influenced the outcome in different ways. The participating regions were central, providing input on what is characteristic of a "sustainable" or resource-efficient bus technology. To be able to make assessments for each key area, several indicators were selected and scales were defined for assessment. This process was iterative, where indicators have been, for example, suggested and discussed, then kept, revised, combined or discarded; the process was similar for the scales. In some cases, already-existing indicators and scales for assessment have been used, while other indicators and scales have been defined more from scratch.
For each indicator, we defined five-step scales using quantitative intervals or qualitative descriptions, ranging from very poor to very good. In some cases, when it was not seen as reasonable to use five steps, a three-step scale was used where the scales for poor and good were removed. In addition, a simple three-step scale was used to indicate the uncertainty of the assessor: "*" referred to high uncertainty (not certain), "**" referred to some uncertainties and "***" referred to low uncertainty (rather certain).

Additional Steps in the MCA Process
The previously described general MCA method (see Section 2) has been followed to a large extent, although we have not conducted any weighting (Step 4 is dealt with in the concluding discussion). The steps concerning data collection (3), assessment and analysis (5) and presentation and interpretation of results, recommendations (6) are mainly described in the associated Part II article, where the MCA method is applied.

The MCA Method for Sustainability Assessment of Bus Technologies
An overview of the established MCA method, including key areas, key questions and indicators, is provided in Table 2. The four key areas are based on the dimensions of sustainable development and are in line with Oltean-Dumbrava [75], Foxon et al. [74] and others. In the following sub-sections, the different areas are presented with their indicators and scales for assessment, which are introduced and motivated. Generally, the indicators and scales have been selected and defined to focus on aspects where the bus technologies perform differently, that is, excluding issues where they are performing similarly, which is clarified in the following descriptions and in the concluding discussion.

Technical Performance
For any bus technology, it is a key requirement that it works well from a technical perspective [88], being able to provide the desirable transport function and additional services during the contracted period (commonly around 10 years in Sweden). In cases where public organizations take part in technical development projects, this could differ and technical indicators can then be reformulated, deprioritized or removed. Regarding comfort, accessibility and security, the standard Bus Nordic (see Bus Nordic, Common Nordic bus procurement requirements, version 2018) is commonly used, which includes predefined requirements that each service provider must fulfill cf. [89]. As these requirements are the same for any technology, they have not been included in the MCA method.
Two technical indicators were selected. The first indicator concerns technological maturity (or technological reliability or readiness, for example, see Reference [90]). This indicator was incorporated to assess the stage of development and implementation of bus technologies. Generally, a technology can be more or less developed, ranging from early research stages to mature solutions that have been used in society for many years. Newer technologies are prone to come with technological challenges, both directly related to their function and related to less established support networks [91]. It is thus considered favorable if a technology is well-established, both nationally and internationally. Table 3 presents the qualitative scale for the indicator technological maturity. It has been assumed that the technology is the same or similar, during the contract period. In the context of longterm assessments, however, a new technology can have larger development potentials [91].

Satisfactory
Relatively new technology, commercially implemented and proven to work well in some cases, in conditions similar to the national/regional context. More limited support networks compared to the levels of good and very good. Some uncertainties regarding the performance, for example, regarding operational availability, energy use, replacement of critical components or needs of maintenance.

Poor
New technology, tested in several cases or commercially implemented in some cases with different conditions from the national/regional context. Very limited support networks. Large uncertainties regarding the performance, for example, regarding operational availability, energy use, replacement of critical components or needs of maintenance.
Very poor Possibly coming technology but not developed enough to be seen as a reasonable alternative from a technical perspective.
The second technology indicator is oriented towards the daily operational duties. For efficient use, it is relevant to consider if and to what extent necessary stops influence the ability to perform the duties. Therefore, we included the indicator daily operational availability considering the range and time for refueling or recharging; see the scale in Table 4. Table 4. Scale for the indicator "daily operational availability."

Very Good
Refueling or recharging is conducted during the night (or during another period with low demand) and results in a vehicle range that is sufficient to carry out the daily duties without any additional stops for refueling/recharging.

Satisfactory
Refueling or recharging is conducted during the day (or during another period with relatively high demand) but without significant negative impact on the wanted timetables or any need for additional vehicles due to refueling/recharging.

Very Poor
Refueling or recharging is conducted during the day (or during another period with high demand), significantly influencing the wanted timetables negatively or leading to needs of additional vehicles due to refueling/recharging.

Economic Performance
All procurers of public transport need to consider the costs involved; they should try to be cost-efficient and maximize the transport service level within set budget frames [92]. This key area focuses on costs directly linked to the vehicles and their use and costs related to infrastructure. Thus, the perspective is relatively narrow and short term, not broader socio-economic and long term. However, the broader environmental performance assessment compensates for this to some extent.
The first economic indicator is the total cost of ownership (TCO), including costs for purchasing the bus, fuel and maintenance/repair and considering the vehicles' residual value. Many sustainability assessments consider the costs; some use TCO or life-cycle costs (LCC) that can have similarities (when focused on directs costs, LCCs differ, considering indirect costs and externalities), while others focus on some of these costs or are not that transparent regarding what has been included [30,83,88,93]. Even if the focused TCO is most often mainly taken by the service provider (depending on the type of contract -there might for example be indexed costs (i.e., not fixed costs) where the regions must pay more in case of increasing fuel prices), it strongly influences the price offered by the providers and thus influences the procurer's budget. Table 5 shows the quantitative scale for the indicator total cost of ownership, grading each bus technology's cost against the average/median cost (of all the studied technologies in our case, of all offerings in a practical case). The percentage levels are loosely based on the costs according to Ecotraffic [94] that compared Euro VI buses of all the relevant technologies apart from electric buses. However, the scale could be adapted and based on offered price differences in a procurement rather than using these percentages. There are also other options for such an indicator/scale, for example, to specify absolute cost intervals (e.g., in SEK/km) for each level. Table 5. Scale for the indicator "total cost of ownership."

Very good
The costs are at least 15% lower than the average/median cost.

Good
The costs are at least 5% lower than the average/median cost but not lower than 15%.

Satisfactory
The costs are average, within a range of 5% from the average/median cost.

Poor
The costs are at least 5% higher than the average/median cost but not higher than 15%.

Very poor
The costs are at least 15% higher than the average/median cost.
The indicator need for investments in infrastructure concerns the level of investment a technology requires. These investments can be related to new infrastructure but also maintenance or expansion of existing infrastructure. It is an indicator that also has been part of several similar studies [88,93]. These costs have been separated from the total cost of ownership, as they normally (at least in a Swedish context) are not taken by the bus service provider as a part of the public procurement of bus service contract. In this study, these investments may concern, for example, infrastructure for storage and transport of fuels, for refueling and for recharging. However, we have not included infrastructure or facilities for the production of fuel/electricity or any parts upstream from the production. Table 6 shows the qualitative three-step scale. In the assessment, it can be the logic to compare the costs for infrastructure with the total cost of ownership, when deciding what is minor and significant (or in-between). Table 6. Scale for the indicator need for "investments in infrastructure."

Value Scale Definition
Very good No investments in infrastructure are needed for this technology.
Satisfactory Minor but acceptable investments in infrastructure are needed for this technology.
Very poor Significant investments in infrastructure are needed for this technology.
Cost stability concerns risks for significantly increased or unexpected costs during the contract period but also chances of lower costs. Such increased costs can, for example, be due to scarce resources and increased demand for some fuels or for unplanned repairs. The costs of public bus transport in Sweden have increased significantly since 2010 [95]. This is partly explained by costs that we have not considered, like salaries, as they are assumed to be independent of bus technology but costs related to vehicles and fuels have also contributed to this development [16,95]. Depending on the contract between the public transport authority and the service providers, changed costs have different implications. The focus of the indicator is on changed costs taken by the public transport procurer, the scale of which is shown in Table 7. In many cases, it may be seen as non-controversial to avoid or lower the economic risks, in line with the suggested indicator and in any case wise to learn about existing risks. Nevertheless, there may be negative implications if such an assessment results in excessively low marginals for the service providers or leads them to take an unreasonable share of the risks (in cases where it is difficult to avoid/lower risks and still provide the services wanted) [96]. In addition, chances of high profits for the service providers may be disliked by the taxpayers (seen as non-efficient management of public funds but this depends on how profits are used). The scale could be complemented to cover such cases as well. Table 7. Scale for the indicator "cost stability."

Very good
The costs related to vehicles or fuels are expected to significantly decrease during the time period of the service contract. There is a good chance of costs significantly below the expected budget level.

Good
The costs related to vehicles or fuels are expected to slightly decrease during the time period of the service contract. There is a good chance of costs below the expected budget level.

Satisfactory
The costs related to vehicles or fuels are expected to remain stable during the time period of the service contract. There is a good chance of costs in line with the expected budget level.

Poor
The costs related to vehicles or fuels are expected to slightly increase during the time period of the service contract. There is a risk of costs above the expected budget level.

Very poor
The costs related to vehicles or fuels are expected to significantly increase during the time period of the service contract. There is a risk of costs significantly above the expected budget level.

Environmental Performance
In line with the introduction, Swedish regional transport authorities commonly want bus technologies with a favorable environmental performance. There are many possible impact categories [97] or indicators [79,98] that could be considered, of which a few were chosen to reach a reasonable total number.
The first indicator deals with non-renewable primary energy efficiency from a well-towheel perspective, in line with the reasoning by Marcus Gustafsson et al. [99]. The Swedish Association of Local Authorities and Regions provides yearly statistics for the comparison of different regions' performance [100], where they regarding energy focuses on vehicle energy use. However, this narrow systems perspective does not account for important energy and environmental issues, as 'efficient buses' may be associated with high energy use and environmental impact (cf. [30]). The scale is defined in Table 8, based on the findings of Gustafsson et al. [99] regarding the energy use of buses in Swedish regions. Table 8. Scale for the indicator "non-renewable primary energy efficiency."

Value Scale Definition
Very good The bus technology uses less than 1 kWh of non-renewable primary energy/vehicle kilometer.

Good
The bus technology uses between 1 and 1.5 kWh of non-renewable primary energy/vehicle kilometer.

Satisfactory
The bus technology uses between 1.5 and 2 kWh of non-renewable primary energy/vehicle kilometer.

Poor
The bus technology uses between 2 and 2.5 kWh of non-renewable primary energy/vehicle kilometer.

Very poor
The bus technology uses more than 2.5 kWh of non-renewable primary energy/vehicle kilometer.
Greenhouse gas emission (GHG) savings is an issue of great environmental importance [101] that receives much attention from decision makers [102] and is commonly included in similar MCA methods, for example, References [83,88,93] and other relevant studies. The indicator deals with the amount of carbon dioxide equivalent emissions (gCO 2 eq) that can be reduced in comparison with a baseline diesel bus technology reference. As for energy, a well-to-wheel perspective should be used, including all emissions related to the transport service (fuels, vehicles and infrastructure). The scale (Table 9) was formulated with even steps in the range of 0 to 100%. It can be noted that the EU's renewable energy directive (Directive (EU) 2018/2001 of the European Parliament and of the Council of 11 December 2018 on the promotion of the use of energy from renewable sources.) requires 60% GHG emission savings for biofuels used from 2015 to 2020, matching the lower level of the interval for good. It is recommended to use results from life-cycle assessments based on ISO 14040 and ISO 14044 [103] that is, with system expansion to include a broad range of relevant climate issues. The LCA methods used for the calculations in the directive above do not account for essential issues, and system expansion, as recommended by the ISO standards, gives more accurate results. Information about GHG emissions is also of relevance to get a rough understanding of other global/regional environmental impact categories, indirectly providing information about emissions of NO x and SO 2 being linked to fossil fuel use [54]. Table 9. Scale for the indicator "greenhouse gas emission savings."

Scale Definition Compared to the Diesel Bus Technology Reference (of 1241 g CO 2 -Eq/Vehicle
Kilometer 1 ), the GHG Emissions Savings Are: Very good 80% or higher (x ≥ 80%).
Air pollution consists of pollutants that commonly cause health problems in city areas and environmental problems, like carbon monoxide (CO), hydrocarbons (HC), nitrogen oxides (NO x ) and particles (PM). The transport sector is a significant contributor to local air pollution [105], causing serious negative health effects in many cities worldwide [106]. For simplicity, we have focused on tailpipe emissions, in line with EU emission standards for buses. However, there are also some limited considerations of the broader lifecycle emissions for electricity [107], as the very good level requires 'clean and safe' renewable energy sources, meaning that highly polluting electricity production and production linked to risks of nuclear radiation get lower grading [108][109][110]. Further on, it is assumed that all bus technologies contribute to similar amounts of road wear particles and are therefore not included. The scale is based on the EU emission standards (Euro V and VI), also considering the latest Swedish legislation concerning low emission zones (SFS 2018:1562). The scale for the indicator air pollution is shown in Table 10. Table 10. Scale for the indicator "air pollution."

Very good
The buses have no tailpipe emissions AND The electricity is to 100% produced from renewable sources with very low health impacting emissions, like electricity produced from water, wind or solar power. AND The electricity is NOT at all produced from nuclear power (i.e., associated with risks of nuclear radiation).

Good
The buses fulfil the requirements for Low Emission Zone 3 in Sweden, meaning that they: are driven by 100% electricity or fuel cells, OR are driven by gas engines fulfilling the Euro VI requirements, OR are chargeable hybrid vehicles fulfilling the Euro VI requirements.

Satisfactory
The buses fulfil the requirements for Euro VI.

Poor
The buses fulfil the requirements for Euro V.

Very poor
The buses do NOT fulfil the requirements for Euro V.
Noise is a common problem area for many cities, where transports are among the most prevalent sources [111,112], causing disturbances for people and wildlife [113]. There is an abundance of literature dealing with transport-related noise, for example, References [111,114,115], with a relatively large focus on modeled noise or measurements from test environments, that may provide significantly different results compared to real-life situations. Noise levels are important to understand health impacts but also psychological or psychophysiological factors for people concerned [116]. Nevertheless, it was still found reasonable to include noise levels in the MCA method. So-called A-weighted sound levels are commonly used (denoted as dBA units) to adjust the measurements to the sensitivity of the human ear, with a focus on the range between 1 to 4 kHz. The dBA unit is widely used in assessments of transportation-related noise, commonly assumed to be the default unit [117]. Larsson and Holmes [118] studied noise related to different bus technologies and chose to use the dBA scale, with the motivation that it is most commonly used in socio-economic noise studies. However, although not using it (but providing relevant dBC data), they acknowledge that the dBC scale is also very relevant to pay more attention to low-frequency noise, which can be of relevance for transportation and several other sources (the difference between the dBC and dBA sound levels is used for information about low frequencies). It seems to be widely supported that transport noise in the range from about 10 Hz to 200 Hz can cause indoor noise annoyance [119,120], while higher frequencies contribute more to outdoor problems [121]. It should also be noticed that differently defined maximum (peak) and equivalent (average) levels are used in regulations and noise studies, such as socio-economic studies of transport noise [118,122]. According to Andersson et al. [123], based on a study of some Swedish regions, data on 24-h average sound levels are important for socio-economic noise estimations, while maximum sound levels are not as relevant. Relevant EU noise level requirements are established by EU regulation 540/2014, with maximum levels (as dBA from the vehicle, to be measured in accordance with the latest amendments (see EU regulations/amendments from 2016 and onwards.)) set for buses (focusing on vehicle type M3, which holds more than eight passengers and has engine power greater than 250 kW) for the years 2016-2020 (80 dBA), 2020-2024 (78 dBA) and after 2024 (77 dBA). Braun et al. [124] have studied different vehicle noise sources and conclude that the four major ones are the engine, intake system, exhaust system and tire/road system. Assuming similar weight and speed for the assessed bus technologies, we have focused on noise from the vehicle, that is, excluded noise from tires or roads in the MCA method. However, it is important to notice that for speeds exceeding 50 km/h, noise from the tires/road dominates [118]. Thus, in areas where buses commonly have a speed around 50 km/h or higher, it is less important to focus on noise from the engines, meaning that what bus technology is used is not of great importance regarding noise annoyance. The scale was formulated, considering all this information, as shown in Table 11, with levels chosen in relation to existing regulations. In addition to the previously mentioned environmental indicators, bus technologies can have other local/regional impacts on land and aquatic environments, thus included as an indicator. Production of fuel, electricity, vehicles and infrastructure may involve both positive and negative (local) effects, which can vary significantly depending on the choice of technology. For example:

•
Fossil fuels may involve a wide range of negative, local environmental impacts [125][126][127]. • Biofuels or electricity produced from food waste, aquatic biomass or other relevant feedstocks may involve recycling of nutrients and reducing eutrophication [128]. • Biofuels produced from straw may lead to too low soil organic carbon levels, being negative regarding soil fertility, while there are also several examples leading to improved soil fertility [62,129]. • Feedstock production may be linked to a positive and/or negative impact on species and ecosystems nearby, for example, ecological farming favorable for biodiversity in contrast to farming involving pesticides [130].
A qualitative scale was chosen for the indicator local/regional impact on land and aquatic environments, shown in Table 12. Table 12. Scale for the indicator "local/regional impact on land and aquatic environments."

Scale Definition
Focusing on Local/Regional Impact on Land/Soil, Water Resources and Aquatic Environments, Biodiversity/Ecosystems and Other Relevant Local/Regional Impacts that Are Not Clearly Covered by Any Other Indicator:

Very good
The bus technology is found to be very beneficial from a local/regional environmental perspective: -There are significant positive environmental effects AND -There are no significant negative environmental effects

Good
The bus technology is found to be beneficial from a local/regional environmental perspective: -There are relevant positive environmental effects, together judged to be clearly more important than the negative effects AND -There are some negative (but still acceptable) environmental effects

Satisfactory
The bus technology is found to have no or neutral effects from a local/regional environmental perspective: -There are no significant environmental effects OR the negative and positive effects are judged to be of similar importance (where the negative are acceptable)

Poor
The bus technology is found to be negative from a local/regional environmental perspective: -There are relevant negative environmental effects, together judged to be clearly more important than the positive effects AND -There are some positive environmental effects

Very poor
The bus technology is found to be very negative from a local/regional environmental perspective: -There are significant negative environmental effects AND -There are no significant positive environmental effects

Social Performance
In addition to the economic and environmental performance, social aspects have been considered.
Energy security is defined by the International Energy Agency (IEA) as "the uninterrupted physical availability at a price which is affordable" (cf. [131,132]). In recent decades there has been a re-emerged interest in energy security, driven by rising demand, disrupted supplies and the push towards de-carbonization [133] and in both Europe [134] and Sweden [135], issues regarding energy security receive significant attention. The focus of this indicator is on the physical availability of primary resources cf. [136] and on where the production of fuels and electricity takes place since fuel and electricity prices are covered by the economic performance indicators. A simple geographical perspective has thus been used to consider to what extent different bus solutions contribute to energy security on local/regional, national and international levels. The scale is presented in Table 13. As Sweden is an EU member country influenced by the EU energy security strategy [137], the EU level was set as the satisfactory level also including the closely connected Schengen area in order to include countries like Norway, which Sweden has close links and long-term good relations with. The resource considerations also include resources for electricity production (such as coal, oil, energy crops, etc.) where renewable sources such as wind, water or solar power are considered to be local to the electricity production site. Table 13. Scale for the indicator "energy security."

Value Scale Definition
Very good More than 90% of the used fuel or electricity is produced within the actual region 1 , based on resources from this region.

Good
More than 90% of the used fuel or electricity is produced within the nation, based on resources of national origin.

Satisfactory
More than 90% of the used fuel or electricity is produced within countries that are geographically close to the nation, that the nation has long-term and stable business relations with, based on resources from those countries.

Poor
More than 90% of the used fuel or electricity is produced in countries that are not geographically close to the nation but which the nation has long-term and stable business relations with, based on resources from those countries.

Very poor
More than 90% of the used fuel or electricity is produced within countries that are not geographically close to the nation and which the nation does not have long-term and stable business relations with, based on resources from those countries. 1 Referring to the term region as used in Sweden, corresponding to county. Other areas could be used.
This geographical orientation is also related to employment, even if we decided not to explicitly include it. For example, regional resource management (or production) and regional production of fuels/electricity will likely be linked to regional employment. In addition, Ekener-Petersen et al. [138] found that both fossil and biofuels can cause significant negative social impacts and emphasized the need for social performance requirements in procurements of fuels. In this context, it is important to consider the country or origin since there can be important differences concerning human and labor rights and work health. There are databases such as the Social Hot Spots Database that can be used for these kinds of assessments [138]. However, we did not choose to explicitly include such considerations as the performance may vary significantly within countries and sectors and between different production sites. Nevertheless, a recent report by The International Trade Union Confederation (ITUC), [139] indicates that EU countries, in general, are performing relatively well regarding human rights. Thus, indirectly, the above-suggested scale takes social performance into account within additional areas than those mainly targeted.
Public organizations like regions and municipalities are important actors for several sociotechnical systems, like the systems for the management of energy, transportation, waste and water [140]. As there are commonly important links between such systems involving public transportation [141], a second indicator was included to consider if and how the choice of bus technology influences the mentioned sociotechnical systems-the indicator sociotechnical system services (see Table 14). This assessment can include links to sociotechnical systems outside the specific area studied (such as a certain region). Links to agricultural systems are not included here as they are covered in the indicator local/regional impact on land and aquatic environments via involving nutrient management and soil impact. Table 14. Scale for the indicator "sociotechnical system services."

Very good
The bus technology is linked to regional/municipal sociotechnical systems of waste wastewater management and/or energy and significantly facilitates their function and/or economic viability

Satisfactory
The bus technology is not linked to regional/municipal sociotechnical systems of waste wastewater management and/or energy or does not significantly influence their function and/or economic viability

Very poor
The bus technology is linked to regional/municipal sociotechnical systems of waste wastewater management and/or energy and is significantly problematic regarding their function and/or economic viability

Concluding Discussion
There are several technologies for transportation, ranging from fossil systems to biofuels, electricity and other future options. There is a demand for modern, well-functioning and cost-effective technologies with superior environmental or sustainability performance to reduce climate impact, air pollution, resource depletion and other urgent challenges. On the one hand, the now-existing palette of renewable technologies brings great opportunities for the continued marginalization of fossil fuels and improved sustainability performance [104]. On the other hand, it is difficult to know what technology is preferablethat is, to systematically compare the different alternatives. This article deals with the assessment of transport technologies, focusing on the public procurement of bus transports by Swedish regional authorities, that have used 'green or sustainable public procurement' (GoSPP) to achieve a major transition of the bus fleet but also, to some extent, other types of vehicles [142].
Within the field of sustainability assessments, researchers emphasize the existence of implementation gaps and argue for a need to focus more on implementation in central decision-making processes [43,44]. Similarly, prior research on GoSPP and the authors' practical experiences, point out a need for improved knowledge and supportive methods on how environmental/sustainability assessments can support public procurement processes. Our study addresses these challenges via the establishment of a multi-criteria assessment (MCA) method for assessments of public bus technologies' sustainability. The multi-criteria approach was chosen as we see sustainability assessment of bus technologies as a complex problem and acknowledge value pluralism.
To guide the MCA method development process and for reviewing the outcome, a literature review on advice for effective and efficient sustainability assessments was conducted (see Section 2.2). Below, our method development process and the resulting MCA method are discussed in relation to keywords and criteria (in italic) from this review. The process of establishing the MCA method has been described in detail for transparency, providing more information on management, actors and indicators than most studies reviewed. The process has been iterative and participatory, as it engaged staff from a wide range of actors. The method results from a process governed by this study's researchers based on literature reviews and input from project participants, other stakeholders and a few students. This mix of actors and sources has influenced the key areas, indicators and scales. Thus, it can be difficult to link a certain part of the method to a specific actor, source or part of the process, which limits the transparency. All the involved actors have importantly contributed to the development. Of course, an enlarged group would have brought a broader competence base, which could have improved the resulting MCA method [43]. A different composition of the group would presumably also have led to a different method [62,63]. We have tried to establish an MCA method that works well for all the assessed fuels and electricity to cover all technologies' strengths and weaknesses. However, it may have caused bias that the project has been based within the Biogas Research Center and involved people with special competence and interests in biogas solutions. Still, several participating organizations, such as energy utility companies, regions and municipalities, have broader interests. The truck and bus manufacturer can be described as a more neutral actor in this respect, as this company sells trucks and buses designed for all the covered fuels and electricity. In addition, the final review round, for example, involving experts on other transport technologies, was intended to reduce potential biogas/biomethane bias. In relation to the four categories of effectiveness presented by Waas et al. [43], this section has dealt with several of the procedural components and normative components since the participants learnt from the process. Substantive and transactive effectiveness is discussed in Part II.
For the method to be useful and efficient in relation to the aim, it is important to consider the choice of indicators and scales from several different angles. Intentionally, the method consists of both quantitative and qualitative indicators, as the purpose has been to focus on the essential issues rather than on what can be easily measured or quantified [43,143]. In this respect, the method is different from many other assessment methods for similar transport contexts, which are more quantitatively oriented [24,83,144]. As researchers, we wanted to avoid weighting, otherwise commonly applied in MCA/MCDM (Multi-Criteria Decision Making) projects, as this is a more 'political step' than other parts of the process and as the importance of the indicators can vary between different regions. For example, noise may be highly relevant in large city contexts but less important in the countryside. We also wanted to contribute to broadened requirements in public procurement processes [145] without including too many indicators (practicability). This is a balancing act, as more comprehensive methods will cover additional sustainability aspects but can involve overcomplicated assessments and difficulties concerning interpretation and decision making. The method consists of 4 key areas and 12 indicators. The relatively low number of indicators, without any decision trees or weighting, should make it easier to understand the logic and facilitate the practical assessment, interpretation and visualization. However, it is certainly possible to further simplify the use, for example, by providing a tool as exemplified by Lindfors and Ammenberg [23]. We believe that the method is broad enough to cover many essential areas, thereby helping in avoiding unintended problem shifting [146,147]. We have tried to reduce the correlation between the indicators but there are (certainly) overlaps, for example, environmental impacts related to total costs via taxes and similar issues. This is difficult to avoid. Several of the criteria for effective and efficient assessments are discussed in Part II, as they are closely linked to the actual assessment process and results-for example, the time needed for assessments, data availability and quality and clarity of results. One should also add that good management of uncertainties is required (part II in Reference [148]).
Reviewing the method, it is also relevant to focus on essential issues that may not be well covered by the chosen indicators. An overarching comparison with other MCA methods or sustainability assessments related to transportation shows many similarities. This is quite natural, as they influenced our choices. However, even if the same or similar indicators are included, the level of detail may vary, particularly related to scales for assessment and data collection. For example, our method includes a relatively simple and qualitative scale for infrastructure investments (cf. [93]), while others have a more quantitative and detailed approach (cf. [149]). Several studies with a similar focus have a broader consideration of social issues (cf. [26,73,88]), for example, dealing with availability/mobility and comfort. However, for the buses procured by the regions, there will not be any important differences in this respect, no matter the technology/fuel, why it was not seen as relevant to focus on (the procurement is set to match a pre-decided function or level of service that is not supposed to be influenced by choice of service provider or transport technology). The method could have been complemented with a larger focus on safety (cf. ibid.), considering risks related to, for example, vehicles, infrastructure, the production chain and energy systems. However, it may be problematic to assess risk levels, especially for new technologies, due to a lack of data. In the current version, risks related to nuclear power are indirectly included via the indicator "air pollution," due to the requirements for "very good," which is maybe not fully logical. During the course of the project, discussions also focused on whether and how to include the following: • the use of scarce natural resources and use of primary or secondary resources and implications • flexibility related to the existence of back-up fuels (e.g., biodiesel-diesel; biomethanenatural gas) • more detailed health effects and costs.
However, for various reasons, we decided not to directly include such indicators, which can be added in other cases or in improved versions, although some of these areas are indirectly covered, such as health costs via air pollution. Public acceptance was also discussed but left out as it was indicated by the involved actors that most customers do not care about the actual technology as long as it is renewable, which was based on previous customer surveys.
Although the method has been developed in a Swedish context, it addresses a challenge of general relevance-how to conduct sustainability assessments of transport technologies. The 4 key areas and 12 indicators would probably be relevant for assessment of buses across the globe. We find the indicators dealing with technical and economic performance to be generally applicable, while some adjustment can be reasonable for a few of those dealing with environmental and social performance: for example, to adapt the levels of the scales to fit local conditions and available technology (such as levels of energy efficiency, air pollution and noise) and adjust the scale on energy security considering relevant political and trade alliances and so forth.
Finally, we would like to stress that methods such as the one proposed can be used to find the best future transport technology or fuel but due to limited raw material and production potentials and other issues, we need a smart combination of several renewable fuels. Thus, it may be wise to also study and focus on what combination of technologies is most efficient.