Sustainable Urban Transportation Criteria and Measurement—A Systematic Literature Review

: Sustainable transportation plays a key role in social participation and climate change. However, no universally accepted set of criteria for tracking the progress of urban sustainable transportation projects currently exists; one reason for this is the absence of a standardized lexicon for sustainability measurement elements. Therefore, this paper conducts a systematic literature review and analysis of sustainable transportation criteria using 21 papers from journals listed in the German rating system JOURQUAL3 (JQ3) and published between 2010 and 2020. The paper thus develops a uniﬁed vocabulary for sustainability measurement elements that is structured into a hierarchy. The goal (sustainable transportation) presides over the following three sustainability dimensions: objectives (e.g., minimization of trafﬁc clogging), criteria (e.g., congestion), and indicators (e.g., cost of trafﬁc congestion). Within the hierarchy, the main criteria for urban multimodal sustainable transportation are identiﬁed as follows: 13 social, 11 economic, and 9 environmental main criteria are determined. The three main criteria used most in the literature exclusively concern the environment. Future research is recommended to assess the interrelations between the criteria as their assignment to sustainability dimensions is ambiguous in the existing literature. This paper helps mobility managers to make decisions about urban transportation concepts and while overseeing projects.


Introduction
Transportation carries both benefits and drawbacks for society. On the one hand, mobility, or the ability to move, is a basic human need that enables social participation. Notably, the act of moving, or transportation [1], created 5% of the gross value added and approximately 5% of the jobs in the European Union (EU) in 2016 [2]. On the other hand, growing traffic volumes have led the transportation sector to contribute significantly to overall greenhouse gases (GHGs). The transportation sector's contribution was nearly 25% of all GHG emissions in 2017 in the EU and had increased by more than 28% in absolute terms compared to 1990. By 2017, about 71% of transportation emissions were caused by road transport, with over 60% produced by cars in the EU [3]. Thus, car traffic alone is responsible for almost 11% of all GHG emissions in the EU. This demonstrates how society's prevailing car dependency contributes significantly to high GHG emission rates [4]. As GHGs are responsible for global warming [5], transportation plays a key role in reaching the goal in the Paris Agreement of keeping the global temperature rise below 2 • C [6,7]. Further direct and indirect negative effects caused by the current transportation system include an increase in inequality regarding accessibility, congestion (which harms public health [4]), unsustainable resource use [8], and accidents, among others [9].
Thus, the promotion of sustainable transportation concepts is of the utmost importance for maintaining mobility, while also managing transportation in an environmentally sound, socially acceptable, and economically efficient manner. Sustainable transportation concepts RQ1: Based on a systematic literature review, which methodological approach supports the selection of the main criteria that focus on sustainable urban transportation projects? RQ2: How do the criteria, objectives, and indicators in sustainable transportation differ in their meaning and how can they be brought into a systematic and logical hierarchy? RQ3: What are the main criteria for each sustainability dimension that are relevant for an urban multimodal transportation project?
By identifying the main criteria, this review aims to operationalize sustainable transportation to assess the sustainability of urban multimodal transportation projects in a target-oriented and practical manner. Furthermore, this paper extends the scope of existing reviews by proposing both a definition for the measurement elements and a novel systematization of the elements into a unified measurement hierarchy. The main criteria include objectives and a specific indicator for conceptualizing what urban sustainable transportation projects should include. The results of this systematic review will be helpful to researchers and practitioners during the planning and the implementation phase of urban multimodal sustainable transportation projects.
The review is structured as follows: Section 2 introduces the concept of sustainable transportation and highlights its importance. The systematic literature review process is presented in Section 3, and the review is conducted in Section 4 and produces a unified hierarchy of terms used in the literature for measuring transportation sustainability as well as a set of criteria. The paper ends with a discussion including limitations and providing suggestions for further research.

Sustainable Transportation
To explain the criteria for sustainable transportation, a general understanding of the terms "sustainable transportation" and "sustainability" must be established. Definitions of sustainability are generic; the best-known definition was stated in the Brundtland report [32]: Sustainability is the preservation of human life in the future [33]. This definition's lack of practical applicability spurred the development of more operational definitions. The majority of these divide the concept into economic, social, and environmental dimensions [34]. Intersections exist between these three dimensions, including social cohesion (socio-economic), economic efficiency (environmental-economic), and environmental responsibility (social-environmental) [35]. The integration of all the described elements is called sustainability [36], and none of the three dimensions must predominate [37,38]. Although some authors have argued that the environmental dimension-which comprises planetary boundaries and, therefore, also holds the preconditions for the social dimension [39]-is the most important, within this study, sustainability is understood as the equal balance of criteria regarding social factors, the environment, and the economy.
Sustainable transportation also lacks a universally accepted definition. Nonetheless, most definitions still integrate social, environmental, and economic aims and maintain the importance of all three [12]. For enhancing the operationality of sustainable transportation, Banister developed the best-known concept [11]. The elements of his concept are listed below and are complemented by statements from other relevant authors as follows: • The encouragement of a modal shift [1,4], which means that car travel needs to be reduced and more environmentally sound transportation modes such as bicycles and public transportation should be reinforced [8,36,40]. Shifting is encouraged by improving the cycling and walking infrastructure, promoting multimodality, limiting car parking spots, and imposing higher fees for the use of roads [41]. Hence, it is often the result of certain other mobility measures [42]. • A reduction in the need to travel to reduce the number of trips as well as a reduction in the distance travelled per trip [4]. The measures to avoid traffic are linked to urban planning [41]. • Increased efficiency levels for transportation systems [4]. An improvement is reached by introducing shared vehicle ownership and low emission vehicles [41]. • Digitalization, such as smart applications, or mobility as service solutions [43] that present various intermodal offers and multimodalities for passenger transportation [1]; this means that a bundle of mobility options is offered to consumers to choose from [44], encouraging a modal shift away from automobile use.
These measures can be integrated into the Avoid-Shift-Improve (ASI) framework for a sustainable mobility paradigm [41]. Different dimensions of mobility exist, such as urban [45] and rural. This paper analyzes the urban dimension and focuses on multiple transportation modes. Therefore, other aspects of sustainable transportation, such as infrastructure planning, are beyond this paper's scope and are not included in this overview.

Materials and Methods
A systematic literature review was conducted to select and analyze the papers. This is the most widely known type of review. It systematically and transparently searches, evaluates, and synthesizes the evidence in research, often by the use of specific guidelines [46].
For this aim, this method primarily follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines (for more information, see [47]) to ensure scientific rigor, as these are adapted in over 60,000 reports already [48]. The PRISMA steps for a systematic literature review are identification, screening, checking for eligibility, and the inclusion of selected papers [47].
Since the PRISMA guidelines were originally developed for reviewing health interventions [48], elements from a systematic review process developed by Centobelli et al. [49] are further included. Their process is based on suggestions for applications in social sciences and business administration and adopts suggestions by Pittaway et al. [50], Petticrew and Roberts [51], and Easterby-Smith et al. [52]. More specifically, in this method, the division of the methodology in a review and an analysis phase, and the steps of "material comprehensive research" and "selection of papers" are adopted from Centobelli et al., just as the descriptive analysis.
Wherever the specificity of the topic demands it, this novel approach of combining the existing guidelines is adapted to the specific requirements for answering the research questions by original modifications. Thus, the specific purpose of this study leads to another definition of keywords, filters, and the timespan as well as other criteria for paper selection that are different from those proposed by Centobelli et al. Another difference between Centobelli et al. and this work is the finality of the systematic review: instead of developing research questions from the review, this review answers the previously defined questions, as it is the purpose of most systematic literature reviews [51].
Furthermore, analysis is usually narrative in systematic literature reviews and accompanied by tables [46]. However, this analysis is not merely narrative, but a metasummary that combines qualitative (narrative) with quantitative elements. Therefore, the frequency of each finding (i.e., the inclusion of a criterion) is summarized [53] in a way originally proposed by the authors. Thus, it differs from other guidelines as, for instance, Realist And Meta-narrative Evidence Syntheses: Evolving Standards (RAMESES) [54].
The resulting synthesized and added methodology is constructed as follows: The first step of this novel method focuses on the identification of relevant papers published in highly ranked journals based on topic-related keywords (material comprehensive research), and step two deals with the analysis of their abstracts and full body texts (paper selection). Eligibility criteria are the focus in the assessment of sustainable transportation in the abstract, the listing of indicators or criteria in the papers, the inclusion of at least one sustainability dimension and the focus on holistic urban transportation projects.
The second phase (analysis) uses content analysis to analyze the remaining papers descriptively. For this aim, a narrative synthesis is conducted by formulating categories for the descriptions, analyzing each category, and synthesizing across all studies [51]. Furthermore, the most used social, environmental, and economic criteria that are relevant for holistic multimodal urban sustainable transportation projects are extracted. The analysis uses a criteria refinement process by selecting the criteria in accordance with their compliance with given qualitative requirements. They are further grouped, ranked, and classified into the sustainability dimension they were mostly assigned to by the reviewed authors.
The proposed methodology is shown in Figure 1. The phases are described in more detail in the following sections.
An important issue in conducting systematic literature reviews is assessing the internal validity, (frequently used synonymously with quality [55]), i.e., to search for methodological bias [51]. Commonly, the appraisal of the examined studies is an integral part of data extraction [51]. Thus, in order to reduce bias, the following measures were taken:

•
The review process adhered to the PRSIMA checklist, as using checklists is proposed in the literature [53,56]. • A rating of the journals from which the papers were extracted was defined as a prerequisite, namely of at least B/C in JQ3, as checking journal rankings is a common implicit quality measure in management research [55]. • A straightforward method for finding the most relevant main criteria, namely counting, was applied to avoid misinterpretation of the results, which is recognized as important [56]. However, the grouping retains subjective to a certain degree, which lies in the nature of the subject.   [47,49]).
An important issue in conducting systematic literature reviews is assessing the internal validity, (frequently used synonymously with quality [55]), i.e., to search for methodological bias [51]. Commonly, the appraisal of the examined studies is an integral part of data extraction [51]. Thus, in order to reduce bias, the following measures were taken:

•
The review process adhered to the PRSIMA checklist, as using checklists is proposed in the literature [53,56].
• A rating of the journals from which the papers were extracted was defined as a prerequisite, namely of at least B/C in JQ3, as checking journal rankings is a common implicit quality measure in management research [55].
• A straightforward method for finding the most relevant main criteria, namely counting, was applied to avoid misinterpretation of the results, which is recognized as important [56]. However, the grouping retains subjective to a certain degree, which lies in the nature of the subject.

Material Comprehensive Research
The first phase of identifying papers included the selection of databases, the definition of meaningful keywords, and the determination of the time horizon. Additionally, only papers written in English were considered.
For the database selection, only journals ranked B/C or higher in JQ3, a rating system by the German Association of University Teachers of Business Administration [57], were considered. This ensured the quality of the papers included in the review process, which is an important requirement [58]. Most of these journals are listed in Web of Science (WoS). Therefore, WoS was the first database used for data retrieval. To filter for results that appear in JQ3, a search string containing each journal's international standard serial number (ISSN) combined with "AND" operators was created in the advanced search module. The remaining journals that were not found in WoS were searched for manually

Material Comprehensive Research
The first phase of identifying papers included the selection of databases, the definition of meaningful keywords, and the determination of the time horizon. Additionally, only papers written in English were considered.
For the database selection, only journals ranked B/C or higher in JQ3, a rating system by the German Association of University Teachers of Business Administration [57], were considered. This ensured the quality of the papers included in the review process, which is an important requirement [58]. Most of these journals are listed in Web of Science (WoS). Therefore, WoS was the first database used for data retrieval. To filter for results that appear in JQ3, a search string containing each journal's international standard serial number (ISSN) combined with "AND" operators was created in the advanced search module. The remaining journals that were not found in WoS were searched for manually with the same keyword combinations in the remaining databases, namely Emerald, Springer, AISEL (Association for Information Systems eLibrary), The Journal of Fixed Income (JFI), and ScienceDirect.
Then, keywords for retrieving data were defined. An initial list of term sets for "sustainability" and "measurement" were adopted from Mura et al. [32]. These were completed by searching for synonyms in the Cambridge Dictionary and by comparing the keywords with additional words used by authors of research papers similar to the one presented here. All terms serving as keywords were grouped into three sets concerning sustainability and indicators as well as transportation. The search string was constructed with "AND" operators for combining the sets and "OR" operators for the terms within one set, as shown in the first step in Figure 1. This resulted in a total of 96 possible keyword combinations. For the search in WoS, the search string containing all keywords was combined with the ISSN search string. For the ScienceDirect, JFI, and Springer databases, the keyword combination was too long. In these cases, only "sustainab* AND indicator AND transport*" was used for data retrieval.
Regarding the timespan, to gain a broad overview of current discourse, research contributions within the 10-year timeframe from 2010 until 2020 were evaluated. The researcher conducted the searches on 6 January 2021.
With this initial query, 1269 papers found in WoS were collected, and an additional 101 results were found on the other databases.

Selection of Papers
Not all papers were relevant to the systematic literature review because some did not address the assessment of sustainable transportation. For this reason, the analysis procedure for paper selection was refined in the screening phase by reviewing the abstract of each paper. Papers that were outside the scope, especially those belonging to subject areas other than transportation, were excluded. After this refinement process, a total of 94 documents were subjected to the second selection phase.
The eligibility criterion was the general thematic focus of the papers. The whole text body of the 94 papers was read, and only those with a clear thematization of elements central to sustainability assessment (e.g., criteria or indicators) were included. Furthermore, papers that were overly specific (e.g., concerning only one propulsion technology) and those containing criteria without the indication of any sustainability dimension were excluded. After this final selection step, 21 papers remained, from which none were identified as biased, as all authors reported the sources and methodology of criteria set composition.

Descriptive and Content Analysis and Criteria Refinement Process
These 21 papers were analyzed descriptively according to publication year and journal. Furthermore, the content of each paper was analyzed in terms of the topic, sustainability dimensions included, and hierarchical levels used for sustainability measurement. What the criteria were used for and how they were collected were also analyzed.
To refine the criteria, requirements were defined, including relevance, measurability, the avoidance of redundancy, and validity (adopted from Lyytimäki et al. [59]). The remaining criteria were then split or combined according to their meaning to group them into main criteria. In the last step, the main criteria were assigned to the sustainability dimension where they were found most often in the reviewed literature or where they best fit.

Descriptive Analysis
This analysis considered the distribution of the papers over time and across journals. A combination of both distributions is presented in Figure 2. Of the 21 papers selected, 81% were published in the second half of the researched timespan, indicating the increasing importance of sustainability over time. The peak was reached in 2019, when almost a third of all the relevant papers were published. The analysis of papers across journals shows that both the Journal of Cleaner Production and Transportation Research Part D dominated the research in this subject because more than half of all papers (six for each journal) were published in one of these two journals. Two journals had two papers each, and the other journals were represented by only one paper each. A list of the journals used and their respective databases for the two selection steps is given in Table 1. Research Part D dominated the research in this subject because more than half of all papers (six for each journal) were published in one of these two journals. Two journals had two papers each, and the other journals were represented by only one paper each. A list of the journals used and their respective databases for the two selection steps is given in Table 1. As is evident, both the Journal of Cleaner Production as well as Transportation Research Part D were also dominant in the first selection stage. A list of the final papers selected for content analysis, including the journals in which they were published, is provided in Table 2.

. Analysis of Papers
A brief overview of the research topic of each paper, incorporated sustainability criteria, hierarchical levels (i.e., the partition of dimensions), usage of the criteria, and methods for indicator selection is given in Table 3. Almost all the authors used certain criteria to assess the sustainability of given transportation projects. The geographical areas of application are distributed globally, from Europe to Asia and the Americas. Most of the authors focused on urban projects. Only Bojkovic et al. [12] thematized the sustainability of country-wide transportation systems. Most of the papers (81%) considered the social, environmental, and economic dimensions of sustainability. (Note that the wording is slightly different in the papers. Castillo and Pitfield [63], for instance, name the social dimension "equity and social inclusion"). More than half (52%) added one or more additional dimensions, such as technology, efficiency, or system effectiveness. Only Gössling [64] included walking in the sustainability evaluation. Together with Castillo and Pitfield [63], only two papers concerned cycling, which is interesting as those are claimed to be forms of active and, therefore, healthy transportation [4].
The prevailing manner of data collection was a literature review, which was used in 95% of all the papers. Some authors added expert interviews for refinement. Most of the authors used criteria to evaluate the sustainability performance of transportation alternatives. Only one paper used criteria for a general explication of what sustainable transportation means.   As explained in Section 4.2.1 and as shown in Table 3, there were differences between the papers regarding the partition of dimensions (or hierarchical levels) and the terms used for the levels. These differences are partly due to the varying number of hierarchical levels applied and the variable terms used in the literature.
Therefore, an approach for unifying the hierarchical levels and sublevels used in the literature and their definitions is presented as follows: • As expressed in the 2030 Agenda for Sustainable Development, the economy, society, and the environment are dimensions of sustainability [78]. In the reviewed literature, most of the authors also used the term dimension [12,69,70,74,75] or category [60,61,67,68] for social, economic, and environmental issues in addition to a few others. • Associated or similar topics are bundled into different categories [68], which are called themes [12,36,69] or enablers [65] in the reviewed papers and which occur according to specific dimensions.

•
The difference between a goal and an objective is that an objective indicates a direction (to minimize or to maximize) [79,80]. This term is not defined at all in the literature.

•
As recommended by Litman [81], the sustainability goals are on a higher hierarchical level than the objectives. Both serve as guiding principles to select indicators [31], whereas the goal is the overall aim to be measured, namely sustainable transportation.

•
Attributes and criteria are used synonymously and serve as performance measures of the objective that they characterize and operationalize [15,80,82]. They must be measurable, understandable, and operational to be able to clarify the objectives they represent. Hence, criteria must not be ambiguous; that is, a meaning is assigned to each level of achievement that a criterion indicates [80], but the criteria can also be qualitative expressions of objectives [82]. The terms used synonymously for criteria in the literature vary the most, including, e.g., impact [27,74], indicator (without a unit given) [12,61,74,75], parameter [64], theme [36], or critical success factor [70].

•
The indicators serve as a scale according to which a project's contributions to the different criteria are measured [26,31]. It is indispensable for indicators to be measurable [18]. This is the least ambiguous term in the literature.

Criteria Refinement Process
After gaining a broad overview of the areas of research in sustainable transportation, the criteria themselves were further refined. For the following steps, indicators were rephrased as criteria by omitting measurement units because the use of these terms in other papers vary.

1.
Requirement check In the reviewed papers, 474 criteria were mentioned in total. The number of criteria used in a single paper ranged from 6 to 73, with an average number of 22 and a median of 16 criteria used per paper. For each criterion and its underlying objective, a check was performed against the following requirements:

•
The relevance, or the usefulness of the criteria for the target of designing a holistic urban sustainable transportation concept; • The measurability, or the availability of reliable data; • Acceptability, as the criteria must be based on valid and trustworthy data; • Avoidance of redundancy, given that the same subject must not be described by two or more criteria in the same paper (adopted from Lyytimäki et al. [59]).
Most of the criteria that were excluded failed to meet the relevance requirement. Some of the used sets of criteria were applied to freight transportation, and although some of the criteria were also valuable for transit, others were too specific to be relevant to multimodal urban transportation. This was decided on with a view to the number of papers included for analysis. For this reason, the number of criteria in the paper by Kumar and Anbanandam [65] dropped from 73 to only six after the check. A broad focus for the criteria was another reason for exclusion due to irrelevance; this was the case for the criterion facilitation of education and public participation [63] and impacts to sites of historical and architectural importance [36].
In total, 158 criteria were not relevant, 71 were not measurable, 17 were unacceptable, and 20 were redundant. Consequently, from an initial total of 474 criteria retrieved from the 21 remaining papers, 266 were excluded, and 208 criteria remained after the requirement check. Table 4 provides an overview of the number of criteria before and after the requirement check for each paper.

Grouping of criteria into main criteria
Criteria concerning similar topics were grouped together, and main criteria were defined as collective terms. To accomplish this, criteria with two or more different meanings were first split up; this applied to the criteria of noise and vibration [67]; safety and health [71]; safety, health, and security [74]; and speed and ease of service [77]. This process resulted in the creation of five additional criteria.
Secondly, for authors using two or more criteria with almost identical meaning in the same dimension, their criteria were combined. One example is the combination of transit accidents with the reduction in impacts of accidents [62], as both criteria relate to accidents in a broader sense. Since the term "crash" implies both accidental and intentional or irresponsible events and is therefore of broader meaning, it was used instead of "accidents", as recommended by Stewart and Lord [83]. Additionally, concentrations of air pollutant emissions and air pollutant emissions per capita [36] were also combined. In total, the number of considered criteria reduced by 11 because of combinations.
In total, 202 criteria remained for the step of grouping into main criteria based on the meaning of each criterion. This step resulted in 38 main criteria used as collective terms for specific elements with the same general meaning. The following paragraphs provide a brief explanation of some of the main criteria and their elements.
In some cases, specific pollutants were used as criteria for air pollution (e.g., PM10 or NO x [66]). In other cases, the focus was too broad and also included land and water pollution [68]. Therefore, air pollution was selected as a main criterion that is neither too broad nor too narrow. The criteria represented by the main criterion of energy consumption include energy use [27], intensity [72], efficiency [70], fossil fuel (energy) consumption [36], and natural resource consumption [63]. The main criterion of noise comprises noise pollution [74], noise perception [72], noise/decibel level(s) [70], and noise minimization [63].
Greenhouse gas emissions are linked to air pollution as well as to climate change. Safety measures aim to limit safety impacts [68], adopt safety standards [70], mitigate risk, and increase perceived safety [64], among other goals. Health benefits [64], health risks [62], injury severity level [70], traffic causalities, and effects of air pollution [74] are examples of criteria expressed through the main criterion of health. Operating costs deal with implementation [62], maintenance [76], and fuel costs [66]. These costs do not include the initial investment costs. Travel time is described mainly as commuting time [36] and the time to reach the next public transportation hub (e.g., [12]). Accessibility is described as the share of the population living less than 500 m away from the next public transportation hub [36] and measures the distance between public transportation stops and residential areas [84]. Accessibility can also be specified in relation to disabled and elderly people [72]. The grouping of the remaining similar criteria into the other main criteria succeeded analogously.

Assignment of main criteria to the sustainability dimensions
After defining the main criteria, the criteria were assigned to the sustainability dimensions where they were mentioned most often in the reviewed literature. Only the main criterion of travel time was assigned to a dimension (social) other than the one where it was mentioned most often (economic) because some of the reviewed papers focused on freight transport. Thus, time was seen as a cost factor for the service provider rather than as the time passengers spent in the vehicle. Travel time was, therefore, assigned to the social dimension as this more appropriately described the overall goal of sustainable passenger transportation. In this step, the criteria not mentioned once in one of the three sustainability dimensions (without counting the intersections such as socioeconomics) in the literature were excluded. The criteria excluded for this reason were frequency [60], customer satisfaction [70], alternative propulsion technology, public transport, and societal cost [36]. After this final refinement, 33 main criteria with 197 mentions remained. These main criteria were assigned to one of the three dimensions and ranked according to the number of mentions in Table 5. The main criterion of noise was ambivalent because it was applied not only to the environmental but also to the social dimension in the literature. Furthermore, depending on the author's the point of view, travel time belonged to both the social and the economic dimensions. The same applied to affordability, depending on the perspective the article was considering (provider or user). In summary, 13 main criteria belonged to the social dimension, 11 to the economic dimension, and 9 to the environmental dimension. Regarding the share of mentions (197 in total), or the number of times each main criterion was cited, those belonging to the social dimension were mentioned the most (42%), followed by the environment (37%), and then the economy (21%). The three most mentioned main criteria all belonged to the environmental dimension. This conforms with the supposition that the environment is of existential importance to the other two dimensions [39].

Classification of Main Criteria into Hierarchical Levels of Sustainability
The resulting 33 main criteria were brought into a uniform hierarchical structure, as proposed in Section 4.2.1. Indicators were proposed to exemplify the criteria to be measured, and the underlying objective of each criterion was extracted from the systematic literature review. The sustainable transportation measurement hierarchy is shown in Figure 3.  All this is, in turn, reinforced by congestion [64]. Furthermore, energy consumption and fuel costs are linked. Price is another ambiguous criterion. Increases in price mean higher revenues for service providers, but they also reduce the affordability for users. More generally, improvements in economic outcomes frequently cause deterioration in social or environmental performances [86]. For this reason, it must be mentioned that, before choosing criteria for decision analysis, it is important to consider possible interrelations and especially tradeoffs. Selecting a high number of criteria without considering each one's possible impact on the other criteria does not necessarily improve the decision-making process. Analyzing the relationships in more depth remains an important research topic; Figure 3 provides the basis for the analysis of dependencies As mentioned, the criteria did not always fit into only one dimension, and some had multiple impacts simultaneously.
In the same way that the allocation of a criterion to a dimension is not unambiguous, the relationships between the criteria and their objectives are also complex and can be neutral, complementary (unilateral or mutual), or conflicting [85]. For instance, as seen in the grouping process, traffic congestion clogging is closely related to travel time. Air pollution also has health impacts and is closely linked to GHG emissions and, hence, to CO 2 emissions. The main criterion, "non-motorized modes", also has an impact on these indicators.
All this is, in turn, reinforced by congestion [64]. Furthermore, energy consumption and fuel costs are linked. Price is another ambiguous criterion. Increases in price mean higher revenues for service providers, but they also reduce the affordability for users. More generally, improvements in economic outcomes frequently cause deterioration in social or environmental performances [86]. For this reason, it must be mentioned that, before choosing criteria for decision analysis, it is important to consider possible interrelations and especially tradeoffs. Selecting a high number of criteria without considering each one's possible impact on the other criteria does not necessarily improve the decision-making process. Analyzing the relationships in more depth remains an important research topic; Figure 3 provides the basis for the analysis of dependencies between the main criteria by selecting the most relevant and most cited ones.
The measurement elements presented are the outcome of the first step in building a composite indicator, as proposed by the Organization for Economic Co-Operation and Development (OECD). They are part of the theoretical framework for enhancing the understanding of sustainable transportation [87]. To apply the criteria to a specific case, they must be selected from Figure 3 by checking the data availability [88] concerning the specific use case and the consistency of the set of criteria selected afterwards [89], among other elements.

Discussion
The main objectives of this paper expressed through RQ1-3 were (i) to develop a methodology for main criteria selection that focuses on sustainable urban transportation based on a systematic literature review process, (ii) to define and structure the terms used to measure sustainability in transportation in a logical hierarchy (e.g., criteria, objectives, and indicators), and (iii) to determine the main criteria for each sustainability dimension that are relevant to an urban multimodal transportation project.
For aim (i), a twofold systematic literature review process was performed. The keywords for data retrieval were given, and the condition to only include journals ranked B/C or higher in JQ3 was defined to ensure the quality of the reviewed literature. This is a novel approach to this research topic and implies that journals with a significant contribution that are not included in this ranking might be missing. It is one reason that the number of included papers is only 21 out of over 1300 identified in the first stage. As such, one field for future research is to extend the review to papers beyond JQ3, but with alternative quality criteria defined and applied. This way, (external) validity, especially sensitivity analysis regarding the criteria grouping, must also be performed to assess the robustness of the method and its results.
To meet the relevance requirement, the papers analyzed in this review needed to focus on urban transportation projects, intentionally excluding studies on other spatial structures. Consequently, analyzing criteria for regional or national projects is beyond the scope of this study. These spatial structures likely lead to different sets of criteria. However, the set of criteria proposed in this study serves as an initial draft for analyzing various spatial structures; structures such as those in rural areas can be expanded through future analyses.
Another limitation regards the defined timespan of 10 years for the paper retrieval. The literature published before 2010 was excluded from the review to ensure the topicality of the reviewed papers. This timespan could be prolonged to determine whether relevant papers were published before and whether the resulting set of criteria is different. The same applies to the papers excluded because they were published after the date the review was conducted (6 January 2021). A requirement check for the criteria found in the 21 remaining papers was proposed and conducted in relation to relevance, acceptability, measurability, and non-redundancy. All four formulated requirements helped to identify key criteria and should be applied to future criteria analyses. The relevance check in particular led to the exclusion of numerous negligible criteria. Grouping the key criteria into main criteria involves a degree of subjectivity. Other authors might have phrased the main criteria differently, but their meaning remains the same. The grouping remains subjective to a certain degree; thus, other authors are encouraged to do the same for assessing the validity of the process.
Overall, the systematic literature review, analysis, paper exclusion, and inclusion criteria, and the criteria refinement process are easily transferable to a broad field of studies.
Regarding aim (ii), a hierarchical structure for the measurement elements was proposed. It starts with the highest level (overall goal), after which the sustainability dimension is determined (social, economic, or environmental). If needed, categories are added before the objectives, criteria, and indicators are defined successively. The definitions of the elements are as follows: objectives are subgoals within a dimension that should be minimized or maximized. Criteria serve as performance measures. Indicators add measurement units to the criteria. The proposal of a transparent lexicon adds value to this research field because this is the first attempt to develop a common understanding of the measurement hierarchy. By defining and structuring pertinent terms and bringing them into a hierarchy, practitioners gain an enhanced understanding of what the concept of sustainable transportation consists of. This leads to a target-oriented approach to planning and implementing sustainable urban transportation projects.
In relation to aim (iii), a total of 33 main criteria were identified in 197 mentions, of which 11 belong to the economic dimension, 13 to the social dimension, and 9 to the environmental dimension. The share of mentions of the criteria in the economic, environmental, and social dimensions were 21, 37, and 42%, respectively. The three most mentioned main criteria were air pollution, energy consumption, and noise, all of which belong to the environmental dimension. The assignment of a criterion to a sustainability dimension followed the majority principle. However, if other papers are reviewed, criteria assignment and the resulting hierarchy might change, given that the criteria's associations were not unambiguous in the literature. The set of criteria that were scientifically identified provides mobility managers and planners who want to develop more sustainable options with explanations of which criteria to use according to the available data. The main criteria set allows for the sustainability evaluation of different transportation projects before and during implementation and facilitates the following points: • deciding on and prioritizing different transportation alternatives or policies, such as the impact estimation of different transportation modes; and • tracking progress over time and benchmarking the sustainability of an existent transportation project.

Conclusions
Future research can build on the outcome of this systematic literature review by applying the resulting set of criteria to the sustainability assessment of urban transportation alternatives. For example, the criteria could be used in a multicriteria decision analysis to determine which urban transportation alternative out of a given set is the most sustainable. A variety of such analysis methods and tools are available and have been applied in the past. However, thus far, the criteria proposed in this paper have not been combined for such an assessment, which indicates the need for further investigation. Different weights are often assigned to the criteria in a multicriteria analysis (e.g., to express priorities). The optimal assignment of weights to the different criteria proposed in this review is another area for future investigation. Furthermore, data availability and reliability are crucial criteria selection requirements for conducting multicriteria analysis. As the data availability may vary due to regional differences, the criteria selection from the set proposed here can be tailored to the needs of a given study.
In future works on the topic, special attention must be given to the interactions and, especially, tradeoffs between the criteria because most criteria are not assigned to only one sustainability dimension in the existing literature. Uncovering tradeoffs is a particularly important area for future research, and special emphasis should be given to theory-building and practical implications.
Conflicting goals in sustainability are also an important issue when providing new transportation solutions in business ecosystems with various partners from different organizations. The importance each actor allots to each sustainability criterion might vary. The recognition of such conflicts and their resolution represent another topic for investigation.
To conclude, transportation plays a key role in minimizing the impact of climate change. It is, therefore, of the utmost importance to make transportation more sustainable. This review provides a foundation for highly needed future research on how to make urban transportation projects environmentally sound, socially acceptable, and economically efficient.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.