Measurement Quality Appraisal Instrument for Evaluation of Walkability Assessment Tools Based on Walking Needs

Walking is a sustainable commute mode, and walkability is considered an essential sign of sustainable mobility. To date, many walkability assessment tools have been developed to assess the walkability conditions across the world. However, there is a paucity of comprehensive methods to assess current walkability tools based on walking needs and ensure all walking requirements are included. Thus, researchers and experts are unable to select the most comprehensive tool systematically. The present study attempts to develop a system to evaluate the quality of the existing tools. The instrument focuses on factors related to walking needs frequently observed in all types of walkability assessment tools. Hence, a pilot measurement quality appraisal instrument (MQAI) is developed and tested by a research team with planning and public health backgrounds. The final MQAI is tested by suitable reliability, criterion, and content validity tests. Most appraisal scales display moderate to high reliability for both audits and questionnaires. The MQAI appears as ready for use in several applications, including meta-analyses and systematic reviews. Additionally, the MQAI can be used by practitioners and planners to identify the most comprehensive and efficient assessment tools based on their needs.

The existing physical activity tools and walkability assessment tools aim to assess the walking environment and improve recreational spaces for health advancement in societies [31]. Walkability assessment tools use audits [13,[32][33][34] and questionnaires [ 35,36 ] to collect the required data. In order to perform an audit, the streets are split into segments, and each part is examined by one or more evaluators. In audits, a set of qualitative judgments or quantitative measurements is designated for each assessment item. Like the audits, the questionnaires are effective instruments to assess pedestrian environments. The questionnaires are utilized to evaluate the perceptions of neighborhood residents towards walking and cycling facilities in their area.
According to Litman [37], walkability is considered an essential indicator of sustainable mobility. Typically, researchers and practitioners from various domains, including urban planning, transport planning, urban design, and public health have an interest in the topic of walkability. In addition, they are the main users of the walkability assessment tools. There are many walkability assessment tools, and it is a challenging task to select the best one. Furthermore, there is no guideline or a systematic manner to help these users to select the most appropriate walkability assessment tool. They need to ensure that the tool that they select to work with is comprehensive and sufficiently detailed. This is because the future investments in infrastructures may depend on this assessment. Thus, if an inappropriate tool is used, undesirable consequences will be brought about. Each type of walkability assessment tool uses certain indicators to assess the walking environment and urban design-related factors. The walking needs are extremely diverse, and thus it is important to ensure that the assessment tools consider a wide range of urban design-related factors to the maximum possible extent for assessment purposes. Consequently, there is a need to develop an instrument to appraise the strength of assessment tools to evaluate walking needs. Currently, there is a paucity of research dedicated to the measurement quality examination of walkability assessment tools [11]. The present study aims to develop a measurement quality appraisal instrument (MQAI) to evaluate walkability assessment tools based on walking needs. This paper presents the development process of the MQAI. To exhibit this process, the MQAI was applied to some walkability assessment tools and indicated the reliability, validity, and applicability of these tools. The successful development of MQAI ensures planners and researchers can efficiently employ this tool for choosing the most appropriate walkability assessment tool among the candidate tools.

Walking Needs
Various walking needs and their contributory urban design variables affect people's decision to walk. Accessibility is among the most cited walking needs that must be met to motivate people to walk. Accessibility simply refers to the ability (easiness) of obtaining desired services and activities [4,[6][7][8][9][10]. Several urban design factors affect the accessibility needs of walking, including, but not limited to, availability/completeness of sidewalk network, number of destinations, proximity to transit points, presence/number of barriers, and public spaces.
Safety is another important walking need that is frequently found in the literature. Safety of walking refers to whether an individual feels safe from the danger of falling due to wet conditions, the hazard of conflicts with vehicles, and the threat of crime [2,4,[11][12][13][14]38]. Urban design factors that may affect safety from crime include lighting, landscape and trees, and vacant buildings. Design factors that may contribute to safety from traffic include signage, signals, and pedestrian crossings. Safety from falling also can be affected by surface, materials, and lighting.
A considerable amount of literature has been published on comfort as an important need for walking. Comfort refers to a person's level of satisfaction, ease, and pleasure [4,5,17]. The design factors that may affect the comfort needs of walking include landscape and trees, the presence of traffic calming features, canopies, and drinking fountains. Pleasurability is also an important need for walking. Pleasurability simply refers to whether an individual experiences an enjoyable and interesting area for walking [4,5,20]. The presence of a varied streetscape, architectural elements, and outdoor dining areas can affect the pleasurability level of pedestrians. Table 1 illustrates the walking needs and the urban design factors that affect these needs. The existing walkability assessment tools have various factor classifications. In walkability assessment tools, the major groups of assessment items are street facilities, sidewalk characteristics, land use, and road attributes. Street facilities include signage, signals, drinking fountains, surveillance, and items related to the disabled [33,34,68]. Sidewalk characteristics include items such as sidewalk completeness, the width of the sidewalk, presence/number of barriers (obstacles), and surface/material of the sidewalk [69][70][71]. Land use is another frequently used grouping that contains a mixture of land use, undesirable land uses, and destinations [72,73]. The walkability assessment tools also use items related to road attributes, including traffic calming features, street width, cleanliness, lighting, and directness of walkways/routes [71,74]. Table 2 presents walking needs-related factors based on the major factor classifications in the existing walkability assessment tools. The walking needs information obtained from the literature and summarized in Tables 1 and 2 were used to develop a comprehensive instrument to assess current tools based on walking needs. This instrument can assess the quality of the existing walkability assessment tools and determine their capability for assessing pedestrian environments. Such an instrument also can act as a decision-making system for selecting the most appropriate assessment tool for evaluating the walking environments.

Methods
As previously mentioned, this paper shows the development process of the MQAI. This process included two main parts: (1) pilot version development and (2) final version development. Each part involved a series of assessments and techniques. The development process of MQAI is indicated in Figure 1.

Identifying Walking Needs and Developing the Pilot MQAI
A literature review has been conducted to identify the walking needs and their widest range of contributory urban design factors. The walking needs information extracted from this literature (refer to Tables 1 and 2) were used to develop a pilot MQAI (refer to Appendix A) to assess the current tools based on walking needs. Table 3 lists the key characteristics of the MQAI. This tool is based on a pointing system in which each point corresponds to a specific condition. In this system, the worst and best conditions receive the lowest and greatest points, respectively. This method facilitates a systematic comparison among the walkability assessment tools and allows for determining the tools' capability for evaluating the walkability. To assess each item, the evaluator must select 'no assessment' (determines that the tool does not assess the indicator); 'simple assessment' (determines that the tool simply assesses the availability of an indicator and does not assess the quality of indicator); 'partial assessment' (determines that the tool assesses the availability in addition to the quality but does not provide a complete assessment for the quality); and 'complete assessment' (determines that the tool presents a complete assessment (availability and quality) for the indicator). The 'no assessment', 'simple assessment', 'partial assessment', and 'complete assessment' conditions receive points of zero, one, two, and three, respectively. These four levels of responses allow for simultaneously assessing both availabilities of design factors and their assessment quality in the tools. The score of each measurement scale is computed by the sum of the marks assigned to the different items. Appendix A shows the scoring pattern and related explanations. To investigate the content validity of the proposed MQAI, some meetings were held with a panel of experts which included two experts in urban transport planning and public health. The outcomes of these meetings were minimal changes to the content of some scales and/or the explanation attached. The pilot version of MQAI was made through the results of this step.
A criterion validity test was conducted in this step. Two pedestrian environment assessment tools, including one audit and one questionnaire, were assessed utilizing the pilot version of MQAI by the research team (authors). Each member of the research team was benchmarked relative to the team leader (first author). The average level of agreement was 41.5%.
Once the assessment of criterion validity of the MQAI pilot version was completed, the outcomes of this evaluation were discussed in a series of meetings in which both the research team and experts were involved. These meetings engaged the experts in discussion and the developing of a refined list of suitable MQAI appraisal items. During the meetings, the research team and experts confirmed the purpose and scope of the MQAI. They also ensured that the widest range of appraisal items was included in the proposed instrument. Thus, a few changes were implemented, such as adding more explanations to the description of the responses to clarify the differences between answer categories in a better way (refer to Table 4). Additional improvements included adding an instruction to respond to the appraisal items.
Step-by-step instructions were provided to aid users in selecting a suitable answer concerning 'No, simple, partial, and complete' (refer to Appendix A). Additionally, a graphical scale was provided to help the users recognize the right response (refer to Figure 2). Table 4. An example of an added explanation to a given question in MQAI.

No assessment
The tool does not assess the condition of the path 0 Simple assessment The tool simply assesses the path condition * in the study area 1 Partial assessment The tool assesses the path condition, and one of the following path condition issues: (1) material ** used; and (2) slope *** 2 Complete assessment The tool assesses the path condition and all the following path condition issues: (1) material used; and (2) slope 3 * Poor (several weeds, breaks, and holes), moderate (a few weeds, breaks, and holes), good (very few weeds, breaks, and holes), under repair. ** Flat segmented concrete slabs, paving stones, Portuguese mosaic, rustic natural stones, slippery material (smooth ceramic tiles), rough material (hydraulic tiles, interlocked blocks, flattened concrete), regular, firm, antiskid, and ant vibration material (high strength paving). *** Flat or gentle, moderate slope, steep slope.

Final Version Development
The research team and panel of experts assessed the significance of each tool item. They rated the importance of the items by utilizing a five-point scale varying between 'not important' and 'very important'. The median score for each item was calculated to determine the weight of the items. In order to gain the consensus of the research team, the team computed the agreement level for the importance of every factor. Then, the weight for each item was adjusted based on the number of items in each category.
The formula that was utilized is [weight − expected weight]. The expected weight is the score that is assigned if the items equally contributed to a category. For instance, if it is required to weigh two items, the expected weight is 2.50 for each; and if it is needed to measure four items, then the expected weight is 1.25. The inter-quartile range (IQR) is calculated for these modified weights to assess the degree of consensus among the evaluators on the scored importance of items. Items with an IQR < 1 correspond to a high level of consensus among the evaluators.
The final version of MQAI was tested for criterion validity and reliability. The reference degree of correlation and agreement for individuals with a background and familiarity with urban planning and urban design was investigated to assess the criterion validity of the MQAI. For each rater, the agreement level was calculated with respect to the leader of the research team. A total of eight students who registered for a Master of Science (advanced urban planning course) participated in this step. Two tools were selected by the team leader and were classified based on the MQAI% interpretation section (Appendix A) as poor (20 ≤ MQAI% < 40) and regular (40 ≤ MQAI% < 60). A tool was given to each student, and they were asked to complete the assignment in four days.
In order to test the reliability, two raters were asked to evaluate six walkability assessment tools (three audits and three questionnaires). The users of walking assessment tools are mainly from the domains of urban planning, transportation planning, and public health. Thus, two raters were selected, namely an urban and transport planner and a public health expert. The main goals of this step were: (1) to verify the inter-rater degree of agreement for each of the four levels of answers employed in the MQAI; and (2) to assess the inter-rater degree of agreement for each of the six tools. The inter-rater reliability was tested by using Kappa, which is a statistical measure of inter-rater reliability.

Results
Based on the IQR definition, an IQR of less than one indicates a high level of agreement, and an IQR of more than one indicates a low level of agreement. Thus, sixteen factors exhibited high levels while five factors exhibited moderate levels of consensus (Table  5). All items, including 'sidewalk', 'land use and destinations', and 'road attributes', exhibited high levels of agreement while those items with moderate levels belonged to the 'street facilities' category. The final version of MQAI was tested for criterion validity and reliability. As shown in Table 6, the total baseline of agreement level between the evaluators and the team head was 82%. The lowest agreement belonged to the sidewalk scale (75%). Agreement values for the other three scales were 79% for land use and destinations, 83% for street facilities, and 88% for road attributes. The Spearman correlations were 0.78 for the regular tool and 0.92 for the poor tool. The average MQAI% for the tools was 38% for the poor tool and 43% for the regular tool. The difference in MQAI% between the 'poor' and 'regular' tools were statistically non-significant at the 5% level. Additionally, there was no statistically significant difference in MQAI% between the tools assessed by the research team leader and the tools assessed by the individuals (p-value = 0.3 for the poor tool; p-value = 0.1 for the regular tool).  Table 7 reveals the inter-rater agreement level for every of the four levels of response employed in the MQAI. Table 8 presents the reliability data by appraisal type and includes the number of questions evaluated within each component. With respect to the questions assessed in audits and questionnaires, averages of 69.84% and 73%, respectively, corresponded to a high agreement (≥75%) between the raters. The aggregated results of the inter-rater agreement level for each of the tool types are shown in Table 9. The weighted Kappa values for the four scales varied based on the tool type, and the K values for the audits were in the moderate to good range. Concerning the questionnaires, the K values ranged from fair/moderate to very good. The overall inter-rater reliability for the audits and questionnaires were 70% and 73%, respectively.

Discussions
The baseline agreement level for the overall instrument was 82% for persons with a background in urban planning and urban design with respect to the team leader. The land use and destinations, street facilities, and road attributes scored the agreement levels in the range of 79-88%. The sidewalk scale had the lowest value, that is, a 75% agreement level. The main reason for this is that this scale includes only two items; therefore, missing an item will have a larger influence on the agreement level.
The improvement of the final version of MQAI compared to the pilot version was demonstrated through testing the final version with two raters with planning and public health backgrounds. This improvement might be related to adding instructions and the items' details. A simple check on the reliability results shows that the Kappa value is different for the same scale in audits and questionnaires. For example, the sidewalk attained a lower K value in audits than questionnaires. A possible explanation for this is the inherent difference of assessment in audits and questionnaires besides the dearth of knowledge in a specific field of proficiency. During testing of the MQAI instrument, the team noted that raters had difficulty choosing the 'partial' response. However, the raters did not experience any difficulty in assigning other response categories. The interpretation skill of raters was further significantly improved through in-depth training and supervision.
The results also showed that 'poor' tools are easier to assess than regular tools. The total scoring for a 'poor' tool by raters was very similar to that of the team leader. Based on the classification of the tools proposed in this study, the 'poor' tool represents a tool that considers a few numbers of urban design factors. Hence, the raters were required to easily score items as 'no' or 'simple'.
Both researchers in practice and academia can employ the MQAI to select the most suitable walkability assessment tool. The walkability assessment tools help decision-makers to identify shortcomings in the living environments. Decision-makers then conclude about the improvement strategies for a living environment with undesirable walking conditions. These strategies may include financial and cultural aspects, which may impact the everyday life of the residents. It is vital that a sufficient amount of investments be allocated to an area with inadequate walking conditions. A better walking infrastructure encourages people to walk and, in turn, increases the overall walking level of residents in a neighborhood. Thus, choosing a suitable walkability assessment tool that assesses the walking environment accurately is of great interest. Moreover, this can impact the plans for improving the walking conditions in a neighborhood indirectly. The employment of MQAI enables practitioners to (1) classify the walkability assessment tools, (2) select the most suitable one, and finally (3) identify walkability shortcomings within neighborhoods using the selected tool.
Researchers in academia also can benefit from the MQAI. Researchers in the domains of urban planning, transport planning, and public health need a comprehensive tool for assessing the walkability condition in a certain area and link this condition with the overall walking level in that area. Typically, this relationship is assessed using traditional statistical methods. However, the abundance of walkability assessment tools, in both the forms of audits and questionnaires, makes it challenging for these researchers to pick the most appropriate one, which can truly reflect the walking condition within a certain area. Thus, the MQAI can help them choose the most comprehensive tool that can capture the details of the walking environment and find the associations of this environment and overall walking and physical activity levels.

Conclusions
In recent decades, walkability assessment tools have been developed to assess the suitability of a walking environment for pedestrians. These tools used numerous environmental factors in order to assess the built environments. To date, several reviews were published on walkability assessment tools, and they highlighted challenges faced by extant studies [11,[69][70][71]. However, there is a paucity of a system for assessing the walkability assessment tools based on walking needs. The present study developed and tested an instrument to appraise walkability assessment tools based on walking needs. The main goal of the proposed instrument is to assess whether the walkability assessment tools consider the walking needs and urban design-related factors. This tool can serve as a decisionmaking system for researchers and practitioners to select the most appropriate assessment tool for evaluating the walking environment.
The present instrument can be used for meta-analyses and systematic reviews. This instrument is easy to use for planners and public health experts. The MQAI can aid practitioners and researchers in selecting the tool to assess the pedestrian environments in both the neighborhood and street scale based on their priorities. The instrument considers the majority of the walking needs to assess the existing tools. However, the planners can select the required items based on their priorities and adjust the proposed MQAI based on their selected items. Additionally, the instrument can serve as a base to develop future walkability assessment tools. The MQAI can be utilized to decide whether the design of a new walkability assessment tool adheres to the walking needs of diverse pedestrian groups. The MQAI did not perform the reliability and validity tests on virtual assessment tools. However, to keep abreast with new technological advancements, this tool also can be employed to assess the virtual assessment tools, which were recently released. Additionally, the methodology employed in this study can be followed to develop similar tools for assessing the virtual walkability/bikeability tools. The MQAI can also inspire future decision-making tools to select the best assessment tools that involve physical environment indicators.  If you need to assess the tool many times to judge, there is a high probability that the response is 'Partial'.

•
A different approach to determine if it is a 'Partial' or 'Complete' is by checking the scale below. As presented, the endpoints of the scale are marked with 'Complete' and 'No'. Hence, the 'Partial' response is the whole space within the 'Simple' and 'Complete'.

Mathematical Calculation
Mathematically, the NSAT score is defined as follows: 12 Here, MQAI% = strength of the tool of interest to assess the environmental factors, Pi = point given by the rater to the indicator of interest, Wi = relative weight of each indicator, 12 = total achievable points by each tool (12 = ∑ 3 ×

Answer Description Point
No assessment The tool does not assess the condition of the path 0 Simple assessment The tool simply assesses the path condition * in the study area 1 Partial assessment The tool assesses the path condition, and one of the following path condition issues: (1) material ** used; and (2) slope *** 2 Complete assessment The tool assesses the path condition and all of the following path condition issues: (1) material used; and (2) slope 3 * Poor (several bumps, cracks, holes, and weeds), moderate (a few bumps, cracks, holes, and weeds), good (very few bumps, cracks, holes, and weeds), under repair. ** Flat segmented concrete slabs, paving stones, rustic natural stones, and Portuguese mosaic, slippery material (smooth ceramic tiles), rough material (hydraulic tiles, interlocked blocks, flattened concrete), regular, firm, antiskid, and ant vibration material (high strength paving). *** Flat or gentle, moderate slope, steep slope.

Land Use and Destinations
3. How does the tool assess the mixture of land use?

Answer Description Point
No assessment The tool does not assess the mixture of land uses and activities 0 Simple assessment The tool simply assesses the availability of various land uses and activities in the study area 1 Partial assessment The tool assesses the availability of land uses and activities in the study area and determines the number of each activity such as residential, retail/commercial, office, public, and/or industrial 2 Complete assessment The tool assesses the availability of land uses and activities in the study area and determines the number of each activity and overall desirable land use planning 3 4. How does the tool assess the undesirable land uses (e.g., dilapidated buildings, abandoned buildings, and rights of way of utilities and rail)?

Answer Description Point
No assessment The tool does not assess the undesirable land uses 0 Simple assessment The tool simply assesses the availability of various undesirable land uses in the study area 1 Partial assessment The tool assesses the availability of undesirable land uses in the study area and determines the number of each undesirable land use 2 Complete assessment The tool assesses the availability of undesirable land uses in the study area and determines the number of each undesirable land use and the overall undesirable land use planning 3 5. How does the tool assess the destinations (e.g., local facilities, parks, public transport, services, shops, vehicle parking facilities, and bike parking facilities)?

Answer Description Point
No assessment The tool does not assess the destinations Complete assessment The tool assesses the availability of landscape and trees and two to three of the following issues in the landscape and trees: (1) vertical clearance of tree branches; (2) placement of trees in the furnishing zone; and (3) distance between trees 11. How does the tool assess buffers?

Answer Description Point
No assessment The tool does not assess the buffers 0 Simple assessment The tool simply assesses the availability of buffers and barriers along the street such as on-street parking 1 Partial assessment The tool assesses the availability of buffers and barriers along the street such as on-street parking and type of buffer(s) OR width of a buffer 2 Complete assessment The tool assesses the availability of buffers and barriers along the street such as on-street parking and type of buffer(s) AND width of a buffer 3 12. How does the tool assess benches and sitting areas?

Answer Description Point
No assessment The tool does not assess the bench and sitting areas 0 Simple assessment The tool simply assesses the availability of bench and sitting areas 1 Partial assessment The tool assesses the availability of bench and sitting areas, and one to two of the following issues in the landscape and trees: (1) placement of benches in the furnishing zone; (2) distance from the curb; (3) space for parking a wheelchair or stroller; and (4) distance between the benches 2 Complete assessment The tool assesses the availability of benches and sitting areas, and three to four of the following issues in the landscape and trees: (1) placement of benches in the furnishing zone; (2) distance from curb; (3) space for parking a wheelchair or stroller; and (4) distance between the benches 3 13. How does the tool assess surveillance?

Answer Description Point
No assessment The tool does not assess the surveillance 0 Simple assessment The tool simply assesses the availability of surveillance in the study area 1 Partial assessment The tool assesses the surveillance and active * OR passive ** surveillance 2 Complete assessment The tool assesses the surveillance and active AND passive surveillance 3 * CCTV and security patrols. ** Active frontages, façade solid-void ratio, windows, verandas, and gardens.
14. How does the tool assess the items related to the disabled?

Answer Description Point
No assessment The tool does not assess the items related to the disabled * 0 Simple assessment The tool assesses one of the following issues in the items related to disabled individuals: (1) accessible drinking fountain; (2) accessible toilet; (3) tactile pavement; (4) curb cut; (5) accessible signage and signals; and (6) elevator next to the sky-bridge 1 Partial assessment The tool assesses two to four of the following issues in the items related to disabled individuals: (1) accessible drinking fountain; (2) accessible toilet; (3) tactile pavement; (4) curb cut; (5) accessible signage and signals; (6) elevator next to sky-bridge 2 Complete assessment The tool assesses five to six of the following issues in the items related to disabled individuals: (1) accessible drinking fountain; (2) accessible toilet; (3) tactile pavement; (4) curb cut; (5) accessible signage and signals; and (6) elevator next to the sky-bridge 3 * Accessible drinking fountain, accessible toilet, tactile pavement, curb cut, accessible signage and signals, and elevator next to sky-bridge.

Answer Description Point
No assessment The tool does not assess the streetscape characters * 0 Simple assessment The tool assesses the availability of one of the following streetscape characters: (1) architectural elements; (2)

Answer Description Point
No assessment The tool does not assess driveways 0 Simple assessment The tool only assesses the availability of the driveways 1 Partial assessment The tool assesses the availability of the driveways and driveway width * OR availability of warning facilities ** 2 Complete assessment The tool assesses the availability of the driveways and driveway width AND availability of warning facilities 3 * More than a garage, equal to a garage, and less than a garage. ** Special paving, signs, auditory warning, and mirrors.

Answer Description Point
No assessment The tool does not assess the transit points 0 Simple assessment The tool simply assesses the availability of transit points along the path 1 Partial assessment The tool assesses the proximity to transit points * OR accessibility of transit stations ** in the study area 2 Complete assessment The tool assesses the proximity to transit points AND accessibility of transit stations in the study area 3 * The proximity of transit stations to popular landmarks such as squares, towers, and malls. ** Connectivity and continuity of walkways to transit stations.

Answer Description Point
No assessment The tool does not assess traffic calming features * 0 Simple assessment The tool assesses the availability of one of the following traffic calming features: (1) roundabouts; (2)

How does the tool assess road attributes?
Answer

No assessment
The tool does not assess road attributes * 0 Simple assessment -1 Partial assessment The tool assesses the number of lanes OR street width 2 Complete assessment The tool assesses the number of lanes AND street width 3 * Number of lanes and street width.