Developing participatory analytics techniques to inform the prioritization of cycling infrastructure

.

This research investigates an approach that builds on previous investigations of public participation geographic information systems (PPGIS; Sieber 2006). These systems and technologies are currently used in research and practice to engage cyclists and collect information to inform cycling infrastructure prioritisation. In the context of Australia, PPGIS have mixed uptake by local councils, who would obtain feedback from citizen on predetermined cycle routes already planner by transport planners. However, such tools do not capture the citizen priority routes, not provide metrics on the likelihood more people will cycle in a particular new piece of bicycle infrastructure is incorporated in the existing cycling network. In is in this context this research endeavours to make a new contribution in the development and evaluation of a participatory planning support system.
The research proposes an approach for collecting data about cyclists' (and potential cyclists') priorities to help inform these decisions through a participatory analytics approach.
The study seeks to answer the following two research questions (RQs).
RQ1 -What are the existing passive and active participation mechanisms for collecting data on cycling behaviour and preferences?
RQ2 -How can these data be integrated within a participatory planning support system to better understand and communicate priorities for existing and potential future cyclists?
These research questions will be examined in the context of the city of Sydney, Australia. The remainder of the paper is structured as follows. First, it will describe the background to the research -in terms of data analysis approaches, data collection and digital approaches to cycling participation. It will then outline the method of three different stages of this research to investigate these questions -a web survey, tool development and interviews. Following this, it will discuss the results, relevance of the findings and future implications on research and practice.

Data analysis approaches
The process of data analysis is commonly perceived to be a highly skilled activity that requires training in specific tools, as well as a high level of literacy and numeracy to do well. It can be time-consuming, tedious and mistakes can easily be made. Consequently, the process of data analysis is often excluded from public participation processes. Further, it can be argued that assigning analysis tasks to communities may be an inefficient use of a limited resource (particularly if researchers, experts and practitioners are already trained to take on these tasks; Coghlan and Brydon-Miller, 2014). However, rather than adopting a deficit model (i.e. considering community members to be lack skills or knowledge in particular areas), many researchers are finding ways to build on the wealth of community knowledge in engaging them in accessible analysis opportunities (Coghlan and Brydon-Miller 2014). Such avenues have been explored by Filonik et al. (2014, p.7.), who use the term 'participatory data analytics' to describe 'development of interfaces to support collaborative, community-led inquiry into data'. Furthermore, the field of visual analytics specifically explores how visual interactive interfaces can support analytical reasoning (Thomas and Cook 2006). In this context, the purposeful and opportunistic use of data for better city planning should be considered.

Data collection
In many ways, citizens may be contributing to data analysis processes for city planning indirectly. In Australia -and many parts of the world -application services such as Strava (an application used to record physical activity), TomTom, and mobile phone providers collect user data and re-package it for various urban planning purposes, such as for use by transportation planning authorities. The movements of users are distilled into metrics which help inform infrastructure investment scenarios. A potential issue with this form of participation has been described as a 'big data divide' (Andrejevic 2014). This is characterised by a lack of access to individuals' own passively-collected data, as well as the analytical capabilities to understand how other organisations may be using it and a lack of means to validate the correctness of the assumptions and insights generated by its use (Andrejevic 2014). On the other hand, purposeful data collection, such as national censuses and household travel surveys, is cross-sectional but potentially more likely to be aligned directly to citizens' needs. However, these methods do not capture the same intricate details about the network's use on a day to day basis or in response to certain events. As such, the participatory analytics approach described in this paper involves engaging end-users in interacting with organic (or 'passive') and purposeful (or 'active') data collected in a variety of ways and in designing and evaluating the digital interfaces within a cycling context in the city of Sydney, Australia.

Digital approaches to cycling participation
As found by Falco and Kleinhans (2018), academic literature has only provided a limited overview of digital participation platforms (DPPs) to date. This is characterised by a few demographic and spatial contexts and does not account for the significant volume of DPPs that have been used and developed in planning practice globally over the past decade. There has been a significant uplift in the technology available to perform digital engagement exercise. The authors highlight 113 active DPPs, which they identify, analyse, and classify within a citizen-government relationship typology. This chapter focuses on DPPs related to cycling.
One term related to DPPs, specifically focusing on geographic data collection, is participatory planning geographic information systems (or PPGIS), which were originally develop and applied in the context of cities 20 years ago to as platform to better connect citizens into the planning process (Carver et al. 2001). These are geographic information systems that enable local knowledge production through the creation of spatial data by local and non-government groups. In the context of biycle planning, one popular approach to PPGIS for collecting public opinion is through a 'pinboard' approach. This approach allows users to drop a pin at an explicit location and provide a combination of tags (for example: like/dislike, safe/unsafe, attractive/unattractive) and comments.
One influential example of this for cycling has been Shareabouts (2012) The earlier examples of PPGIS generally do not provide a clear pathway for how and where this citizen data is transformed into outcomes of a later analytical exercise. In addition to this, when participants provide feedback, they are exposed to limited contextual data which may be useful in assisting them in making an informed comment.
On the other hand, the PSS described after this generally do not integrated direct citizen feedback. Rather, these tools rely on existing survey data, passive data, and network characteristics. As well as this, many of the available PPGIS or PSS tools, focus on one particular aspect of the network, participating in one type of cycling infrastructure or one type of cycling incident (e.g., safety).
As such, this research examines a potential digital intervention that combines both active and passive citizen data in a singular interface. This interface is intended to be used by practitioners as a planning support system and by citizens as a participatory tool to view other citizens' voices and explore the effect of alternate investment priorities on the network.

Method
The method for this case study is outlined by the workflow described in Error! Reference source not found.. The case study is designed to investigate the active and passive data traces that are currently used by cyclists and those potentially interested in cycling. These are then combined into a technical tool. This is then shown to experts and those interested in giving a more detailed input. Conceptually, this process then reiterates until multiple aspects (active -long, active -short and passive) participation are achieved. As such, several participation outlets are explored to design the final tool and create local planning decisions.
This research uses a mixture of methods to allow a combination of numerical measurements and in-depth exploration. The first RQ1, relating to existing passive and active participation mechanisms, and priorities for existing and potential cyclists in Sydney. In exploring this, both quantitative insights to rank and order these -as well as qualitative insights to explore more overarching themes and issues. As such, a survey instrument was used to collect this data. In order to understand RQ2, and whether participation inputs have been used in a meaningful way, more detailed qualitative feedback was required. This was obtained through more in-depth, one-on-one interviews.
Each stage of the research process analyses the previous stage -the digital tool analyses the participant input, and the interviews analyse the digital tool. Following this -the interviews and overall research are analysed.

RQ1 -Existing passive and active participation mechanisms and priorities
The first stage of this study was to conduct a web survey to capture elements which may influence participation in cycle planning in Sydney. This survey was created using the ArcGIS Online Survey123 platform, which allows spatial data to be captured alongside traditional survey questions. A snowball sampling method was used -the researcher sent the survey link to other researchers, transport planning practitioners, cycling advocacy groups, and local and state government workers to distribute to potential participants. The survey began on May 11 th , 2020 and finished on June 14 th , 2020, when a sample of 280 was achieved. The first 250 respondents participated within the first two weeks. The full survey can be found in Appendix 1. Initial background questions of survey participants included the following.
• Cycling previous experience, trip purposes, trip frequency, and physical and navigational confidence.
• General engagement in cycle planning (through apps, surveys, meetings) in both active and passive terms.
• Satisfaction with current network and participation mechanisms.
• Preferred engagement format (passive or active).
• Observations of changes to cycling behaviour as a result of COVID-19 (expanded separately to this chapter; see Anonymous, 2020).
Further to these background questions, a participation exercise was also embedded in the survey. This involved two components. Firstly, there was a prioritisation exercise, in which where participants ranked the importance of the following factors: 2) Creating a continuous network that doesn't stop and start 3) Designing safer environments in areas where crashes and collisions have occurred 4) Building and upgrading infrastructure where the most people cycle 5) Building and upgrading infrastructure to specifically support commuter cyclists 6) Building infrastructure where the gradient or slope is easy to ride 7) Building infrastructure in critical locations that will encourage new people to ride bicycles 8) Building infrastructure close to train stations This was assessed using a Likert scale for each criterion, as well as a question asking for the single most important factor to focus on. The second component of the participation exercise was to ask participants to draw their top cycling improvement for Sydney. Here, participants were asked to draw one specific shape with some accompanying text (see Figure 2). The only restrictions was that only one drawing was permitted -part of the research was to analyse how people responded to this and the various forms by which participants commented.  The criteria included the following: • Priority should be given to areas with limited off-street cycleways.
• Priority should be given to areas that connect the existing network of off-street cycleways.
• Priority should be given to improving the safety of areas where cyclists have crashed.
• Priority should be given to areas where the highest volumes of cyclists use the network • Priority should be given to corridors that are currently used for commuting purposes.
• Priority should be given to areas within a 10-minute cycle of a train station.
• Priority should be given to areas with a cyclable gradient.
• Priority should be given to areas which will potentially convert short car trips to bike trips.
• Priority should be given to areas where citizens voted or drew ideas for improvements.
Following the creation of these nine spatial layers, the layers were collated into hexagonal units for the city of Sydney. Each hexagonal unit received a score of 0, 1, or 2, depending on how well they fit the above criteria.

Dashboard development
An interactive tool was developed to communicate the results of the participatory prioritisation index. This is described in Figure 4. The tool was developed using Bokeh (bokeh.org), a Python visualisation library.
The tool allows users to do the following: • Select individual councils and view a heatmap of prioritisation in the area.
• Toggle each prioritisation criteria to be 'more important' or 'less important'.
• Investigate which road areas score most highly on a customised prioritisation index.
• Investigate the impact of citizen votes on prioritisation index.
• View scorecards of how particular hexagonal units perform on the index.
• View details on the metric calculations.
One of the unique features of this tool was the ability to view the most relevant citizen comments for each hexagon. This was done by selecting the comment relating to the participant-drawn shape that had the highest proportion of its total area fitting within the hexagon. As such, a prioritisation process was also embedded in comments, allowing the most relevant comments about specific areas to be shown in those areas, and more general comments to be shown in all others around them.

Semi-structured interviews
The third component of this research involved semi-structured interviews giving feedback on the first component (survey) and second component (tool) used in this study.
Theoretically, this is to understand the lengthy, deliberative input that could occur at the later ends of this process to inform reiterations of the participation process until a satisfactory process is achieved. In a practical sense, this also allowed an understanding of whether end-users considered the results of the survey to be useful, which aspects were surprising, and their thoughts on extensions and potential applications of the interactive tool.

Interview sampling
Invitations were sent to approximately 20 of the survey respondents and an additional ten people who were actively engaged in cycle planning or public participation exercises in Australia. A sample of professionals with local, state, and federal government experience in transport; private consulting experience in transport; and experience in digital public participation technology roll-out were interviewed. As well as this, there were individuals with experience in cycle advocacy, the organisation of group and social rides, and participation in local community and public hearing events. Several participants had significant amounts of anecdotal evidence of successful and failed cycling participation initiatives in the city. It is worth noting that many participants had years of experience with various combinations of these areas, so describing them as 'one category' would not do them justice. In total, 15 interviews summarised this knowledge base and provided a detailed database of thoughts on the survey and tool.

Interview structure
The interviews were held during a period of restricted public movement in Sydney, Australia, due to COVID-19 pandemic. As such, face-to-face research was not possible; one-on-one interviews were conducted over video calls (Microsoft Teams). The interviews were semi-structured and lasted between 45 and 60 minutes.
Firstly, participants described their background, what interested them in cycling, their work experience related to the study, and any other interesting personal anecdotes related to the study. Secondly, participants were shown a series of slides describing participatory analytics, passive data, and actively collected data. Following this, they were shown survey results and asked questions about their thoughts on particular elements, as well as the open-ended question, 'What did you find interesting about this survey?' Participants then watched a 10-minute video describing the full functionality of the tool (a video is also provided to complement this research in Appendix 5). Following this, an open discussion was held with the following prompts: • What did you think about the tool or process so far?
• What do you think is the advantage of using this tool or process, over it not being used?
• What improvements do you think could be made to the tool or process?
• What additional data could be used in this tool or process?
• Do you think the data used in this tool or process represents all cyclists?
• Do you think you could use this tool or process in your work? How?
Following the collection of interview responses, key quotes were documented and arranged into themes where they were notable or created patterns among the responses.
These themes and quotes are presented in results section.

Results
This section describes the results of the initial web survey (n = 280) and the longer one-onone interviews (n = 15). The full survey results can be found in Appendix 3.

Participant backgrounds
A total of 280 responses were collected in the survey period between May 11 th and June 14 th , 2020. The majority of responses occurred within the first week of the survey becoming available. The majority of respondents were aged 25-54, in age ranges 35-44 (30%), 45-54 (25%) and 25-34 (24%) respectively. Participants were skewed towards male (65%) in terms of gender. The majority of participants were full-time employed (74%). In terms of education, a high proportion of the sample had tertiary education at bachelor's level or above -24% had at least an undergraduate degree and a further 61% had more specialised education beyond that (e.g., graduate certificate, master's, PhD). The geographic spread of survey participants can be found in

Cycling habits
The main travel purposes by bicycle for participants were for fun and enjoyment (88%), for exercise (80%), and commuting to and from work (76%). The main frequency of cycling was every day or close to every day (36%), or at least a few times a week (36%). A total of 78% of respondents rated themselves as 'confident' or 'very confident' physically riding a bicycle, but only 60% were confident navigating with a bicycle.

Engagement and satisfaction with cycle planning in Sydney
One of the key questions of this exercise was around engagement in bicycle infrastructure planning. The main forms of engagement in cycle planning were as follows: • Participating in the Australian Census (64%) • Giving feedback to a local council (55%) • Posting on social media (55%) • Giving feedback to a state government body (31%) • Attending advocacy group meetings and events (20%) When asked the proportion of rides that they logged using applications (such as Strava), there was an interesting split between almost every ride to all rides (30%) being logged and no rides being logged (33%). Furthermore, for rides which were logged, these were mostly recreational (20%), or a mix of recreational and commuting (32%).
In terms of satisfaction with existing participation mechanisms, participants were generally not satisfied (42.5%) or very unsatisfied (11.79%) with mechanisms that were in place. A large number were also neutral (36%). Participants preferred the engagement format to be active participation. In terms of desired participation, the largest proportion wanted to participate 'actively with no time' (55%) -these participants wanted very quick ways to contribute to improvements to the network. As well as this, 31% of participants were willing to give up longer periods for active engagement. Interestingly, less than 1% wanted nothing to do with influencing the network. Furthermore, 13% of participants wanted their passive data to be primarily used as their preferred form of participation. On average, participants rated Sydney's cycling network 3.8 out of 10 when prompted.
Notably, only six out of the 280 respondents gave the network a score of eight or above.

Investment prioritisation
When asked, 'to you personally, how important are the following factors in improving the

H)
Building infrastructure close to train stations

Participant comments
As well as rating priorities, participants could draw one spatial shape on a map, representing their top idea for cycling investment in Sydney. There was a median distance Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 2 December 2021 of 3.8km between a respondent's postcode and the centre point at which they commented.
As there was a vast range of scales at which participants drew their response, filtering by area was found to be a useful way to identify areas and types of comment that were being made. Figure 7 demonstrates the top 200 comments covering the smallest area, which allowed key corridors to be highlighted. Figure 8 shows a heatmap of the comments, summed into individual hexagonal units across the city (showing five or more comments only). Figure 9 shows the results of integrating these comments into the composite index.

Interview results
After showing the results of the survey and interactive tool to practitioners, the following key feedback themes were identified.

Who would benefit most from using this tool?
Overwhelmingly, the majority of interviewees responded that local governments were the main user group that would benefit from using this tool and process. One

What current features do you like about the tool?
Several participants stated that they enjoyed seeing everything in the same interface. One stated, 'Love that you have combined lots of different things in one place'. Another stated that it was 'awesome to be able to put everything together like that'. One participant had 'never before seen' so many data sources in one cycling tool.
Some participants stated that it was easy to understand. For example, one Another positively highlighted feature was the comment matching system, which provides the most relevant comment for each cell alongside the score. One participant described this as 'a really effective way to combine qualitative and quantitative information'.

What can be added to the tool?
Some participants suggested including benchmarks of other cities or strategic visions. One participant wanted to know how each cell compared to 'best practice around the world, in cities such as Auckland, San Francisco and London'. Another would like to have seen integration of where the councils or government want the city to be in terms of strategic objectives, but also including proposed projects and how these match the participatory prioritisation work.
Another key suggestion articulated was to enable citizens to see the full array of In order to improve the tool, several participants suggested adding wider economic benefits, such as congestion reduction, air pollution reduction, and population health improvements into the costing module. Many interviewees also mentioned school drop-offs and considering additional prioritisation criteria that account for encouraging children to cycle to school. As well as this, many considered access to schools as more important than some included criteriaseveral participants considered access to stations less important than schools.
Additional limitations included by several participants were that although there is a 'propensity to cycle' metric, the survey and tool do not capture all would-be cyclists. Several highlighted the need to deliberately find and seek opinions of those who do not and will never cycle, as well as those who would consider cycling.

Other insights into participation
Several participants explicitly showed their agreement with the concept of the participatory analytics exercise, one stating that 'the analysis has even more legitimacy because it has been through a participatory process'. Some were critical of existing participation mechanisms, one stating that 'a lot of people are not aware of things until once they start digging holes in the ground'.
One participant responded that it is 'a lot of work to build a successful cycling participation exercise -partnership is needed amongst government agencies, advocacy, or user groups and other parties such as UberEATS [who can send internal messages to delivery partners]'.
A respondent also stated that 'government adoption is a barrier to participation exercises. These digital approaches are still quite novel and challenge the role of the expert. Experts are better at solutions; however, the community is much better at identifying problems'. One participant suggested using virtual reality headsets, saying that many more non-cyclists engaged when using this participation mechanism than would engage otherwise. Another participant noted the effectiveness of the 'Shared Spaces' web map in New South Wales, which allowed them to see other participant comments. As a consultant, this participant could download the web map as a data set to use in their own transport planning work.

Discussion
This research investigated two key questions. Firstly, what are the existing passive and active participation mechanisms and priorities for cyclists and potential cyclists in Sydney? Secondly, how can digital tools that consolidate these participation inputs in a meaningful way to both citizens and practitioners be designed and evaluated? There were three stages to the research: an initial survey with 280 respondents, the creation of a prioritisation index and a tool integrating the responses, and feedback through 15 in-depth interviews about the previous two stages.
The novel contributions of this work can be seen as an extension of previous works done in PPGIS and PSS introduced in the initial sections of this paper -including a range of passive and active data sources in a singular interface that have been given additional contextual information from survey data and in-depth follow-up interviews. It also provides applied research with specific, real-world generated information highlighting key desires, corridors from citizens as well as from the multi-criteria heat map for the specific geographic area of Sydney, Australia.
In answering the first aspect of the research question, it was found that the process was success in enabling participants to identify key areas for improvement in the city. These participants identified very clear priorities of off-street and separated cycleways, as well as creating a continuous network. In general, the preference for separated cycleways related strongly to research identified in the initial review (Aldred et al. 2017). Participants also articulated strong preferences for additional considerations of promoting cycling among younger, school-aged populationsrather than more 'strategic' objectives such as access to train stations. A clear preference for active participation mechanisms (86%) was articulated, as opposed to a reliance on existing data available and passive data. Though many of the survey participants took part in the Census and other participation mechanisms, participation was surprisingly low across many of the mechanisms which did not match the citizen stated preferences for active engagement. In particular, given the reliance on official data sources, such as Census and travel surveys, on calibrating the existing body of research on cycling PSS, it was surprising to find that only 64% of participants stated that they responded to this. This has implications on how this data is used, as well as on the importance on a data fusion exercise as performed in this research.
Given the preference for active participation shown by participants, it was unsurprising that the results revealed a general dissatisfaction with existing participation mechanisms -highlighting that there is a desire for more active processes to be employed by practitioners. In terms of the forms of passive data being used, such as logging rides through mobile applications, just over one third of the sample population did not log their rides at all. Those that did log their rides showed a skew toward only for recreational purposes. This presents further challenges for the use of passive data, and again, highlights the need for a mixture of data sources to be used.
In answering the second aspect of the research question, it was found that the bicycle PSS developed was successful in consolidating participation inputs in a meaningful way to citizens and practitioners. This was evidenced through several factors. Firstly, it was found that drawing explicit spatial shapes (rather than points on a map) was an effective way to highlight key suggestions which fed into a comment-weighting system in the prioritisation index. This allowed strong visuals as to the locations of the comments, as well as assigning relevant comments to particular hexagonal cells to assist with diagnosis of issues highlighted through the index.
Secondly, the prioritisation index was well understood by interview participants and seen as useful -particularly among local government users and for both internal and external advocacy. Through interviews, participants also, again, suggested features such as adding in access to schools, adding in the ability to vote again after the index has been formulated, and measuring against international best practice as well as strategic, government-led benchmarks.
The simple combination of multiple data and analytical layers, including the framing of active and passive participation, was seen by some participants as an important step in increasing the legitimacy of data-driven engagement exercises and the participatory prioritisation process. However, a key limitation was the lack of engagement from certain groups. Non-cyclists, potential cyclists, and existing cyclists from other groups (such as recreational cyclists, children who cycle, or food delivery drivers/couriers) were not well recorded through either active or passive engagement exercises in this study. These should be key additions in future research, which should discover how to include and engage those that would not have attended the survey, been a data point in the study, or attended an interview. These above points should be considered by government and researchers as they design and use such tools.
In Future research should also work to better understand socio-technical challenges for local councils, the proposed most likely end users, and others in moving beyond PPGIS approaches into integrating these with similar bicycle PSS approaches to this study. This would investigate the bottlenecks in adoption of using such participatory analytics driven PSS tools.

Conclusion
As many cities around the world grapple with the challenge of improving active transport infrastructure there is an increasing need for both participatory and evidence-based planning tools which can be used to retrofit our cities with better bicycle infrastructure. In this paper a bicycle infrastructure planning support tool is presented which takes into account both passive and active citizen engagement. The research builds upon the body of work in PPGIS and PSS to develop and evaluate an interactive tool which aims to prioritise future bicycle infrastructure as directly informed by the users of this infrastructure, the local community.
During this research, it is also worth noting and valuable to reflect that there was an exponential increase in the number of publicly available applications using human movement and activity data. This was supercharged by the need for individuals and authorities to understand movement and activity data in the context of the COVID-19 pandemic. Large technology companies are leading the way (for example, Facebook, SafeGraph, Cuebiq, Google, and Mapbox) by sharing passive mobility data to help assist with planning during this crisis. As such, at a time when passive data is being increasingly shared with the public and government and behaviours are being assumed, it could not be more important to emphasise the integration of active participation within this -to create a balanced approach and to match the desires of the participants in this study.
High-quality cycling infrastructure can be a significant investment, and constrained government budgets must balance competing priorities. Thus, investment decisions must be guided by sound analysis and transparent decision support tools. Effective citizen engagement improves the quality of information being developed and ensures that the government can proactively deal with emerging issues. The tool developed in this research enables citizens to identify priorities for themselves and share in decision-making, thereby assuming more ownership of solutions and more responsibility for their implementation. This fosters a sense of mutuality and empowerment, which reduces the risks of implementation and strengthens the resilience of future transport infrastructure.   'Signage. And better apps. The rms map is ok but needs to be significantly more detailed.
Google maps is ok but doesn't do a great job of finding the fastest cycleway route'.
'Still think there is not enough signage alerting motorists to be aware of cyclists -have never seen a sign advising motorists to be 1.5m clear of a cyclist'.

Appendix 5
Video describing prioritisation tool The following media is also available to complement this research.
Link removed for peer review. Author details are included.