Modelling Public Transport Accessibility with Monte Carlo Stochastic Simulations: A Case Study of Ostrava

: Activity-based micro-scale simulation models for transport modelling provide better evaluations of public transport accessibility, enabling researchers to overcome the shortage of reliable real-world data. Current simulation systems face simpliﬁcations of personal behaviour, zonal patterns, non-optimisation of public transport trips (choice of the fastest option only), and do not work with real targets and their characteristics. The new TRAMsim system uses a Monte Carlo approach, which evaluates all possible public transport and walking origin–destination (O–D) trips for k-nearest stops within a given time interval, and selects appropriate variants according to the expected scenarios and parameters derived from local surveys. For the city of Ostrava, Czechia, two commuting models were compared based on simulated movements to reach (a) randomly selected large employers and (b) proportionally selected employers using an appropriate distance–decay impedance function derived from various combinations of conditions. The validation of these models conﬁrms the relevance of the proportional gravity-based model. Multidimensional evaluation of the potential accessibility of employers elucidates issues in several localities, including a high number of transfers, high total commuting time, low variety of accessible employers and high pedestrian mode usage. The transport accessibility evaluation based on synthetic trips o ﬀ ers an improved understanding of local situations and helps to assess the impact of planned changes.


Introduction
Increasing traffic is an inherent symptom of vigorous urban development and its prosperity, but is concurrently one of the main factors that contribute to the deterioration of the urban environment and the endangerment of the sustainability of urban development. Public transport (PT) represents the main sustainable mode of urban mobility [1] and improves social equity and cohesion. In many countries, PT continues to represent an important share of transport, especially in cities.
Local governments aim to improve PT's utilisation and attractiveness to decrease the volume of individual car transport and to motivate people to shift their transport modes towards more environmentally friendly means. Many empirical studies (e.g., [2][3][4][5][6][7]) help to discern which local factors influence PT usage and which advantages and impedances shape the behaviour of commuters. Ingrained habits and social perceptions play a larger role than economic reasons, and a coherent urban sum of CDRs aggregated into cells of certain points in time. The strengths of such data lie in their quantity and fine temporal resolution; nevertheless, the main issue for commuting analyses is the missing link between the individual properties of phone carriers and the purpose of their mobility. Mobility motivation is generally judged only from temporal behaviour, and therefore lacks clarity. Furthermore, the following constraints should be taken into account: missing information about the full route, limited data availability due to high costs and other restrictions, real data from one mobile operator and estimated data from other operators, only currently active SIM cards being recorded, difficulties eliminating robots and non-authorised access to mobile phones, limited spatial resolution due to the size of cells, and cell balancing.
On the other hand, questionnaires and interviews where socio-demographic characteristics, including travel behaviour and travel purposes, are recorded may provide a more comprehensive dataset. They offer a deep understanding by means of thorough characterisation of individual social and economic status, car ownership, income, preferences, family conditions, pro-environmental behaviour, etc. Travel behaviour is also influenced by a person's system of values, e.g., postmodernist values, their relationship to environmental awareness (including perception of congestion, and perception of pollution), and how these are reflected in pro-environmental behaviour [42]. City-wide or regional interviews typically produce small datasets, vis-à-vis representative samples for each individual group of people. Thus, it is difficult to make inferences for small localities and assess the local transport situation using these data.
The evaluation of PT conditions is more complicated than the evaluation of individual car transport (ICT). Under ordinary conditions (e.g., no congestion), various metrics of ICT (e.g., travel time, distance, cost, fuel) are highly correlated; therefore, it may be enough to use any one of them to analyse commuting conditions [43]. The situation for PT is different and requires person-based accessibility measures to address the many person-related constraints and preferences [44]. There are many factors that can act as substantive impedances for PT usage, namely travel time, cost, walking distance, and waiting time, and usually they are not correlated simply. Furthermore, PT travel conditions may differ significantly on the outbound and return trips and thus require evaluation as a round trip [45].
To summarise, many difficulties are faced in obtaining and utilising large, unbiased representative datasets on human movements. We can substitute such datasets by utilising simulations of human behaviour following certain models with an appropriate stochastic component, which implements the variability and uncertainty of conditions including, but not always, the rational behaviour of real commuters.
To overcome these issues, a new, micro-scale, Monte Carlo simulation-based model for evaluation of local commuting conditions and potential accessibility is introduced. The aim of this paper is to highlight some specific features of the system, such as the utilisation of personal characteristics, activity-driven modelling (i.e., chaining of activities with fixed and soft temporal intervals or starts), and optimised PT trips selected from the full set of possibilities within the given time interval and k-nearest stops according to several criteria, as well as a discrete choice of targets, and implementation into the distributed client-server software enabling large computations.
The paper is organised as follows: the first section gives an overview of related works, underlining some existing limitations. The second section summarises the outputs of the travel survey in Ostrava. The third section presents the design of the simulation model. The fourth section describes the implementation of the modelling system and settings for alternative models. The fifth section provides a validation of the models. The sixth section presents the results of potential accessibility modelling in Ostrava. The discussion and conclusions summarise the main features of the introduced modelling approach and provide a comparison to existing systems.

Microsimulation Modelling
Stochastic microsimulations of activity patterns can be based on different principles, and are currently dominated by agent-based modelling (ABM). ABM controls the behaviour of agents with a set of rules, the status of agents and external stimuli, which enables their coordination and interaction, which is useful for the integration of social behaviour and building dynamic systems. The disadvantages of ABM lie in the complexity of the systems and subjective choices, but also in performance limitations [46] and dealing with space [47]. Microsimulation modelling may reach the level of individual persons and vehicles. Usually, ABM is focused on individual vehicle modelling, while PT modelling remains a minor topic. The following examples demonstrate the development of this type of modelling and the level reached, and also comment on their simplifications and accompanying issues.
One of the first microsimulation models was RAMBLAS [15], which simulated daily activity patterns for regional planning in the Eindhoven region (NL). A synthetic population based on Monte Carlo simulations and activity agendas randomly drawn from the national distribution were used [44]. It was focused only on individual vehicle transport.
The ILUMASS project [18] integrated an activity-based simulation model of urban traffic flows, and a microscale ABM of household and firm development, to obtain the resulting changes in land and housing markets and the environmental impacts. Sub-models were connected via files that caused excessive computing times and limited operational usage [48].
Lovelace et al. [49] used a microsimulation of individual commutes to the nearest employment centre. The model was optimised to minimise differences between simulation results and census data. Simulated origin-destination (O-D) pairs were aggregated to display travel patterns and evaluate distances to employment centres, as well as to identify important destinations, etc. However, only driving and walking transport modes were modelled.
Greulich et al. [50] developed a multimodal ABM with a flexible framework where individual agents could reschedule their trips when unexpected events occurred. Another multimodal transport model was developed by Dobler and Lämmel [51]. Their hybrid approach enabled the combination of macro-scaled demand models with micro-scaled, force-based, and agent-based models, where the latter was meant to represent active modes of transport (walking, cycling) [12].
One of the most rapidly developing projects, MATSim, provides a modular framework to implement large-scale, agent-based transport simulations. MATSim includes demand modelling, agent-based mobility simulations, iterative processing to reach an optimum model, and methods to analyse the outputs [19]. MATSim was implemented in numerous studies [52], i.e., for an accessibility evaluation in the Metropolitan area of Zurich [53], travel behaviour under the condition of limited available survey data in China [54], transport energy demand modelling in Croatia [55], and cordon toll policy using mobile phone records in Barcelona [41]. While, at the beginning, the microsimulation was focused on the driving mode, recent development has enabled PT modelling (detection of frequent transfer locations in Seoul [56], PT accessibility for Singapore [57], locating transport facilities [58], and a comparison of fixed and flexible PT [59]) and new modes such as carsharing (e.g., the impact of parking price policy on free-floating carsharing in Zurich [60]) or autonomous vehicles (e.g., transport policy optimisation for autonomous vehicles in Zug [61]). Accessibility studies using MATSim vary in employed datasets, travel modes and focus. For the evaluation of PT, typically no time restrictions or temporal variability were taken into account. The missing temporal changes and neglected variability in schedules were also criticised by Zhang et al. [62].
Liu and Zhou [63] applied ABM for modelling capacity-constrained transit service, though no personal preferences or social behaviour were utilised in this model. Social behaviour (e.g., job competition) was included in the ABM model by Huang [44]. Another advantage of his model was the integration of schedules with anticipated delays. Nevertheless, he simplified the model to include only job opportunities within a fixed Euclidean walking distance of 1 km, and neither real employers nor residential locations were taken into account. Personal behaviour was deduced from the national census data using a one-size-fits-all approach.
ABM is not the only approach to microsimulation modelling. Hybrid models represent a combination of aggregated and disaggregated models. The IRPUD model [48] is a dynamic simulation model of intraregional locations and mobility decisions in an urban region where the travel behaviour of individual households is modelled at a zonal level. It was used to simulate scenarios of fuel price increases and different combinations of activities and policies. As part of the solution, they modelled job accessibility in the Dortmund region, but only for individual transport.

Zonal Model Considerations
The classic modelling concept is usually based on four-step modelling (FSM) [11]. In FSM, zonal tessellation (usually TAZ) of the area is applied, and the simulated trips are distributed equally over the whole study area. A comprehensive overview of different transport models and approaches was provided by Ortúzar and Willumsen [10]. If the origin-destination matrices are built for zones and not for locations, it can provide only general spatial information [12], which limits the usage of address-based specific information, but, conversely, makes it simpler to model spatial interactions using, e.g., gravity models. Except for gravity models, discrete-choice models play an important role in the improved utilisation of observed individual choices. These models interpret more influential variables than gravity models, enabling better adaptation [64]. However, discrete-choice models are unable to integrate different sizes and attractiveness of individual targets, as opposed to what is available with gravity models using generalised distance-decay functions (e.g., [27]), sometimes also combined with a Huff model ( [65][66][67]). Both models are based on zones, typically TAZ ( [7,68]). A common shortcoming of zonal models is that they cannot fully utilise individual point-based targets and their properties, such as location, capacity, and time window and constraints, to improve the accuracy of transport modelling.

Study Area
The model was tested and evaluated for Ostrava, the centre of the Moravian-Silesian region situated in northeastern Czechia. The Ostrava XXL PT zone (Ostrava PT greater zone) has 401,000 inhabitants (2015) and an area of 530 km 2 . The urban PT network consists of 531 stops and 81 routes (bus, tram, and trolleybus) operating around 7150 trips on any given weekday.
In 2014, a questionnaire was conducted by a group of university researchers, including the authors of this paper, in order to understand the travel behaviour of respondents and their travel diaries. The respondents were asked to describe their usual daily trips based on starting and ending points, trip duration, means of transport used, purpose, and frequency. The total number of questionnaires completed was 534 (0.2% of the population, which is not high, and makes it difficult to draw firm conclusions). The Pencil And Paper Interview (PAPI) method was applied, and the sampling quota of respondents was stratified according to gender, age and education level [69]. The city was divided into 13 zones to ensure an even territorial distribution of respondents. The main class of respondents was full-time workers (46%), followed by retirees (20%), students (17%), self-employed (7%), persons on parental leave (6%) and unemployed (5%). The composition of the sample corresponded to the distribution of the general population by gender (male 48%), age (14% in 15-24, 67% in 25-64, 19% in 65+), and highest achieved level of education (tertiary 23%, secondary with graduation 39% and 28% without graduation, primary 10%). The deviation from the required stratification was below 1% for gender and age categories. The places of residence, origins, and destinations of all trips on weekdays and weekends (a total of 9959 points) were geocoded using Google Geocoding API [70].
The survey results were aggregated according to the following population groups: retired, employed, self-employed, unemployed, and students. The frequency analysis of recorded trips provided a quantitative assessment of group priorities for travel purpose, destination, time, and transport mode.
A reconstruction of all day trips enabled the discovery of typical scenarios (Day Activity Pattern according to [10]), daily movements ( Figure 1) and the temporal distribution of activities ( Figure 2). The evaluation of these distributions is necessary to draw random samples during Monte Carlo stochastic simulations of trips, similar to, for example, the Portland model in [71]. Noticeably, the patterns reflect differences in working and non-working days, and in population groups ( Figure 1). According to the survey results, three quarters of daily movements are simple return trips to one destination. This figure corresponds well to the 70% found in Auckland [10]. Almost 90% of workers' daily activities on weekdays (WD) are related to work, including 11% of trip chains connecting working and shopping. Pensioners declared shopping as the most prominent activity both for weekdays and weekends (WK) (40-50%). Different activities on WD and on WK were reported by students. They declared almost 90% of daily activities to be linked with studies and only a small portion were combined with sport or shopping, while, on WK, they engaged mainly in sport, work or shopping. Visiting family and friends is a specific weekend activity for all groups, except for students.
Sustainability 2019, 11, x FOR PEER REVIEW 6 of 24 stochastic simulations of trips, similar to, for example, the Portland model in [71]. Noticeably, the patterns reflect differences in working and non-working days, and in population groups ( Figure 1). According to the survey results, three quarters of daily movements are simple return trips to one destination. This figure corresponds well to the 70% found in Auckland [10]. Almost 90% of workers' daily activities on weekdays (WD) are related to work, including 11% of trip chains connecting working and shopping. Pensioners declared shopping as the most prominent activity both for weekdays and weekends (WK) (40%-50%). Different activities on WD and on WK were reported by students. They declared almost 90% of daily activities to be linked with studies and only a small portion were combined with sport or shopping, while, on WK, they engaged mainly in sport, work or shopping. Visiting family and friends is a specific weekend activity for all groups, except for students. The temporal distribution of the beginning of travel activities for employees ( Figure 2) shows the dominant peak of commuting to work early in the morning on weekdays and the multimodal distribution of commuting to work during weekends, where work shifts are more noticeably imprinted in the pattern (13:00 h for the second shift and 21:00 h for the third shift). Bimodal distribution is typical for escorting family members, usually children, on weekdays. Sport is typically an afternoon activity, while shopping is indicated by a secondary small peak in the morning hours for employees on non-working days, rather than before work. The temporal distribution of the beginning of travel activities for employees ( Figure 2) shows the dominant peak of commuting to work early in the morning on weekdays and the multimodal distribution of commuting to work during weekends, where work shifts are more noticeably imprinted in the pattern (13:00 h for the second shift and 21:00 h for the third shift). Bimodal distribution is typical for escorting family members, usually children, on weekdays. Sport is typically an afternoon activity, while shopping is indicated by a secondary small peak in the morning hours for employees on non-working days, rather than before work.

Distance-Decay Impedance Functions
A profound understanding of trip distribution and the calibration of gravity models requires a distance-decay function (DDF) evaluation. These functions are represented by the relative distribution functions of travel distance and travel time constructed in the case of sufficient sample size for each combination of transport mode, purpose, personal economic activity, type of day and urban category (Figure 3a  Differences between transport modes are clearly visible ( Figure 3a). As expected, walking is limited to short distances and 90% of walks are shorter than 2 km. The cumulative distribution of urban PT trips continuously increases up to 8 km, which corresponds to the longest distances between borders of dense peripheral settlements and the city centre. Beyond this limit, distances

Distance-Decay Impedance Functions
A profound understanding of trip distribution and the calibration of gravity models requires a distance-decay function (DDF) evaluation. These functions are represented by the relative distribution functions of travel distance and travel time constructed in the case of sufficient sample size for each combination of transport mode, purpose, personal economic activity, type of day and urban category ( Figure 3a,b).

Distance-Decay Impedance Functions
A profound understanding of trip distribution and the calibration of gravity models requires a distance-decay function (DDF) evaluation. These functions are represented by the relative distribution functions of travel distance and travel time constructed in the case of sufficient sample size for each combination of transport mode, purpose, personal economic activity, type of day and urban category (Figure 3a  Differences between transport modes are clearly visible ( Figure 3a). As expected, walking is limited to short distances and 90% of walks are shorter than 2 km. The cumulative distribution of urban PT trips continuously increases up to 8 km, which corresponds to the longest distances between borders of dense peripheral settlements and the city centre. Beyond this limit, distances Differences between transport modes are clearly visible ( Figure 3a). As expected, walking is limited to short distances and 90% of walks are shorter than 2 km. The cumulative distribution of urban PT trips continuously increases up to 8 km, which corresponds to the longest distances between borders of dense peripheral settlements and the city centre. Beyond this limit, distances grow rapidly, but such trips rarely occur. Driving is used for longer trips; the median is 6.4 km and the 9th decile is 13.7 km. The longest recorded car trip corresponds to the diameter of the study area.
Density of urbanisation (Figure 3b) also indicates a significant influence on the expected length of trip. Urban land cover categories include dense urbanised (city centre or compact settlement) and sparse urbanised (family houses, and mixed residential, commercial and industrial zones) settlements. However, few observations for sparse urbanised areas were recorded. It is clear that trip distances in dense urbanised areas are noticeably shorter (by approximately a third) than those for sparse urbanised areas, where suburbs are also included.
For all groups, various regression functions (i.e., exponential, power, Weibull, gamma, lognormal, and Box-Cox) were tested. Discussion on the behaviour and testing of such functions can be found in, e.g., [72][73][74]. In the majority of groups analysed, the best approximation was reached using the Weibull function [75] (Equation (1)): where x stands for distance, and a and b are parameters for optimisation, with the approximate interpretation that a is more related to the scale of trips, while b corresponds to the shape of the function and variability of trip length. This function was used for approximation of all group DDFs; the differences lie only in the a and b parameters. DDFs characterise groups' different perceptions of distance and willingness to travel. This same characterisation was applied in order to modify functions for travel time instead of distance. For the sake of simplicity, we did not adopt the name of the impedance function for, e.g., the temporal-decay function.

Concept of Modelling
The TRAMsim simulation system is based on the concept outlined in Figure 4 and is explained below. The simulation system is controlled by a set of parameters that should be fitted to the expected behaviour of the commuting community. The process of modelling is subdivided into the main four phases. grow rapidly, but such trips rarely occur. Driving is used for longer trips; the median is 6.4 km and the 9th decile is 13.7 km. The longest recorded car trip corresponds to the diameter of the study area. Density of urbanisation (Figure 3b) also indicates a significant influence on the expected length of trip. Urban land cover categories include dense urbanised (city centre or compact settlement) and sparse urbanised (family houses, and mixed residential, commercial and industrial zones) settlements. However, few observations for sparse urbanised areas were recorded. It is clear that trip distances in dense urbanised areas are noticeably shorter (by approximately a third) than those for sparse urbanised areas, where suburbs are also included.
For all groups, various regression functions (i.e., exponential, power, Weibull, gamma, lognormal, and Box-Cox) were tested. Discussion on the behaviour and testing of such functions can be found in, e.g., [72][73][74]. In the majority of groups analysed, the best approximation was reached using the Weibull function [75] (Equation 1): where x stands for distance, and a and b are parameters for optimisation, with the approximate interpretation that a is more related to the scale of trips, while b corresponds to the shape of the function and variability of trip length. This function was used for approximation of all group DDFs; the differences lie only in the a and b parameters. DDFs characterise groups' different perceptions of distance and willingness to travel. This same characterisation was applied in order to modify functions for travel time instead of distance. For the sake of simplicity, we did not adopt the name of the impedance function for, e.g., the temporal-decay function.

Concept of Modelling
The TRAMsim simulation system is based on the concept outlined in Figure 4 and is explained below. The simulation system is controlled by a set of parameters that should be fitted to the expected behaviour of the commuting community. The process of modelling is subdivided into the main four phases. First, a population of interest is selected. It may represent a current resident population, population prediction or a selected category of person. The population choice influences the important parameter settings for simulations, including the number of simulations, maximal walking distance, number of proximal stops, description of chaining activities and related time settings (e.g., earliest time to leave home, latest time to return home, time tolerance, time interval requested before First, a population of interest is selected. It may represent a current resident population, population prediction or a selected category of person. The population choice influences the important parameter settings for simulations, including the number of simulations, maximal walking distance, number of proximal stops, description of chaining activities and related time settings (e.g., earliest time to leave home, latest time to return home, time tolerance, time interval requested before or after the activity, time distribution of activities). Important aspects of activities and behaviour have been explained in previous work ( [10], p. 482). Within our concept, activities can be described as soft activities with unknown targets and unknown requested times, where only an activity type is known, or fixed activities where the target and/or requested time is known. For both soft and fixed activities, the potential set of targets is known, and each target is characterised by a precise location, size or capacity, and temporal restrictions.
After the population selection, the scenario is selected. The scenario consists of a chain of activities with a defined order. Each activity has an indicated minimal duration and may also have a fixed time plan (e.g., required starting time), which is suitable for work shifts, cultural and sporting events, etc.
Simulations for each origin begin with the search and evaluation of all possible transport connections within one hour (before and after) the planned activity to all destinations of the given type. The multithreading computing searches for connections between the specified number of stops closest to the origin and the destination. The search respects the parameters for the given type of simulation, scenario and personal features. If no connection is found, the time interval is extended sequentially by one hour.
Walking to and from stops is included as an obligatory part of each PT trip (similar to [25]). Furthermore, walking directly (hereafter referred to as pedestrian mode) between the origin and the destination is evaluated, which is important for shorter trips.
All found connections (D 1a -D nz ) are compared, and, according to the personal/purpose travel optimality criterion (fastest, shortest, with minimum changes, random, latest, earliest), the most suitable trip from the given origin to each destination is selected (D 1 -D z ). In this way, the set of potential trips is reduced to one optimal trip from the given origin to each destination.
Next, the gravity value for each destination from the given origin is calculated. The gravity model utilises the size of the target, trip duration (including walking and waiting time) and the appropriate distance-decay impedance function. The DDF for modelling is selected according to the mode of transport, target categories, economic activities of the respective person, type of day, and category of the location. The resulting gravity value is obtained by multiplying the attractiveness by the target weight (Equation (2)): where G ij stands for the gravity value, A ij is the attractiveness of the j-target from the i-origin, f() is the appropriate distance-decay impedance function, and t ij is the travel time between i and j locations. The attractiveness of a j-target is directly proportional to the size of the target, analogous to the Huff probability model [65,76]) for shopping gravity models, where the size of a target is typically expressed as the area of the given shop. It is calculated for the given type of person from the trip duration transformed by the DDF. The target size is a parameter that directly influences the size of the interaction; in the case of commuting to work, the number of employees is used [77].
The system currently implements three regimes of selection for the destination from the given origin from the full set of targets-fully random, maximal gravity and proportional gravity. Fully random selection utilises a uniform distribution, where each destination possesses the same probability of selection using a Monte Carlo drawing. Such an option may be a relevant alternative for small settlements where differences in distance can be neglected. The maximal gravity mode selects the target with the maximal gravity function value. This option is related to distance-decay utility maximisation [78] and assumes that everyone selects one's job purely according to minimal travel impedance and maximal size of employer. Correspondingly, during simulations, the appropriate employer for each given origin is always selected. The proportional gravity mode intends to select targets proportionally to the distribution of the gravity value for all targets. It is implemented using a Monte Carlo drawing from the sum of the gravity values. All procedures are fully described in Appendix A.
By employing one of these selection options, one destination for the given origin is selected and one simulated trip is finished. Trip characteristics are recorded, including start time, finish time, travel time, total travel time (including walking), distance, number of changes, price, walking distance to/from the stop, and transport mode (public transport/walking). The simulations are repeated until the required number is reached, yielding large trip tables of person-commuting values.
The process can be iterated by selecting another possible scenario for the given population, or restarted by selecting another population (typically another person category).

TRAMsim Implementation
Simulations are performed in a system called TRAMsim. As was mentioned previously, the most frequent variants of simulation machines are ABM models. However, generally ABM models are not considered to be designed for extensive simulations [79]. In this case study, procedural modelling was preferred because of the massive extent of searching for PT connections and the requirements for parallel and distributed processing. The procedural modelling was based on the execution of a sequence of activities undertaken by the population. The simulation of behaviour for more persons or variants of individual behaviour, reflecting different conditions (changed randomly or systematically), are performed by repeated executions of the model, as is done with ABM. The heterogeneity of the environment is also implemented in a similar way to ABM.
Achieving results within a reasonable timeframe requires massive parallel data processing. A new parallel processing with an automatic extendable client-server system using a Microsoft SQL Server (MS SQL Server) was developed. The MS SQL Server is a relational database server developed by Microsoft and is used for the storage and maintenance of travel data and the programming of the application with a Transact-SQL language [80]. The three-layer model includes a central server, auxiliary local SQL servers and client PCs ( Figure 5). Generation of the travel requirements is done using the local copy of the underlying data to eliminate any overloads of the central server. During the subsequent receipt of the results, the local server writes the results onto both servers. This method significantly reduces the load on the central server, which allows for an increase in the number of clients by a factor greater than 100. The programming code is stored in the supplementary file S1.
A database application generates random transport requests, according to the expected behaviours and parameters of the system. The transport requests are processed by the subsystem for the searching, evaluation and optimisation of PT connections. This subsystem utilises a DLL (Dynamic-link) library provided by CHAPS Ltd., the administrator of the National Information System on Regular Public Passenger Transport Timetables, to search for all possible PT connections in the PT time schedules. This enables a selection from the full set of trips based on different priorities in order to minimise travel time, distance, cost, number of transfers and waiting time within the given time period. The system distributes requests among a large number of clients and stores the results in the database server. Currently, the system does not implement the uncertainty of transport timetables, due to congestion, for example. This is a result of the first implementation being done in Ostrava, where delays in PT are usually minimal and timetables are well respected (93% within a 3 min tolerance [81]). auxiliary local SQL servers and client PCs ( Figure 5). Generation of the travel requirements is done using the local copy of the underlying data to eliminate any overloads of the central server. During the subsequent receipt of the results, the local server writes the results onto both servers. This method significantly reduces the load on the central server, which allows for an increase in the number of clients by a factor greater than 100. The programming code is stored in the supplementary file S1.

TRAMsim Testing
For testing and validation, the simple Home-Workplace-Home scenario was implemented and analysed. Multipurpose trips were not included in the case study; however, the same model can be implemented to analyse trips of that nature. In TRAMsim, origin locations can be simulated, but currently they are substituted by fixed centres of small urban polygons. This corresponds to an intermediate approach in activity-based models [10]. The locations of residents (Home) were represented by 280 median centres of Basic Settlement Units (BSUs). The median centres were calculated using the distribution of address points in the given BSU representing a residential centre of the settlement unit.
Presently, the demand from any given origin was fixed to establish the same population demand for all localities, in order to not mirror the current population situation (number, structure), but, instead, a potential of accessibility. The number of demands from each location equals the number of simulations done for that location.
Instead of synthesising workplace locations inside a destination zone, current opportunities were mapped in sufficient detail and destinations were selected from the real set of localised targets using one of the destination-selection criteria (typically proportional gravity).
Contrary to other studies, (e.g., [7,44]), we mapped all significant workplaces (generally 50+ employees) in the pilot area (Ostrava + 20 km buffer), identified their locations as potential destinations, and estimated the number of employees as a basis for the evaluation of attractiveness. Various registries of employers and companies were explored for this purpose. The final set of workplaces was derived from the registry of the local labour office and was complemented by evidence from other sources (mainly the register of companies) and downscaled to cover all premises according to their relative size based on experts' estimations. Such estimated values are obviously sources of uncertainty; nevertheless, they are used to calculate gravity values and do not represent constraints to limit the number of workers commuting to a given employer (no double constrained gravity model is applied). Optionally, work regimes (i.e., the start and end of a shift) can be recorded, similarly to other types of targets, where temporal constraints are recorded.
According to [82], the three stops nearest to the origin and the destination (nine combinations in total) were selected to search for various transport connections. In the following example, 977,760 O-D transport requests were processed and, for each request, about 25 connection variants were evaluated to find an optimal transport connection. Additionally, a pedestrian-mode trip between each origin and destination was evaluated.

Two Alternative Models for Testing
The number of PT stop combinations indicates a demanding computation, similar to other studies ( [44,83,84]). The question is, is it necessary, or is it possible, to obtain similar results with substantially lower computational effort? Two alternative models were built, and their results were analysed. The two commuting models were designed based on simulated movements to (1) one hundred randomly selected large employers with more than 100 employees (the random-above-limit model, e.g., [85]) and (2) one hundred proportional gravity-selected employers. The first model is implemented much more simply than the second, because it does not require frequency analysis, optimisation of the DDF, or estimation of the weight of the target. It was assumed that elimination of the distance-decay effect within a city would have a negligible impact. In both models, the same settings were applied: departure from home after 16:15 h, 15 min anticipated preparatory time before work, eight hours work, and return home before 22:00 h. The latest travel connection before the requested arrival time was preferred in order to minimise waiting time. This meant that, if working hours started at 08:00 h, the connection prior to and closest to the optimal arrival time of 07:45 h was selected. The beginning of working hours was randomised from the most probable interval between 06:00 and 08:00 h, according to the survey results ( Figure 2). The maximal walking distance to a PT stop was 5 km, so as not to strictly limit accessibility in peripheral villages.
Results of the models differ substantially from one another, including in the related accessibility evaluations. Differences in the distribution of individual values can be seen both in centrally situated BSUs, as well as in those in a peripheral position (Velka Polom in Figures 6 and 7). The proportional gravity model provides a much lower mean and median, as well as a more homogeneous distribution and fewer outliers than the random-above-limit model for both BSUs (Figure 6). This is verified in the maps (Figure 7), where the random-above-limit model often selects quite distant targets, contrary to more compact selection in the proportional gravity model. It is necessary to validate both models and decide if the differences are significant.

Model Validation
The validation of both models is challenging because comprehensive information is necessary for the simulation of individual activities and is almost never available with a sufficient range [49]. Such models cannot conceptually be fully validated against data, due to mismatching in the granularity, and abundance of simulation outputs and traffic count data [86,87]. Wegener and Spiekermann [48] recommend combining the statistical calibration of models with expert judgement and plausibility analysis. In this case, the validation of models was based on the comparison between the simulation outputs and frequencies of each travel time reported in travel diaries aggregated by distribution functions. The only modification of the survey data was to exclude reported trips where targets were outside of Ostrava. For validation, both graphic and statistical tools were applied. The results (Figure 8) clearly show a significant deviation from the random-above-limit model, where the share of large trips is much higher than the distribution of time from the travel survey (notice the rounding of time values by respondents). On the other hand, the results of the proportional gravity model seem to coincide with the original data quite well (Figure 9). before work, eight hours work, and return home before 22:00 h. The latest travel connection before the requested arrival time was preferred in order to minimise waiting time. This meant that, if working hours started at 08:00 h, the connection prior to and closest to the optimal arrival time of 07:45 h was selected. The beginning of working hours was randomised from the most probable interval between 06:00 and 08:00 h, according to the survey results ( Figure 2). The maximal walking distance to a PT stop was 5 km, so as not to strictly limit accessibility in peripheral villages.
Results of the models differ substantially from one another, including in the related accessibility evaluations. Differences in the distribution of individual values can be seen both in centrally situated BSUs, as well as in those in a peripheral position (Velka Polom in Figures 6 and 7). The proportional gravity model provides a much lower mean and median, as well as a more homogeneous distribution and fewer outliers than the random-above-limit model for both BSUs (Figure 6). This is verified in the maps (Figure 7), where the random-above-limit model often selects quite distant targets, contrary to more compact selection in the proportional gravity model. It is necessary to validate both models and decide if the differences are significant.

Model Validation
The validation of both models is challenging because comprehensive information is necessary for the simulation of individual activities and is almost never available with a sufficient range [49]. Such models cannot conceptually be fully validated against data, due to mismatching in the granularity, and abundance of simulation outputs and traffic count data [86,87]. Wegener and Spiekermann [48] recommend combining the statistical calibration of models with expert judgement and plausibility analysis. In this case, the validation of models was based on the comparison between the simulation outputs and frequencies of each travel time reported in travel diaries aggregated by distribution functions. The only modification of the survey data was to exclude reported trips where targets were outside of Ostrava. For validation, both graphic and statistical tools were applied. The results (Figure 8) clearly show a significant deviation from the random-above-limit model, where the share of large trips is much higher than the distribution of time from the travel survey (notice the rounding of time values by respondents). On the other hand, the results of the proportional gravity model seem to coincide with the original data quite well (Figure 9). A good correlation between the survey results and simulation outputs from the proportional gravity model was verified via statistical testing. Both datasets did not originate from the normal distribution (Kolmogorov-Smirnov and Shapiro-Wilk tests, p < 0.001). According to the results of the non-parametric tests, as well as the t-test (Table 1), the two distributions are equal and there is no significant difference. This confirms that the simulations based on the described proportional gravity model are realistic. the simulation outputs and frequencies of each travel time reported in travel diaries aggregated by distribution functions. The only modification of the survey data was to exclude reported trips where targets were outside of Ostrava. For validation, both graphic and statistical tools were applied. The results (Figure 8) clearly show a significant deviation from the random-above-limit model, where the share of large trips is much higher than the distribution of time from the travel survey (notice the rounding of time values by respondents). On the other hand, the results of the proportional gravity model seem to coincide with the original data quite well (Figure 9).  A good correlation between the survey results and simulation outputs from the proportional gravity model was verified via statistical testing. Both datasets did not originate from the normal distribution (Kolmogorov-Smirnov and Shapiro-Wilk tests, p < 0.001). According to the results of the non-parametric tests, as well as the t-test (Table 1), the two distributions are equal and there is no significant difference. This confirms that the simulations based on the described proportional gravity model are realistic.
We can conclude, based on the random-above-limit model analysis, that the simplification of a simulation based only on a selection of large employers is not appropriate, even within a city.

Potential Accessibility Evaluation in Ostrava
The proportional gravity model was applied to evaluate potential accessibility in Ostrava using  We can conclude, based on the random-above-limit model analysis, that the simplification of a simulation based only on a selection of large employers is not appropriate, even within a city.

Potential Accessibility Evaluation in Ostrava
The proportional gravity model was applied to evaluate potential accessibility in Ostrava using PT and walking for the granularity of BSUs. This analysis demonstrated the variability of outputs from microsimulation modelling, as well as their usefulness for understanding the underlying factors and thoroughly explaining observed differences in accessibility results.
Without the availability of an individual vehicle, it is possible to commute to some of the employers from every origin, but under highly variable conditions. During simulations, 94% of available employers were used as destinations for simulated commutes. This means that the remaining 6% of employers may be less accessible via PT, or they may be located in the "shadow" of a competing large employer located within very close proximity. The share of accessible workplaces from each origin exhibits significant differences in the supply of workplaces (Figure 10a). The locations with the richest variety of available employers (>70%) are in the centre of Ostrava, as well as in some fragmented zones around it. A comparison between the share of accessible workplaces and employers' locations (black dots) shows clear differences in the pattern, indicating the importance of attractiveness and PT conditions within the evaluation. Peripheral parts of the city show an obviously lower supply of accessible employers, but again the situation is heterogeneous. A low supply of employers is recognised in Jistebnik (SW corner, 39%), in several municipalities on the west border (less than 60%, Zbyslavice,Čavisov, Horní Lhota) and in the northeast corner around Bohumín (less than 60%). The underlying reasons for this can be understood using Figures 10 and 11. show an obviously lower supply of accessible employers, but again the situation is heterogeneous. A low supply of employers is recognised in Jistebnik (SW corner, 39%), in several municipalities on the west border (less than 60%, Zbyslavice, Čavisov, Horní Lhota) and in the northeast corner around Bohumín (less than 60%). The underlying reasons for this can be understood using Figures 10 and 11.  Jistebnik has a low number of employers within the municipality and in nearby accessible surroundings, thus the local labour market does not create a sufficient labour demand. During simulations, due to poor PT conditions (e.g., many transfers, low frequency, long walking distance show an obviously lower supply of accessible employers, but again the situation is heterogeneous. A low supply of employers is recognised in Jistebnik (SW corner, 39%), in several municipalities on the west border (less than 60%, Zbyslavice, Čavisov, Horní Lhota) and in the northeast corner around Bohumín (less than 60%). The underlying reasons for this can be understood using Figures 10 and 11.  Jistebnik has a low number of employers within the municipality and in nearby accessible surroundings, thus the local labour market does not create a sufficient labour demand. During simulations, due to poor PT conditions (e.g., many transfers, low frequency, long walking distance to a stop), local employers within walkable distance were frequently selected (confirmed by the high Jistebnik has a low number of employers within the municipality and in nearby accessible surroundings, thus the local labour market does not create a sufficient labour demand. During simulations, due to poor PT conditions (e.g., many transfers, low frequency, long walking distance to a stop), local employers within walkable distance were frequently selected (confirmed by the high share of pedestrian mode trips-more than 20% in Figure 10b) or workers commuted to more distant employers under poor conditions (reference the extreme number of changes per day in Figure 11b). Surprisingly, in Jistebnik, the total travel time per day was not worse than in the majority of other BSUs, which is explained by a smoothing effect of the mean between results for local commuting (including the dominating pedestrian mode) and distant commuting.
Bohumín shows some similarities, such as the low variety of employers connected with a very high share of pedestrian mode use; however, it boasts good travel times and a low number of changes ( Figure 11). These attributes reveal a very close commuting network (built with local urban PT and pedestrian modes), in which Bohumín creates its own sufficient labour market with minimal requirements for commuting to other parts of the Ostrava XXL zone.
Finally, municipalities on the western border of Ostrava XXL show a low usage of the pedestrian mode, which indicates the absence of local employers. However, PT conditions are good (comparable to other geographically similar locations), especially when considering an acceptable average number of changes.
It is possible to analyse the situation in other municipal areas (BSUs) by employing the same technique. A high incidence of pedestrian mode usage is not only found in the centre of Ostrava, where the high supply of local employers and slower PT in the densely urbanised area leads to a preference for walking, but also in the eastern BSUs, close to external employers in the nearby city of Havířov.
Several BSUs (e.g., Vřesina) offer a good variety of employers, but exhibit a poor average travel time and high number of changes. According to average travel time, a poor commute of more than 90 mins per day exists in some locations, including in several isolated zones (14 BSUs), mainly on the borders, which represent only 1% of the population. More surprisingly, however, the second worst category (60-90 min commute) covers not only the majority of suburban areas, but also some internal parts of the municipality of Ostrava. These areas include 54% of BSUs, with 51% of the population (almost 190,000 people). This reveals some troublesome commuting durations using PT, which can generally be attributed to more than one change per trip.
The results indicate locations within Ostrava with a potential for improvement in PT service. Results were not influenced by adaptation strategies of residents, including relocation for better access to employers and other facilities, using individual transport or some shared modes of transport, changes in employer due to poor accessibility, etc.

Discussion and Conclusions
Stochastic microscale simulation-based modelling offers a more thorough and detailed understanding of local commuting conditions and PT issues. Usually, studies in this area are based on schemes of evaluation of respondents' opinions on travel conditions (e.g., the multidimensional Rasch model in [88]). Even though this represents a direct way to evaluate customer satisfaction, there are questions about how to reach objectively comparable commuting conditions. Synthetic data can substitute or supplement existing real-world data, respecting existing PT conditions, targets' property constraints, and personal and opportunity factors. For accessibility studies, trip simulation provides the benefit of a multidimensional evaluation of local PT conditions, including an assessment of total travel time per day, number of changes, walking distance, waiting time, and the frequency of utilised stops or links [89].
This paper introduces a new approach to activity-based microsimulation modelling using stochastic Monte Carlo simulations to generate synthetic trips, and its implementation using the TRAMsim system. Three main outputs are presented: (1) a new concept of modelling and its implementation in TRAMsim, (2) the results of the testing and validation of two commuting models and (3) a potential accessibility evaluation of workplaces in Ostrava.
A key decision for modelling is the choice of a population of interest. It is possible to focus on specific groups within the population, and to analyse and predict accessibility specifically for them (e.g., [67] for children, youth and seniors). Personal characteristics are described in more detail than in other studies, where usually only income level and car ownership are documented. Characteristics are derived from a local travel survey. The population must be described in terms of location (spatial distribution), socioeconomic status, travel preferences and behaviour. Similar to discrete-choice models [10], observation of individuals is recommended. Specific outputs include a description of all important scenarios (daily activity patterns) with identified activities, the temporal distribution of these activities and optimised distance-decay impedance functions, corresponding to different purposes. The scenarios may consist of more than one activity, respect the opening hours of businesses, and consider the temporal variability of the attractiveness of the destination.
Activities are linked with a set of real-world targets equipped with real properties, i.e., the locations (coordinates) of all premises, temporal constraints and preferences, and sizes or capacities. Trip destinations are selected only from this set of targets, which enables the provision of a more accurate evaluation of travel. The selection of targets is managed by one of three options, the proportional gravity model being the validated model for Ostrava. In this system, PT trips always include walking, as complete door-to-door trips are evaluated.
All public transport connections are searched for not only for the closest stop from the origin or destination (a common setting in many other studies) but also for the predefined set of nearest stops, where all stop-stop combinations are evaluated. This is a commonly overlooked issue, even though, especially in dense urban environments, walking distances to various stops frequently differ only slightly, but the performance (transport frequencies, and available modes and links) may differ significantly. For example, Ivan [90] confirmed that only about 50% of transit users in our region use the closest stop.
As well as this, several options for the method of travel are available. Even though the majority of studies use only the fastest connection from the given departure time (e.g., [25]), this option is not preferred for all people and all purposes. Our model deals with a specified time interval and finds not just the fastest connection within this interval, but also the latest, the earliest, the shortest, a random and a "comfort" option. The "comfort" option is appropriate for travellers with disabilities, accompanying other persons, with luggage or shopping bags, etc. The last option should be preferred by passengers aiming to minimise waiting time before a fixed-start event. These variants enable a better simulation of passenger behaviour, based on their personal properties and purposes.
The model is implemented in TRAMsim, a database application with an automatic extensible client-server system suitable for massive parallel processing. TRAMsim simulates the selection of targets, searches for optimal transport connections and generates synthetic sets of probable trips, using both PT and pedestrian mode, optimised according to personal characteristics and detailed mapping of opportunities for activities.
Within the Ostrava case study, two models for commuting to work were tested. Comparison of the test results for these two models (random selection of a large employer, and proportional gravity selection from all employers) proved that the computational load during simulations cannot be decreased by the simplification of the model based on a random-above-limit approach, even within a city, and shows a good correspondence of the proportional gravity model to the survey results. This is a significantly positive result, as some other microsimulation studies have not succeeded in validation of their models (e.g., in [7], both gravity and discrete-choice models failed in validation).
The successfully validated proportional gravity model was used for simulations. Synthetic trips were employed for the mapping and evaluation of potential workplace accessibility in Ostrava.
Accessibility was evaluated with a set of selected indicators suitable for PT, including the share of accessible employers, pedestrian mode usage, total commuting time, and number of changes. The accessibility was evaluated for the Home-Work-Home scenario representing a full daily tour, including walking to/from stops and pedestrian mode. Thus, the total travel time includes all travel time, walking time and waiting time. The evaluation of such a daily travel time budget eliminates a common issue in PT accessibility studies, where outbound and return trips differ significantly. The low variety of employers found in simulations indicates a limited offer, usually related to restricted transport options, where only some requested targets are accessible. The pedestrian mode usage, combined with other indicators, is able to distinguish relatively closed local catchments with limited accessibility of distant destinations. A high number of changes indicates higher discomfort for travelers, which may impede PT usage for some groups of people. Such multidimensional evaluations enable a better rendering of the local situation, identify specific issues in some localities, and evaluate the impact of planned transport changes.
In this case study, potential accessibility was evaluated with a monotone demand and does not reflect the current local population. The reason for this is that models with a synthetic current population (e.g., [15,44]) evaluate the current accessibility conditions, but they conserve the existing population and results are influenced by a different structure of residents. For example, zones populated mainly by retirees may exhibit good accessibility, due to pensioners' prioritised commuting needs, such as shops, when, in reality, that same zone may have inadequate accessibility to employers or other types of destinations. Another issue with this kind of modelling is how to properly evaluate sparsely populated areas or locations under development. In such situations, it is possible to synthesise a standardised or target population. Using the same standardised population for all tested locations would enable the mapping of potential accessibility unbiased by the current population and its profile. The main disadvantage of simulation modelling can be seen in the great requirement for fine-grade local data and computation loads. This a common weakness [48,83,84], which still impedes the operational usage of such tools. Spiekermann and Wegener [91,92] described theoretical, empirical, practical and ethical limits for increasing the resolution of these models. In the TRAMsim case study, the model does not try to capture the full complexity of relationships between the population, land use and transport to simulate traffic flows in the real world. The focus of the simulation is on accessibility studies with partly fixed conditions, which decreases the uncertainty of the modelling and the computational load.
Potential weaknesses of the TRAMsim model can be seen in several instances. TRAMsim is unable to model social interaction. No activity across household members nor redistribution of activities among members can be implemented in an automatic manner. The social behaviour of the synthetic persons is not included, because the current focus of the system is on the potential accessibility evaluation and, therefore, no interactivity of the current population is modelled, similar to job competition in multiagent systems in [44]. Also, the gravity potential is evaluated for each target independently, and no synergic effect (sojourns) for clusters or chains of targets can be utilised. Additionally, no travel time uncertainty or schedules (e.g., congestion) are implemented. For the presented case study, this is a marginal problem, due to the positive fact that PT in Ostrava currently exhibits only small delays in scheduled times.
TRAMsim offers inspiration for improvements in PT modelling, which may also be applied in other simulation systems to reach more accurate modelling results. Improved understanding of local PT issues helps to adapt PT policy, to decrease real inconveniences impeding the wider usage of PT, and to decrease transport stress in modern cities.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

Appendix A. Explanation of the Regimes of Selection for the Destination from the Given Origin
Let O be a set of origins and D be a set of destinations: For each element o i of the set, O exists as an ordered set D i of destination elements that is ordered by the value of gravity function g.
The g function returns the gravitational value, which represents the gravity between origin o i and destination d i,j . The g function is defined by: where function a returns the attraction of destination d i,j from origin o i , function f is the appropriate distance-decay impedance (temporal-decay) function, and function t returns the travel time between o i and d i,j locations. For each element o i of the set O, there also exists an ordered subset B i ⊆ D i of elements of set D i where the gravitation ratio is greater than the randomly determined level. Set B i is defined by: where r is the random number r ∈ 0, 1) from a uniform distribution from interval 0, 1) , n is the number of elements in set D i , and j is the index of element d i,j of ordered set D i . For random choice of destination d i,j from origin o i : where n is the number of set elements D i , and r is the random number r ∈ 0, 1) from a uniform distribution from interval 0, 1) . For maximal gravity choice of destination d i,j from origin o i : where n is the number of set elements D i . For proportional gravity choice of destination d i,j from origin o i :