Trip Chaining Model with Classiﬁcation and Optimization Parameters

: In order to model the complex requirements of users travelling in an urban environment, the relevant parameters for creating activity chains have to be identiﬁed. In this study, travel related parameters were collected and groupedinto two main types: classiﬁcation parametersand optimization parameters. In the case of optimization parameters, further grouping was performed where general and comfort parameters were introduced. Additionally, the possible values and data sources of the parameters were identiﬁed. A utility function was created to take into account the optimization parameters and the weights. Weights related to comfort optimization parameters were aggregated to decrease the number of required settings by the users. Finally, the features of the proposed optimization algorithm are described. With the identiﬁed parameters, aggregated weights and elaborated utility function activity chains can be optimized for users with di ﬀ erent requirements.


Introduction
Recent developments in the field of travel behavior are dealing with topics like activity-based trip analysis, mode choice modeling, travel demand management, and flexible mobility options. About 20 years ago, Bhat and Singh [1] developed an analytical framework to identify the travel patterns of workers by estimating the commuting mode choice, the number of stops, and arrival periods. In the same period, Wen and Koppelman [2] found empirical results to demonstrate the connection of individual parameters to activity location choice and tour formulation. Bowman and Ben-Akiva [3] presented a daily activity scheduling concept where activity and travel related decisions were handled together with transport mode, time choice, and activity location. Islam and Habib [4] investigated the effect of socio-demographic characteristics on activity chains and found that several characteristics played a major role in influencing trips. Many other aspects of activity chains have been studied such as the comparison of travel behavior by men and women by McGuckin and Murakami [5]; the analysis of activity chains in specific regions by Subbarao and Krishna Rao [6]; the assessment by age group by Golob and Hensher [7]; and travel patterns of specific user groups [8]. These papers highlight that activity-based modeling needs to include several types of parameters.
Mazzula [9] analyzed user responses by applying an activity-based approach, where stated preference and revealed preference results were combined. Random utility models were used to simulate travel behavior and potential choice alternatives. In order to define traveler profiles, Pronello and Camusso [10] used factor analysis and cluster analysis. The study showed how significant constraints such as necessity, time saving, and transport supply determine a behavioral change, while Prillwitz and Barr [11] tried to assess the role of attitudes for travel decisions. The results demonstrated the usefulness and limitations of segmentation approaches and underlined the need for more comprehensive mobility style frameworks. Haustein and Hunecke [12] worked on the creation of useful segmentations of travel groups who share similar attitudes and preferences. In their contribution, attitudinal, socio-demographic, geographical, and behavioral segmentations are compared to provide sustainable travel choices.
The research question arises, of how to define a suitable set of parameters and model the activity chain optimization with a utility function. Several researchers have dealt with aspects of trip chaining, activity scheduling, and travel behavior, where user related parameters and grouping options were also investigated. However, the representation of detailed user requirements in an activity chain optimization framework has not appeared as of yet. Therefore, in this paper, a model with a set of parameters and a utility function are introduced. Such a detailed description of trip chaining parameters and the related utility function with weights has not been realized, thus the contributions of this paper provide a real added value to the literature. With this achievement, it will be possible to create more advanced activity chains taking into account user requirements. The main contributions of this paper are: • To provide a detailed classification of parameters related to trip chaining.

•
To identify the types, potential values, and data sources of the parameters.

•
To aggregate weights related to parameters, so that user settings can be easier.

•
To create a utility function for the activity chain optimization.
The rest of this paper is structured as follows. Section 2 presents the literature review. In Section 3, the parameters are defined, which were separated into two groups: classification and optimization parameters. In Section 4, the model is elaborated, where the utility function connects with the weights and parameters. In Section 5, the implications, limitations, and realization options are discussed. Section 6 provides the conclusions.

Literature Review
Trip chaining, or often called activity chain optimization, has been investigated by Timmermans et al. [13], who described and analyzed travel patterns. They proved that travel patterns are, in general, independent from spatial settings. Liao et al. [14] modeled the activity-travel scheduling problem to predict short-term effects of travel information systems and travel demand management. The authors developed a multi-state supernetwork, where the temporal dimension was also included when selecting the locations of activities. More importantly, personal preferences were taken into account, which supported optimal solutions for the travelers. Buliung et al. [15] explored the spatial variety of activity patterns and highlighted the importance of the flexibility of activities; however, no algorithm was developed to provide solutions to the travelers. Balaji et al. [16] worked on a hybrid approach that combined customer prioritization with optimization algorithms. In their model, users were clustered, and optimal routes were assigned using the analytic hierarchy process (AHP). Hafezi et al. [17] developed a method for modeling the daily activity patterns of individuals. The dependencies between activity type, activity frequency, and socio-demographic characteristics were taken into account while employing a random forest model. Kang and Recker [18] proposed an algorithm for daily activity scheduling using the location selection problem, where the locations of activities were chosen using predetermined and alternative locations. In order to define the best solutions, a utility function was introduced. The problem of huge search space was in this case solved with dynamic programming. Hilgert et al. [19] developed a mobility assistance system that gathers information from timetables and a real time information system. Furthermore, it knows the user plans and can reorganize weekly activity schedules according to personal preferences. They included both personal and network related parameters. In order to collect data from activities and user preferences, traditional survey methods and automatic data collection methods can be applied [20,21]. The answers of the questionnaires can be analyzed by AHP in order to determine the preferences of user groups [22]. However, these methods require time and human resources. A current application for automatic data collection is GTPlanner [23], Sustainability 2020, 12, 6422 3 of 15 which takes into account personal preferences when planning routes for users, and also provides information on their trips.
Lawton [24] claimed that there were four possible sources of information to build up or feed activity-based models, which may be used to identify parameters of an optimization model: household surveys (revealed preference) to study activities that influence travel demand; stated response surveys to investigate activity-travel patterns; longitudinal panel surveys; and retrospective surveys of activities to explore long term behavior (e.g., household location decisions). The combination of these methods with information technology supported information may help to identify personal parameters to establish a proper activity chain optimization model. In connection with this, an important aspect of a survey by Frignani [25] was the attempt to capture activity-travel planning attributes. The planning attributes were focused on timing and the constraints of planning decisions and explored whether user decisions regarding transportation mode are mainly driven by routine, while the choice of start time of activities is more individual and impulsive. In addition, Artenze et al. [26] developed a latent-class user model for tourists, where they used activity location-based parameters and trip-based parameters (i.e., tourist attraction values, time-use characteristics and point of interest (POI) attributes). With a multi-attribute utility function, personalized optimal tours were offered for the users. This approach was also utilized in the current research.
Relevant papers were collected (Table 1), which cover the aspects of activity chain optimization, especially the general goal setting (modeling, with algorithm development), the used network (multimodal, with activity types), the applied calculation method (optimization, with utility function), and the types of parameters included (classification, with optimization parameters). Artenze [27] placed emphasis on providing personalized advice for travelers. The main idea was to find out travel parameters based on choices, where empirical testing was performed based on a travel choice experiment; however, no optimization was performed. Nijland et al. [28] developed an activity-based model, where daily agendas were modeled based on a web survey with reported activities. The research analyzed the effects of planned activities on the decision to schedule an activity, but no optimization was realized. Another activity travel scheduling model was created by Miller and Roorda [29] based on travel diaries. Their aim was to understand the process of how travelers schedule and reschedule activities with a utility maximization approach, however, several features were lacking such as flexibility and multimodality. Chowdhury and Scott [30] examined the influence of the built environment on trip-chaining behavior with regression models. They took into account personal and household characteristics, and a few attitudinal variables, but did not use detailed optimization parameters. Their focus was rather on the modeling of accessibility, and not on the optimization of trips during the day using a utility function, which is present in our model.
Dib et al. [31] worked on a route planning problem in a practical way. They developed route planning methods in multimodal transportation networks using genetic algorithms and variable neighborhood search methods. In contrast to traditional algorithms, this approach was fast enough for practical routing applications. However, the approach was presented on a theoretical network and did not consider daily activities and optimization parameters. Ghiani et al. [32] solved the traveling salesman problem with heuristic algorithms to generate optimal activity chains. Here, the implementation of daily activity optimization was presented, however, neither flexibility nor a complex utility function were elaborated. Nuzzolo and Comi [33] created a method of how to choose paths in multimodal travel networks. The method used an individual traveler utility function, which allowed personal preferences to be included, although daily activity chains and complex optimization parameters were not considered.
Västberg et al. [34] developed a dynamic discrete choice model for daily activity travel planning including individual preferences and generating a utility function. Additionally, time-space constraints were taken into account, but personal and optimization parameters were not. One of the most complex solutions was provided by Pougala et al. [35], who elaborated a scheduling method for daily activities where a complex utility function with flexible activities was included using a mixed integer programming approach. They covered four transportation modes and 11 activity types; however, classification parameters were not considered.
Malik and Kim [36] created an optimal travel route recommendation mechanism to predict the best routes for tourists based on neural networks and particle swarm optimization. In their route optimization, a complex utility function was created, and five main optimization factors were included; however, activity types did not play a role, and only a limited number of optimization parameters were considered. Another excellent approach was elaborated by Charypar and Nagel [37], who applied genetic algorithm to provide activity plans, where a complex utility function was created taking into account the preferences of the users. Their utility function included the time and the location of the activity, however, multimodality and activity types were not handled.

Definition of Classification and Optimization Parameters
In order to model the complex requirements of users regarding an urban activity chain, the possible optimization parameters were identified. In the literature, the main typical optimization parameters are time, cost, and comfort. Furthermore, the parameter type, component type, possible values, and data sources were created for grouping the parameters.
Parameter Type: Two types of parameters can be introduced ( Figure 1). The parameter type describes whether the parameter is a classification parameter or an optimization parameter. The detailed descriptions of the parameters follow the order of the parameter types, which are actually strongly linked to the component type.
can be changed by personal preferences.
The optimization parameters are used in the optimization process. Their two main groups are the general optimization parameters without weights (with exception of time and cost) and the comfort optimization parameters with weights. Usually, general optimization parameters are parameters with fixed or predefined values, where weighting cannot be defined (e.g., opening times). Parameters present directly in the utility function are in italics.

Component Type:
Three types of component were identified: the user, the trip, and the location (Figure 2). Most parameters clearly belonged to one component type, but some parameters influenced more component types, therefore they were placed in the intersections. The user includes classification and optimization parameters, which depend on the individual user. The trip contains optimization parameters and is divided into sub-types according to the transportation modes, as transportation modes have specific parameters. The location consists of those optimization parameters, which are connected to the location of the activity. The classification parameters are not used directly in the optimization process, but are crucial inputs for the classification of users into user groups. The creation of user groups facilitates user decisions about setting the weights for the optimization parameters. The user groups possess predefined settings of the weights, where the weights provide only an initial setting and the values can be changed by personal preferences.
The optimization parameters are used in the optimization process. Their two main groups are the general optimization parameters without weights (with exception of time and cost) and the comfort optimization parameters with weights. Usually, general optimization parameters are parameters with fixed or predefined values, where weighting cannot be defined (e.g., opening times). Parameters present directly in the utility function are in italics.
Component Type: Three types of component were identified: the user, the trip, and the location ( Figure 2). Most parameters clearly belonged to one component type, but some parameters influenced more component types, therefore they were placed in the intersections. The user includes classification and optimization parameters, which depend on the individual user. The trip contains optimization parameters and is divided into sub-types according to the transportation modes, as transportation modes have specific parameters. The location consists of those optimization parameters, which are connected to the location of the activity.
Possible Values: In the case of optimization parameters with numeric values, the quantification and creation of categories is easy, as exact values can be assigned to the categories (e.g., prices). The quantification is also possible for optimization parameters with a textual value set by assigning artificially created value categories. In some cases, the optimization parameters can only be categorized by applying heuristic considerations or the exact values of the categories can be learned by collecting a large number of examples (e.g., crowding). In the case of optimization parameters with weights, the parameters have

Possible Values:
In the case of optimization parameters with numeric values, the quantification and creation of categories is easy, as exact values can be assigned to the categories (e.g., prices). The quantification is also possible for optimization parameters with a textual value set by assigning artificially created value categories. In some cases, the optimization parameters can only be categorized by applying heuristic considerations or the exact values of the categories can be learned by collecting a large number of examples (e.g., crowding). In the case of optimization parameters with weights, the parameters have the following possible values: low, medium, high. Low values represent "good" features, while high values represent "bad" features.

Data source:
The data source refers to the origin of the parameters, which can originate from the user (by setting the requested values), from the application (by collecting and evaluating usage statistics), or from external sources (by receiving data or datasets). The external sources can be represented by a transport operator, a municipality, a social media provider, a POI database, or other databases.

Classification Parameters
In the following section, the parameters are grouped by the parameter type. The comfort optimization parameters are further divided by the component type.
The classification parameters and their attributes are identified in Table 2, and are mainly connected to the user component type.

•
Age, gender, occupation, income, car ownership, family status: The basic socio-economic data, which are required to categorize users into user groups.

Data source:
The data source refers to the origin of the parameters, which can originate from the user (by setting the requested values), from the application (by collecting and evaluating usage statistics), or from external sources (by receiving data or datasets). The external sources can be represented by a transport operator, a municipality, a social media provider, a POI database, or other databases.

Classification Parameters
In the following section, the parameters are grouped by the parameter type. The comfort optimization parameters are further divided by the component type.
The classification parameters and their attributes are identified in Table 2, and are mainly connected to the user component type.

•
Age, gender, occupation, income, car ownership, family status: The basic socio-economic data, which are required to categorize users into user groups.

•
Number of daily trips: Average number of trips during a day (e.g., users with family tend to make more daily trips, while pensioners probably make fewer daily trips).

•
Flexibility: Average number of flexible activities during a day (e.g., users with flexible working hours and students tend to have more flexible activities).

•
Number of changes: Average number of changes in daily activity plans (e.g., younger people tend to change their mind and have new unplanned events during the day).

General Optimization Parameters
The general optimization parameters were identified. Most of these parameters were without weights, with the exception of time and cost. These parameters are mainly connected to both the user and the location component type (Table 3).

Comfort Optimization Parameters
The comfort optimization parameters were described connected to the trip component type (Table 4).

•
Weather (p9): Measure for the actual daily average weather situation measured by the temperature and the humidity (e.g., rainy, windy). Finally, comfort optimization parameters were identified connected to the location component type (Table 5).

Elaboration of the Method
Utility functions were introduced in order to combine the values of the optimization parameters and to support the creation of activity chains. The utility functions consist of optimization parameters and weights. Weights related to comfort optimization parameters are aggregated weights.

Aggregated Weights
The aggregated weights were introduced to decrease the number of required settings by the users. They influence the relevance of more optimization parameters, thus the modeling of typical user requirements is present. The possible values of the aggregated weights can be between one and five. These values are predefined by the user groups (average values), but can be changed by the user (Figure 3). The utility functions (u (p,w)) regarding comfort optimization parameters were formalized, creating the mathematical context of dependencies between optimization parameters and aggregated weights. The following aggregated weights were defined: • Routine (w r ): Measure of willingness to differ from well-known routes; this weight has a general effect on several parameters (e.g., willingness to make detours, if it is beneficial), is a super aggregation with an effect on delay sensitivity, lifestyle, quality sensitivity, price sensitivity, and area sensitivity.

•
Delay sensitivity (w 1 ): Average delay tolerated by the user, which depends on the congestion and the incident probability of the chosen trip (e.g., users with high delay sensitivity should avoid congested routes). u 1 (p, w) = p 15 * w 1 + p 16 * w 1 * w r (1) • Lifestyle (w 2 ): Measure for environmental consciousness and security features (e.g., rather using more eco-friendly transportation modes and avoiding dangerous areas). u 2 (p, w) = p 10 * w 2 + p 27 * w 2 * w r (2) • Quality sensitivity (w 3 ): Measure for taking comfort features, price ranges, and parking space into account (e.g., businessmen tend to use cars and visit places with higher prices). u 3 (p, w) = p 11 * w 3 + p 30 * w 3 + p 24 * w 3 * w r (3) • Price sensitivity (w 4 ): Willingness to pay for a certain trip, which includes traffic tolls and parking fees (e.g., workers may travel longer distances, where no traffic toll has to be paid). u 4 (p, w) = p 14 * w 4 + p 29 * w 4 * w r (4) • Area sensitivity (w 5 ): Measure for taking features regarding ratings and the area of the location into account such as the city area and location area (e.g., users tend to visit restaurants in the city center, but a recreational activity rather close to a park). u 5 (p, w) = p 26 * w 5 + p 23 * w 5 + p 25 * w 5 * w r (5) • Biking preference (w b ): Measure of the willingness of using a bike during trips (e.g., students tend to bike more often); this weight has a general effect on biking related parameters, • Biking habits (w 6 ): requirements of the users regarding road quality, biking routes, and weather (e.g., many users prefer built roads and good weather). u 6 (p, w) = p 12 * w 6 + p 13 * w 6 + p 9 * w 6 * w b • Car preference (w c ): Measure of the willingness of using a car during trips (e.g., businessmen tend to use their own cars more often); this weight has a general effect on car related parameters.
• Car habits (w 7 ): Requirements of the users regarding road quality and weather (e.g., certain users do not use their cars in winter). u 7 (p, w) = p 13 * w 7 + p 9 * w 7 * w c (7) • PT preference (w p ): Measure of the willingness of using PT during trips (e.g., younger people prefer public transportation, because they can utilize their time more efficiently by reading on the vehicles); this weight has a general effect on PT related parameters. • PT habits (w 8 ): Requirements of the users regarding number of transfers, crowding, and vehicle types including cleanliness, comfortable seats, heating and air conditioning (e.g., users do not prefer old vehicles without air conditioning during the summer).
• Walking preference (w w ): Measure of the willingness to walk during trips (e.g., young people tend to walk more); this weight has a general effect on walking related parameters.

•
Walking habits (w 9 ): Requirements of users regarding pavement quality, street type, and weather (e.g., certain users prefer nice road with trees and good weather). u 9 (p, w) = p 9 * w 9 + p 21 * w 9 + p 20 * w 9 * w w (9) • Special needs (w 10 ): The need for special services such as modern low floor vehicles, need to avoid stairs or slopes and accessibility of locations (e.g., users with wheelchairs do not like to visit places without ramps). u 10 (p, w) = p 22 * w 10 + p 18 * w 10 + p 28 * w 10 (10) Sustainability 2020, 12, x FOR PEER REVIEW 12 of 16

Utility Function
The main utility function was defined as the sum of the products of optimization parameters and weights. The optimization parameters were weighted, where weights represent the personal preferences of the users. In the case of comfort optimization parameters, the weights were grouped into aggregated weights, so that users could express their requirements. The optimization parameters were values retrieved from external data sources, whereas weights and aggregated weights were set by the user. The value of time, the cost, and the value of comfort were different between user groups. During the optimization, the utility function is minimized. The minimization of time and cost is a well-known operation. In the case of comfort parameters, the possible values were defined in such a

Utility Function
The main utility function was defined as the sum of the products of optimization parameters and weights. The optimization parameters were weighted, where weights represent the personal preferences of the users. In the case of comfort optimization parameters, the weights were grouped into aggregated weights, so that users could express their requirements. The optimization parameters were values retrieved from external data sources, whereas weights and aggregated weights were set by the user. The value of time, the cost, and the value of comfort were different between user groups. During the optimization, the utility function is minimized. The minimization of time and cost is a well-known operation. In the case of comfort parameters, the possible values were defined in such a way that low values represent ideal conditions and high values represent not preferred conditions. min u(p, w) = p time * w time + p cos t * w cos t + m i=1 u i (p, w) • p-Optimization parameters.

Optimization Algorithm
The utility function supports the creation of the optimization algorithm. In general, optimization algorithms can be divided into two basic categories. The first type is exact algorithms [38,39], which search the whole solution space and provide a globally optimal solution, however, in most cases with considerably more processing time. The second type is heuristic algorithms, which use specific rules to speed-up the solution. This implies that not the whole solution space is searched, thus they usually provide only a nearly optimal solution. However, with proper settings, it is acceptable for practical applications [40].
For the optimization of activity chains, a special heuristic algorithm is to be applied. In the case of transportation related problems, the GA framework has been successfully applied to activity scheduling problems [41], such as the travelling salesman problem (TSP) [42], the travelling salesman problem with time-windows (TSP-TW) 42, and the vehicle routing problem (VRP) [43], which are classified as NP-hard problems [44]. These kinds of problems are usually harder to solve as the size of the network grows, however, when using the GA framework, solutions can be calculated in a reasonable amount of time.
The optimization algorithm uses this GA framework that iteratively solves the TSP-TW problem for different combinations. Thus, it provides a set of possible solutions, which are evaluated based on the elaborated utility function. After running the algorithm for several iterations, a nearly optimal solution can be derived for the planned activity chain of the user.
The functioning of the algorithm can be described in the following steps: • Data input: This part is especially supported by the classification and optimization parameters, which were discussed in detail in Section 3. They provide the main input for the optimization algorithm. During the creation of activity chains, it is assumed that the user is already aware of the activities and other parameters, which are provided to the algorithm in advance.

•
Creation of alternatives: Priority is one of the most important optimization parameters. Based on its value, if an activity is flexible, then the demanded service may be available in more places. The algorithm has to find these alternative locations, so that better alternatives can replace the original activity locations. However, if an activity is fixed, then the activity location cannot be changed and thus optimization cannot be performed for this activity.
• Calculation of the utility function: With the original and alternative locations of activities, the utilities between the activity locations can be calculated. The utility function was discussed in detail in Sections 4.1 and 4.2, which provides the ranking of different alternatives.

•
Optimization algorithm: The GA calculates based on the provided utility function of the best scenarios, which results in an optimized set of activity locations based on the provided classification and optimization parameters. The GA framework runs several times to find possible solutions. It is not ensured that the global optimum will be reached, however, with good parameter settings, the solution can be quite close. • Visualization: The proposed activity chain has to be shown on a map, where the optimal activity locations are present, and the daily route is available.

Discussion
In this study, a well-defined utility function was created to support the optimization of activity chains. The main limitation of the study is the lack of realization, which will be done in a later stage of the research, however some considerations are discussed in this section.
A crucial point of the development of the planned algorithm is the specific setting of the GA framework, where the genetic operators have to be initiated, which are the selection, the mutation, and the crossover operators. Moreover, parameters of the GA framework also have to be set, which are the population size, the mutation probability, the crossover probability, and the number of generations. The exact realization of these steps requires an extensive analysis and testing of the proposed framework, which are part of the future research directions.
When realizing the optimization algorithm with the proposed utility function, a series of experiments need to be conducted to analyze the effectiveness of the algorithm. Thus, comparisons between the heuristic and the optimal solutions will be provided. In addition, a sensitivity analysis is needed to check how the setting of each parameter changes the results of the optimization.
The application of the utility function could be used in any preferred location where the following data are available: a map of the city with routes provided by a map operator, the timetable of public transportation provided by the transport operator through an interface, city specific parameters provided by local authorities, and the set of activities provided by the travelers. In case some data are not present, the algorithm would be still functional, however, the optimum would be calculated by less parameters.
By collecting personal information, the real weights of the users can be acquired. This could be reached through extraction from usage statistics (e.g., average waiting time for the bus) or by letting the user choose the value of the weight parameter, which is adapted during real usage (e.g., number of daily trips). As a consequence, the belongingness to a user group can be analyzed (e.g., the certain user likes biking more than the average of their user group). Finally, case studies could be carried out that would include the logging of user trips and comparing the activity chains of the original and optimized version.

Conclusions
In this paper, a model with a set of parameters was introduced and grouped into classification parameters (to classify users into user groups) and optimization parameters (to provide utilities to the optimization algorithm). The parameters were connected to the user, the chosen transportation mode, or to the location type of the activity. Aggregated weights were assigned to the optimization parameters, which represent the preferences of the users. A utility function was also elaborated to provide input for the optimization algorithm about the preferences of the user. As a conclusion, it was observed that some parameters were easy to include in the optimization algorithm (e.g., time), but some were hard to quantify or collect. In order to realize the optimization framework in real circumstances, a huge amount of external information is required to feed the model.

Funding:
The research reported in this paper was supported by the BME Artificial Intelligence FIKP grant of EMMI (BME FIKP-MI/SC).