Solving the Urban Transit Routing Problem Using a Cat Swarm Optimization-Based Algorithm

.


Introduction
The urban transport system is an important feature of today's urban areas. The design of high-quality urban transport systems comprises a major issue for modern cities due to their development, it and affects both pollution and environmental matters. In the corresponding literature, two main types of urban transport systems are defined [1]: (i) public transportation networks and (ii) private transportation networks.
Nowadays, people throughout the world recognize the importance of having efficient urban public transport systems [2]. As a matter of fact, many different transportation means are included in the public transportation system, for example, busses, metro, trams, etc. Efficient public transportation systems can reduce negative effects of private transportation networks [3].
The quality of bus services is mainly determined both by the minimization of waiting and in-vehicle travel times as well as the reduction of en route vehicle changes. Indeed, it is extremely difficult to have an ideal bus service, that is, a bus service which satisfies all passengers' needs while simultaneously maintaining all operator costs in an acceptable level [4]. Usually, throughout the world, it is the responsibility of bus companies to design efficient bus routes and schedules for an urban area. On the other hand, there are also some cases where the local authorities have the responsibility to The formalism used to model the UTRP in this paper is the same with the one used in all contributions presented in the previous section. Moreover, we use an equivalent fitness function (see Section 3.1.3) and assess the resulting solutions' quality using analogous performance criteria. We decided to do so to be able to compare with other approaches' results on a fair basis. The criteria on which the assessment of the resulting bus transportation networks are based are five items. The 1st criterion is each passenger's average in-vehicle travel time or alternatively the total time for all passengers needed to cover their transport demands. The 2nd criterion is the percentage of passengers been able to travel directly from their origin to their destination. The 3rd criterion is the percentage of passengers who must make only one transfer to go from their origin to their destination. The 4th criterion is the percentage of passengers who must make exact two transfers to go from their origin to their destination. Finally, the 5th criterion is the percentage of passengers who must make three or more transfers to travel from their origin to their destination or are not able to reach their destination using public network at all.
The objective function implemented (the value of which must be maximized) is the sum of two conflicting parameters. The first is the passenger cost which is represented by the total journey time overall passengers. The second is the operator cost which is represented by the total length of the route Algorithms 2020, 13, 223 4 of 27 set. There are two cases investigated in the current contribution. According to the 1st case, designing each route set is completely based on passengers' profit (operator's cost is absent). According to the 2nd case, the route set design also takes into account the operator's profit (see Sections 3.1.3 and 3.1.4).

Urban Transit Routing Problem (UTRP)
The UTRP aims at developing a set of vehicle routes for an urban transit network while meeting all passengers' and/or operator's constraints. To model a transportation network, bus stops are modeled as adjacent nodes and, as a result, are linked by an edge. Several nodes connected by edges form a route. Several edges that connect nodes constitute a transportation path. In Figure 1, we present Mandl's Swiss public transit network [25]. This network has fifteen nodes and twenty-one edges. Traveling times between nodes are defined by the numbers on the edges.
Algorithms 2020, 13, x FOR PEER REVIEW 4 of 27 designing each route set is completely based on passengers' profit (operator's cost is absent). According to the 2nd case, the route set design also takes into account the operator's profit (see Sections 3.1.3 and 3.1.4).

Urban Transit Routing Problem (UTRP)
Τhe UTRP aims at developing a set of vehicle routes for an urban transit network while meeting all passengers' and/or operator's constraints. To model a transportation network, bus stops are modeled as adjacent nodes and, as a result, are linked by an edge. Several nodes connected by edges form a route. Several edges that connect nodes constitute a transportation path. In Figure 1, we present Mandl's Swiss public transit network [25]. This network has fifteen nodes and twenty-one edges. Traveling times between nodes are defined by the numbers on the edges.  [25].
A candidate bus route (transportation path) constitutes several nodes connected by edges. We can define a valid bus route among nodes 8, 14 and 7 since these nodes are connected by existing edges of the network. However, nodes 9, 11 and 14 cannot form a bus route since there are not any existing edges that directly connect these nodes of the network. A route set is formed by one or more transportation paths (valid bus routes). Finally, all the routes of a route set that have been superimposed constitute the route network.
It is obvious that a route network should contain all nodes of the current transit network but may not include all its edges. As a matter of fact, we can say that each route network constitutes a subgraph of the existing road network (graph). An ideal route network, that is, a network satisfying all traveling demands, is a network with paths that connect each node of the network with every other node. To construct an efficient route network, we must acquire precise estimations of all travel demands. To realize these estimations, one should [44]: • Undertake public and private vehicle analysis. • Carry out surveys on the local population.
A candidate bus route (transportation path) constitutes several nodes connected by edges. We can define a valid bus route among nodes 8, 14 and 7 since these nodes are connected by existing edges of the network. However, nodes 9, 11 and 14 cannot form a bus route since there are not any existing edges that directly connect these nodes of the network. A route set is formed by one or more transportation paths (valid bus routes). Finally, all the routes of a route set that have been superimposed constitute the route network.
It is obvious that a route network should contain all nodes of the current transit network but may not include all its edges. As a matter of fact, we can say that each route network constitutes a subgraph of the existing road network (graph). An ideal route network, that is, a network satisfying all traveling demands, is a network with paths that connect each node of the network with every other node. To construct an efficient route network, we must acquire precise estimations of all travel demands. To realize these estimations, one should [44]: • Undertake public and private vehicle analysis.

•
Carry out surveys on the local population. However, these estimations are difficult to make, since travel demands are continuously modified and are very sensitive to factors like quality of service, pricing policy, etc. In the ideal case, the most travel demanding routes should be fulfilled with short travel paths and few vehicle transfers. In this way, however, the service level of all other routes will probably be affected. There are also several other factors that affect designing an effective route network, such as the local government's transport management policies, the local area's street environment, etc. [45].

Input Data and Optimization Criteria
The input instances that are used in current work are the Mandl's Swiss transit network [25] and four transit networks that were contributed by Mumford [13]. The data used by the CSO-based algorithm as input are described in the following:

•
Data that concern the connections among the network's nodes (road network's structure).

•
Data that concern times needed to go from one node of the network to another (travel times).

•
Data that concern travel demands between any two nodes of the road network (see Table 1 as an example).  The optimization criteria-in order of significance-on which the performance evaluation of the presented CSO algorithm is based, are described as follows ( [13,14,26,31,36,38,43,46,47]): 1.
The percentage of unsatisfied demands (should be as low as possible, ideally equal to zero).

2.
The average travel time in minutes per transit user (should be as low as possible). 3.
The percentage of "with no transfers" satisfied demands (should be as high as possible).
Except for these three major criteria, there also some other constraints that should be fulfilled [47]: • A minimum and a maximum number of nodes (length) must be defined for each route (defining that each route must have minimum and maximum length ensures route network's connectivity and assists bus schedule adherence, respectively).
• In order passengers to be able to travel between any two nodes of the road network, there should be a path connecting any two of them (the road network should be a connected graph).

•
Individual routes should not have any cycles or backtracks.

•
To limit transportation cost, the service provider has usually predefined the number of routes in the route set.
The constraints mentioned above are considered as hard constraints that should never be violated by any acceptable solution, and are also considered by the presented CSO approach.

Solution Approaches and Drawbacks
As mentioned above, to design an effective route network, all criteria affecting its quality must be optimized. Consequently, one must formulate the mathematical model of the problem to include all these criteria so that its solution will result to an optimal set of routes for specific input data and under specific constraints. Researchers have attempted in the past to formulate and solve the UTRP problem using a mathematical approach ( [26,48]). In these approaches, however, the desired characteristics of the route network are expressed as specific constraints, and the aim is to optimize a given function containing criteria affecting its quality [31]. Unfortunately, these attempts fail to define the specific routes of the network through the mathematical optimization problem. In [26], the authors conclude that UTRP is a difficult problem to represent using a mathematical approach due to the problem's discrete nature. The basic variables of the problem are the nodes of the network and the edges that connect them. In [49], Newell stated that the UTRP, which is a non-convex optimization problem in its general form, is a difficult problem to solve. Consequently, there are some major UTRP features that increase its complexity. One of these features is its nonlinearity. Another one is that in many cases logical variables must be defined (e.g., to determine if two nodes are related to each other). Chakroborty claimed that conventional approaches find it difficult to solve the UTRP since they cannot represent it properly and effectively [31]. These methods often add extreme computational burden trying to represent it or even fail to depict it adequately. The UTRP is NP-complete in its general form concerning its computational complexity. This means that the difficulty to find a solution rises exponentially to its size, and there is no deterministic algorithm resulting in an acceptable solution in polynomial time.

The Proposed CSO-Based Algorithm
CSO-based algorithms imitate the collective behavior of non-distributed, self-organized physical systems. The capability of natural systems to demonstrate collective intelligent behavior is generally reported in publications as swarm intelligence [50]. These systems are usually composed of several autonomous agents: entities that are meant to execute different and specific operations. These agents interact locally either with each other or with their environment. Their interactions, although with no centralized control of behavior, often lead to a prospective collective one [51]. CSO algorithms were firstly proposed by Shu-Chuan et al. in 2006 [52] and have been successfully applied to other optimization problems, as presented in the related work section (see Section 1.1) According to the classic CSO approach [50,52], there are two different states in which a cat can be during the optimization process: a) seeking mode and b) tracing mode. In seeking mode, each cat rests while examining its environment in order to decide which move should be its next one. In tracing mode, each cat moves rapidly hunting for food. The value of a Boolean variable determines whether a cat is in seeking mode or tracing mode. The way a cat is acting in seeking mode is affected by four parameters: (i) seeking memory pool (SMP), (ii) seeking range of the selected dimension (RSD), (iii) count of dimensions to change (CDC) and (iv) self-position consideration (SPC) [50,52]. The steps (in pseudocode) that each cat k (cat k ) is going to execute while in seeking mode are presented below: Step 1: Create j copies of the current position of cat k (Note: j = SMP) If (the value of parameter SPC is true) j = (SMP − 1) Add the current position into the pool of candidate position to be moved to Step 2: For (each copy) Change the value of CDC dimensions at random (Note: these changes cannot exceed a percent ± RSD) Step 3: Calculate the fitness value for all candidate positions Step 4: If (the calculated finesses are not equal with each other) Calculate the probability P i of selecting each candidate position Else P i = 1.0, (Note: for all i) Step 5: Pick at random, among candidate positions, the position to which cat k will be moved Place cat k to this position Probability P i , which is the probability of each candidate position to be selected for moving cat kp there, is computed using Equation (1): where FS i is the fitness value of position i, FS max is the biggest fitness value found, FS min is the smallest fitness value found and FS b is equal to FS max for maximization problems and equal to FS min for minimization problems, respectively.
The steps (in pseudocode) that each cat k (cat k ) is going to execute while in tracing mode are presented below: Step 1: Update the velocity of each dimension of cat k Step 2: If (the value of a velocity is outside allowed range) Set it equal to the maximum allowed value Step 3: Update the position of cat k The velocity of each dimension of cat k is updated according to Equation (2): where x best,d is the position of the cat with the best fitness value up to that moment, x k,d is the position of cat k , c 1 is a constant affecting the change in the velocity of each dimension (most of the times set equal to 2.0 [50,52]) and r 1 is a random value belonging to [0, 1]. The position of cat k is updated according to Equation (3): These two modes (seeking mode and tracing mode) are combined in the original CSO algorithm using a ratio determined by the parameter mixture ratio (MR) [50,52]. This parameter is usually set to a very small value favoring the seeking mode and thus reflecting the true behavior of cats, which spend most of their time-when awake-monitoring their environment instead of hunting for food. A more detailed description concerning the structure and operation of both modes is referenced in [50,52].

Representation of Cats
A two-dimensional array is used for the representation of the route set. An analogous representation scheme was also adopted in [37]. If, for example, the route set consists of 5 routes, each one having a maximum of 9 nodes, we need an array of 5 × 9 = 45 cells to represent the solution.
In Table 2, a random route set example of five routes, each one consisting of nine nodes at most, is presented. The 1st route begins at node 1, goes on to node 3, and then continues to node 4, etc. The other four routes are described in the same way. In the case of an empty cell, we put the value of "−1" in that cell, meaning that the respective route has been completed in a previous node. The 2nd route, for example, consists of seven nodes. As mentioned in previous sections, the total number of routes in the network as well as the maximum length of each route are both predefined by the transportation company and the local authorities, respectively.

Initialization of the Agents (Cats)
Since the UTRP is a very difficult problem to solve (see Section 2.3), we decided not to generate the initial population of route sets at random but to construct them based on some specific logical principles. Each route set is created by defining the number of routes it consists of. The initialization procedure, which has been adopted in this contribution, has been mainly based on the approach presented in [31] by Chakroborty and Wivedi, partly changed using the make-small-change procedure, as described by Fan and Mumford [37] and Fan et al. [15] and as it was also modified by Kechagiopoulos and Beligiannis [43]. In these approaches, an initial node set (INS) with K nodes is created, from which the first node is picked with a probability proportionate to its activity level. The value of parameter K, which is user defined, was chosen equal to 14 for the Mandl's Swiss network, in line with the above works. Considering that Mandl's network has 15 nodes, setting a value of 14 for the K parameter means that, besides node 14 which does not attract or generate any transfers (see Table 1), all other nodes are possible first nodes. For the four Mumford's instances [13], all the nodes in each instance are considered as possible first nodes.
To maximize the number of nodes in each route, Kechagiopoulos and Beligiannis in [43] propose repetitive creation of a route 10 times to hold the one with the largest number of nodes. In the present approach, many original agents are created (limited by memory issues that are mainly affected by the size of the road network). Then, the initial whole population is filtered, and a smaller number of active agents remains (also limited by execution time issues). The filtering criterion is either the total number of nodes in the agent's route set (we pick agents with as many as possible total number of nodes) or its fitness value (we pick agents with as bigger as possible fitness value). The first criterion has been used in trials focusing on customers' interest, and the second criterion has been used in trials concerning the provider's interest.
The initialization procedure satisfies the following demands: • Establishment of a consistent route set.

•
The duration of each route does not exceed the allowed limit.

•
The minimum and maximum number of nodes on each route of the route set is within the predefined limits. • There are no circles or inversions on each route in the route set.
The above requirements are ensured throughout the evolution of the algorithm. Initialization also determines the initial mode of each cat (if it is in tracing or in seeking mode) according to the mix ratio specified by the user (see Section 3.1.5).

Route Set Evaluation
Since the path of each route is strongly dependent on the paths of the other routes belonging to the same route set, separate evaluation of each route has no meaning. Consequently, all routes belonging to the same route set are evaluated together. The evaluation of route sets is based on several criteria measuring the "quality" level of each route set (see Section 2.2).
All criteria used for route set evaluation by the proposed CSO method are the ones already adopted by other researchers in the literature ( [25,26,31,37,43,46]). To be more precise, we have adopted the evaluation procedure proposed in [31,37,43], where all evaluation criteria presented in Section 2.2 are put in a mathematical formula in order to calculate one single number, i.e., FIT(r). The arithmetic formula of FIT(r), which is used to calculate the fitness of each route set r, is the following equation: where F 1 (r) is the score obtained by evaluating the route set r using the second criterion only (that is, the average travel time in minutes per transit user; see Section 2.2), F 2 (r) is the score obtained by evaluating the route set r using the third criterion (that is, the percentage of demand satisfied without any transfers, see Section 2.2) and F 3 (r) is the score obtained by evaluating the route set r using the first criterion only (that is, the percentage of demands unsatisfied; see Section 2.2). Parameters ω 1 , ω 2 , and ω 3 are user specified weights for scores F 1 (r), F 2 (r) and F 3 (r), respectively. Based on the methodology introduced by Chakroborty and Wivedi [31], we estimate F 1 (r), F 2 (r) and F 3 (r). F 1 (r) represents the average time spent by each passenger when using a specific route set. If the respective average traveling time is big, F 1 (r) has a small value; otherwise, its value is big. To estimate F 1 (r), one has both to calculate the average traveling time and determine whether the calculated value can be considered as "big" or "small". Estimating F 1 (r) is performed according to the following steps: 1st Step: For every node pair (i,j) and a given route set r the in-vehicle travel time IVT i,j (r) is computed by determining the smallest time in which a passenger can travel from node i to node j using the routes of the rth transit route set. Traveling time T p (r), on a path p, is calculated using the next mathematical formula: where t a is the traveling time to node α starting from the previous node of the path, n is the number of transfers involved in path p and U is a time penalty paid for each transfer. In the current approach, parameter U is set to 5 min, as proposed in [25,26,31,37] and [43]. IVT i,j (r) is the smallest T p (r) of all possible paths connecting nodes i and j. 2nd Step: The absolute minimum traveling time minT i,j between nodes i and j is calculated. This parameter is determined totally by the road network and neither by the route set nor the respective transfer delays. 3rd Step: Index f i,j (r) is determined by comparing traveling time minT i,j with the respective IVT i,j (r). Its value indicates whether IVT i,j (r) is "big" or "small". Index f i,j (r) is estimated by the next mathematical formula: where x = IVT i,j (r) − minT i,j , x m is the upper limit value of x, K 1 is a positive user-defined parameter which defines the maximum value of f i,j (r) and is then used in order to calculate the value of F 1 (r) as follows: where S r is the set of node pairs (i,j) for which transfer demands, d i,j , are satisfied by route set r.
One must notice that calculating the minimum time using Equation (7) is not so obvious, since different results may appear if someone considers transfer times or not. The selection of a route by a passenger may be done either without considering possible transfer delays (method A) or by considering all delays introduced by possible transfers (method B). We decided to examine both route evaluation methods and concluded that method B is much more computational demanding. Consequently, we used method A to evaluate routes during the execution of the algorithm and, then, the resulting route sets were also evaluated with method B, as it is obvious that this method is used in the published literature ( [13,25,37]). F 2 (r) represents the percentage of passengers traveling to their destination from their origin directly by making a single transfer or by making two transfers. Estimating F 2 (r) is realized by the next mathematical formula: where K 2 is a positive user-defined parameter which defines the upper limit of F 2 (r), K 2 /α ≤ b 2 ≤ 2·K 2 /α and α is the upper limit of d T (r); d T (r) is estimated by the following formula: where d 0 (r), d 1 (r), d 2 (r) are the percentages of passengers who travel from their origin to their destination directly, with one transfer or with two transfers, respectively. Parameters a, b and c are determined by the user (a ≥ b ≥ c). They represent the respective importance of the ways in which the demand should be satisfied in a qualitative route set, for example, trying to have a low percentage of intermediate transfers may lead to a larger average travel time. F 3 (r) represents the percentage of passengers who are not able to go from their origin to their destination by route set r, and it is estimated as follows: where K 3 is a positive parameter determined by the user, defining the upper limit of F 3 (r); −K 3 ≤ b 3 ≤ 0; and d unsat (r) is the percentage of total transit demand which cannot be satisfied by route set r.

Calculating F 4 (r)-The Service Provider's Cost Actor
Kechagiopoulos and Beligiannis [43] proposed the addition of a fourth cost, namely F 4 (r), in the evaluation function trying to construct qualitative route sets from the service provider's scope. This fourth cost depends mainly on each route set's total length. At first, they calculate the total length L tot of each route set by summing all time distances between nodes belonging to the routes of each route set. L tot is proportional to fuel consumption, a crucial factor for each service provider. The mathematical formula calculating score F 4 (r) is the following equation: where K 4 is a positive parameter determined by the user defining the upper limit of F 4 (r), −K 4 /x 4m ≤ b 4 ≤ 0 and x 4m is a limit value for L tot according to which route sets having L tot values smaller than the value of x 4m are supposed to be optimal. In these cases, F 4 (r) reaches its maximum value.

Constructing the Best Route Procedure
The algorithm starts by reading the input data, i.e.,: • Data concerning the road network as well as the transfer requests.

•
Data concerning problem requirements (number of routes in the resulting route set, maximum number of nodes for each route, the maximum time length per route and the minimum nodes per rout). • Transfer penalty.
From the initialization phase, each cat acquires a property that determines its initial mode in terms of seeking mode or tracing mode. A small percentage of cats is put into seeking mode and the rest in tracing mode. This percentage is determined by the value of the mixture ratio (MR) parameter, which in this contribution was set to 0.04 (see Section 3.1.8 and Table 3). MR determines the proportion of cats in seeking mode in the total active cats' population [52]. An MR value equal to 0.04 means that about 4% of the population is in seeking mode.  The main phase of the route set configuration algorithm runs for a predetermined number of iterations (see Section 3.1.8 and Table 3). It consists essentially in separating the active cats according to the mode they are in (seeking mode or tracing mode) and applying to them the corresponding seeking mode procedure (see Section 3.1.6) or tracing mode procedure (see Section 3.1.7). After completing these procedures, each cat's mode is redefined, and the process is repeated. When the predetermined number of iterations is completed, the algorithm ends its main phase and enters an improvement phase of the resulting solution. In this phase, a refinement procedure which seeks to improve the resulting solution by adding one node at the end of each route of the final route set is applied (see Section 3.2). The pseudocode in Figure 2 demonstrates the structure of the proposed algorithm. Steps 1-4 constitute the initialization phase; steps 5-19 comprise the main phase of the algorithm, while step 20 concerns the execution of the refinement procedure.

2.
Create initial number of agents//see Table 3 3. Sort initial population due to decreasing total route set length or due to decreasing fitness value //depending if focus is on passenger's or provider's interest

4.
Pick from the sorted population "number of active agents"//see Table 3 5.
bestCat  agent with best fitness among the active agents

Seeking Mode Process
The seeking process intends to bring about changes to the cats that are in seeking mode in order to improve the global solution. In the seeking process, a random route is selected from the best agent (i.e., the best route set so far) and takes the place of the corresponding route of the cat to which the process is applied. If none of the hard constraints is violated and the cat's fitness improves, the substitution is finalized. Otherwise, the cat returns to its original state. If, in addition, replacement leads to better fitness than the existing bestFitness, then the bestCat is also updated and substituted by current cat in which seeking process is applied (Figure 3).

Seeking Mode Process
The seeking process intends to bring about changes to the cats that are in seeking mode in order to improve the global solution. In the seeking process, a random route is selected from the best agent (i.e., the best route set so far) and takes the place of the corresponding route of the cat to which the process is applied. If none of the hard constraints is violated and the cat's fitness improves, the substitution is finalized. Otherwise, the cat returns to its original state. If, in addition, replacement leads to better fitness than the existing bestFitness, then the bestCat is also updated and substituted by current cat in which seeking process is applied (Figure 3).

Tracing Mode Process
The tracing process seeks to make changes locally, slightly altering the structure of the cats in tracing mode. This increases the chance of improving the route sets represented by cats. The changes are probabilistic rather than deterministic. Thus, both the diversity of the population and its dispersion in as much of the solution space as possible are maintained. During the tracing process, several copies of the cat to which the tracing process is applied are created. This number of copies is equal to the seeking memory pool (SMP) size parameter [52], which in the current implementation is set to 10 (see Section 3.1.8 and Table 3). For each copy of these, a random route is selected from the route set that cat represents, and a random location on that route is marked which is not empty (i.e., has value not equal to −1). All the road network nodes are overlooked, and the first one that can replace the node in the chosen location is selected. If the route set consistency is not violated, the maximum allowed length of the selected route is not exceeded and no circles or inversions are created, the substitution is finalized. Otherwise, the process proceeds to the next road network node. The process for the current copy is completed when there is a successful replacement or when the road network nodes are exhausted. The above procedure is repeated for all copies of the SMP. The fitness value of each copy is then calculated, and if all fitness values are the same, the probability of choosing each copy is set to 1. Otherwise, the probability of choosing each copy is calculated according to the value of the ratio |Fitnesscopy− Fitness min| Fitness max− Fitness min , where Fitness min and Fitness max are respectively the minimum and maximum fitness values among all copies of cat, and Fitness copy is the fitness value of the current copy. The process is completed by selecting a copy replacing the cat to which the tracing process is applied, with a probability of selection equal to the one previously calculated (Figure 4).

Tracing Mode Process
The tracing process seeks to make changes locally, slightly altering the structure of the cats in tracing mode. This increases the chance of improving the route sets represented by cats. The changes are probabilistic rather than deterministic. Thus, both the diversity of the population and its dispersion in as much of the solution space as possible are maintained. During the tracing process, several copies of the cat to which the tracing process is applied are created. This number of copies is equal to the seeking memory pool (SMP) size parameter [52], which in the current implementation is set to 10 (see Section 3.1.8 and Table 3). For each copy of these, a random route is selected from the route set that cat represents, and a random location on that route is marked which is not empty (i.e., has value not equal to −1). All the road network nodes are overlooked, and the first one that can replace the node in the chosen location is selected. If the route set consistency is not violated, the maximum allowed length of the selected route is not exceeded and no circles or inversions are created, the substitution is finalized. Otherwise, the process proceeds to the next road network node. The process for the current copy is completed when there is a successful replacement or when the road network nodes are exhausted. The above procedure is repeated for all copies of the SMP. The

Setting the Algorithm's Parameters Values
Most parameters' values used by the algorithm (presented in Table 3) are determined based on efficiency requirements, constraints of computational resources and the respective literature [18,[50][51][52]. The determination of the rest parameters' values was conducted after exhaustive experiments (by trial and error), and the selected values where the ones that utilize the algorithm's performance. These parameters are the following:

28.
Set probability of peeking current copy equal to 1.0

30.
Set probability of peeking current copy equal to , where Fitnesscopy is the fitness of current copy 31. }

32.
Pick a copy from seeking memory pool according to its probability 33. Set current agent equal to picked copy

Setting the Algorithm's Parameters Values
Most parameters' values used by the algorithm (presented in Table 3) are determined based on efficiency requirements, constraints of computational resources and the respective literature [18,[50][51][52]. The determination of the rest parameters' values was conducted after exhaustive experiments (by trial and error), and the selected values where the ones that utilize the algorithm's performance. These parameters are the following: Number of iterations: The number of iterations is determined mainly by constraints related to the execution time of the algorithm. Because the size of the solution space is increasing as the size of the road network increases (see Figures 5 and 6 for example), the number of generations depends on the size of the network and is fixed to 300 + number of nodes of the network, since we noticed that after 300 iterations the improvement in the algorithm's performance, for the tested networks, is insignificant.
Number of iterations: The number of iterations is determined mainly by constraints related to the execution time of the algorithm. Because the size of the solution space is increasing as the size of the road network increases (see Figures 5 and 6 for example), the number of generations depends on the size of the network and is fixed to 300 + number of nodes of the network, since we noticed that after 300 iterations the improvement in the algorithm's performance, for the tested networks, is insignificant.  Initial number of agents (cats): The number of agents originally created, from which the most robust are selected, is mainly determined by constraints on computing resources (memory). For relatively small networks, this number can be set to a bigger value, and for larger networks, a smaller value should be selected. After many experiments, this number has values from 135 to 1000.
Number of active agents (cats): The number of active agents used throughout the execution of the algorithm is mainly determined by the requirement that the algorithm completion time must be limited to acceptable frames. For relatively small networks. this number can be set to a bigger value, and for larger networks, a smaller value should be selected. After many experiments, this number has values from 50 to 500.
Coefficients ω1, ω2, ω3: By adjusting the values of ω1, ω2and ω3, we can control the effect of the terms F1, F2 and F3 on the fitness value (see Section 3.1.3 and Equation (1)). When performing the experimental tests, it was observed that when the number of routes in the route set is small, solutions can be found in which there is a non-zero percentage of unsatisfactory transfer demands. After exhaustive testing, we decided to set the values of ω1 and ω2 to 1.0. It was also observed that in some cases, the algorithm produced infeasible solutions when the number of routes was limited to four. Therefore, the ω3 value was set at 121.0 when the algorithm had to deal with the four routes case. Alternatively, one could adjust the value of the K3 factor accordingly. Number of iterations: The number of iterations is determined mainly by constraints related to the execution time of the algorithm. Because the size of the solution space is increasing as the size of the road network increases (see Figures 5 and 6 for example), the number of generations depends on the size of the network and is fixed to 300 + number of nodes of the network, since we noticed that after 300 iterations the improvement in the algorithm's performance, for the tested networks, is insignificant.  Initial number of agents (cats): The number of agents originally created, from which the most robust are selected, is mainly determined by constraints on computing resources (memory). For relatively small networks, this number can be set to a bigger value, and for larger networks, a smaller value should be selected. After many experiments, this number has values from 135 to 1000.
Number of active agents (cats): The number of active agents used throughout the execution of the algorithm is mainly determined by the requirement that the algorithm completion time must be limited to acceptable frames. For relatively small networks. this number can be set to a bigger value, and for larger networks, a smaller value should be selected. After many experiments, this number has values from 50 to 500.
Coefficients ω1, ω2, ω3: By adjusting the values of ω1, ω2and ω3, we can control the effect of the terms F1, F2 and F3 on the fitness value (see Section 3.1.3 and Equation (1)). When performing the experimental tests, it was observed that when the number of routes in the route set is small, solutions can be found in which there is a non-zero percentage of unsatisfactory transfer demands. After exhaustive testing, we decided to set the values of ω1 and ω2 to 1.0. It was also observed that in some cases, the algorithm produced infeasible solutions when the number of routes was limited to four. Therefore, the ω3 value was set at 121.0 when the algorithm had to deal with the four routes case. Alternatively, one could adjust the value of the K3 factor accordingly. Initial number of agents (cats): The number of agents originally created, from which the most robust are selected, is mainly determined by constraints on computing resources (memory). For relatively small networks, this number can be set to a bigger value, and for larger networks, a smaller value should be selected. After many experiments, this number has values from 135 to 1000.
Number of active agents (cats): The number of active agents used throughout the execution of the algorithm is mainly determined by the requirement that the algorithm completion time must be limited to acceptable frames. For relatively small networks. this number can be set to a bigger value, and for larger networks, a smaller value should be selected. After many experiments, this number has values from 50 to 500.
Coefficients ω 1 , ω 2 , ω 3 : By adjusting the values of ω 1 , ω 2 and ω 3 , we can control the effect of the terms F 1 , F 2 and F 3 on the fitness value (see Section 3.1.3 and Equation (1)). When performing the experimental tests, it was observed that when the number of routes in the route set is small, solutions can be found in which there is a non-zero percentage of unsatisfactory transfer demands. After exhaustive testing, we decided to set the values of ω 1 and ω 2 to 1.0. It was also observed that in some cases, the algorithm produced infeasible solutions when the number of routes was limited to four. Therefore, the ω 3 value was set at 121.0 when the algorithm had to deal with the four routes case. Alternatively, one could adjust the value of the K 3 factor accordingly.

The Refinement Procedure
The algorithm has been enriched with a solution improvement process that is activated after the end of the main phase of the algorithm. This process seeks to improve the solution by adding one node at the end of each route of the final route set. There is a pre-refinement phase in which all routes of the route set are identified if they can be refined. A route can be improved (i.e., refined) when the number of nodes in the route is less than the maximum number of nodes per route. We check whether a node could be added after the last node of the route without a circle being created or violating the connectivity of the route set. If this is the case, then the route is considered a route that can be refined. If not, the route is reversed, and the same pre-refinement procedure is repeated once more. The refinement procedure is not applied in the case in which we are focusing on provider's interest (see Section 4.3).
The refinement procedure is described in Figure 7. Steps 1-12 constitute the pre-refinement phase, while steps 13-29 comprise the main refinement phase.

Computational Results
To evaluate the performance of the algorithm, five input files from the international bibliography containing five road networks data were used: the file of Mandl's road network [25] and four files introduced by Mumford [13], namely Mumford0, Mumford1, Mumford2 and Mumford3. Table 4 contains the attributes of these files.

Computational Results
To evaluate the performance of the algorithm, five input files from the international bibliography containing five road networks data were used: the file of Mandl's road network [25] and four files introduced by Mumford [13], namely Mumford0, Mumford1, Mumford2 and Mumford3. Table 4 contains the attributes of these files. Evaluation and comparison of route sets is made considering the following criteria (in order of significance):

1.
Unsatisfied travel requirements (d unsatt ), which should be as small as possible and ideally 0%.

2.
Average travelling time per user (att), which should be as small as possible and is equal to where 0≤ i ≤ number of nodes, 0≤ j ≤ number of nodes, d ij represents the transferring demands and t ij symbolizes the travelling time between node i and node j, respectively, using the route set which is under evaluation (transferring delays are included). Note that this is different from Equation (4), in which we consider only the travelling demands that are satisfied by the route set, i.e., travelling demands that are satisfied with less than three transfers. Alternatively, we use the total user cost (ttc in minutes [40,42]) which also should be as small as possible.

3.
Percentage of travels with zero transfers (d 0 ), which should be as large as possible and ideally 100%.
Code was created in Java, using the Eclipse IDE. All tests were performed in a Windows Vista Home Premium (SP2) 32-bit OS machine, with an Intel Core 2 CPU, T7200 @2.00 GHz and 2.00 GB RAM.
All five instances have been evaluated using the same method. If there are any discrepancies between the published results and our evaluation, both results are published (this is the case for Mumford [13] solutions). Details about how to run the proposed CSO algorithm, the input files used, the output files containing the best results achieved, as well as, the executable program of the proposed CSO algorithm are available online at [53], so as to be easy for the interested reader to reproduce the below-mentioned results.

Applying CSO to Mandl's Network Instance
There were executed 100 repeatable Monte Carlo runs per instance per required number of routes. In Tables 5-12, the best and worst results achieved by the proposed CSO algorithm are selected based on the three criteria presented in the former paragraphs using the proposed order of significance. Moreover, the average and standard deviation values presented are the ones calculated over all 100 repeatable Monte Carlo runs. In bold are the values which determine which of the proposed approaches achieve the best results.  [16] 93.14 6.86 0.00 0.00 10.42 Kilic and Gok [39] 97.37 2.63 0.00 0.00 10.20 Nikolic and Teodorovic [41] 98.97 1.03 0.00 0.00 10.09 Chew and Lee [17] 97 The best solution (route set) with four routes found by the proposed CSO algorithm, which is presented in Figure 8, is the following:  Table 5 shows, the route set that the CSO approach produced for a four-route route set is the second best achieved solution among other approaches. Even the worst achieved solution by the CSO algorithm is better than many of the best solutions achieved by other approaches. The route set that CSO produces has an average travelling time per user that differs by only 5.34% from the ideal one (see Table 4).
The best solution (route set) with six routes found by the proposed CSO algorithm is the following:  Figure 8. The best route set with four routes produced by CSO.
As Table 5 shows, the route set that the CSO approach produced for a four-route route set is the second best achieved solution among other approaches. Even the worst achieved solution by the CSO algorithm is better than many of the best solutions achieved by other approaches. The route set that CSO produces has an average travelling time per user that differs by only 5.34% from the ideal one (see Table 4). Table 5. Comparative results for the best four-route solutions for Mandl's network.

Solution with Four Routes d0(%) d1(%) d2(%) dunsatt(%) att(mpu ) ttc(mpu)
Mandl [25] 69 The best solution (route set) with six routes found by the proposed CSO algorithm is the following:  As Table 6 shows, the route set that the CSO approach produced for a six-route route set improves the best achieved solution ( [38]) to 0.06% (same average travelling time; better zero transfers' satisfied transfer demands). Even the worst achieved solution by the CSO algorithm is better than many of the best solutions achieved by other approaches, and all transfer demands are satisfied with no more than one transfer. The route set that CSO produced gives an average travelling time per user that differs by 2.14% from the ideal one (see Table 4).
The best solution (route set) with routes found by the proposed CSO algorithm is the following:  Table 7 shows, the route set that the CSO approach produced for a seven-route route set improves the best achieved solution ( [23]) to 0.3% (10.12 min per user from 10.15 min per user). Even the worst achieved solution by the CSO algorithm is better than many of the best solutions achieved by other approaches. All transfer demands are satisfied with no more than one transfer in both best and worst solutions. The route set that CSO produced gives an average travelling time per user that differs by 1.14% from the ideal one (see Table 4).
The best route set with eight routes created using the proposed CSO algorithm: As Table 8 shows, the route set that the CSO approach produced for an eight-route route set improves the best achieved solution ( [45]) to 0.1% for the average travelling time. Even the worst achieved solution by the CSO algorithm is better than all but one of the best solutions achieved by other approaches. All transfer demands are satisfied with no more than one transfer in both best and worst solutions. The route set that CSO produced gives an average travelling time per user that differs by 0.74% from the ideal one (see Table 4).
In Muhammed et al. [38] and Nikolic and Teodorovic [40], the maximum number of nodes for each route was limited to 13 and 11, respectively. For a fair comparison, we ran tests using CSO, using the same maximum number of nodes per route. In Tables 9-12, the comparative best results are presented. In bold are the values which determine which of the proposed approaches achieve the best results.

Applying CSO to Mumford's Instances
There were 100 Monte Carlo trials executed per instance per required number of routes. In Tables 13-16, the best results achieved by the proposed CSO algorithm are presented based on the three criteria presented at the beginning of Section 4 using the proposed order of significance. In bold are the values which determine which of the proposed approaches achieve the best results.

Considering the Provider's Operational Cost
The provider's operational cost is considered in this section by investigating to what extent the total length of all routes affects it. Of course, the smaller the total length of all routes, the bigger the profit for transport companies. As mentioned (see Section 1), Kechagiopoulos and Beligiannis [43] proposed a method of balancing costs between the customer and the provider, admitting a fourth term F 4 (r) in the evaluation function (Equation (8)). In this contribution, we adopt the same method using the same values for the coefficients for calculating the term F 4 (r) (see Table 3). Score F 4 (r) is summed in an analogous way to the rest of the F i scores (FIT(r) = ω 1 · F 1 (r) + ω 2 · F 2 (r) + ω 3 · F 3 (r) + ω 4 · F 4 (r)).
The ω 4 weight is set equal to 1. As seen in Section 3.1.3, the K i values are the maximum values that the respective F i scores can obtain. Thus, we can modify K i values to put analogous emphasis on a specific score. We can reach the same goal by changing the values of weights ω i too.
When considering score F 4 (r), one can clearly realize the multi-objective nature of the applied algorithm. As mentioned above, score F 4 (r) is mainly affected by the total length of the entire route set, which clearly affects operational costs, for example, average fuel consumption, and introduces in the evaluation function the service provider's point of view. Since the value of F 4 (r) increases as the total route set length decreases,F 4 (r) is competing with the other F i scores. Computational experiments show that if the algorithm's parameter values are properly chosen, the algorithm can succeed in constructing route sets that balance the conflicting goals (passenger's demands vs. operational costs). In order to obtain optimal results from the passengers' scope and be able to have a fair comparison of the algorithm's performance with other methods reported in the literature, the weight value of F 4 (r) was set to 0 (operational cost not taken into account) in all experimental results presented in Sections 4.1 and 4.2.
In Tables 17 and 18, the evaluation and comparison of route sets constructed by the proposed CSO algorithm and the approach by Kechagiopoulos and Beligiannis [43] is made considering the following criteria (in order of significance):

1.
Total length (L tot ) in minutes of each route set, which should be as small as possible.

2.
Unsatisfied travel requirements (d unsatt ), which should be as small as possible and ideally 0%.

3.
Average travelling time per user (att), which should be as small as possible.

4.
Percentage of travels with zero transfers (d 0 ), which should be as large as possible and ideally 100%.
Note that in this case, the agents that remain active during the execution of the algorithm were initially picked according to their fitness value (see Section 3.2), and the refinement phase (see Section 3.2) is not activated.
In the first set of experiments, parameter K 4 was set to 10, while x 4m varies from 20 to 200 min. In Table 17, we present the average values of all evaluation criteria. Both algorithms were executed for 100 Monte Carlo runs, the number of routes equals 4, and the maximum number of nodes is restricted to 8.
As can be seen from Table 17, the approach of the CSO algorithm produces in all cases better results according to the criteria set forth above. The cost for the transmission system provider is about 72.9% on average when compared to the corresponding costs generated by the approach of [43]. The volatility of L tot values is much smaller than the corresponding volatility displayed at [43]. This means that we could set x 4m to a value bigger than 160 and produce a route set with a cost for the provider equal to 75.45 on average. At the same time, we achieve route sets that meet the requirements of passengers with no more than two transfers. The time burden for passengers in this case reaches 10.3% on average when compared to [43]. In the second set of experiments, parameter K 4 was set to 10, while x 4m varies from 80 to 160 min. In Table 18, we present the average values of all evaluation criteria. Both algorithms were executed for 100 Monte Carlo runs, the number of routes equals 4, and the maximum number of nodes is restricted to 10.
As can be seen from Table 18, the approach of the CSO algorithm produces in all cases better results according to the criteria set forth above. The cost for the transmission system provider is about 65.6% on average compared to the corresponding costs generated by the approach of [43]. The volatility of L tot values is much smaller than the corresponding volatility displayed at [43]. This means that we could set x 4m to a value equal to 160 and produce a route set with a cost for the provider equal to 75.9 on average. The time burden for passengers in this case reaches 20.9% on average when compared to [43].
In conclusion, we can adjust algorithm's parameters to fulfill passengers demands and, at the same time, succeed in decreasing operational costs. However, designing route sets with very small route lengths should be avoided, since very small route lengths demand higher frequency of bus schedules, which is not acceptable in general.

Conclusions-Future Work
In this work, a robust CSO-based algorithm has been constructed and applied to the urban transit routing problem (UTRP). Efficient and feasible near-optimal route networks for public transit networks were designed using this novel approach. To our knowledge, this is the first time that a CSO-based algorithm is applied to solve the UTRP. Five input instances were used as benchmark problems, derived from international literature, which consist of well-known test instances for the UTRP in the respective literature. Experimental results showed that the proposed CSO-based algorithm achieves in all test instances very satisfactory results comparable with the best results found to date by other effective published approaches applied to the same test instances using the same evaluation criteria. This fact demonstrates that applying CSO-based algorithms is a very good choice to effectively solve the UTRP, excluding the fact that the proposed algorithm also considers the optimization of the provider's operational cost. Some experimental results show that by adjusting the algorithm's parameters, the proposed algorithm can both fulfill passengers' demands and, at the same time, succeed in decreasing operational costs. A weak spot of the proposed algorithm is the determination of parameters' values, which plays a significant role in the algorithm's optimization capability. A more systematic method to determine the (near) optimal values of all algorithms' parameters will be one of the main issues of our future work. Moreover, further research toward optimizing the cost of the service provider as well as applying the algorithm to relevant vehicle routing problems, such as the capacitated vehicle routing problem (CVRP) and the open vehicle routing problem, are future goals of the authors.