A Bi-Objective Home Health Care Routing and Scheduling Problem under Uncertainty

Home health care companies provide health care services to patients in their homes. Due to increasing demand, the provision of home health care services requires effective management of operational costs while satisfying both patients and caregivers. In practice, uncertain service times might lead to considerable delays that adversely affect service quality. To this end, this paper proposes a new bi-objective optimization problem to model the routing and scheduling problems under uncertainty in home health care, considering the qualification and workload of caregivers. A mixed-integer linear programming formulation is developed. Motivated by the challenge of computational time, we propose the Adaptive Large Neighborhood Search embedded in an Enhanced Multi-Directional Local Search framework (ALNS-EMDLS). A stochastic ALNS-EMDLS is introduced to handle uncertain service times for patients. Three kinds of metrics for evaluating the Pareto fronts highlight the efficiency of our proposed method. The sensitivity analysis validates the robustness of the proposed model and method. Finally, we apply the method to a real-life case and provide managerial recommendations.


Introduction
The Home Health Care (HHC) industry has been fast-growing worldwide in recent years with the development of information technology and transportation systems.HHC companies provide services at patients' homes, ranging from nursing care to specialized medical services [1].Patients can thus receive treatment in familiar surroundings.This also helps to decrease hospital admissions and duration of hospital stays [2].Demand for HHC has been increasing in recent years as the elderly increasingly prefer to grow old in their own home, underscoring the critical need for the efficient organization of caregivers' daily activities.
The integration and coordination of this health service logistic network is a complex task, and managers have to face many logistics decisions.Network design, transportation management, staff management, and inventory management are all described as the set of decision problems related to design and operation in HHC [3].Depending on the time horizon from long term to short term, the planning horizon can be categorized into three levels: the strategic level, the tactical level, and the operational level [4].The strategiclevel decisions involve defining facility locations, patient districts, transportation modes, staffing, and service levels.The fleet assignment to patients' districts, the shift scheduling, and the definition of inventory policies are considered at the tactical level.The decisions at the operational level include staff assignment and routing, as well as inventory control.A fourth level has recently been recognized as a real-time level, where very short-term planning is needed, according to actual execution.
The short-term daily activities in HHC services are shown in Figure 1.Once patients receive their medical treatment prescriptions, service times and staff qualifications are defined, and managers assign qualified caregivers to patients.They draw up the caregivers' schedule based on the services requested and on the patients' available times.Finally, each caregiver delivers their medical service, starting from the home health care facility and visiting patients in a planned sequence before returning to the facility.These daily activities in HHC refer to transportation management and can be planned at the tactical and operational levels.This planning can be modeled as a Home Health Care Routing and Scheduling Problem (HHCRSP).Caregivers should arrive and leave within the time frame of patients' availability, corresponding to a fixed time interval for receiving the care service, called a time window.A hard time window ensures that the patients are visited within their time windows.Caregivers are not permitted to perform the service if they arrive or leave outside of these time windows.A soft time window can be violated at the cost of a penalty.
When determining routes and schedules, decision-makers strive to identify the most cost-effective routes while taking into account time windows.This is crucial as transportation services constitute a substantial portion of a company's expenses and can impact operational efficiency.However, both patient satisfaction and caregiver satisfaction also play an important role in HHC companies' competitiveness.HHC companies may not be capable of providing service for all patients in their available time frames.All patients can, however, receive care if some of them accept visits outside of their available time frames.The less that arrival and departure times are outside the time window, the more satisfied patients and caregivers will be.It is therefore particularly important to strike a balance between costs and quality, which has motivated us to propose two objectives.
In practical care services, caregivers must possess appropriate qualifications to provide medical care that aligns with the patient's conditions.The level of treatment required should correspond to the capabilities of the caregivers.Fairness of caregivers' workload also has to be ensured to avoid overwork or refusal of certain treatment tasks.These commonly considered features, including different staff qualifications and workload balance for caregivers, contribute to the service quality but make the model more complex.
In addition to the operational constraints, uncertainty is prevalent in such applications.A caregiver may arrive or leave a patient's home earlier or later than the previous appointment time as the schedule and the route are static.The time that a caregiver spends on the road changes little, except for instances of major road accidents.The arrival time and service end time are hard to predetermine mainly because the service time of each patient is related to their physical condition, which is often unstable, especially when emergencies or acute diseases are involved.Moreover, providing services to address social isolation among elderly patients adds to the unpredictability of service time.Individual needs and preferences for social interaction and companionship can vary significantly.Some patients may require more time and attention for meaningful conversations or activities than the planned service times.If caregivers arrive prematurely due to the preceding service ending sooner, they have to wait.This inadvertently extends their working hours.If the actual service time is longer than the planned service time, the caregiver may arrive too late at the next patient's home.This case where the patient cannot receive service on time degrades the service quality.In diabetes management, maintaining strict adherence to time schedules is crucial for stable blood glucose level control [5].Similarly, time-regulated nutritional support is needed in palliative care [6].It is therefore essential to plan robust routes and schedules for real-life home health care activities.
To conclude, there is an urgent need to define effective routes and schedules while enhancing the quality of service under uncertain service times.To address this, a new bi-objective Mixed-Integer Linear Programming (MILP) model is developed, integrating routing and appointment scheduling.Due to the complexity of solving this model for practical-sized instances, the Adaptive Large Neighborhood Search within Enhanced Multidirectional Local Search (ALNS-EMDLS) has been developed and is proposed to speed up problem-solving and provide the approximation Pareto front.Additionally, a stochastic version of ALNS-EMDLS is introduced to generate robust solutions in the face of uncertain service times.These solutions may not be optimal for every service time scenario, but they perform better on average than the deterministic method.
The rest of this paper is organized as follows.Section 2 reviews the literature related to vehicle routing and scheduling problems in home health care activities.Section 3 addresses the model with two objective functions.A scenario-based multi-objective solution methodology is proposed in Section 4. The parameters setting, computational experiments, and results analysis are presented in Section 5.In Section 6, the parameters inherited from the sensitivity analysis are utilized in a real-world application, which results in managerial suggestions.The conclusion of this paper and the perspective of future studies are summarized in the Section 7.

Related Works
In this paper, we focus on the HHCRSP within a daily planning horizon, which is a variant of the Vehicle Routing Problem with Time Windows (VRPTW) [7].The VRPTW has many applications such as telecommunication, waste collection, and cross-docking [8,9].Determining the optimal solution to VRPTW is NP-hard.Solving the HHCRSP consists of designing optimal delivery routes from a central location to a set of geographically distributed patients with various constraints.This entails optimization problems that are complex and therefore of particular interest to stakeholders.It differs from VRPTW because of the features [10]: (1) the temporal dependency and the disjunctive nature of services; (2) the continuity, given that patients are assigned to a restricted set of caregivers; and (3) caregivers' skills and patients' requests.Various constraints and objectives are used in different studies due to the home care policies of the country under study.Exact, heuristic, and approximate methods have been proposed to solve related problems [11].

Operational Constraints
Liu et al. [12] addressed a routing and scheduling problem for home care workers with the consideration of lunch break requirements.They found an optimal solution by the Branch and Price (B&P) method.Shahnejat-Bushehri et al. [13] took into account Time Window (TW), Quality of Caregivers (QC), precedence, and synchronization constraints.A parameter was used to indicate whether a caregiver had the necessary qualifications to provide a certain service.Simulated Annealing (SA) and Tabu Search (TS) were applied in two phases.A Variable Neighborhood Search (VNS)-based heuristic was used to obtain a feasible solution that had to observe assignment constraints, working time restrictions, time windows, and mandatory break times [14].Constraint Programming (CP) and TS were used to solve the routing problem while considering the problem of matching the skills of caregivers and the patient's conditions [15].Workload Balance (WB) was considered as the fair workload assignment to each caregiver in [16].The constraints in existing research also include visits incompatibilities [17], multiple modes of transportation [18], and time-dependent travel times [19].
In most other studies, the time window is used to limit the service starting time.The service starting time outside the time window leads to a penalty due to patients' dissatisfaction.However, it is more reasonable that the time window is defined as the time frame during which the patient is available.There will also be a penalty if caregivers perform a service and then leave the patients' homes outside the time windows.We, therefore, define patients' and caregivers' satisfaction as minimizing the segmented penalty due to arrival times and departure times being out of the time windows.If the penalty is continuous, only one patient is likely to endure a long delay.The segmented penalty promotes a more equitable distribution of service punctuality and increases the fairness in scheduling for patients and caregivers.Different levels of caregivers are required by patients depending on the severity of their conditions.We assume a fixed number of caregivers to be assigned to the daily schedule.The number of patients to be served by one caregiver is limited for the sake of balancing the workload.We first consider these properties together.

Multi-Objective Optimization
Decision-makers balance various factors, including cost, requiring the simultaneous optimization of multiple objectives, an approach facilitated by Multi-Objective Optimization (MOO) [20].There is seldom a single global solution, and it is often necessary to determine a set of points that all fit a predetermined definition of an optimum (for more concepts, see [21]).One of the most intuitive methods is to optimize the weighted sum of all the objective functions as a single objective optimization.It is implemented simply but is highly dependent on weights.The bounded objective function method minimizes the single most important objective function, while others are used to form additional constraints.The bounded value and the most important objective function are hard to pre-select.The optimization process of the lexicographic method is carried out individually on each objective function following the order of importance and stops when a unique solution is obtained.In a game theoretic approach, objective functions are assumed to be the players that are ultimately controlled by the decision-maker and can be expected to reach an agreement, meaning the game is cooperative.This was proposed in [22], and an improvement was stated in [23].A single solution method can be used if the decisionmakers' preference is known.However, if they cannot explicitly express their preference, it is preferable to provide them with a range of solutions to choose from.Some multi-objective optimization algorithms are capable of obtaining Pareto solutions, which represent the set of non-dominated solutions in a multi-objective optimization problem.These solutions are essential for decision-makers to evaluate trade-offs between conflicting objectives.A normal-boundary intersection is a technique intended to find the portion that contains the Pareto optimal solutions, delineating the boundary of the set of attainable objective vectors [24].A genetic algorithm is also suitable to obtain Pareto fronts since it can process a set of solutions in parallel.Non-dominated sorting genetic algorithms and other algorithms based on the genetic algorithm combine the use the random numbers and information from previous iterations to evaluate and improve a population of points [25].Recently, the Pareto Q-learning algorithm was used for solving MOO, learning deterministic, non-stationary, and non-dominated multi-objective policies while mapping the entire Pareto front [26,27].
Recent advances in HHC research have moved to explore the multi-objective optimization method to obtain the Pareto front instead of using a weighted objective function.Decerle et al. [28] combined distance and visit penalties into a single objective function, finding genetic algorithms and local searches that yielded instance-flexible results.In [29], the weighted sum of travel time, a score of continuity of care, overtime, idle time, and penalty for unscheduled patients, was minimized.Yang et al. [30] minimized travel costs, inconsistency, and workload balance using an artificial bee colony metaheuristic to generate non-dominated solutions.Braekers et al. [31] analyzed the trade-off between operating costs and service levels.A bi-objective optimization model was developed to address the travel costs and downgrading costs of nurses in [32].This model was solved by an ϵ-constraint-based approach.Fathollahi-Fard et al. [33] considered travel costs and CO 2 emissions as objective functions, using hybrid versions of metaheuristics and developing four fast heuristics for Pareto optimal solutions.
Although improving service quality and patient satisfaction is as important as reducing costs, there is still less related research on the subject.This has motivated us to introduce a bi-objective model to reconcile the interests of different stakeholders in HHC.We do not depend on the preference of decision-makers (for example, weights assigned to the objectives) to aggregate the objectives into one.Decision-makers can select their preferred solution from a Pareto optimal set according to different operational situations.
Table 1 summarizes the research that built deterministic models for routing problems in HHC.Compared with the deterministic models in recent research, our work takes into account the time windows, workload balance, and skills matching.We propose a novel approach ALNS-EMDLS.It is easy to implement and ignores the gradient information and the nature of objective functions and constraints.It contains fewer hyperparameters compared to genetic algorithms.

Uncertainties
Some parameters that are represented in a stochastic manner can model uncertainties such as service times, travel times, and demands of patients.Two common models used in general formulations are the Chance Constrained Programming (CCP) model and the Stochastic Programming with Recourse (SPR).Li et al. [34] conducted a comparative study of these two models for the vehicle routing problem under uncertain travel times and service times.Based on the results obtained by TS, the authors concluded the CCP might not be a suitable model for the target problem.This is attributed to the computational challenges arising from the stringent constraints imposed by the confidence levels.Consequently, the stochastic programming model has been chosen for handling the uncertain service times in our study.
The robust optimization aims to obtain robust solutions that remain relatively unchanged under uncertainties.The uncertainties can be modeled deterministically, probabilistically, or possibilistically [35].The uncertainties in HHC can be mainly quantified based on certain distributions [30,36,37], the triangular fuzzy numbers [38,39], the budget uncertainty polytopes [40], and a set of scenarios with fixed probabilities [41,42].The worstcase philosophy and expected performance of all scenarios can be employed to construct the robust objective function of the original formulation.The former, though conservative, can yield impractical solutions when overly large domains are chosen.Therefore, we define the robust counterpart of the original objective function based on the expectation.We assume that the uncertainty in our study follows the normal distribution.If solutions are obtained with optimal expected values while involving uncertainties that are random variables or follow probability distributions, it can be also called stochastic optimization.
To solve the HHCRSP under uncertainties, some studies in the HHCRSP utilize numerical techniques to transform the robust optimization problem into a normal optimization problem by using strong mathematical assumptions.Yuan et al. [43] used an approximate formula to replace the expected penalty for late arrival and thus to reduce computing effort.In [30], the objective functions and constraints related to uncertain service time and travel time were considered and transformed into their deterministic equivalents based on uncertainty theory.This consistent HHCRSP was solved by an improved multi-objective Artificial Bee Colony (ABC) metaheuristic.The objective function was rewritten as a recursive function based on the theory of budget in [40].However, this approach may be limited by the need for strong mathematical assumptions like first-or second-order derivatives, which may not always be accessible.In most studies, the robust objective function is typically calculated by the simulation method, and then exact, matheuristic, or metaheuristic methods are applied [36,37].
In other fields, some researchers apply the scenario-based method to solve a stochastic multi-objective model [44][45][46].A multi-objective optimization problem under uncertainty in transmission expansion planning was proposed in [47].The objective functions were the total cost, the robustness, and the flexibility criterion.The proposed process for solving this problem considered the performance of solutions in all scenarios simultaneously.It could be applied to a situation where there were not too many scenarios because the model needs to be optimized under each scenario.In the micro-grid operation field, Niknam et al. [48] modeled load demand, available output power, and market price, by means of scenariobased stochastic programming.
In our study, we develop the stochastic model and method to obtain robust routes and schedules.Given a certain distribution of the service time, we aim to optimize the expectation of the objective functions instead of using only the mean of the service times for a deterministic model.It is hard to calculate the integration of complex objective functions.Sampling from the distribution into a finite set also generates a big increase in the number of variables and constraints.Therefore, we propose a stochastic ALNS-EMDLS that combines a scenario-based method to deal with the uncertain service times.
Notably, we conclude the main research gaps by comparing our study with the following literature that is closely related.The authors of [43] modified the objective function and constraints with stochastic service times by an approximate formula, while the authors of [37] used Monte Carlo simulation.However, they only focused on single-objective optimization problems using Monte Carlo simulation or the scenario-based method.Our study is the first to apply the scenario-based stochastic method in addressing the multiobjective optimization problem in this domain.The scenario-based stochastic method is data-driven and enhances decision-making under uncertainty by evaluating multiple scenarios, allowing for more informed strategies [49].The multi-objective optimization model and algorithm that we propose offers a broader range of choices where manager preferences are unknown.
The main contributions of our work are as follows: • We develop a new bi-objective MILP model that aims to optimize the travel cost as well as the satisfaction of caregivers and patients, considering the alignment of caregivers' qualifications with patients' requirements as well as workload balance.

•
We propose the ALNS-EMDLS to solve the problem.The effectiveness of the new approach is validated by experimental results thanks to the comparison with the Gurobi solver [50].The stochastic ALNS-EMDLS is proposed to deal with uncertainties.
The contrast between the stochastic and original versions demonstrates the stochastic method's robustness.

•
In order to refine and enhance the application of our method, we conduct a sensitivity analysis to identify suitable parameters and apply them to real-world data in a case study, providing actionable management recommendations to choose the suitable schedules.

Problem Statement
In this section, a mathematical model and its components representing the extension of VRPTW in HHC have been formulated.From the perspective of graph theory: let G = (V, A), where V = {0} ∪ N = {1, 2. . ., n} is the vertex set and represents the depot and the patients.A = {(i, j) ∈ V, i ̸ = j} is the set of arcs.We aim to find minimum-cost routes that serve the vertices once and satisfy the side constraints of arc R⊆A.
It is assumed that all caregivers leave the depot at time 0. Caregiver k ∈ K starts from the facility, moves once to each patient, and returns to the depot.Each caregiver has the qualification level Q k .Fixed levels of qualification Q = {1, ..., q} are defined in this paper.A visit is therefore allowed only if the patient's requirement is lower or equal to the qualification level of the caregiver.The skill level of nurses is mainly based on their experience in dealing with complex cases.Patients are assured of receiving appropriate levels of care.The qualification matching ensures the service quality and safety.
Patients are spread across different locations.Each patient i has requirement RC i for a specific level of the caregiver.The patients' requirement set is aligned with the qualification set of the caregivers.The patients indicate their preferred time for home care during the registration process.The time window [e i , l i ] of the patient i is defined as the earliest time of starting service and the latest time of ending service that can be tolerated by the patient.If the caregiver arrives earlier than e i , the service will start before reaching e i ; if the arrival is after e i , the service begins immediately.The caregiver will leave immediately after serving for service time δ i .The departure time of node i can be calculated as (1): The arrival time of patient j is calculated as (2): Note that a jk is meaningless when patient i is not visited by caregiver k, that is,  2 shows the penalties schematically.The notation to describe the model is summarized in Table 2.The objective functions and constraints are formulated by ( 5)- (38).
∑ i∈V,i̸ =j The first objective function ( 5) is to minimize the travel cost.The second objective function (6) represents the penalty cost to be minimized.A smaller penalty cost indicates greater satisfaction for both caregivers and patients.Constraints (8) ensure that a caregiver is assigned to exactly one route.Constraints (9) mean each caregiver visits the patient and then leaves the patient.Constraints (10) indicate that caregivers start from the depot and return to the depot after finishing services.Caregivers can perform the service only if their qualification levels are satisfied by constraints (11) and (12).Constraints (13) indicate that each caregiver must serve a certain number of patients in relation to the workload balance.Constraints ( 14)-( 17) linearize the Formula (2).The arrival time at node j is the sum of the departure time from node i and the travel time from i to j, when x ijk = 1.Constraint (18) specifies that caregivers start their routes from the depot at time 0. The departure times at patients' locations are defined by constraints ( 19)- (22), which convert the Formula (1) into linear.Constraints ( 14)-( 22) guarantee the schedule feasibility and make subtours impossible.( 23)-( 36) are the variants of ( 4) and (3).Constraints ( 37) and ( 38) set the domains of decision variables.
For the MILP model, the optimal solutions of each objective function can be obtained by Gurobi Solver.We use a weighted sum method to obtain the approximation of a Pareto optimal set.The sum of the weights of two objectives satisfies ω 1 + ω 2 = 1 for normalization.This normalization ensures that the relative importance of each objective is expressed as a fraction of the whole.In the next iteration, ω 2 − ∆.We keep only non-dominated solutions from T solutions, which are obtained after T iterations.

Multi-Objective Algorithms
Our proposed method ALNS-EMDLS is divided into two main components: the Enhanced Multi-directional Local Search (EMDLS) described in Section 4.1, and the Adaptive Large Neighborhood Search (ALNS) detailed in Section 4.2.To address uncertain service times, the stochastic version of ALNS-EMDLS is proposed in Section 4.3.

Enhanced Multi-Directional Local Search Algorithm (EMDLS)
The multi-directional local search method was first proposed by Tricoire [51].Each local search is performed for a single objective (direction) iteratively to improve the nondominated front F. Each local search works separately without considering the importance of the objectives.Only non-dominated solutions are kept after one iteration.This strategy has fewer parameters and can yield well-spread solutions.The savings algorithm is a kind of constructive heuristic and can be used to construct the initial solution.In each direction, we use the ALNS to improve solutions and put the solutions in F. The Deb non-dominated sorting method is used to keep the non-dominated front after each iteration.F is saved as an ordered list to reduce the number of times executing non-dominated sorting.
Our EMDLS differentiates from the original algorithm in two ways.Firstly, unlike the original algorithm where a single solution from set F initiates the next iteration, our approach retains multiple solutions in each direction to enhance diversity.Secondly, our method selects solutions from F based on crowding distance, drawing inspiration from NSGA-II [52], rather than making random selections.The crowding distance of point i in F can be regarded as the perimeter of the hypercube, which is surrounded by the two adjacent points i − 1 and i + 1.The two boundary points are assigned to a very large number.The solution with a larger crowding distance is more likely to be chosen.

Adaptive Large Neighborhood Search (ALNS)
In the ALNS, various destroy and repair operators are selected adaptively to construct new solutions, which are accepted if their objective function values meet the record-torecord criterion, as detailed in the following subsections.

Destroy and Repair Operators
Three destroy operators and three repair operators are designed based on the previous work in [53].We remove nodes from the solution by the destroy operators and then insert the removed nodes by the repair operators.To satisfy the constraints (13), we select the routes with over m patients for destruction and those with less than n patients for repair.We choose the routes where caregivers meet patients' demands to respect the constraints (11).
A certain number of nodes are randomly removed and inserted by the random destroy operator and the random repair operator, respectively.These operators can easily be implemented to run faster than others.The worst destroy operator chooses the nodes with the largest saving that appear to be placed in the wrong position in the solution, while the relatedness destroy operator tends to select the nodes that are similar and can easily be exchanged.The relatedness of the first objective function ( 5) can be calculated by 1 c ij /c max +v , while the objective function (6) by 1 (|e i −e j |+|l i −l j |)/tw max +v , where c max denotes the largest cost of all pairs of i and j, tw max , is the length of the longest time window.If the node i and the node j are in the same route, v = 0; otherwise, v = 1.We iteratively find the node with minimum cost position in the greedy repair operator.But for the nodes that are expensive to insert in the last iteration, there are not many opportunities for inserting them because many of the routes are "full".The regret operator chooses the nodes from the removal set by calculating i = arg max i∈u [∑ k j=1 (∆ where u is the removal set, and ∆ f j i denotes the insertion value of the node i in the j th cheapest insertion position.This method selects the insertion that has a larger possibility to improve the overall performance than the greedy method.Appendix A contains the details of the destroy operators and the repair operators.

Adaptive Weight Adjustment and Acceptance Criterion
Only one destroy operator and one repair operator are chosen by probability in one iteration, where w j is the weight of the j th operator to be chosen, i ∈ {1, 2, ..., k}.The entire search is divided into several segments.A segment is a number of iterations of the ALNS.The weight w j is automatically updated after a segment and is calculated by the Formula (39).
The variable o num means the usage frequency of operator i in the latest segment.The reaction factor γ controls how quickly the weight adjustment algorithm reacts to the changes in the effectiveness of the heuristics.The r score can take three values: r 1 , r 2 , and r 3 , corresponding to three types of acceptance criteria, which assess the heuristic's recent performance.
A high score corresponds to a better performance.More specifically, in the record-to-record method, a neighborhood solution generated by the destroy and repair operators is always accepted if it outperforms the current solution, the best solution, and the sum of the best solution and deviation.
We allow infeasible solutions where some patients in the removal set may not be scheduled by a repair operator.In this case, the number of remaining patients incurs penalties in two objectives by the following formulas: where z i = 1 if the patient i can not be inserted; otherwise, z i = 0. Algorithm 1 presents each step of the ALNS-EMDLS.

Algorithm 1 ALNS-EMDLS
Input: a set F only including an initial solution x, repair operators, destroy operators, deviation d, iter seg , r 1 , r 2 , r 3 Output: the Pareto front F 1: repeat 2: x cur ← crowd_distance(F) 3: for k ←1 to K do 6: for j ←1 to iter ALNS do 7: for i ←1 to iter seg do 8: choose destroy i and repair i based on weight w 9: x new ← repair i (destroy i (x cur )) 10: x cur ← x new 12: x F ← Deb_nondominated_sorting(F, G) 28: until stopping criterion is met

Stochastic Method
An objective function of a stochastic optimization problem can be written as f (x, Y(ω)), where x is the decision variable and Y is a random variable that associates a real number to each element ω of a sample space Ω.We simply write it as f (x, ω).Without loss of generality, we consider the form of the stochastic K-objective optimization problem as [54]: It can be reduced to a deterministic model by defining f i as an s-dimensional vector F (s) i ( f i (x, ω)) (for other transformations, see [55,56]).Expectation E( f i (x, ω)), which is a specific functional F (s) i , is used in this paper.To make the expectation computationally tractable, each objective function is estimated by a sample average approximation method.E( f i (x, ω)) can be replaced by an unbiased consistent estimator [57].
where S = {ω 1 , ω 2 , . . ., ω U } is a fixed finite set of scenarios drawn in advance.We assume the service time follows a certain probability distribution that is known in advance.We generate U scenarios from the distribution for each patient by means of the Monte Carlo method.The second objective function varies under different scenarios.The first objective function is not calculated by the service time, but it is indirectly affected.
The objective function ( 6) can be rewritten as: where P s r j denotes the penalty of node r j under scenario s; R = {R 1 , R 2 ..., R |K| } is a set of routes.Here, |K| represents the number of routes, and R k means the k th route in the set R.
In the stochastic version of ALNS-EMDLS, the second objective function f 2 is replaced by f2 , and it can be calculated by Algorithm 2 under different scenarios of service time.

Algorithm 2 Stochastic simulation for computing the expectation of penalty cost
Input: a solution, a number of scenario U, s = 1, sum = 0 Output: estimate expected value of penalty cost 1: while s ≤ U do generate δ (t) r j from sample space according to the probability measure.

5:
compute the arrival time and departure time according to (1) and (2), then calculate P r j by ( 4) and (3).end for 9: end while 10: the estimated expected value is sum = sum/U

Computational Study
In this section, our experiment datasets and settings are presented.Some metrics are used to evaluate the quality of non-dominated solutions and the efficiency of multiobjective algorithms.We analyze the results obtained by an exact solution approach (the Gurobi Solver) and two approximate solution approaches (the ALNS-EMDLS and the stochastic ALNS-EMDLS).All of the algorithms have been implemented in Python [58].For all the experiments, we have used an Intel(R) Core (TM) i5-10310U CPU (@ 2.21 GHz) CPU with 16GB of RAM memory.
The computational study was carried out on two types of data sets.First, and to extend the study of the performance and robustness of the proposed method, a series of tests was created based on literature instances.Second, a several dataset was taken directly from the field.It was used to analyze the proposed solutions in terms of business practices and to draw out managerial insights.

Data Sets and Experimental Setup
No benchmark results exist in the literature for our problem.Hence, we generate six different types of (C1, C2, R1, R2, RC1, and RC2) instances based on the Solomon dataset [59], which contains various instances with different sizes and characteristics for the VRPTW problem.This variety allows for comprehensive testing of our algorithm across different instances.
The Solomon dataset provides necessary details such as locations, time windows, and service times, aligning well with our proposed problem.The data sets that support the findings of this study are openly available in "figshare" at http://doi.org/10.6084/m9.figshare.21339072(accessed on 1 March 2024).We have compared problems involving 25, 50, and 100 patients, focusing on the characteristics including the geographical data, the length of the scheduling horizon, and the proportion of time-constrained patients.The geographical data are randomly generated in R1 and R2, clustered in C1 and C2, and a mix of both structures in RC1 and RC2.The sets R1, C1 and RC1 have a short scheduling horizon, while the sets R2, C2, and RC2 have a longer one.Each type comprises four sets.For example, C1 consists of C1-a, C1-b, C1-c, and C1-d.The only distinction between C1-a and the other three sets lies in the presence of some patients having time windows that are scarcely constrained in sets C1-b, C1-c, and C1-d.
Each patient is available only between the ready time and the due time.Distance is Euclidean, and the value of the travel time is equal to the value of distance between two nodes.We assign random values to patient requirements for caregiver levels and service times.To be more practical, we diversity the service time of each patient.The mean value of service time δ accounts for twenty to sixty percent of the time window.We assume that the service time of each patient is an independently normally distributed random variable and follows N(µ i , σ 2 i ).We set µ = δ and σ = δ 5 .We assume that three caregivers are assigned to 25 patients, five caregivers to 50 patients, and 10 caregivers to 100 patients.
Before starting the problem-solving, the parameters are set.For solving the MILP by Gurobi Solver, we set ∆ as 1/50.The ALNS is affected by random factors, so we utilize the average value of the metrics from five runs for every experiment.In each iteration, the number of nodes removed by destroy operators is randomly set between 2 and 4, as reflected in the variable q in Algorithms A1 and A2 (refer to Appendix A). η in Formula ( 40) is set to 1000.Given the impracticability of testing all hyperparameter combinations, we employ Bayesian optimization for efficient exploration [60].It is assumed the hyperparameters are in a black box (an unknown function), with the function's output evaluated via a metric known as the hypervolume indicator (detailed in Section 5.2).In Bayesian optimization, it assumes the unknown function stems from a Gaussian process prior, updating the posterior distribution with new observations.An acquisition function is chosen for the next evaluation point.The tuned hyperparameters of the proposed method and their best values after 50 iterations are shown in Table 3.

Notation Definition
Value

Performance Metrics
The metrics are used to measure the convergence and diversity (diversity includes ductility and uniformity) of the solutions [61], including the number of Pareto optimal points (N), the hypervolume indicator (HV), and the Spread metric (S) [62].These metrics provide a comprehensive assessment of an algorithm's performance in identifying a diverse and well-distributed set of solutions.
The objective values are scaled between 0 and 1 before calculating these metrics.The HV gives the volume x enclosed by a reference point and the solutions and is shown in the following Formula (45): The reference point R is commonly set to the point (1,1) for minimizing a bi-objective problem.This metric measures convergence and diversity.A larger value of hypervolume signifies the better quality of the solutions.The spread metric evaluates the diversity of solutions and is given by: where d f and d l are the Euclidean distances between the extreme solutions in true Pareto front and non-dominated solutions (NDS).d is the average of the whole distance d i , and d i is the Euclidean distance between one solution and the next nearest solution, where We assume the extreme values of a true Pareto front are (0,1) and (1,0).Smaller values indicate better distribution.

Deterministic Bi-Objective Solutions
This is a base case in which the service times are considered as deterministic quantities.We compare the ALNS-EMDLS and the Gurobi Solver to measure the performance of our proposed method.
The two extreme points of Pareto front ([ f min 1 , f 2 ] and [ f min 2 , f 1 ]), HV, S, the CPU execution time of the program measured by seconds (TCPU), and the number of Pareto points (N) for the small size (10 patients) and the real-life size (25, 50, and 100 patients) instances are shown in Table 4.We find that the running time of the Gurobi Solver is much longer than that of the ALNS-EMDLS.The extreme values of the two objectives are very close.The outcomes produced by the proposed method, which encompass a broad-range Pareto front, are remarkably comparable to those of the Gurobi Solver while requiring less time.Moreover, the results of the ALNS-EMDLS involve more solutions than the Gurobi Solver.Real-world instances typically contain a larger number of patients.With large instances, solving the problem to optimality is troublesome, and we found no solution within a limited time using the Gurobi Solver.We, therefore, considered the ALNS-EMDLS.Computation times shown in Table 4 indicate that the ALNS-EMDLS is effective.Other metrics show the Pareto fronts achieved by the proposed method are well distributed and have satisfactory diversity.
The numerical simulation processes of stochastic programming require a lot of time.Utilizing the Gurobi Solver to derive solutions for the stochastic model would require significantly more time compared to the deterministic model.The proposed method is adequate to find satisfactory solutions.The performances of the proposed method are very close to the best-found solutions obtained by the Gurobi Solver.Therefore, for large instances, we propose employing the stochastic version of ALNS-EMDLS to solve the problem under uncertainty.

Stochastic Bi-Objective Solutions
Table 5 shows the metrics and extreme objective function values of the Pareto front of the Stochastic ALNS-EMDLS (S_EMDLS).
On average, a considerable difference exists between the solution for minimum travel cost and the solution for minimum penalty cost in terms of both objectives.Hence, the quality of services that decision-makers decide to offer to patients has a significant impact on operating costs, underscoring the need for careful decision-making.The instances C1 and C2, which are clustered, have lower minimum travel costs than R and RC.We compared the results under different lengths of time windows.Data types 2, including C2, R2, and RC2, have longer time windows than type 1.Table 5 shows that most of the values of the minimum penalty of type 2 are less than those of type 1.We use Algorithm 2 to evaluate the solutions obtained by the Deterministic (original) ALNS-EMDLS (D*_EMDLS).The number of solutions and HV and S metrics are compared in Table 6.
It can be inferred from the results that the modeling of uncertain service times will increase computing time because the S_EMDLS needs to handle more information than only one scenario.Using the proposed stochastic framework, more scenarios contribute to the output.The S_EMDLS is therefore more realistic.In Table 6, the results of the ANOVA test show there is no significant difference in the means of HV and S between the solutions of D*_EMDLS and S_EMDLS since the p-values are greater than 0.005.This means that the stochastic method can yield solutions that are at least as good as the deterministic one.The proposed stochastic approach is designed to optimize the expected values of objective functions.Its solutions may not be global optimal solutions for the individual scenario, but they are robust, providing possible realizations despite uncertain service times.To identify the behavior of the proposed model and method and examine the influence of uncertain service times on objective values, several sensitivity analysis are performed on the main parameters.In this regard, a small test problem of 25 patients and three caregivers is selected.The parameters include the ending time of the loose time windows (ET) and variance of distribution (VD), which can indicate the range of uncertain service times.Each parameter has three levels, namely, small, medium, and large.To validate the robustness of solutions, we also compare the results of D*_EMDLS and S_EMDLS.We use the solutions obtained by the deterministic model to evaluate their sensibilities under uncertain service times.We normalize the Pareto set to [0,1] and calculate the distance between the origin and each point in the Pareto set.The solution with a minimum distance (D) is defined as a trade-off solution in our case.The objective values (travel cost (TC), penalty (P), normalized travel cost (NTC), and normalized penalty (NP)) of the trade-off solutions and minimum distances are summarized in Tables 7 and 8. l i means the latest time of the tight time window.If the departure time somehow exceeds the time window by a certain level, there will be a penalty cost.If the departure time lies within (e i , l i ], (l i , l i + c 1 ], (l i + c 1 , l i + c 2 ] or (l i + c 2 , ∞] (c 1 < c 2 ), the penalty cost will be 0, α 0 , α 1 , α 2 , α 3 , respectively (see Figure 2 and Formula 4).That is to say, we have loose time windows.If c 1 and c 2 are bigger, patients give more flexibility to the decision-makers and caregivers.Three levels of ET are compared when the VD is δ * 2 in Tables 7 and 8.The results of the two methods do not dominate each other when the ET levels are small and medium.But the result of D*_EMDLS is dominated by S_EMDLS when c 1 = 30 and c 2 = 45.If patients have more flexibility, the S_EMDLS is better to deal with uncertain service times.Regarding VD, sensitivity analysis has been performed by increasing the variance of normal distribution.We sample the service times from normal distributions.If the variance is bigger, which means patients are more likely to have larger or smaller service times that deviate from the average value, the results of S_EMDLS dominate D*_EMDLS (shown in the last rows of Tables 7 and 8). Figure 4 shows the Pareto points of D*_EMDLS and S_EMDLS with different VD and ET.The short lines in the box plots denote the median of TC or P. In (e), when VD = δ * 2, LD = (l i + 30, l i + 45), the median of S_EMDLS is obviously smaller than D*_EMDLS.However, the S_EMDLS is more realistic as it takes into account multiple scenarios, leading to increased computing time.If the variance of service times is not too large, the D_EMDLS can be chosen to save computing time.
The values of travel cost and penalty of trade-off solutions for different approaches are shown in Table 9 and Figure 5.The values of travel cost of the D*_EMDLS are less stable than those of S_EMDLS.When VD increases from δ/3 to δ * 2.5, the penalty of S_EMDLS increases by 28.24%, while the growth for the D*_EMDLS is 43.32%.The penalty of S_EMDLS changes more sluggishly than that of D*_EMDLS when VD changes.Therefore, S_EMDLS has better robustness than the D*_EMDLS.The objective values are more stable and perform better for the cases with large variations in service times by using the S_EMDLS, while D_EMDLS is more appropriate for cases with small variations in service times to save computing time.Decision-makers can select one of the solutions from the Pareto sets depending on their companies' operating profitability.They can select which method to use in order to attain better objective values based on varying conditions.In this section, we apply the results derived from Section 6.1 to a real-life case and provide some actionable management recommendations for choosing schedules.
Using the real-life data provided by "Soins et Santé", a home health care company located in Lyon, France, we implement our methods to create routes and schedules.A total of 27 patients receive home care services.The available data set includes patients' locations, service times, and available time sessions.The duration of each service varies, ranging from 5 to 46 minutes.The patients are visited in three time sessions: morning sessions from 7:30 to 12:00, afternoon sessions from 13:00 to 15:30, and evening sessions from 17:00 to 19:30.We create the time windows based on the patient's preferred time sessions for visits.The length of the time windows ranges from 30 to 150 minutes.We also create the required level of caregiver for each patient.
Figure 6 shows the results when VD = δ * 2, LD = (l i + 30, l i + 45), m = 4, and n = 10.In this figure, the Pareto front of the S_EMDLS dominates that of the D*_EMDLS algorithm.The S_EMDLS is capable of generating a more diverse range of solutions.In real-life cases, when implementing the solution obtained by the D_EMDLS, if the difference between the real service times and the planned service times is small, the actual objective function values are close to the original ones.However, if the real service times are significantly different from the planned service times, the solution obtained by the S_EMDLS can be chosen to achieve smaller objective values on average.If it is not possible to determine the extent to which the patients' service times differ from the planned service times, the solution obtained by S_EMDLS can be used by decision-makers to have higher stability.
By providing a more comprehensive view of the various solutions within the Pareto front, managers can gain a deeper understanding of the strengths and weaknesses of the solutions in the Pareto front.This knowledge can then be used to make decisions on selecting the most appropriate solution for their needs.We examine three points on the Pareto front, namely, the solution with the minimum travel cost (S 1 ), the trade-off solution (S 2 ), and the solution with the minimum penalty (S 3 ).These points are clearly marked with three star icons within Figure 6.We chose the trade-off solution with objective values closest to the origin.Table 10 presents the objective values and their respective indicators.The indicator PER e represents the percentage of patients who are visited by caregivers before the earliest time of their time windows.The indicator PER l denotes the percentage of patients whose services are completed by caregivers after the latest time of their time windows.The indicators WT min and WT max mean the minimum and maximum daily working hours, respectively.The values of f 1 for S 1 and S 2 are lower compared to S 3 , whereas f 2 and PER l for S 1 and S 2 exceed those of S 3 .This reveals that S 3 offers improved service punctuality but leads to higher travel costs.The values of WT min and WT max for S 3 are lower than those for S 1 and S 2 , signifying shorter waiting times for caregivers, contributing to a more equitable distribution of workload.The routes corresponding to these solutions are visualized in Figure 7.The locations of the patients (expressed by their longitude and latitude) and the planned routes are displayed on a map with a blank background to ensure their anonymity.The routes of S 3 are too complicated to be constructed manually, and our model and method proposed a useful tool to construct them.We offer actionable recommendations for choosing solutions as follows.

1.
If the majority of patients have a higher tolerance for exceeding their end time of the time windows, the manager may opt for a solution located on the left side of the Pareto front, which prioritizes minimizing travel costs.2.
Although the minimum and maximum numbers of patients that each caregiver needs to visit (m and n) are limited in the proposed model, the solution S 3 can be chosen to achieve a better workload balance.

3.
The routes of S 3 can be selected when the manager prefers better satisfaction for patients and caregivers.

Conclusions
We developed a bi-objective model for the HHC problem.This model aims to optimize the travel costs and satisfaction of both patients and caregivers, considering several practical constraints: the soft time windows, the matching of patient needs and caregiver skills, and the workload balance.Additionally, we consider uncertain service times to enhance the practicality and robustness.
To solve the bi-objective optimization problem, we developed an ALNS-EMDLS to obtain Pareto fronts.The Stochastic ALNS-EMDLS (S_EMDLS) was proposed to deal with the problem under the uncertain service times.First, we considered only one deterministic scenario: the average value of service times sampling from the normal distribution.The comparison between the Gurobi Solver and the ALNS-EMDLS revealed the latter's superior efficiency and competitively high-quality solutions.Second, we considered uncertain service times, assuming they follow the normal distribution.In the D*_EMDLS, the solutions obtained by the ALNS-EMDLS were evaluated under uncertain service times.The results showed that when the two parameters, i.e., the ending time of the loose time and variance of distribution, are bigger, the S_EMDLS outperforms since its solutions dominate those of D*_EMDLS.We evaluated the trade-off solutions of D*_EMDLS and S_EMDLS under varying variances.The outcomes confirmed S_EMDLS's robustness, effectively demonstrating its efficacy in managing uncertain service times.Finally, a real-life application was conducted to provide practical managerial suggestions for choosing routes and schedules.
As a future development, we will create new heuristics and use exact methods to compare the results of this study.More practical objectives and constraints motivated by the needs of HHC companies can be added to further studies.We will try to accelerate the computing time as the S_EMDLS takes over twenty times longer to implement than the D_EMDLS.At present, the study is conducted on a daily planning horizon; in the future, mid-term or long-term planning of HHC activities will be considered.For future applications, we aim to ensure seamless model integration into the current HHC management system, focusing on real-time data for more accurately and effectively predicting future uncertainties.

Algorithm A1 Worst destroy operator
Input: a solution x, the number of nodes to be removed q Output: the removal list D 1: while q > 0 do 2: L contains all nodes of the solution 3: for each node i in x do 4: remove node i from x 5: end for 7: sort L in descending order of cost i 8: random number y q in interval [0,1) for each node i in x do 5: sort L in descending order of cost i 8: random number y q in interval [0,1) find smallest cost(d, i) and its corresponding d and i 9: x new ← insert(d, i, x ′ ) find the biggest g and its corresponding d and i 15: end for 16: x new ← insert(d, i, x ′ ) 17: remove node d from D 18: end while

Figure 1 .
Figure 1.Daily activities in home health care.

Figure 2 .
Illustration of discrete penalty.(a) Discrete penalty for arrival time; (b) discrete penalty for departure time.

The data sets 25 -
C1-a, 25-C1-b, 25-C1-c, and 25-C1-d are used to test the effect caused by different percentages of patients with time windows.In 25-C1-a, all patients are available only at certain periods of the day.28%, 52%, and 68% patients are available for the full working time of caregivers for 25-C1-b, 25-C1-c, and 25-C1-d, respectively.Most patients do not have time windows in the 25-C1-d set.The Pareto fronts are shown in Figure 3.The solutions of 25-C1-d dominate 25-C1-a, 25-C1-b, and 25-C1-c.

Figure 3 .
Figure 3.Time windows comparison of 25 patients.

Figure 5 .
Objective values of trade-off solutions with changing of VD.(a) Travel cost; (b) penalty.
destroy operator Input: a solution x Output: the removal list D 1: d ← random_choose(x) 2: while q > 0 do

do 2 :each node d in D do 3 :do 4 : insert node d at position i 5 :
repair operator Input: the solution x ′ without the nodes in the removal list D Output: a new solution x new 1: while |D| > 0 for for each insert positioni ← 1 to |route| − 1 cost(d, i) ← f (x ′ ) − f (x ′ +i )

do 2 :node d in D do 3 : L ← ∅ 4 :do 5 : insert node d at position i 6 : 7 :for i ← 1 to regret_n do 12 :
the solution x ′ without the nodes in the removal list D, regret_n Output: a new solution x new 1: while |D| > 0 for for each insert positioni ← 1 to |route| − 1 cost(d, i) ← f (x ′ +i ) L ← L ∪ cost(d, i) g ← g + L[i] − L[0]

Table 1 .
Deterministic models and methods used in the latest research.
In other words, we have two loose time windows [e i − 30, l i + 30] and [e i − 15, l i + 15] and one tight time window [e i , l i ].Discrete penalties can prevent a few patients from suffering from large delays.When y ik = 1, the penalty is determined using (3) and (4); otherwise, p a ik and p a ijk = 0. Caregivers who arrive too early may face long waiting times, while arriving too late can lead to decreased patient satisfaction.A penalty cost is introduced in objective functions when caregivers arrive or leave outside the time window.Minimizing the penalty can enhance service punctuality and reduce waiting times, thereby improving the satisfaction of both patients and caregivers.Different penalty values are assigned to [0, e i − 30], (e i − 30, e i − 15], (e i − 15, e i ], (e i , l i ], (l i , l i + 15], (l i + 15, l i + 30], (l i + 30, ∞].ik are equal to 0. Figure

Table 2 .
Notation.p ik = w ik * y ik , ∀i ∈ N, ∀k ∈ K u ig , v ih binary decision variable: 1 if caregivers' arrival (departure) time at patient i is located at g th (h th ) interval, 0 otherwise r i binary decision variable: 1 if caregiver arrives after e i

Table 6 .
Deterministic model tested on uncertain environment.

Table 7 .
Results of different levels of ET and VD using D*_MDLS.

Table 8 .
Results of different levels of ET and VD using S_MDLS.

Table 9 .
Objective values of trade-off solutions with changing of VD.

Table 10 .
Indicators for three solutions.