A Learnheuristic Algorithm Based on Thompson Sampling for the Heterogeneous and Dynamic Team Orienteering Problem
Abstract
1. Introduction
2. Related Work
3. Describing the DTOP
3.1. An Illustrative Example of the DTOP
3.2. A Mathematical Model for the DTOP
4. Modeling Probabilities of Reward for Different Types of Nodes
- Parameter implies the meteorological conditions: a value 0 denotes good meteorological conditions, whereas a value 1 expresses bad ones.
- Parameter represents the congestion level in the urban network: a value 1 denotes severe congestion on the node, whereas a value 0 expresses the lack of it.
- Parameter indicates the remaining battery percentage of an electric vehicle: a value 1 denotes that the battery is full of energy, whereas a value 0 express that the battery is empty.
- Coefficient acts as a baseline intercept term, representing the default scenario when all variables are at their baseline levels.
- Coefficient modulates the influence of meteorological conditions (w).
- Coefficient reflects the impact of congestion (c) at the specific node being visited.
- Coefficient adjusts the outcome based on the remaining battery percentage (b).
5. Solution Approaches for the Heterogeneous DTOP
5.1. Online Contextual Thompson Sampling
5.2. A Static Constructive Heuristic
Algorithm 1 Constructive Static Solution (, ) |
|
5.3. A Learnheuristic Constructive Heuristic
Algorithm 2 Constructive Dynamic Solution (, ) |
|
6. Numerical Experiments
7. Analysis of Results
8. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Archetti, C.; Speranza, M.G.; Vigo, D. Chapter 10: Vehicle routing problems with profits. In Vehicle Routing: Problems, Methods, and Applications, 2nd ed.; SIAM: Philadelphia, PA, USA, 2014; pp. 273–297. [Google Scholar]
- Butt, S.E.; Cavalier, T.M. A heuristic for the multiple tour maximum collection problem. Comput. Oper. Res. 1994, 21, 101–111. [Google Scholar] [CrossRef]
- Chao, I.M.; Golden, B.L.; Wasil, E.A. The team orienteering problem. Eur. J. Oper. Res. 1996, 88, 464–474. [Google Scholar] [CrossRef]
- Vansteenwegen, P.; Souffriau, W.; Van Oudheusden, D. The orienteering problem: A survey. Eur. J. Oper. Res. 2011, 209, 1–10. [Google Scholar] [CrossRef]
- Gunawan, A.; Lau, H.C.; Vansteenwegen, P. Orienteering problem: A survey of recent variants, solution approaches and applications. Eur. J. Oper. Res. 2016, 255, 315–332. [Google Scholar] [CrossRef]
- Vansteenwegen, P.; Souffriau, W.; Berghe, G.V.; Van Oudheusden, D. Iterated local search for the team orienteering problem with time windows. Comput. Oper. Res. 2009, 36, 3281–3290. [Google Scholar] [CrossRef]
- Lin, S.W.; Vincent, F.Y. A simulated annealing heuristic for the team orienteering problem with time windows. Eur. J. Oper. Res. 2012, 217, 94–107. [Google Scholar] [CrossRef]
- Verbeeck, C.; Sörensen, K.; Aghezzaf, E.H.; Vansteenwegen, P. A fast solution method for the time-dependent orienteering problem. Eur. J. Oper. Res. 2014, 236, 419–432. [Google Scholar] [CrossRef]
- Ilhan, T.; Iravani, S.M.; Daskin, M.S. The orienteering problem with stochastic profits. Iie Trans. 2008, 40, 406–421. [Google Scholar] [CrossRef]
- Panadero, J.; Juan, A.A.; Bayliss, C.; Currie, C. Maximising reward from a team of surveillance drones: A simheuristic approach to the stochastic team orienteering problem. Eur. J. Ind. Eng. 2020, 14, 485–516. [Google Scholar] [CrossRef]
- Panadero, J.; Barrena, E.; Juan, A.A.; Canca, D. The stochastic team orienteering problem with position-dependent rewards. Mathematics 2022, 10, 2856. [Google Scholar] [CrossRef]
- Yu, Q.; Adulyasak, Y.; Rousseau, L.M.; Zhu, N.; Ma, S. Team orienteering with time-varying profit. Informs J. Comput. 2022, 34, 262–280. [Google Scholar] [CrossRef]
- Ejaz, W.; Anpalagan, A.; Ejaz, W.; Anpalagan, A. Internet of Things enabled electric vehicles in smart cities. In Internet of Things for Smart Cities: Technologies, Big Data and Security; Springer International Publishing: Cham, Switzerland, 2019; pp. 39–46. [Google Scholar]
- Martins, L.d.C.; Tordecilla, R.D.; Castaneda, J.; Juan, A.A.; Faulin, J. Electric vehicle routing, arc routing, and team orienteering problems in sustainable transportation. Energies 2021, 14, 5131. [Google Scholar] [CrossRef]
- Arnau, Q.; Juan, A.A.; Serra, I. On the use of learnheuristics in vehicle routing optimization problems with dynamic inputs. Algorithms 2018, 11, 208. [Google Scholar] [CrossRef]
- Bayliss, C.; Juan, A.A.; Currie, C.S.; Panadero, J. A learnheuristic approach for the team orienteering problem with aerial drone motion constraints. Appl. Soft Comput. 2020, 92, 106280. [Google Scholar] [CrossRef]
- Macrina, G.; Pugliese, L.D.P.; Guerriero, F.; Laporte, G. Drone-aided routing: A literature review. Transp. Res. Part Emerg. Technol. 2020, 120, 102762. [Google Scholar] [CrossRef]
- Otto, A.; Agatz, N.; Campbell, J.; Golden, B.; Pesch, E. Optimization approaches for civil applications of unmanned aerial vehicles (UAVs) or aerial drones: A survey. Networks 2018, 72, 411–458. [Google Scholar] [CrossRef]
- Rojas Viloria, D.; Solano-Charris, E.L.; Muñoz-Villamizar, A.; Montoya-Torres, J.R. Unmanned aerial vehicles/drones in vehicle routing problems: A literature review. Int. Trans. Oper. Res. 2021, 28, 1626–1657. [Google Scholar] [CrossRef]
- Peyman, M.; Martin, X.A.; Panadero, J.; Juan, A.A. A Sim-Learnheuristic for the Team Orienteering Problem: Applications to Unmanned Aerial Vehicles. Algorithms 2024, 17, 200. [Google Scholar] [CrossRef]
- Mufalli, F.; Batta, R.; Nagi, R. Simultaneous sensor selection and routing of unmanned aerial vehicles for complex mission plans. Comput. Oper. Res. 2012, 39, 2787–2799. [Google Scholar] [CrossRef]
- Lee, D.H.; Ahn, J. Multi-start team orienteering problem for UAS mission re-planning with data-efficient deep reinforcement learning. Appl. Intell. 2024, 54, 4467–4489. [Google Scholar] [CrossRef]
- Sundar, K.; Sanjeevi, S.; Montez, C. A branch-and-price algorithm for a team orienteering problem with fixed-wing drones. Euro J. Transp. Logist. 2022, 11, 100070. [Google Scholar] [CrossRef]
- Poggi, M.; Viana, H.; Uchoa, E. The team orienteering problem: Formulations and branch-cut and price. In Proceedings of the 10th Workshop on Algorithmic Approaches for Transportation Modelling, Optimization, and Systems (ATMOS’10). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, Liverpool, UK, 9 September 2010. [Google Scholar]
- Dang, D.C.; El-Hajj, R.; Moukrim, A. A branch-and-cut algorithm for solving the team orienteering problem. In Proceedings of the Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems: 10th International Conference, CPAIOR 2013, Yorktown Heights, NY, USA, 18–22 May 2013; Springer: Berlin/Heidelberg, Germany, 2013; pp. 332–339. [Google Scholar]
- Keshtkaran, M.; Ziarati, K.; Bettinelli, A.; Vigo, D. Enhanced exact solution methods for the team orienteering problem. Int. J. Prod. Res. 2016, 54, 591–601. [Google Scholar] [CrossRef]
- Dang, D.C.; Guibadj, R.N.; Moukrim, A. A PSO-based memetic algorithm for the team orienteering problem. In Proceedings of the Applications of Evolutionary Computation: EvoApplications 2011: EvoCOMNET, EvoFIN, EvoHOT, EvoMUSART, EvoSTIM, and EvoTRANSLOG, Torino, Italy, 27–29 April 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 471–480. [Google Scholar]
- Dang, D.C.; Guibadj, R.N.; Moukrim, A. An effective PSO-inspired algorithm for the team orienteering problem. Eur. J. Oper. Res. 2013, 229, 332–344. [Google Scholar] [CrossRef]
- Muthuswamy, S.; Lam, S.S. Discrete particle swarm optimization for the team orienteering problem. Memetic Comput. 2011, 3, 287–303. [Google Scholar] [CrossRef]
- Ferreira, J.; Quintas, A.; Oliveira, J.A.; Pereira, G.A.; Dias, L. Solving the team orienteering problem: Developing a solution tool using a genetic algorithm approach. In Proceedings of the Soft Computing in Industrial Applications: Proceedings of the 17th Online World Conference on Soft Computing in Industrial Applications, Online, 3–14 December 2012; Springer: Berlin/Heidelberg, Germany, 2014; pp. 365–375. [Google Scholar]
- Bouly, H.; Dang, D.C.; Moukrim, A. A memetic algorithm for the team orienteering problem. 4OR 2010, 8, 49–70. [Google Scholar] [CrossRef]
- Archetti, C.; Hertz, A.; Speranza, M.G. Metaheuristics for the team orienteering problem. J. Heuristics 2007, 13, 49–76. [Google Scholar] [CrossRef]
- Campos, V.; Martí, R.; Sánchez-Oro, J.; Duarte, A. GRASP with path relinking for the orienteering problem. J. Oper. Res. Soc. 2014, 65, 1800–1813. [Google Scholar] [CrossRef]
- Laguna, M.; Marti, R. GRASP and path relinking for 2-layer straight line crossing minimization. Informs J. Comput. 1999, 11, 44–52. [Google Scholar] [CrossRef]
- Reyes-Rubiano, L.; Juan, A.; Bayliss, C.; Panadero, J.; Faulin, J.; Copado, P. A biased-randomized learnheuristic for solving the team orienteering problem with dynamic rewards. Transp. Res. Procedia 2020, 47, 680–687. [Google Scholar] [CrossRef]
- Li, Y.; Peyman, M.; Panadero, J.; Juan, A.A.; Xhafa, F. IoT analytics and agile optimization for solving dynamic team orienteering problems with mandatory visits. Mathematics 2022, 10, 982. [Google Scholar] [CrossRef]
- Gomez, J.F.; Uguina, A.R.; Panadero, J.; Juan, A.A. A learnheuristic algorithm for the capacitated dispersion problem under dynamic conditions. Algorithms 2023, 16, 532. [Google Scholar] [CrossRef]
- Evers, L.; Glorie, K.; Van Der Ster, S.; Barros, A.I.; Monsuur, H. A two-stage approach to the orienteering problem with stochastic weights. Comput. Oper. Res. 2014, 43, 248–260. [Google Scholar] [CrossRef]
- Osisanwo, F.; Akinsola, J.; Awodele, O.; Hinmikaiye, J.; Olakanmi, O.; Akinjobi, J. Supervised machine learning algorithms: Classification and comparison. Int. J. Comput. Trends Technol. 2017, 48, 128–138. [Google Scholar]
- Russo, D.J.; Van Roy, B.; Kazerouni, A.; Osband, I.; Wen, Z. A tutorial on Thompson sampling. Found. Trends Mach. Learn. 2018, 11, 1–96. [Google Scholar] [CrossRef]
- Thompson, W.R. On the likelihood that one unknown probability exceeds another in view of the evidence of two samples. Biometrika 1933, 25, 285–294. [Google Scholar] [CrossRef]
- Zhao, Q. Multi-Armed Bandits: Theory and Applications to Online Learning in Networks; Springer Nature: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
- Gupta, A.K.; Nadarajah, S. Handbook of Beta Distribution and Its Applications; CRC Press: Boca Raton, FL, USA, 2004. [Google Scholar]
- Chapelle, O.; Li, L. An empirical evaluation of thompson sampling. Adv. Neural Inf. Process. Syst. 2011, 24, 1–9. [Google Scholar]
- Askhedkar, A.R.; Chaudhari, B.S. Multi-Armed Bandit Algorithm Policy for LoRa Network Performance Enhancement. J. Sens. Actuator Netw. 2023, 12, 38. [Google Scholar] [CrossRef]
- Jose, S.T.; Moothedath, S. Thompson sampling for stochastic bandits with noisy contexts: An information-theoretic regret analysis. arXiv 2024, arXiv:2401.11565. [Google Scholar]
- Dominguez, O.; Juan, A.A.; Faulin, J. A biased-randomized algorithm for the two-dimensional vehicle routing problem with and without item rotations. Int. Trans. Oper. Res. 2014, 21, 375–398. [Google Scholar] [CrossRef]
- Arif, T.M. Introduction to Deep Learning for Engineers: Using Python and Google Cloud Platform; Springer Nature: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
Node Type | Low | Medium | High | ||||||
---|---|---|---|---|---|---|---|---|---|
N1 | 0 | −1 | 1 | 0 | −1.2 | 1.2 | 0 | −2 | 1 |
N2 | −0.2 | −0.8 | 1.1 | −0.4 | 1 | 1.4 | −0.6 | −1.5 | 2 |
N3 | −0.4 | −0.6 | 1.2 | −0.6 | −0.8 | 1.6 | −1.2 | −1 | 3 |
N4 | −0.6 | −0.4 | 1.3 | −0.8 | −0.6 | 1.8 | −1.8 | −0.8 | 4 |
N5 | −1 | −1.5 | 0 | −1.5 | −2 | 0 | −2 | −3 | 0 |
Instance | Static | Learnheuristic | Gap (%) | ||||||||||||||||||
Low (1) | Medium (2) | High (3) | Low (4) | Medium (5) | High (6) | ||||||||||||||||
Time | OF | Dyn. OF | Time | OF | Dyn. OF | Time | OF | Dyn. OF | Time | OF | Dyn. OF | Time | OF | Dyn. OF | Time | OF | Dyn. OF | (1)–(4) | (2)–(5) | (3)–(6) | |
p4.2 | 12.07 | 901.90 | 431.54 | 12.02 | 876.17 | 428.36 | 12.03 | 901.90 | 437.75 | 23.06 | 804.40 | 492.11 | 23.00 | 800.48 | 512.88 | 22.77 | 796.39 | 551.54 | 14.04% | 19.73% | 25.99% |
p4.3 | 10.71 | 804.40 | 384.37 | 10.64 | 804.40 | 381.14 | 10.66 | 804.40 | 388.59 | 19.71 | 728.19 | 449.91 | 19.68 | 726.84 | 464.95 | 19.57 | 724.11 | 503.12 | 17.05% | 21.99% | 29.47% |
p4.4 | 9.30 | 677.13 | 323.75 | 9.31 | 652.40 | 321.59 | 9.24 | 652.40 | 329.19 | 16.15 | 632.55 | 389.08 | 16.12 | 632.55 | 404.64 | 16.07 | 630.45 | 436.25 | 20.18% | 25.83% | 32.52% |
p5.2 | 4.98 | 995.00 | 460.21 | 4.99 | 995.00 | 456.15 | 4.98 | 995.00 | 470.05 | 9.59 | 808.99 | 491.18 | 9.58 | 805.59 | 509.95 | 9.59 | 804.18 | 554.34 | 6.73% | 11.79% | 17.93% |
p5.3 | 4.53 | 843.56 | 391.34 | 4.51 | 843.56 | 387.32 | 4.49 | 843.56 | 398.28 | 8.50 | 703.94 | 430.44 | 8.48 | 700.71 | 445.13 | 8.47 | 698.28 | 480.76 | 9.99% | 14.93% | 20.71% |
p5.4 | 4.17 | 718.94 | 331.94 | 4.16 | 718.94 | 328.33 | 4.17 | 718.94 | 337.54 | 7.49 | 607.85 | 372.75 | 7.49 | 607.20 | 386.45 | 7.49 | 602.68 | 416.14 | 12.29% | 17.70% | 23.28% |
p6.2 | 4.84 | 868.60 | 411.58 | 4.83 | 868.60 | 406.55 | 4.84 | 868.60 | 411.90 | 9.54 | 703.31 | 424.62 | 9.53 | 704.34 | 441.83 | 9.53 | 702.32 | 476.01 | 3.17% | 8.68% | 15.56% |
p6.3 | 4.74 | 766.58 | 363.46 | 4.71 | 766.58 | 358.49 | 4.69 | 766.58 | 364.34 | 9.08 | 670.13 | 391.32 | 9.06 | 667.54 | 403.94 | 9.06 | 662.22 | 426.72 | 7.67% | 12.68% | 17.12% |
p6.4 | 4.37 | 664.00 | 315.66 | 4.44 | 664.00 | 310.76 | 4.42 | 664.00 | 317.22 | 8.51 | 633.14 | 354.50 | 8.52 | 631.78 | 365.79 | 8.45 | 622.72 | 384.12 | 12.31% | 17.71% | 21.09% |
p7.2 | 82.17 | 700.16 | 332.47 | 84.15 | 700.16 | 329.20 | 84.10 | 700.16 | 335.54 | 157.27 | 581.28 | 360.32 | 157.12 | 579.53 | 373.86 | 155.19 | 570.09 | 401.66 | 8.37% | 13.57% | 19.70% |
p7.3 | 77.82 | 650.45 | 307.38 | 77.33 | 650.45 | 304.01 | 77.41 | 650.45 | 310.06 | 136.70 | 544.72 | 335.94 | 136.43 | 543.56 | 349.74 | 135.88 | 316.64 | 469.32 | 9.29% | 15.04% | 51.37% |
p7.4 | 67.27 | 547.23 | 259.14 | 67.15 | 547.21 | 256.89 | 67.05 | 547.21 | 262.05 | 108.88 | 472.65 | 293.58 | 109.25 | 472.42 | 306.31 | 107.99 | 467.08 | 332.55 | 13.29% | 19.24% | 26.90% |
Average | 23.92 | 761.49 | 359.40 | 24.02 | 757.29 | 355.73 | 24.01 | 759.43 | 363.54 | 42.87 | 657.60 | 398.81 | 42.86 | 656.04 | 413.79 | 42.50 | 633.10 | 452.71 | 11.20% | 16.57% | 25.14% |
Instance | Static | Learnheuristic | Gap (%) | ||||||||||||||||||
Low (1) | Medium (2) | High (3) | Low (4) | Medium (5) | High (6) | (1)–(4) | (2)–(5) | (3)–(6) | |||||||||||||
Nodes | Fails | Nodes | Fails | Nodes | Fails | Nodes | Fails | Nodes | Fails | Nodes | Fails | ||||||||||
p4.2 | 60.62 | 33.23 | 60.62 | 33.52 | 60.62 | 32.93 | 59.62 | 27.59 | 59.19 | 26.36 | 58.71 | 23.93 | −16.97% | −21.37% | −27.33% | ||||||
p4.3 | 56.77 | 31.65 | 56.77 | 31.94 | 56.77 | 31.47 | 56.68 | 27.19 | 56.49 | 26.36 | 56.14 | 24.11 | −14.10% | −17.48% | −23.39% | ||||||
p4.4 | 51.33 | 29.31 | 51.33 | 29.52 | 51.33 | 29.00 | 52.76 | 26.77 | 52.71 | 25.94 | 52.46 | 24.07 | −8.66% | −12.12% | −16.99% | ||||||
p5.2 | 38.50 | 21.61 | 38.50 | 21.78 | 38.50 | 21.29 | 36.60 | 17.05 | 36.51 | 16.23 | 36.47 | 14.50 | −21.10% | −25.46% | −31.93% | ||||||
p5.3 | 36.48 | 20.91 | 36.48 | 21.07 | 36.48 | 20.67 | 36.07 | 17.70 | 35.99 | 16.96 | 35.92 | 15.41 | −15.34% | −19.49% | −25.45% | ||||||
p5.4 | 34.65 | 20.39 | 34.65 | 20.53 | 34.65 | 20.17 | 35.57 | 18.52 | 35.57 | 17.87 | 35.47 | 16.38 | −9.20% | −12.98% | −18.79% | ||||||
p6.2 | 38.50 | 21.18 | 38.50 | 21.40 | 38.50 | 21.20 | 37.89 | 17.63 | 37.85 | 16.88 | 37.78 | 15.25 | −16.75% | −21.14% | −28.05% | ||||||
p6.3 | 39.17 | 22.00 | 39.17 | 22.26 | 39.17 | 22.03 | 40.49 | 20.63 | 40.38 | 19.92 | 40.25 | 18.67 | −6.24% | −10.53% | −15.23% | ||||||
p6.4 | 38.13 | 21.81 | 38.13 | 22.06 | 38.13 | 21.80 | 42.89 | 23.51 | 42.87 | 22.82 | 42.59 | 21.52 | 7.83% | 3.43% | −1.28% | ||||||
p7.2 | 41.50 | 22.95 | 41.50 | 23.13 | 41.50 | 22.72 | 39.77 | 18.43 | 39.60 | 17.67 | 39.12 | 15.78 | −19.70% | −23.60% | −30.55% | ||||||
p7.3 | 38.54 | 21.81 | 38.54 | 21.97 | 38.54 | 21.61 | 38.64 | 19.19 | 38.52 | 18.45 | 38.23 | 16.65 | −12.03% | −16.03% | −22.93% | ||||||
p7.4 | 35.63 | 20.70 | 35.63 | 20.84 | 35.63 | 20.49 | 37.65 | 19.87 | 37.60 | 19.21 | 37.19 | 17.52 | −3.99% | −7.83% | −14.46% | ||||||
Average | 42.49 | 23.96 | 42.49 | 24.17 | 42.49 | 23.78 | 42.89 | 21.17 | 42.77 | 20.39 | 42.53 | 18.65 | −11.35% | −15.38% | −21.36% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Uguina, A.R.; Gomez, J.F.; Panadero, J.; Martínez-Gavara, A.; Juan, A.A. A Learnheuristic Algorithm Based on Thompson Sampling for the Heterogeneous and Dynamic Team Orienteering Problem. Mathematics 2024, 12, 1758. https://doi.org/10.3390/math12111758
Uguina AR, Gomez JF, Panadero J, Martínez-Gavara A, Juan AA. A Learnheuristic Algorithm Based on Thompson Sampling for the Heterogeneous and Dynamic Team Orienteering Problem. Mathematics. 2024; 12(11):1758. https://doi.org/10.3390/math12111758
Chicago/Turabian StyleUguina, Antonio R., Juan F. Gomez, Javier Panadero, Anna Martínez-Gavara, and Angel A. Juan. 2024. "A Learnheuristic Algorithm Based on Thompson Sampling for the Heterogeneous and Dynamic Team Orienteering Problem" Mathematics 12, no. 11: 1758. https://doi.org/10.3390/math12111758
APA StyleUguina, A. R., Gomez, J. F., Panadero, J., Martínez-Gavara, A., & Juan, A. A. (2024). A Learnheuristic Algorithm Based on Thompson Sampling for the Heterogeneous and Dynamic Team Orienteering Problem. Mathematics, 12(11), 1758. https://doi.org/10.3390/math12111758