You are currently viewing a new version of our website. To view the old version click .
Photonics
  • Article
  • Open Access

22 June 2025

Multi-Link Fragmentation-Aware Deep Reinforcement Learning RSA Algorithm in Elastic Optical Network

,
,
and
1
School of Telecommunications Engineering, Xidian University, Xi’an 710071, China
2
Hangzhou Institute of Technology, Xidian University, Hangzhou 311231, China
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Advancements and Future Perspectives in All-Optical Detection and Reliability Improvement Technologies

Abstract

Deep reinforcement learning has been extensively applied for resource allocation in elastic optical networks. However, many studies focus on link-level state analysis and rarely discuss the influence between links, which may affect the performance of allocation algorithms. In this paper, we propose a multi-link fragmentation deep reinforcement learning-based routing and spectrum allocation algorithm (MFDRL-RSA). We number the links using a breadth-first numbering algorithm. Based on the numbering results, high-frequency links are selected to construct the network state matrix that reflects the resource distribution. According to the state matrix, we calculate a multi-link fragmentation degree, quantifying resource fragmentation within a representative subset of network. The MFDRL-RSA algorithm enhances the accuracy of the agent’s decision-making by incorporating it into the reward function, thereby improving its performance in routing decisions, which contributes to the overall allocation performance. Simulation results show that MFDRL-RSA achieves lower blocking rates compared to the reference algorithms, with reductions of 16.34%, 13.01%, and 7.42% in the NSFNET network and 19.33%, 15.17%, and 9.95% in the Cost-239 network. It also improves spectrum utilization by 12.28%, 9.83%, and 6.32% in NSFNET and by 13.92%, 11.55%, and 8.26% in Cost-239.

1. Introduction

Elastic optical networks (EONs) have been extensively adopted in optical communication networks due to the flexibility in resource allocation [1]. As the utilization of EONs continues to increase, optimizing network performance has become a primary concern. routing and spectrum allocation (RSA) are critical in enhancing network efficiency. The RSA process can result in spectrum fragmentation arising due to allocation constraints, leading to the underutilization of spectrum, which causes resource wastage and service blocking [2,3,4]. Consequently, addressing spectrum fragmentation has become a key area of research in RSA optimization. In the last decade, many heuristic algorithms have been developed to address the RSA problem. And in recent years, significant growth has been seen in applying deep reinforcement learning (DRL) to network optimization problems, which has been primarily driven by advancements in artificial intelligence technologies [5,6,7,8]. DRL learns optimal strategies from historical and real-time network data without predefined rules. Compared to heuristic-based algorithms, DRL has demonstrated superior adaptability, flexibility, and efficiency, which can effectively address the evolving demands of networks and complex resource allocation challenges [9,10]. As a result, DRL has become a powerful tool for solving the RSA problem in EONs, with great potential for future research and practical use [11]. However, current DRL-based allocation algorithms face issues such as reward convergence to fixed values and poorly designed rewards that overlook interdependence among network links, leading to insufficient guidance for the agent.
To optimize the reward function of DRL, we proposed an MFDRL-RSA algorithm. This algorithm introduces a multi-link fragmentation degree into the reward function, aiming to help the agent focus on the heavily utilized parts of the network when making routing allocation decisions, rather than just the current candidate path. The innovations of MFRDL-RSA algorithm are as follows:
  • Unlike other studies that evaluate spectrum fragmentation only on the links within the current candidate path, the MFDRL-RSA algorithm focusses on multiple links along different paths. Since spectrum fragmentation of links can influence each other, focusing on the candidate path may ignore the impact of allocation decisions on subsequent requests. In contrast, the multi-link fragmentation degree offers a more holistic and effective assessment of global allocation.
  • We propose a breadth-first numbering (BFN) algorithm to ensure that the network state matrix constructed based on link numbering can represent the connectivity of links. This is crucial for correlating the multi-link fragmentation degree in the MFDRL-RSA algorithm with the distribution of network resource.
Simulation results show that in the NSFNET network, MFDRL-RSA reduces the blocking rate by 16.34%, 13.01%, and 7.42% and improves spectrum utilization by 12.28%, 9.83%, and 6.32% compared to KSP-FF, DeepRMSA, and DeepSF-PCE, respectively. In the Cost-239 network, it achieves greater improvements, with blocking rate reductions of 19.33%, 15.17%, and 9.95% and spectrum utilization improvements of 13.92%, 11.55%, and 8.26%, respectively. These findings underscore the efficacy of the MFDRL-RSA algorithm in enhancing the efficiency of EONs. Building upon the performance improvements already achieved by the basic DRL algorithm, MFDRL-RSA further reduces network blocking, positioning it as a promising solution for future optimization challenges.

4. Simulation and Analysis

4.1. Simulation Setup

In this paper, the 14-node NSFNET [31] and 11-node Cost-239 [32] networks are selected for simulation (see Figure 2a,b). Each optical fiber link is equipped with 320 FSs; each FS is 12.5 GHz . Requests are generated according to the Poisson process, with an average arrival rate and service duration following Poisson distribution with parameter λ and exponential distribution with parameter 1 / μ , respectively. Thus, the traffic load is λ/μ Erlang. The bandwidth demand for each request is uniformly distributed between 25 and 100 Gb / s . The neural network in the AC network consists of five hidden layers, each with 128 neurons. The ReLU activation function is used for hidden layers, and the Adam optimizer is employed for training. The batch size N is set to 200. The number of candidate paths K is set to 5. The discount factor γ and the entropy regularization coefficient α are set to 0.95 and 0.01, respectively. The exploration rate ε starts at 1 and is gradually decayed by 1 0 5 . The minimum value ε min is 0.05. A total of 500 training episodes are performed, each consisting of 10,000 connection requests.
Figure 2. Network topology in simulations. (a) NSFNET topology, (b) Cost-239 topology.

4.2. Simulation Result

In this section, the simulation results are presented to evaluate the performance of the MFDRL-RSA. Two metrics are used for the assessment: blocking rate and spectrum resource utilization. The blocking rate is defined as the ratio of the number of blocked requests to the total number of requests processed, while the spectrum resource utilization is defined as the ratio of occupied FSs to the total available FSs across all links in the network. These two metrics are commonly used to evaluate the effectiveness of RSA algorithms. A lower blocking rate indicates that the algorithm can successfully accommodate more requests. And a higher spectrum utilization implies that the algorithm can make more efficient use of available FSs, reducing spectrum waste caused by fragmentation.
The substantial volume of link state data implicated in the fragmentation calculation can potentially lead to prolonged training times for the model. To address this, we analyze the computational complexity of the fragmentation degree calculation defined in Equation (4), which measures the ratio of adjacent free FSs to occupied FSs across a set of links. Specifically, for each occupied FS, we traverse up to four adjacent FS positions to determine whether they are free. This results in a per-slot complexity of O ( 1 ) . Let n denote the total number of links in the network and f   denote the number of FSs per link. Then, the total number of slot positions is O ( nf ) , and in the worst case where most FSs are occupied, the overall complexity of calculating the fragmentation degree over all links is O ( nf ) , simplified as O ( n ) assuming f is constant. To reduce the computational burden, we propose using high-frequency links that appear more often in k shortest paths between node pairs. For the NSFNET topology, the 11 links (out of 22 total) that emerge with a frequency that exceeds the mean are designed as high-frequency links. Let n denote the number of high-frequency links. The complexity of the fragmentation calculation under this approach becomes O ( n ) , and, since n = n / 2 , it is approximately O ( n / 2 ) , effectively reducing the computation cost.
The training results for the link statistic cases described above are presented in Figure 3. KSP-FF (k shortest path and first-fit) is a widely used heuristic algorithm for RSA in EONs [33]. Compared to it, the MFDRL-RSA algorithm achieves a lower blocking rate in both link statistic cases. MF-RSA is a heuristic algorithm developed based on the fragmentation degree proposed in this paper. In MF-RSA, for each request, the state matrix is constructed based on all links. The fragmentation degree of each feasible allocation is then calculated using the proposed equation, and the option with the highest value is selected for spectrum allocation. MFDRL-RSA-WA (short for MFDRL-RSA with all links) and MFDRL-RSA-WHF (short for MFDRL-RSA with high-frequency links) are two variants of the MFDRL-RSA algorithm that differ in how they calculate the multi-link fragmentation degree. MFDRL-RSA-WA computes fragmentation based on all links, and MFDRL-RSA-WHF focuses only on high-frequency links.
Figure 3. Blocking rate of KSP-FF, MF-RSA, and MFDRL-RSA algorithms with different link statistic cases in NSFNET.
The results demonstrate that the algorithm incorporating fragmentation degree into the reward function of DRL achieves a lower blocking rate than the MF-RSA. Specifically, the MFDRL-RSA-WA algorithm using all links achieves an average blocking rate reduction of 5.15%, while the MFDRL-RSA-WHF algorithm using high-frequency links achieves an average reduction of 9.14%. This is because the fragmentation degree we designed is mainly calculated based on the resource status of a subset of links in the network. During the processing of each request, if the links involved are not within the scope of the state matrix, the allocation decision will not directly affect the value of the fragmentation degree. As a result, the metric fails to provide meaningful guidance. However, heuristic metrics are usually expected to clearly indicate the quality of different allocation options during each request, thus guiding decision-making. The fragmentation degree cannot fulfill this need, so it is not suitable for heuristic algorithms.
With adaptive learning capability, DRL can learn the latent information embedded in the fragmentation degree during training, thereby developing more effective routing strategies to reduce network fragmentation. As a result, the MFDRL-RSA algorithm successfully leverages the characteristics and advantages of DRL. Meanwhile, the blocking rate is demonstrably lower for the high-frequency link case than the all-links case, with an average reduction of approximately 5.31%. In the all-links case, the low-occupancy links increase the reward for decision. This distorts the multi-link fragmentation degree, thereby affecting the agent’s decision optimization. In contrast, the multi-link fragmentation degree in cases of a high-frequency link is more accurate. This is due to the selection of high-frequency links, which are key to determining whether a request is accepted or rejected, thereby providing the agent with more meaningful guidance for decision optimization. The fragmentation degree calculations in the following simulations are based on high-frequency links, which both accelerate the training process and enhance performance through a better-informed reward signal.
Figure 4 and Figure 5 show the blocking rate variation in different algorithms in the NSFNET and Cost-239 network environments when the traffic load is 250 Erlang. The blocking rate is calculated for every 10,000 request arrivals, and the entire training process consists of 500 episodes. It can be observed that in both network environments, the MFDRL-RSA, DeepSF-PCE, and DeepRMSA algorithms progressively converge to a stable performance as training advances. DeepRMSA is a widely referenced DRL-based allocation algorithm characterized by a simple reward function where a successful request receives a reward of 1, while a failed one receives −1 [14]. Compared to heuristic algorithms, DeepRMSA has been proven to more effectively reduce network blocking and enhance transmission performance, making it a fundamental benchmark in DRL resource allocation research. DeepSF-PCE is a fragmentation-aware DRL-based RMSA algorithm that enhances the reward function by incorporating Shannon entropy-based fragmentation metrics [17]. Compared to MFDRL-RSA, which constructs the network state matrix and computes the muti-link fragmentation degree based on representative links, DeepSF-PCE focuses only on the links within the current candidate path. This comparison highlights the advantage of the multi-link fragmentation degree, which provides a more holistic assessment of global allocation. Specifically, in the NSFNET network environment, the MFDRL-RSA algorithm reduces the blocking rate by 14.93% compared to the KSP-FF algorithm, by 12.11% compared to the DeepRMSA algorithm, and by 6.92% compared to the DeepSF-PCE algorithm. In the Cost-239 network environment, the MFDRL-RSA algorithm’s blocking rate is reduced by 18.11% compared to the KSP-FF algorithm, by 13.45% compared to the DeepRMSA algorithm, and by 9.27% compared to the DeepSF-PCE algorithm. Furthermore, the MFDRL-RSA algorithm has demonstrated remarkable adaptability across network environments, significantly enhancing transmission performance in NSFNET and Cost-239. Its superior capability highlights the potential to boost operational efficiency and service quality in EONs.
Figure 4. Blocking rate of KSP-FF, DeepRMSA, DeepSF-PCE, and MFDRL-RSA in NSFNET.
Figure 5. Blocking rate of KSP-FF, DeepRMSA, DeepSF-PCE, and MFDRL-RSA in Cost-239.
Figure 6 and Figure 7, respectively, show the blocking rate and spectrum resource utilization results of different algorithms under different load conditions in the NSFNET network. The figures illustrate that as the network load increases, the blocking rate of all four algorithms rises. An increase in network load results in a higher utilization of spectrum resources, rendering it increasingly challenging to identify available free spectrum for subsequent service connections. In the NSFNET network, the blocking rate of the MFDRL-RSA algorithm is lower than that of the KSP-FF, DeepRMSA, and DeepSF-PCE algorithms. The blocking rate of the MFDRL-RSA algorithm is, on average, 16.34% lower than that of the KSP-FF algorithm, 13.01% lower than that of the DeepRMSA algorithm, and 7.42% lower than that of the DeepSF-PCE algorithm. The spectrum resource utilization by the MFDRL-RSA algorithm is higher than that of the KSP-FF, DeepRMSA, and DeepSF-PCE algorithms. Specifically, the resource utilization of the MFDRL-RSA algorithm is, on average, 12.28% higher than that of the KSP-FF algorithm, 9.83% higher than that of the DeepRMSA algorithm, and 6.32% higher than that of the DeepSF-PCE algorithm. The findings demonstrate the efficacy of the MFDRL-RSA algorithm in varying load conditions, outperforming reference approaches in reducing network blocking and improving resource utilization.
Figure 6. Blocking rate of KSP-FF, DeepRMSA, DeepSF-PCE, and MFDRL-RSA under different traffic loads in NSFNET.
Figure 7. Spectrum resource utilization of KSP-FF, DeepRMSA, DeepSF-PCE, and MFDRL-RSA under different traffic loads in NSFNET.
Figure 8 and Figure 9 illustrate the blocking rate and spectrum resource utilization of the algorithms under varying load conditions in the Cost-239 network. As network load increases, the blocking rate and spectrum resource utilization of all algorithms rise accordingly. In the Cost-239 network, the MFDRL-RSA algorithm achieves an average blocking rate that is 19.33% lower than that of the KSP-FF algorithm, 15.17% lower than that of the DeepRMSA algorithm, and 9.95% lower than that of the DeepSF-PCE algorithm. Additionally, the spectrum resource utilization of the MFDRL-RSA algorithm surpasses that of KSP-FF, DeepRMSA, and DeepSF-PCE. On average, MFDRL-RSA improves spectrum utilization by 13.92% over KSP-FF, 11.55% over DeepRMSA, and 8.26% over DeepSF-PCE. These results further confirm the effectiveness of the MFDRL-RSA algorithm under varying load conditions in different networks.
Figure 8. Blocking rate of KSP-FF, DeepRMSA, DeepSF-PCE, and MFDRL-RSA under different traffic loads in Cost-239.
Figure 9. Spectrum resource utilization of KSP-FF, DeepRMSA, DeepSF-PCE, and MFDRL-RSA under different traffic loads in Cost-239.
It is noteworthy that within the Cost-239 network, which is distinguished by a more significant number of links and an elevated level of environmental complexity, MFDRL-RSA exhibits a marked superiority in terms of optimization outcomes, achieving a more substantial reduction in blocking rates and a greater improvement in spectrum resource utilization compared to its performance in the NSFNET environment. This finding underscores the potential of MFDRL-RSA in enhancing transmission performance within complex network environments. The simulation results confirm the effectiveness of MFDRL-RSA in improving DRL-based routing decisions through a reward design with the multi-link fragmentation degree. This refinement allows the model to capture the network state better, leading to more informed routing decision-making. The efficacy of the MFDRL-RSA algorithm in comparison to other DRL-based resource allocation algorithms is demonstrated by its ability to reduce network blocking and enhance resource utilization.

5. Conclusions

This paper proposes MFDRL-RSA, an improved DRL-based routing algorithm for RSA problems that addresses the lack of global network awareness in existing methods. It introduces a multi-link fragmentation degree to evaluate resource distribution within a representative subset of a network. A novel BFN algorithm is used to preserve link adjacency in the network state matrix. To reduce computation, only high-frequency links are considered in fragmentation calculation, which improves training efficiency and lowers blocking rates compared to using all links. This fragmentation metric is integrated into the reward function, guiding the agent toward routing decisions that minimize fragmentation, with spectrum allocated using the FF strategy.
Simulation results show that MFDRL-RSA achieves lower blocking rates than KSP-FF, DeepRMSA, and DeepSF-PCE under identical load conditions in both NSFNET and Cost-239 networks and maintains consistent superiority under varying traffic loads. Specifically, it reduces average blocking rates by 16.34% and 19.33% compared to KSP-FF, by 13.01% and 15.17% compared to DeepRMSA, and by 7.42% and 9.95% compared to DeepSF-PCE, in NSFNET and Cost-239, respectively. In terms of spectrum utilization, MFDRL-RSA improves by 12.28% and 13.92% over KSP-FF, by 9.83% and 11.55% over DeepRMSA, and by 6.32% and 8.26% over DeepSF-PCE in the same networks. The gains are more pronounced in the more complex Cost-239 network. These results confirm that MFDRL-RSA is a practical solution for solving RSA problems in dynamic EONs.
Although MFDRL-RSA has shown promising results, several aspects remain to be explored. First, we did not consider the real deployment of the algorithm. In real networks, decision delay may cause degraded service. Thus, we plan to evaluate the decision time of MFDRL-RSA under the same hardware and compare it with other approaches. We will assess training time to discuss if the performance gain justifies the cost. Additionally, the model’s generalization is unexplored. Since real environments vary, we will test generalization by training on one topology, testing on others, and evaluating performance under different traffic patterns like bursty or non-Poisson arrivals.

Author Contributions

Conceptualization, J.J. and Y.S.; methodology, J.J. and Y.S.; software, J.J., Y.S. and T.S.; validation, J.J. and Y.S.; formal analysis, J.J., J.C. and Y.S.; investigation, J.J. and Y.S.; resources, J.J., J.C. and T.S.; data curation, J.J. and Y.S.; writing—original draft preparation, J.J. and Y.S.; writing—review and editing, J.J. and Y.S.; visualization, J.J. and Y.S.; supervision, J.J., J.C. and T.S.; project administration, J.J., J.C. and T.S.; funding acquisition, J.J. and T.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China (62401428), Natural Science Basic Program of Shaanxi (2024JC-YBQN-0714) and the Fundamental Research Funds for the Central Universities (ZYTS25040).

Institutional Review Board Statement

Not applicable.

Data Availability Statement

There are no relevant datasets presented in this article. Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
MFDRL-RSAMulti-link fragmentation deep reinforcement learning-based routing and spectrum allocation algorithm
EONsElastic optical networks
RSARouting and spectrum allocation
DRLDeep reinforcement learning
BFNBreadth-first numbering
FSFrequency slot
TAMTraversing the adjacency matrix
ACActor-critic
KSP-FFk shortest path and first-fit
FFFirst-fit

References

  1. Kumar, K.S.; Kalaivani, S.; Ibrahim, S.P.S.; Swathi, G. Traffic and fragmentation aware algorithm for routing and spectrum assignment in Elastic Optical Network (EON). Opt. Fiber Technol. 2023, 81, 103480. [Google Scholar] [CrossRef]
  2. Lechowicz, P.; Tornatore, M.; Włodarczyk, A.; Walkowiak, K. Fragmentation metrics and fragmentation-aware algorithm for spectrally/spatially flexible optical networks. J. Opt. Commun. Netw. 2020, 12, 133–145. [Google Scholar] [CrossRef]
  3. Kitsuwan, N.; Akaki, K.; Pavarangkoon, P.; Nag, A. Spectrum allocation scheme considering spectrum slicing in elastic optical networks. J. Opt. Commun. Netw. 2021, 13, 169–181. [Google Scholar] [CrossRef]
  4. Mandloi, A. Routing and dynamic core allocation with fragmentation optimization in EON-SDM. Opt. Fiber Technol. 2024, 83, 103658. [Google Scholar]
  5. Mei, J.; Wang, X.; Zheng, K.; Boudreau, G.; Bin Sediq, A.; Abou-Zeid, H. Intelligent radio access network slicing for service provisioning in 6G: A hierarchical deep reinforcement learning approach. IEEE Trans. Commun. 2021, 69, 6063–6078. [Google Scholar] [CrossRef]
  6. He, Q.; Wang, Y.; Wang, X.; Xu, W.; Li, F.; Yang, K.; Ma, L. Routing optimization with deep reinforcement learning in knowledge defined networking. IEEE Trans. Mob. Comput. 2023, 23, 1444–1455. [Google Scholar] [CrossRef]
  7. Qureshi, K.I.; Lu, B.; Lu, C.; Lodhi, M.A.; Wang, L. Multi-agent DRL for Air-to-Ground Communication Planning in UAV-enabled IoT Networks. Sensors 2024, 24, 6535. [Google Scholar] [CrossRef]
  8. Wang, S.; Yuen, C.; Ni, W.; Guan, Y.L.; Lv, T. Multiagent deep reinforcement learning for cost- and delay-sensitive virtual network function placement and routing. IEEE Trans. Commun. 2022, 70, 5208–5224. [Google Scholar] [CrossRef]
  9. Hernández-Chulde, C.; Casellas, R.; Martínez, R.; Vilalta, R.; Muñoz, R. Experimental evaluation of a latency-aware routing and spectrum assignment mechanism based on deep reinforcement learning. J. Opt. Commun. Netw. 2023, 15, 925–937. [Google Scholar] [CrossRef]
  10. Xu, L.; Huang, Y.C.; Xue, Y.; Hu, X. Hierarchical reinforcement learning in multi-domain elastic optical networks to realize joint RMSA. J. Lightw. Technol. 2023, 41, 2276–2288. [Google Scholar] [CrossRef]
  11. Tanaka, T.; Shimoda, M. Pre-and post-processing techniques for reinforcement-learning-based routing and spectrum assignment in elastic optical networks. J. Opt. Commun. Netw. 2023, 15, 1019–1029. [Google Scholar] [CrossRef]
  12. Khorasani, Y.; Rahbar, A.G.; Alizadeh, B. A novel adjustable defragmentation algorithm in elastic optical networks. Opt. Fiber Technol. 2024, 82, 103615. [Google Scholar] [CrossRef]
  13. Bao, B.; Yang, H.; Yao, Q.; Yu, A.; Chatterjee, B.C.; Oki, E.; Zhang, J. SDFA: A service-driven fragmentation-aware resource allocation in elastic optical networks. IEEE Trans. Netw. Serv. Manag. 2021, 19, 353–365. [Google Scholar] [CrossRef]
  14. Chen, X.; Li, B.; Proietti, R.; Lu, H.; Zhu, Z.; Ben Yoo, S.J. DeepRMSA: A deep reinforcement learning framework for routing, modulation and spectrum assignment in elastic optical networks. J. Lightw. Technol. 2019, 37, 4155–4163. [Google Scholar] [CrossRef]
  15. Yan, W.; Li, X.; Ding, Y.; He, J.; Cai, B. DQN with prioritized experience replay algorithm for reducing network blocking rate in elastic optical networks. Opt. Fiber Technol. 2024, 82, 103625. [Google Scholar] [CrossRef]
  16. Gonzalez, M.; Condon, F.; Morales, P.; He, J.; Cai, B. Improving multi-band elastic optical networks performance using behavior induction on deep reinforcement learning. In Proceedings of the 2022 IEEE Latin-American Conference on Communications (LATINCOM), Rio de Janeiro, Brazil, 30 November–2 December 2022; Volume 1, pp. 1–6. [Google Scholar]
  17. Errea, J.; Djon, D.; Tran, H.Q.; Verchere, D.; Ksentini, A. Deep reinforcement learning-aided fragmentation-aware RMSA path computation engine for open disaggregated transport networks. In Proceedings of the 2023 International Conference on Optical Network Design and Modeling (ONDM), Coimbra, Portugal, 8–11 May 2023; Volume 1, pp. 1–3. [Google Scholar]
  18. Johari, S.S.; Taeb, S.; Shahriar, N.; Chowdhury, S.R.; Tornatore, M.; Boutaba, R.; Mitra, J.; Hemmati, M. DRL-assisted reoptimization of network slice embedding on EON-enabled transport networks. IEEE Trans. Netw. Serv. Manag. 2023, 20, 800–814. [Google Scholar] [CrossRef]
  19. Etezadi, E.; Natalino, C.; Diaz, R.; Lindgren, A.; Melin, S.; Wosinska, L.; Monti, P.; Furdek, M. Deep reinforcement learning for proactive spectrum defragmentation in elastic optical networks. J. Opt. Commun. Netw. 2023, 15, E86–E96. [Google Scholar] [CrossRef]
  20. Shimoda, M.; Tanaka, T. Mask RSA: End-to-end reinforcement learning-based routing and spectrum assignment. In Proceedings of the European Conference on Optical Communication (ECOC), Bordeaux, France, 13–16 September 2021. [Google Scholar]
  21. Tang, B.; Huang, Y.C.; Xue, Y.; Song, H.; Xu, Z. Heuristic reward design for deep reinforcement learning-based routing, modulation and spectrum assignment of elastic optical networks. IEEE Commun. Lett. 2022, 26, 2675–2679. [Google Scholar] [CrossRef]
  22. Asiri, A.; Wang, B. Deep Reinforcement Learning for QoT-Aware Routing, Modulation, and Spectrum Assignment in Elastic Optical Networks. J. Lightw. Technol. 2025, 43, 42–60. [Google Scholar] [CrossRef]
  23. Wang, Y.; Cao, X.; Pan, Y. A study of the routing and spectrum allocation in spectrum-sliced elastic optical path networks. In Proceedings of the 2011 Proceedings IEEE Infocom, Shanghai, China, 10–15 April 2011; Volume 1, pp. 1503–1511. [Google Scholar]
  24. Zhang, C.; Wang, P. Fuzzy logic system assisted sensing resource allocation for optical fiber sensing and communication integrated network. Sensors 2022, 22, 7708. [Google Scholar] [CrossRef]
  25. Chen, X.; Xu, Z.; Wu, Y.; Wu, Q. Heuristic algorithms for reliability estimation based on breadth-first search of a grid tree. Reliab. Eng. Syst. Saf. 2023, 232, 109083. [Google Scholar] [CrossRef]
  26. Liao, J.; Zhao, J.; Gao, F.; Li, G.Y. Deep learning aided low complex breadth-first tree search for MIMO detection. IEEE Trans. Wirel. Commun. 2023, 23, 6266–6278. [Google Scholar] [CrossRef]
  27. Zhou, C.; Huang, B.; Hassan, H.; Fränti, P. Attention-based advantage actor-critic algorithm with prioritized experience replay for complex 2-D robotic motion planning. J. Intell. Manuf. 2023, 34, 151–180. [Google Scholar] [CrossRef]
  28. Wang, H.; Gao, W.; Wang, Z.; Zhang, K.; Ren, J.; Deng, L.; He, S. Research on Obstacle Avoidance Planning for UUV Based on A3C Algorithm. J. Mar. Sci. Eng. 2023, 12, 63. [Google Scholar] [CrossRef]
  29. Wang, S.; Song, R.; Zheng, X.; Huang, W.; Liu, H. A3C-R: A QoS-oriented energy-saving routing algorithm for software-defined networks. Future Internet 2025, 17, 158. [Google Scholar] [CrossRef]
  30. Labao, A.B.; Martija, M.A.M.; Naval, P.C. A3C-GS: Adaptive moment gradient sharing with locks for asynchronous actor–critic agents. IEEE Trans. Neural Netw. Learn. Syst. 2020, 32, 1162–1176. [Google Scholar] [CrossRef]
  31. National Science Foundation. NSFNET: The Birth of the Commercial Internet. Available online: https://www.nsf.gov/impacts/internet (accessed on 3 June 2025).
  32. SNDlib—Survivable Network Design Library. Cost239 Topology. Available online: https://sndlib.zib.de/network.jsp?topology=cost239 (accessed on 3 June 2025).
  33. Vincent, R.J.; Ives, D.J.; Savory, S.J. Scalable capacity estimation for nonlinear elastic all-optical core networks. J. Light. Technol. 2019, 37, 5380–5391. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.