Network-Aware Gaussian Mixture Models for Multi-Objective SD-WAN Controller Placement
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsDear authors, in the next paragraphs, my comments about your manuscript.
The article presents an original approach to the SD-WAN context based on Gaussian Mixture Models (GMM). The introduction of hybrid distance metrics (latency, geographical distance, topological cost, and reliability) is a notable contribution in regard to the optimization of controller placement. The NA-GMM model simultaneously considers latency, load balancing, fault tolerance, and scalability as criteria, addressing the complexity of the controller placement issue in distributed SD-WAN networks.
The article demonstrates exemplary theoretical rigour, with formal modelling of the GMM components, convergence analysis of the modified EM algorithm, weight normalization, and validation of the hybrid metric.
The employment of Mininet, Internet Topology Zoo, and well-known benchmarks (DRL, ACO, and CPCSA) brings solid and broad validation to the proposal, with concrete metrics (ACL, WCL, ICL, NDR). NA-GMM achieves improvements in average latency metrics of up to 37.5%, load balancing with NDR near 1.0, and computational efficiency, with less memory usage and faster execution, than all else. The proposal can be considered for real-world application in large-scale enterprise networks, providing better results without pushing the computational requirements needed by deep learning-based approaches.
Points for improvement:
1.With the normalization of α, β, γ, δ having been mentioned, a sensitivity analysis on the way in which the different combinations affect the results is yet to be performed. Parametric or impact analyses would be beneficial.
2.It was assumed that the value k is optimal based on elbow and silhouette coefficient analyses, so there is no consideration of the effect of varying the number of controllers on other topologies.
3.From another perspective, though, an exponential distance decay-based reliability model is quite simplistic. What would be more practical would be to consider empirical link failure data or some metrics taken from real networks.
4 With a slight mention in the literature gap subsection, no specific evaluation of fault resilience is done towards malicious faults or attacks aimed at the controllers.
5.The paper has not been able to test the NA-GMM under dynamics of topology change (addition/removal of nodes), although mentioning that SD-WAN networks are highly dynamic.
6.Limits on the number of nodes in the topologies were 145 with only three topologies. His application on a large-scale network (>1000 nodes) such as an operator backbone was never explored.
7 While the confidence intervals and significance tests are mentioned, some quantitative usage details (e.g., p-values, effect size, data distribution, etc.) are still not given.
Author Response
Dear Esteemed Reviewer,
We sincerely thank you for your thorough and constructive review of our manuscript titled "Network-Aware Gaussian Mixture Models for Multi-Objective SD-WAN Controller Placement." Your detailed feedback and insightful comments have been invaluable in helping us improve the quality and clarity of our work. We greatly appreciate the time and effort you have invested in evaluating our research, and we recognize that your expertise has contributed significantly to strengthening this manuscript. We have carefully considered each point raised and are pleased to provide detailed responses below. We believe that addressing these concerns will enhance the accessibility and impact of our research for the broader scientific community. We are committed to ensuring that our work meets the highest standards of scientific rigor and clarity. Below, we address each of your specific comments with detailed explanations and references to the relevant sections, tables, and figures in our manuscript.
Comment 1: With the normalization of α, β, γ, δ having been mentioned, a sensitivity analysis on the way in which the different combinations affect the results is yet to be performed. Parametric or impact analyses would be beneficial.
Response 1: We appreciate the reviewer's insightful comment. To address this concern, we have added Section 3.3 "Sensitivity Analysis of Hybrid Distance Metric Parameters" which provides a comprehensive parametric analysis of the weight combinations.
Comment 2: It was assumed that the value k is optimal based on elbow and silhouette coefficient analyses, so there is no consideration of the effect of varying the number of controllers on other topologies.
Response 2: We respectfully clarify our methodology for optimal k determination. The elbow and silhouette coefficient analyses were specifically employed to scientifically determine the optimal number of controllers rather than using arbitrary selection or exhaustive testing. As shown in Figure 7 and Section 4.2.1, we conducted rigorous clustering validation across all three topologies.
Comment 3: From another perspective, though, an exponential distance decay-based reliability model is quite simplistic. What would be more practical would be to consider empirical link failure data or some metrics taken from real networks.
Response 3: We acknowledge the reviewer's valid concern regarding the simplicity of our exponential decay reliability model. This approach ensures that our algorithm's core contributions (hybrid distance metric integration and probabilistic clustering) can be evaluated independently of complex failure modeling assumptions. We plan to incorporate real-world reliability datasets and advanced failure modeling as part of our future research agenda, where link failure considerations will be addressed as a primary objective rather than a secondary component of the distance metric.
Comment 4: With a slight mention in the literature gap subsection, no specific evaluation of fault resilience is done towards malicious faults or attacks aimed at the controllers.
Response 4: We acknowledge that our current work does not explicitly address security-oriented fault resilience or malicious attacks targeting controllers. While we briefly mentioned security considerations in Section 2.8 as an emerging research gap, our primary focus was on optimizing controller placement for performance metrics (latency, load balancing, throughput) rather than security resilience.
Comment 5: The paper has not been able to test the NA-GMM under dynamics of topology change (addition/removal of nodes), although mentioning that SD-WAN networks are highly dynamic.
Response 5: We acknowledge this important limitation in our evaluation. While we highlighted the dynamic nature of SD-WAN environments in Section 1, our experimental evaluation focused on static topology configurations to establish baseline performance characteristics of the NA-GMM algorithm.
Comment 6: Limits on the number of nodes in the topologies were 145 with only three topologies. His application on a large-scale network (>1000 nodes) such as an operator backbone was never explored.
Response 6: We would like to thank the reviewer for such an aspect regarding scalability validation on large-scale networks. Our evaluation focused on medium-scale topologies (33-145 nodes) from the Internet Topology Zoo dataset, which represent infrastructure-level network elements (switches/routers) rather than end-user devices. In practical SD-WAN deployments, each topology node serves as an aggregation point managing hundreds to thousands of edge connections, effectively scaling the network's operational capacity beyond the core topology size. While our theoretical complexity analysis demonstrates favorable O(n²k) scaling characteristics compared to benchmark algorithms, we acknowledge that direct evaluation on operator backbone networks with >1000 infrastructure nodes require specialized computational resources and represents a significant scalability challenge for all compared algorithms.
Comment 7: While the confidence intervals and significance tests are mentioned, some quantitative usage details (e.g., p-values, effect size, data distribution, etc.) are still not given.
Response 7: We thank the reviewer for their careful attention to our statistical reporting. We acknowledge an error in our statistical terminology - we incorrectly mentioned "confidence intervals" in our methodology when we actually computed and reported "mean/average" values from multiple simulation runs. We have corrected this terminology in the revised manuscript.
Reviewer 2 Report
Comments and Suggestions for AuthorsSee the comment.
Comments for author File: Comments.pdf
Author Response
Dear Esteemed Reviewers,
We sincerely thank you for your thorough and constructive review of our manuscript titled "Network-Aware Gaussian Mixture Models for Multi-Objective SD-WAN Controller Placement." Your detailed feedback and insightful comments have been invaluable in helping us improve the quality and clarity of our work. We greatly appreciate the time and effort you have invested in evaluating our research, and we recognize that your expertise has contributed significantly to strengthening this manuscript. We have carefully considered each point raised and are pleased to provide detailed responses below. We believe that addressing these concerns will enhance the accessibility and impact of our research for the broader scientific community. We are committed to ensuring that our work meets the highest standards of scientific rigor and clarity. Below, we address each of your specific comments with detailed explanations and references to the relevant sections, tables, and figures in our manuscript.
Comment 1: Several parts of the manuscript (particularly Sections 3 and 4) are overly descriptive and could benefit from condensation. For instance, the mathematical background of GMMs is explained at textbook-level detail, which may be unnecessary for the target readership.
Response 1: We thank the reviewer for this valuable feedback regarding manuscript conciseness. We acknowledge that certain sections contained excessive detail that may be redundant for readers familiar with machine learning fundamentals. We have condensed the manuscript by approximately 15%, specifically:
- Section 3.3.1: Reduced GMM mathematical background to essential formulations directly relevant to our network-aware modifications
- Sections 3.3.2-3.3.5: Streamlined theoretical explanations focusing on novel contributions rather than standard concepts
- Section 4: Condensed experimental descriptions while retaining critical implementation details
- Algorithm presentations: Emphasized innovative aspects rather than conventional procedural steps
Comment 2: Although the abstract and introduction mention dynamic networks, the algorithm does not support online learning or dynamic reconfiguration.
Response 2: We thank the reviewer for highlighting this important distinction regarding our use of "dynamic networks." We acknowledge the need to clarify our terminology and scope. When we referenced "dynamic networks" in SD-WAN deployments, referring to the operational environment characteristics such as variable traffic patterns, changing QoS requirements, and evolving network conditions that require optimized controller placement. However, we noted that their NA-GMM algorithm addresses static topology optimization, assuming the physical network structure remains fixed during the optimization process. This differs from dynamic reconfiguration algorithms that handle real-time topology changes or online learning approaches that adapt to network evolution. The authors have clarified the terminology to distinguish between dynamic operational conditions and dynamic topology changes, also included to into the future work in section 6. And also removed the subsection of adaptive / dynamic controller placement from the literature.
Comment 3: Some existing work should be further considered. stability analysis of networked control systems under dos attacks and security controller design with mini-batch machine learning supervision and intelligent event-triggered control supervised by mini-batch machine learning and data compression mechanism for t-s fuzzy ncss under dos attacks
Response 3: The reviewer highlighted important research in networked control systems security, but the current work focuses on SD-WAN infrastructure optimization for performance metrics, without incorporating security considerations or attack resilience mechanisms. Future work will incorporate these areas in section 6.
Comment 4: While the hybrid distance metric is a key innovation, there is no detailed ablation study showing the individual contributions of α, β, γ, δ components.
Response 4: We thank the reviewer for recognizing the importance of understanding individual component contributions to our hybrid distance metric. This concern has been addressed through the addition of Section 3.3 "Sensitivity Analysis of Hybrid Distance Metric Parameters," which provides a comprehensive ablation study examining the individual and combined effects of the α, β, γ, δ components.
Comment 5: The computational complexity is discussed theoretically, but empirical runtime benchmarks (e.g., how it scales with number of nodes) are limited.
Response 5: We thank the reviewer for this observation. While we provided theoretical O(n²k) complexity analysis, our empirical evaluation in Section 4.2.3 focuses on controller scaling (K=1 to K=5) as shown in Figure 18, where NA-GMM demonstrates 68.9% faster execution than DRL with consistent 20.6% growth rate per additional controller. We acknowledge that our node count scalability assessment is limited to three topologies (33-145 nodes), which provides insufficient empirical validation of runtime scaling with network size. Comprehensive empirical benchmarking across varying node counts (100-1000+ nodes) represents an important validation step for our theoretical complexity claims and will be addressed in future large-scale evaluation studies requiring specialized computational infrastructure, also included in the future work section 6.
Comment 6: Phrases like “network-aware” and “hybrid distance metric” are used repeatedly without variation, which can be stylistically fatiguing.
Response 6: We thank the reviewer for this stylistic feedback. We acknowledge the repetitive terminology and have introduced varied expressions while maintaining technical precision. "Network-aware" (referring to our α, β, γ, δ parameter-based topology node analysis) and "hybrid distance metric" (denoting the integration of four components with Haversine distance for accurate node positioning) are varied with alternatives such as "topology-conscious," "multi-dimensional distance framework," and "composite proximity measure" where contextually appropriate. These revisions improve stylistic variety while preserving the technical accuracy of our core contributions.
Comment 7: Figures (e.g., controller placement visualization) are informative but could benefit from color-coding clusters, clearer node labels, or adding legends.
Response 7: We appreciate the reviewer for this visually feedback, the figures been enhanced (Figures 11-13)
Comment 8: Some grammatical expressions should be further modified.
Response 8: Proof reading done and double checked.
Reviewer 3 Report
Comments and Suggestions for AuthorsThis paper mainly investigates the performance of multiple SDN controller placement methods by implementing these methods in a Mininet emulation environment. This paper employs realistic traffic scenarios and consistent network settings to evaluate the impact of alternative clustering-based placement techniques on overall network performance. The purpose of this paper is to give practical insights into the usefulness of these algorithms beyond theoretical study, focusing on their influence on communication efficiency and system responsiveness in real-world scenarios. The paper is well organized, but there are some issues that need to be addressed.
1- There is no actual numerical results, such as latency, throughput, etc..... to show the effectiveness and robustness of the proposed model.
2- There is no comparative figure or table that represents how each model will perform.
3- The presented idea needs to be compared with other benchmarks.
4- Some important information need to be mentioned, such as topology size and type, number to switches, controller or nodes used in this model
5- What are the limitations of the proposed model?
Author Response
Dear Esteemed Reviewers,
We sincerely thank you for your thorough and constructive review of our manuscript titled "Network-Aware Gaussian Mixture Models for Multi-Objective SD-WAN Controller Placement." Your detailed feedback and insightful comments have been invaluable in helping us improve the quality and clarity of our work. We greatly appreciate the time and effort you have invested in evaluating our research, and we recognize that your expertise has contributed significantly to strengthening this manuscript. We have carefully considered each point raised and are pleased to provide detailed responses below. We believe that addressing these concerns will enhance the accessibility and impact of our research for the broader scientific community. We are committed to ensuring that our work meets the highest standards of scientific rigor and clarity. Below, we address each of your specific comments with detailed explanations and references to the relevant sections, tables, and figures in our manuscript.
Comment 1: There are no actual numerical results, such as latency, throughput, etc..... to show the effectiveness and robustness of the proposed model.
Response 1: We appreciate the reviewer's insightful comment. Our paper contains extensive numerical results demonstrating the effectiveness and robustness of the proposed NA-GMM model:
Latency Results:
- Table 5 (TATA topology)
- Table 6 (BICS topology
- Table 7 (BESTEL topology)
Throughput Results:
- Table 8: Network performance metrics showing actual throughput values:
- TATA: NA-GMM achieves 282.97 Req/Sec vs ACO's 260.20 Req/Sec
- BICS: NA-GMM achieves 278.58 Req/Sec vs ACO's 236.86 Req/Sec
- BESTEL: NA-GMM achieves 292.67 Req/Sec vs ACO's 259.44 Req/Sec
Additional Performance Metrics:
- Controller Utilization: 4.0677% average across topologies (Table 8)
- Memory Usage: 1,247 MB average vs DRL's 2,134 MB (41.5% reduction, Table 12)
- Execution Time: 1.83 seconds for 5-controller placement vs DRL's 5.89 seconds (68.9% reduction, Table 10)
Comment 2: There is no comparative figure or table that represents how each model will perform.
Response 2: We respectfully clarify our paper includes multiple comparative visualizations and tables:
Comparative Tables:
- Tables 5, 6, 7: Direct performance comparison across all four algorithms (NA-GMM, ACO, DRL, CPSA) for each topology
- Table 8: Comprehensive network performance comparison showing throughput, utilization, CLV, and LIR metrics
- Table 10: Execution time comparison across different controller configurations
- Table 12: System resource utilization comparison
Comparative Figures:
- Figures 8, 10, 12: Horizontal bar charts comparing ACL, WCL, ICL, and NDR metrics across all algorithms
- Figures 14-17: Network performance comparisons showing:
- Figure 14: Throughput comparison across all topologies
- Figure 15: Controller utilization comparison
- Figure 16: Controller Load Variance comparison
- Figure 17: Load Imbalance Ratio comparison
- Figure 18: Time complexity comparison showing scalability characteristics
- Figure 20: CPU and RAM utilization during simulations
.
Comment 3: The presented idea needs to be compared with other benchmarks..
Response 3: We acknowledge the concern and we have compared NA-GMM against three established benchmark algorithms:
- ACO (Ant Colony Optimization)[Reference 48]: A well-established metaheuristic approach
- DRL (Deep Reinforcement Learning)[Reference 50]: State-of-the-art machine learning approach using CNN-LSTM architecture
- CPSA (Controller Placement with Critical Switch Awareness)[Reference 49]: Recent specialized approach for SDN controller placement
These benchmarks represent different algorithmic paradigms (metaheuristic, machine learning, and network-aware approaches) providing comprehensive comparative analysis.
Comment 4: Some important information needs to be mentioned, such as topology size and type, number to switches, controller or nodes used in this model.
Response 4: We acknowledge that our current work include topology information is provided in multiple locations:
Network Specifications (Table 4, page 17):
- TATA topology: 145 nodes, 194 edges (large-scale enterprise network)
- BICS topology: 84 nodes, 101 edges (medium-scale regional network)
- BESTEL topology: 33 nodes, 48 edges (small-scale constrained network)
Experimental Configuration:
- Number of controllers: k=5 (determined through silhouette analysis, Figure 7)
- Controller capacity: 7,000 Req/Sec uniformly applied
- Network bandwidth: 1000 Mbps
- Topology sources: Internet Topology Zoo (ITZ) real-world network topologies
- Visual representation: Figure 6 shows geographical layouts of all three topologies
Comment 5: What are the limitations of the proposed model?
Response 5: We acknowledge the limitations of our proposed model, which we have addressed comprehensively in Section 6: Future Work (pages 31-32).
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors explain and add all the recommendations in the manuscript
Reviewer 3 Report
Comments and Suggestions for AuthorsThe authors addressed all the required comments