Next Article in Journal
A Narrative Review on Key Values Indicators of Millimeter Wave Radars for Ambient Assisted Living
Previous Article in Journal
EMF Exposure of Workers Due to 5G Private Networks in Smart Industries
 
 
Article
Peer-Review Record

An OpenFlow-Based Elephant-Flow Monitoring and Scheduling Strategy in SDN

Electronics 2025, 14(13), 2663; https://doi.org/10.3390/electronics14132663
by Qinghui Chen 1, Mingyang Chen 2,3, Hong Wen 1,* and Yazhi Shi 4
Reviewer 2: Anonymous
Reviewer 3:
Reviewer 4:
Reviewer 5: Anonymous
Electronics 2025, 14(13), 2663; https://doi.org/10.3390/electronics14132663
Submission received: 2 April 2025 / Revised: 1 June 2025 / Accepted: 4 June 2025 / Published: 30 June 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Review for

An Openflow-based elephant-flow monitoring and scheduling strategy in SDN

By Mingyang Chen, Qinghui Chen, Hong Wen, and Yazhi Shi

[Review period 26/4/2025-8/5/2025]

The authors propose monitoring and scheduling improvements in Software Defined Networking to address the elephant flow problem, i.e. the impact on the overall network quality and performance of transmitting very large flows of data (the elephant).

For a newcomer or a student in the field this report could be interesting as it is relatively brief and it highlights a framework in studying or researching such a problem (e.g. mininet, iperf, allping, ryu controller etc). In this respect it could motivate a newcomer to study the problem further.   However, in order the paper to contribute towards that end it would need some improvements to help the audience comprehend its contribution.

Similar issues have been studied in the research community at least since 2010 (e.g. [1]). By studying the paper, it is not clear if the authors propose something innovative.  The algorithms for detecting elephant flows and improving the scheduling do not provide the necessary details for someone to understand what the innovation is.  In addition, there are no comparisons with other proposals appearing with the literature as has been done with other works (e.g. [2]) to help someone comprehend the contribution.

I provide below some remarks that could have helped the readability or understandability of the paper. 

At the end of subsection 2.1 there is an algorithm provided.  It is not clear if it is related to the module “el_monitor” mentioned earlier.  In addition, it is not clearly presented what improvements or additions have been made to the existing Ryu framework in order to come up with “el_monitor”.

In the same algorithm and in Step 2 it defines an add_flow function.  It is not clear what exactly this function does.  Is it the one involved in Step 3 to add a flow table entry?  Is it related to what is mentioned earlier in this subsection: “…the application uses the Address Resolution Protocol (ARP) to resolve the IP address contained in the traffic, which is crucial for the subsequent traffic type judgment.”

In a similar sense, Subsection 2.1 says: “For the flow table delivery logic, this application inherits and improves the simple_switch_13 application built into Ryu”. It is not clear what were the exact improvements to the Ryu component. This could have helped someone appreciate the contributions of the present work.

In the algorithm provided at the end of Subsection 2.2 it appears that the essential part is the step “Configure new traffic path n”. It is not clear at that point what this configuration involves at that time.  Also, it is not clear if the last two statements “while true:” and “all path_n” are part of the function path_n() [this is what the indentation suggests] or they are the main body of the algorithm, which looks more probable.

Section 3 attempts to compare the performance of the leaf-spine structure with a fat-tree structure.  Figure 2 presents the underline leaf-spine.  It would help the reader if there were a diagram provided for the fat-tree structure it is compared against.

 

In section 3.2, third paragraph it says: “When the network has an elephant flow, it is difficult for the Fat-Tree structure to perform the scheduling algorithm.” It is a little difficult to understand what exactly difficulty is implied and what are the consequences.

In section 3.2 Fig. 3 is missing its legend.

In Section 4, second paragraph, it says: “The experimental results show that the solution based on the Fat-Tree spanning tree under the traditional structure will cause network jitter, obvious packet loss, network congestion at multiple time points, and no data packets can pass.”.  However, in the experimental results presented in section 3.2 there is a comparison of throughputs only.  There should be an explanation if the degradation of the other metrics can be derived from this (or they have been observed in other curves not listed in the paper).

[1] Hedera: Dynamic Flow Scheduling for Data Center Networks https://static.usenix.org/events/nsdi10/tech/full_papers/al-fares.pdf

[2] Elephant Flow Detection Mechanism in SDN-Based Data Center Networks https://www.researchgate.net/publication/344062865_Elephant_Flow_Detection_Mechanism_in_SDN-Based_Data_Center_Networks

Comments for author File: Comments.pdf

Author Response

Response:

We sincerely thank the reviewer for the detailed and thoughtful review.We have incorporated the following two references to support our discussion of elephant flow scheduling and detection mechanisms in SDN-based data center networks:

“… and dynamic flow scheduling systems like Hedera have shown to significantly improve bandwidth utilization in multi-rooted tree topologies [5]. ”

[5] Al-Fares, M., Radhakrishnan, S., Raghavan, B., Huang, N., & Vahdat, A. (2010). Hedera: Dynamic Flow Scheduling for Data Center Networks. NSDI. https://static.usenix.org/events/nsdi10/tech/full_papers/al-fares.pdf

We believe these references strengthen the technical depth and practical relevance of the revised manuscript.

The algorithm in Section 2.1 is indeed the core part of the el_monitor module. It is responsible for parsing flow statistics obtained from OpenFlow switches and classifying traffic types accordingly. We acknowledge that the original manuscript did not clearly articulate the relationship between the algorithm and the el_monitor module. To address this, we have added the following clarification in the revised manuscript:

“The following algorithm is the core implementation logic of the el_monitor application developed on top of the Ryu framework, which performs traffic classification based on flow statistics.”

The comment regarding the add_flow function is very pertinent. The add_flow() function is the standard method in the Ryu controller for installing flow entries. It is invoked in Step 3 to add flow rules that prevent repeated Packet-In events. However, it is not directly related to ARP resolution, which occurs during Packet-In handling to obtain destination IP addresses. We have revised the description to clarify its function and distinguish it from the ARP logic:

“Step 2: define the add_flow function. Add a flow table entry to the specified datapath to enable direct forwarding of known traffic and avoid repeated Packet-In events. This function is later invoked in Step 3.”

Additionally, in the ARP-related section, we now state:

“The ARP-based IP resolution step is used during packet decoding, and is not directly part of the add_flow function, but rather supports traffic classification decisions.”

 

We appreciate the suggestion regarding the lack of elaboration on actual enhancements to Ryu components. To clarify our improvements over the simple_switch_13 example, we have added the following:

“Compared with the original simple_switch_13, the improved module el_monitor adds the following capabilities:

  1. A flow statistics polling function (triggered every 5 seconds) for dynamic monitoring.
  2. Flow classification logic based on both bandwidth and duration.
  3. Real-time console output of flow types, IP addresses, and protocol details.
  4. Integration with rerouting logic for dynamic path adjustment.”

Regarding the new path configuration, this refers to rewriting flow table rules on all switches in the network to forward elephant flows along alternative paths. This was not well-explained in the original manuscript. We have supplemented this with a more precise algorithm description::

“Note: The step Configure new traffic path n refers to updating flow table entries on all switches to redirect the elephant flow to a different ECMP path. This involves specifying new output ports and destination rules in the flow entries for each involved switch.”

We have also clarified the scope of while true: all path_n():

“Note: The lines while true: and all path_n() represent the main loop of the scheduling process, not part of the path_n() function.”

We fully agree with the recommendation to include a Fat-Tree topology diagram. The Fat-Tree is a commonly used architecture in traditional data center networks and significantly differs from the Leaf-Spine structure in terms of hierarchy and redundancy. To facilitate comparison, we have prepared a diagram (Figure 1) illustrating the three-tier structure of a Fat-Tree network: Core, Aggregation, and Edge. Each pod includes multiple switches and servers.

Due to length constraints, this figure is not embedded in the main text, but it will be included as an appendix with a caption highlighting the key differences from the Leaf-Spine architecture, particularly regarding path diversity, link utilization, and scheduling flexibility.

Figure 1: Fat-Tree Network Topology

Description: The Fat-Tree uses a layered structure and relies on Spanning Tree Protocol (STP) to eliminate loops, leading to fixed inter-layer paths and limited path diversity. This results in potential bottlenecks under elephant flow conditions. In contrast, the Leaf-Spine architecture supports parallel paths and better load balancing.

The original statement regarding the scheduling limitations of Fat-Tree was indeed vague. We intended to express that Fat-Tree, when combined with STP, becomes a single-path structure, making ECMP-based scheduling infeasible. We have revised the sentence as follows:

“Due to the single-path nature enforced by the Spanning Tree Protocol in Fat-Tree structures, it is infeasible to dynamically reroute elephant flows, resulting in overburdened links and increased congestion risks.”

We also appreciate the suggestion to improve figure legends. While Figures 3 and 4 already distinguish throughput results under different network structures with clear titles, we agree that adding legends would enhance readability. We will improve the graphic design and include appropriate legends in the next version.

Additionally, although we observed packet loss via Wireshark, the original manuscript only presented throughput curves. To clarify the performance metrics, we have added the following note at the end of Section 3.2:

“In addition to throughput analysis, packet loss and jitter were measured via Wireshark logs. These showed packet drops and near-zero throughput under congestion, though detailed curves are omitted due to space constraints. Future versions of the paper will include a full set of performance indicators, including packet loss rate and jitter graphs.”

Finally, we again thank the reviewer for the invaluable feedback on structure, detail, and logic. We will implement the suggested revisions to clarify module functions, algorithm structure, and experimental data, and include the necessary diagrams to enhance the clarity and coherence of the paper. Although these revisions require considerable effort, we believe they are worthwhile and greatly appreciate the reviewer’s careful review, which has greatly benefited our work.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The authors proposed a new method for elephant flow monitoring and scheduling in Software-Defined Networking (SDN) through OpenFlow-based implementation. A centralized monitoring framework based on Ryu allows them to develop an application which they tested through simulations on Mininet. Several important points should be addressed in this fascinating paper.

-- The technical content of this paper remains strong while improvements in grammar along with consistency and formatting need attention. For example, the title should begin with “An Openflow-based…” Instead of “A Openflow-based…”.

-- The abstract must present essential numerical findings together with a description of how the proposed methods advance the field.

-- The introduction needs expanded details regarding how elephant flow management plays a vital role in modern data center networks.

-- The authors should create three to five bullet points with essential contributions of their paper which should be placed at the end of the Introduction section before the last paragraph.

-- The paper requires better organization of its argument structures. The section plan should appear at the conclusion of the introduction section where it specifies that section 2 covers... while section 3 presents...

-- The literature review requires an expansion that includes recently published literature in the field which should include the following research: Attack detection analysis in software-defined networks using various machine learning method; Evaluating acceptance of a more strict plate control policy among motorcycle riders in Tehran; The role of AI and machine learning in supply chain optimization.

-- The experimental setup needs detailed description which includes all simulation parameters.

-- The study would gain value by examining how its results affect existing network management protocols.

-- The proposed methods need to emphasize their potential enhancement spots.

-- Review implementations of machine learning algorithms that can improve traffic prediction alongside proactive scheduling methods.

-- If possible, the authors should analyze how proposed approaches affect multi-tenant cloud operations because resource allocation and flow management become significantly more intricate in such environments.

Author Response

Dear Editor and Reviewers:

We are very thankful for your in-depth review. We have carefully read and followed up on the comments to revise the manuscript. All the modifications made to the text in the new version of the manuscript are underlined in red. Our point-to-point responses and actions to the reviewers’ comments are as follows.

Reviewer 2:

The authors proposed a new method for elephant flow monitoring and scheduling in Software-Defined Networking (SDN) through OpenFlow-based implementation. A centralized monitoring framework based on Ryu allows them to develop an application which they tested through simulations on Mininet. Several important points should be addressed in this fascinating paper.

Comments 1: The technical content of this paper remains strong while improvements in grammar along with consistency and formatting need attention. For example, the title should begin with “An Openflow-based…” Instead of “A Openflow-based…”.

Response 1:Thanks for your suggestions and we have revised the grammar throughout the manuscript to ensure the clarity and consistency of the paper, where the title is modified to “An OpenFlow-based Elephant-Flow Monitoring and Scheduling Strategy in SDN.”

 

Comments 2: The abstract must present essential numerical findings together with a description of how the proposed methods advance the field.

Response2: Thanks for your suggestions and we have made corresponding modification to more clearly highlight the contributions of the proposed scheme in the abstract, such as the observed throughput stability (e.g., negligible packet loss at 8 Mbps under the Leaf-Spine topology) and identified congestion points (e.g., repeated drops to 0 Mbps in the Fat-Tree topology).

The detail change to the abstract can be presented as

“Specifically, in the Leaf-Spine topology, the network throughput stabilized around 8 Mbps with minimal fluctuation and no congestion over a 120-second test, compared to multiple throughput drops to 0 Mbps under the Fat-Tree topology.”

 

Comments 3: The introduction needs expanded details regarding how elephant flow management plays a vital role in modern data center networks.

Response 3: Thanks for your suggestions and we have made corresponding modification to elaborate on the role of elephant flow management for modern data center networks in the introduction. The detail can be given as

“Although elephant flows account for only a small fraction of total traffic, they consume a disproportionately large amount of bandwidth and directly cause performance bottlenecks and congestion in data center environments. This highlights the dynamic and bursty nature of data center traffic, where applications such as big data analytics, distributed storage, and real-time services frequently generate elephant flows, challenging traditional static routing and load balancing mechanisms.”

 

Comments 4: The authors should create three to five bullet points with essential contributions of their paper which should be placed at the end of the Introduction section before the last paragraph.

Response 4: Thanks for your suggestions and we have made corresponding modification in the revised manuscript as

“The main contributions of this work are concluded as:

  1. We designed and implemented an SDN-based elephant flow monitoring system using the Ryu controller. This system classifies traffic based on duration and bandwidth.
  2. We proposed a polling-based dynamic elephant flow scheduling algorithm that performs path rerouting across equal-cost multipath in Leaf-Spine topologies, avoiding congestion typical in traditional fat-tree structures.
  3. We conducted simulations in Mininet and iperf to validate our approach, demonstrating that our proposed strategy achieves stable throughput (8 Mbps) and zero packet loss under load, outperforming traditional scheduling methods in stability and link utilization.”

 

Comments 5: The paper requires better organization of its argument structures. The section plan should appear at the conclusion of the introduction section where it specifies that section 2 covers... while section 3 presents...

Response 5: Thanks for your suggestions and we have made corresponding modification in the last paragraph of the introduction as

“This paper follows this organization: Section 2 introduces the design and implementation of our proposed elephant flow monitoring and scheduling strategy. Section 3 presents the experimental setup, network simulation environment, and evaluation results. Section 4 concludes the paper and discusses directions for future research.”

 

Comments 6: The literature review requires an expansion that includes recently published literature in the field which should include the following research: Attack detection analysis in software-defined networks using various machine learning method; Evaluating acceptance of a more strict plate control policy among motorcycle riders in Tehran; The role of AI and machine learning in supply chain optimization.

Response 6: Thanks for your suggestions and we have made corresponding modification in the introduction as.

Attack detection in SDN using machine learning: We reviewed and discussed recent contributions that explore intelligent detection frameworks for security threats in SDN environments using machine learning classifiers, such as decision trees, random forests, and deep learning approaches. These studies provide insight into flow analysis and real-time pattern recognition that indirectly support flow classification strategies, such as those used in our proposed method.We have added the following paragraph in the introduction section to reflect these updates (highlighted in red in the revised manuscript):

“Recent research has significantly advanced the use of machine learning (ML) for intelligent detection and optimization in diverse domains, including SDN and supply chain management. In the context of SDN, several studies have explored the use of ML for attack detection and traffic management. For example, Rahman et al. discussed machine learning classifiers such as decision trees, random forests, and support vector machines for detecting Distributed Denial of Service (DDoS) attacks in SDN environments[17]. These intelligent detection frameworks enhance the analysis of flow behavior and real-time pattern recognition, which indirectly support our approach to dynamic flow classification and congestion management in SDNs.”

AI/ML in supply chain optimization: We acknowledged recent advances in AI-driven supply chain management to draw a parallel with how machine learning and intelligent decision-making can optimize resource allocation and routing. We have added the following paragraph in the introduction section to reflect these updates (highlighted in red in the revised manuscript):

“Furthermore, in supply chain optimization, ML techniques have proven to be transformative.  Recent studies have highlighted AI’s role in supply chain disruption management, focusing on how real-time data analytics, including AI and blockchain technologies, improve operational resilience and decision-making[18] . This is particularly relevant for our approach to elephant flow scheduling in SDNs, where intelligent decision-making can optimize the allocation of network resources and improve overall system performance.”

 

 

Comments 7: The experimental setup needs detailed description which includes all simulation parameters.

Response 7: Thanks for your suggestions and we have made corresponding modification in the revised manuscript as.

“To ensure consistency in the emulation, the experiments employed a Leaf-Spine topology comprising 9 switches, with 6 leaf switches and 3 spine switches. This structure provided multiple equal-cost paths to support robust traffic distribution. The iperf tool generated a mix of elephant and mouse flows under varying load conditions. This study identified elephant flows as those that either transferred more than 10 MB of data or maintained a sustained throughput above 1 Mbps. The Ryu SDN controller managed the network and actively polled flow statistics every 5 seconds to detect and respond to emerging elephant flows. Each network link operated at 10 Gbps, with latency configured at 1 millisecond between edge and leaf switches and 2 milliseconds between leaf and spine switches, closely reflecting realistic data center conditions. Each switch port maintained a queue size of 1000 packets to simulate typical buffering behavior. Every simulation lasted for 120 seconds to capture both transient effects and steady-state traffic dynamics.”

 

Parameter

Value / Description

Network Topology

Leaf-Spine

Total Number of Switches

9

Leaf Switches

6

Spine Switches

3

Traffic Generation Tool

iperf

Traffic Types

Mix of elephant flows and mice flows

Elephant Flow Definition

Transfer > 10 MB or sustained throughput > 1 Mbps

SDN Controller

Ryu

Flow Statistics Polling Interval

Every 5 seconds

Link Bandwidth

10 Gbps

Link Latency (Edge ↔ Leaf)

1 millisecond

Link Latency (Leaf ↔ Spine)

2 milliseconds

Switch Port Queue Size

1000 packets

Simulation Duration

120 seconds

 

Comments 8: The study would gain value by examining how its results affect existing network management protocols.

Response 8: Thanks for your suggestions and we have made corresponding modification to the conclusion in the revised manuscript. Specifically, we elaborated on the following points:

Impact on Traditional Protocols (ECMP and OSPF)

Experimental results demonstrate that our SDN-based elephant flow scheduling strategy mitigates the shortcomings of Equal-Cost Multi-Path (ECMP) routing, which performs static and hash-based load distribution. By dynamically rerouting elephant flows based on real-time traffic conditions, our approach outperforms ECMP in both throughput consistency and congestion reduction.

Enhancement of QoS Policies

Our findings indicate that integrating real-time flow classification and path adaptation into Quality of Service (QoS) frameworks improves service quality.

Complementary Flow Aggregation Techniques

Existing protocols typically rely on coarse-grained flow aggregation. Our system adds value by enabling fine-grained, flow-level control without overwhelming the controller, thereby bridging the gap between scalability and precision in current SDN traffic engineering solutions.

“The proposed elephant flow scheduling approach demonstrates notable performance improvements and offers valuable implications for existing network management protocols. By comparing traditional mechanisms such as ECMP and OSPF, we highlight the advantages of SDN-based dynamic flow rerouting in enhancing throughput consistency and mitigating congestion. Moreover, the method can effectively complement Quality of Service frameworks by enabling real-time flow classification and adaptive path selection. Its fine-grained control capabilities also address the limitations of coarse-grained flow aggregation, offering a practical path forward for refining future network protocol designs.”

 

Comments 9: The proposed methods need to emphasize their potential enhancement spots.

Response 9: Thanks for your suggestions and we have made corresponding modification at the end of the conclusion in the revised manuscript.

Dynamic Threshold Adjustment

The current implementation employs fixed thresholds for elephant flow identification. Enhancements could incorporate context-aware threshold mechanisms responding to network traffic conditions, enhancing adaptability across varying operational scenarios.

Advanced Predictive Analysis

Our framework currently utilizes straightforward polling and classification techniques. Integrating sophisticated analytical models could anticipate flow characteristics, allowing for preemptive traffic redirection before bottlenecks materialize.

Expanded Validation Protocol

The evaluation framework relied primarily on controlled testing environments. Broadening experimental validation to encompass production-grade infrastructure would yield more comprehensive insights regarding performance characteristics under complex workloads.

 

Comments 10: Review implementations of machine learning algorithms that can improve traffic prediction alongside proactive scheduling methods.

Response 10: Thanks for your suggestions and we have made corresponding modification at the end of the conclusion in the revised manuscript as.

“Future improvements include adopting adaptive thresholding and integrating machine learning to enhance flow prediction and responsiveness, as well as extending evaluation to large-scale deployments for better scalability insights.”

 

Comments 11: If possible, the authors should analyze how proposed approaches affect multi-tenant cloud operations because resource allocation and flow management become significantly more intricate in such environments.

Response 11: Thanks for your suggestions and we have made corresponding modification to the conclusion in the revised manuscript.

Tenant Isolation

Our SDN-based elephant flow scheduling mechanism supports flow-level granularity, which can be extended to enforce tenant-specific policies. By associating flows with tenant identifiers, the controller can ensure isolation and fair resource sharing among tenants.

Dynamic Resource Allocation

In multi-tenant scenarios where workloads are elastic and fluctuate frequently, our real-time monitoring approach provides the flexibility to adapt path decisions on the fly, thereby reducing the risk of one tenant’s elephant flows degrading others' performance.

Scalability Considerations

We also discussed potential scalability challenges. For example, the increased number of flows in a multi-tenant cloud may introduce additional control-plane overhead. To mitigate this, distributed SDN controller architectures or hierarchical flow management can be adopted.

“Additionally, applying our approach in multi-tenant cloud environments introduces new challenges in policy isolation, dynamic resource sharing, and control overhead, which merit further investigation.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

The paper presents a methodology of monitoring and scheduling in SDN. The results are presented for a particular experimental design shown in Figure 2.

Many questions remain and should be addressed:

(1) why is this set-up better than other alternatives? There is no numerical comparison with any of the methods mentioned in the introduction, even a single one

(2) Is it possible to consider a simple queueing model - for example, assume that traffic contains only two classes: elephant and mouse traffic, with some parameters. Make simplifying assumptions that each node can be modelled as an M/M/1 queue with two traffic classes. Perhaps such a model can be used to at least get a qualitative comparison between a conventional Fat-Tree and the proposed Leaf Spine.

(3) without such comparison, the paper has very little value 

Author Response

Dear Editor and Reviewers:

We are very thankful for your in-depth review. We have carefully read and followed up on the comments to revise the manuscript. All the modifications made to the text in the new version of the manuscript are underlined in red. Our point-to-point responses and actions to the reviewers’ comments are as follows.

Reviewer 3:

The paper presents a methodology of monitoring and scheduling in SDN. The results are presented for a particular experimental design shown in Figure 2.

Many questions remain and should be addressed:

Comments 1: why is this set-up better than other alternatives? There is no numerical comparison with any of the methods mentioned in the introduction, even a single one.

Response 1:

Thanks for your constructive suggestion. We agree that providing numerical comparisons with existing methods would significantly enhance the scientific rigor and persuasiveness of our work. However, in this work, due to differences in implementation environments, evaluation metrics, and architectural assumptions across the referenced studies, it was not feasible to perform a fair and consistent numerical comparison within the scope of our current experimental framework.

Our proposed method offers distinct advantages over the alternatives discussed in the introduction. It operates directly on the native OpenFlow protocol, requiring no additional hardware components or protocol layers, which simplifies deployment and ensures better compatibility with existing SDN infrastructure. Moreover, the approach utilizes the equal-cost multipath feature inherent in the Leaf-Spine architecture, enabling efficient load balancing and improving network resource utilization. This design aligns with the flattening trend of modern data center networks, where scalability, low latency, and high throughput are critical.

Regarding the lack of direct numerical comparison with methods in references [8-14], there are several reasons:

  1. Experimental Context Mismatch: Most referenced methods are implemented and evaluated under different simulation environments, controller platforms (e.g., POX, ODL), or hardware-assisted testbeds, which makes a fair, side-by-side quantitative comparison infeasible within our current setup.
  2. Diverse Evaluation Metrics: Some of the cited works focus on packet-level optimization, others emphasize machine learning-based prediction, or are designed for TCP-specific flows, while our method targets real-time elephant flow classification and dynamic rerouting across multiple paths using a polling strategy.
  3. Different Topological Assumptions: A number of these approaches are specifically tailored to Fat-Tree architectures, and thus their performance characteristics (e.g., path length, congestion points) differ fundamentally from the Leaf-Spine-based context of our work.

Indeed, the absence of such comparative analysis is a limitation of our current study. In future work, we plan to perform a more comprehensive and rigorous evaluation, including reimplementing selected baseline algorithms under a unified simulation framework (e.g., Mininet + Ryu), and comparing them with our proposed strategy in terms of throughput, packet loss rate and control overhead, across both Fat-Tree and Leaf-Spine topologies.

 

Comments 2: Is it possible to consider a simple queueing model - for example, assume that traffic contains only two classes: elephant and mouse traffic, with some parameters. Make simplifying assumptions that each node can be modelled as an M/M/1 queue with two traffic classes. Perhaps such a model can be used to at least get a qualitative comparison between a conventional Fat-Tree and the proposed Leaf Spine.

Response2: Thank you for your valuable suggestion, which will significantly enhance the depth and completeness of our research. Model each node as an M/M/1 queue for elephant and mouse flows is entirely feasible and we can conduct qualitative comparisons between traditional Fat-Tree and our proposed Leaf-Spine architecture without relying extensively on experiments.

In our subsequent research, we will develop such a queuing model, assuming different arrival and service rate parameters for elephant and mouse flows. By calculating key performance metrics including average queue length, average waiting time, and packet loss probability, we can theoretically compare how both architectures perform under various traffic loads.

 

Comments 3: without such comparison, the paper has very little value.

Response 3: Thanks for your suggestions, amd we recognize this deficiency and would incorporate two key elements in our future work.

Firstly, numerical comparative experiments with representative methods to demonstrate the strengths and limitations of our approach. Secondly, as you previously suggested, the introduction of a simple queuing model for theoretical analysis to mathematically illustrate the performance differences between Fat-Tree and Leaf-Spine architectures when handling elephant flows.

We appreciate your identification of this critical issue, which proves essential for enhancing the quality of our research.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors
  • The paper proposes an SDN-based strategy for monitoring and scheduling elephant flows, but it heavily relies on existing frameworks (e.g., Ryu, Mininet) and concepts (e.g., Leaf-Spine, Fat-Tree topologies). It fails to demonstrate a significant departure from prior work, such as load balancing mechanisms or ECMP-based scheduling [8, 9, 15].
  • The methodology for monitoring and scheduling elephant flows is vaguely described. The paper outlines general steps (e.g., flow table collection, polling strategy) but lacks detailed algorithms, parameter justifications, or implementation specifics.
  • The related works section is superficial, summarizing existing studies [8-14] without critically analyzing their limitations or positioning the proposed work as a novel solution.
  • The experimental results are limited to throughput and packet loss comparisons between Fat-Tree and Leaf-Spine topologies, lacking depth in performance metrics (e.g., scalability, computational overhead, or fault tolerance)
  • The evaluation is narrow, focusing only on a 120-second test with a single elephant flow scenario (9Mbps). It does not explore diverse traffic patterns, network sizes, or real-world conditions.

Author Response

Dear Editor and Reviewers:

 

 

We are very thankful for your in-depth review. We have carefully read and followed up on the comments to revise the manuscript. All the modifications made to the text in the new version of the manuscript are underlined in red. Our point-to-point responses and actions to the reviewers’ comments are as follows.

Reviewer 4:

Comments 1: The paper proposes an SDN-based strategy for monitoring and scheduling elephant flows, but it heavily relies on existing frameworks (e.g., Ryu, Mininet) and concepts (e.g., Leaf-Spine, Fat-Tree topologies). It fails to demonstrate a significant departure from prior work, such as load balancing mechanisms or ECMP-based scheduling [8, 9, 15].

Response 1:Thank you for your suggestion. Indeed our research utilizes existing frameworks (Ryu, Mininet) and known network topologies (Leaf-Spine, Fat-Tree), etc. However, there still exists meaningful value and innovation.

Firstly, our work organically integrates SDN technology with Leaf-Spine architecture to present a comprehensive elephant flow monitoring and scheduling solution. Although individual technical components may not be entirely novel, our integrated approach provides a practical and effective solution.

Secondly, experimental results clearly demonstrate the advantages of this approach in improving network performance stability and resource utilization, particularly for elephant flow scenarios in data centers.

 

Comments 2: The methodology for monitoring and scheduling elephant flows is vaguely described. The paper outlines general steps (e.g., flow table collection, polling strategy) but lacks detailed algorithms parameter justifications, or implementation specifics.

Response2: Thank you for your suggestion. We sincerely thank the reviewer for the suggestion. Our research presents an elephant flow monitoring and scheduling framework, encompassing critical steps such as flow table collection, traffic classification, and path selection. Although these methods demonstrated certain effectiveness in experiments, we must articulate implementation details and parameter selection criteria more clearly. This applies particularly to traffic threshold settings, sampling interval determination, and the specific algorithms for polling strategies. To expedite deployment, we utilized the default configurations of the Ryu controller and OpenFlow protocol.

Nevertheless, our work achieved effective elephant flow management in a Leaf-Spine architecture, with experimental results demonstrating practical effects on network performance and resource utilization. We believe this research direction holds potential to provide reference solutions for data center network traffic management.

We appreciate your valuable feedback, which will guide us toward greater methodological rigor and descriptive thoroughness in future work.

 

Comments 3: The related works section is superficial, summarizing existing studies [8-14] without critically analyzing their limitations or positioning the proposed work as a novel solution.

Response 3: Thanks for your suggestion and we have made corresponding modification in the revised manuscript.

[8] Proposes an SDN-based load balancing mechanism for elephant flows. By leveraging the controller’s global network view, it dynamically evaluates link states and splits elephant flows proportionally across multiple paths to alleviate congestion and improve throughput and link utilization. Validation on an OpenFlow testbed demonstrates notable performance gains. However, the strategy depends on ratio-based link evaluations, which may be insufficiently responsive under highly dynamic or bursty traffic, and may introduce packet reordering due to multipath forwarding.

[9] Introduces the ADU mechanism, which triggers Packet_in events by forging source IP addresses at the host side, allowing the controller to detect elephant flows without consuming switch resources directly. This improves detection speed and accuracy, making it suitable for distributed scenarios. However, it requires additional deployment on hosts, increasing system management complexity, and may face limitations in security-sensitive environments due to IP spoofing.

[10] Proposes the DPLBAnt algorithm, which applies ant colony optimization for path selection and rerouting of detected elephant flows. A dual-classifier structure between the controller and switches enhances detection efficiency and reduces controller overhead. The algorithm outperforms traditional methods in terms of latency, throughput, and packet loss in complex networks. Nonetheless, its computational cost is high, potentially affecting real-time performance under frequent routing updates and placing greater demands on controller capabilities.

[11] Evaluates several elephant flow prediction models using real traffic from a Facebook data center and proposes a FARIMA-RNN hybrid model. This model achieves better short-term prediction accuracy than ARIMA and LSTM, supporting traffic scheduling and load balancing. Despite its effectiveness, the method is highly dependent on the quality of historical data and is less responsive to bursty or irregular traffic, with significant training costs.

[12] Summarizes a decade of Google’s data center network evolution, presenting an architecture built on multi-tier Clos topology and centralized control. It reduces costs by using commodity switches and simplifies operations through unified configuration, achieving scalability up to 1 Pbps. While offering excellent scalability and cost-efficiency, the design relies heavily on centralized control, which may limit fault tolerance and complicate coordination in case of controller failure.

[13] Introduces a machine learning-based knowledge plane in leaf-spine data center architectures, integrated with SDN controllers and southbound interfaces. It predicts traffic variations using historical link data and automatically adjusts the network structure to enable elastic scaling and performance optimization. The system shows good adaptability in simulations, but its effectiveness depends on the accuracy of the prediction model, which may degrade under insufficient training data or model failure.

[14] Conducts high-fidelity simulations to analyze how topology, link rate, and buffer size affect data path performance in leaf-spine networks. The results provide quantitative guidance for balancing performance and cost through appropriate parameter tuning. However, the study is based on simulations and lacks validation in large-scale real-world deployments, with limited exploration of complex parameter interactions.

In this context, this work proposes an OpenFlow-based monitoring and scheduling strategy for elephant flows, integrating SDN centralized control with the multipath benefits of the leaf-spine topology. The Ryu controller periodically collects flow table data from switches to enable efficient flow identification and dynamic scheduling. Simulations on the Mininet platform, compared with a traditional Fat-Tree structure, demonstrate clear advantages in throughput stability, link utilization, and congestion control, offering a new approach for traffic management in large-scale networks.

 

Comments 4: The experimental results are limited to throughput and packet loss comparisons between Fat.Tree and Leaf-Spine topologies, lacking depth in performance metrics (e.g., scalability, computational overhead, or fault tolerance)

Response 4: Thanks for your suggestion. We primarily focused on throughput and packet loss as basic metrics to compare Fat-Tree and Leaf-Spine topologies without offering a more comprehensive performance evaluation. However, our experiments verified the effectiveness of the proposed solution in terms of basic network performance, particularly highlighting the advantages of Leaf-Spine architecture when handling elephant flows. Despite the limited experimental scope, these preliminary results validate the value of combining SDN with Leaf-Spine architecture.

 

Comments 5: The evaluation is narrow, focusing only on a 120-second test with a single elephant flow scenario(9Mbps). It does not explore diverse traffic patterns, network sizes, or real-world conditions.

Response 5: Thanks for your suggestion. Our experiments demonstrate and validate our proposed solution's effectiveness under specific conditions, and our preliminary results demonstrate the potential of SDN controllers to manage elephant flows in Leaf-Spine architectures. These findings establish a valuable foundation for further research regarding traffic scheduling in multipath environments. In the future, more comprehensive testing protocols incorporating diverse traffic patterns, larger network topologies, and practical scenarios will be considered.

Author Response File: Author Response.pdf

Reviewer 5 Report

Comments and Suggestions for Authors

This paper addresses a timely and important challenge in data center networking: the identification and scheduling of elephant flows in Software-Defined Networking (SDN) environments. The authors propose a monitoring system that dynamically detects large flows (elephant flows) and reroutes them across available paths. 

The authors outline their experimental environment, detailing how elephant flows and mouse flows are simulated and measured. Furthermore, they show a comparison between network behaviors under Fat-Tree and Leaf-Spine architectures. The rerouting strategy using a polling mechanism across equal-cost multipaths, even though basic, outlines some improvement in throughput stability and a reduction in packet loss.

The elephant flow detection mechanism relies on static thresholds of flow duration and bandwidth, which are somewhat simplistic and potentially brittle under real-world, dynamic traffic patterns. Moreover, the rerouting algorithm is essentially a static polling or round-robin method that does not account for actual link utilization, existing traffic loads, or network latency variations. In highly dynamic or bursty environments, this could still lead to inefficient path selections and localized congestion. The paper would have benefitted from exploring more adaptive or intelligent scheduling strategies that adjust routing decisions based on live network conditions.

Another major weakness is the limited experimental scope. The paper primarily measures throughput and basic packet loss but does not consider other critical performance metrics like end-to-end latency, flow completion times, or scalability in larger networks. In addition, while the experimental validation is well-described, comparisons are made only against the Fat-Tree STP setup without benchmarking against other known elephant flow management solutions. 

In the proposed strategy, elephant flow scheduling is currently based on a polling mechanism to distribute large flows across multiple available equal-cost paths in the Leaf-Spine network. While this simple approach improves network resource utilization compared to traditional Fat-Tree structures, it lacks awareness of dynamic traffic conditions, path interdependencies, and overall network congestion states. As data center networks grow larger and more complex, relying solely on static or round-robin path selection may lead to suboptimal load balancing, localized congestion, and inefficient use of available bandwidth resources.

To address these limitations, the integration of intelligent search methods and advanced combinatorial meta-heuristics presents a promising future direction. Techniques such as constraint-based local search, genetic algorithms, and other evolutionary heuristics can optimize flow placement by considering multiple objectives, such as minimizing maximum link utilization, maintaining flow disjointness, and respecting delay constraints. All in all, it is essential to address this aspect. To this end, I’d recommend checking “Arbelaez et al. A constraint-based parallel local search for the edge-disjoint rooted distance-constrained minimum spanning tree problem. J. Heuristics 24(3): 359-394 (2018)” This work enables the exploration of complex solution spaces to build edge-disjoint, distance-constrained spanning trees, which could be directly adapted for smarter elephant flow scheduling in SDN environments. 

These advanced meta-heuristic algorithms might allow higher throughput and stability and also more scalable and adaptive network management as traffic patterns evolve dynamically.



Comments on the Quality of English Language

The English quality of the paper is a bit poor. Some sentences are overly complex  and wordy, which makes the paper harder to follow. Likewise, Some sentences are too long, containing multiple ideas that could be split for better readability.

Author Response

Dear Editor and Reviewers:

 

 

We are very thankful for your in-depth review. We have carefully read and followed up on the comments to revise the manuscript. All the modifications made to the text in the new version of the manuscript are underlined in red. Our point-to-point responses and actions to the reviewers’ comments are as follows.

Reviewer 5:

This paper addresses a timely and important challenge in data center networking: the identification and scheduling of elephant flows in Software-Defined Networking (SDN)environments. The authors propose a monitoring system that dynamically detects large flows(elephant flows)and reroutes them across available paths.

Comments 1: The authors outline their experimental environment, detailing how elephant flows and mouse flows are simulated and measured. Furthermore, they show a comparison between network behaviors under Fat-Tree and Leaf-Spine architectures. The rerouting strategy using a poling mechanism across equal-cost multipaths, even though basic, outlines some improvement in throughput stability and a reduction in packet loss.

Response 1: Thanks for your comments. Our polling-based mechanism over equal-cost multipaths may appear basic regarding scheduling sophistication. Still, this work proposes a lightweight, controller-assisted, real-time scheduling method that leverages centralized control to dynamically balance elephant flows without requiring complex learning models or significant computational overhead. Despite its simplicity, this strategy yielded measurable performance improvements in throughput stability and packet loss reduction, especially under the Leaf-Spine architecture.

 

Comments 2: The elephant flow detection mechanism relies on static thresholds of flow duration and bandwidth, which are somewhat simplistic and potentially brittle under real-world. dynamic traffic patterms. Moreover, the rerouting algorithm is essentially a static polling or round-robin method that does not account for actual link utilization, existing traffic loads, or network latency variations. In highly dynamic or bursty environments, this could still lead to inefficient path selections and localized congestion. The paper would have benefitted from exploring more adaptive or intelligent scheduling strategies that adjust routing decisions based on live network conditions.

Response 2: Thanks for your suggestion. We fully acknowledge that the use of static thresholds for elephant flow detection may lack robustness in the face of dynamic and bursty real-world traffic patterns. The chosen thresholds (e.g., flows lasting more than 30 seconds with bandwidth exceeding 5 Mbps) were calibrated for our Mininet-based simulation environment, where absolute bandwidth and flow intensity are constrained. These configurations aimed to provide a controlled setting for proof-of-concept evaluation rather than complete generalization to production-scale data centers.

For the rerouting strategy, our work adopts a simple round-robin or polling-based approach without explicitly referencing live link metrics to ensure lightweight, deterministic and easily reproducible scheduling in SDN. Our method does avoid the limitations of static ECMP hashing, but lacks the real-time responsiveness that more sophisticated traffic engineering approaches (e.g., load-aware, latency-sensitive scheduling) can provide. Certainly, adaptive or intelligent rerouting could substantially improve path selection under fluctuating traffic loads and can be used as a promising direction for future extensions to our work.

 

Comments 3: Another major weakness is the limited experimental scope. The paper primarily measures throughput and basic packet loss but does not consider other critical performance metrics like end-to-end latency, flow completion times, or scalability in larger networks. In addition, while the experimental validation is well-described, comparisons are made only against the Fat-Tree STP setup without benchmarking against other known elephant flow management solutions. 

Response 3:

Thanks for your suggestion. A comprehensive evaluation across broader performance metrics is not included in this work. Our primary objective was to validate the feasibility and effectiveness of a polling-based scheduling mechanism within a controllable simulation environment using Mininet and Ryu. In addition, throughput and packet loss indicate congestion levels and resource utilization during elephant flow transmission. These metrics offered clear visibility into the benefits of our rerouting strategy compared to the baseline Fat-Tree + STP scenario. We appreciate this recommendation and research, and our future work would include

  • Experimental scope to include latency and FCT measurements under mixed traffic patterns
  • Scalability testing using extended topologies with higher node/link counts
  • Comparative analysis with recent detection and rerouting frameworks
  • Potential testbed deployment or simulation with more realistic workloads to bridge the gap between emulation and deployment scenarios

 

Comments 4: In the proposed strategy, elephant flow scheduling is currently based on a polling mechanism to distribute large flows across multiple available equal-cost paths in the Leaf-Spine network. While this simple approach improves network resource utilization compared to traditional Fat-Tree structures, it lacks awareness of dynamic traffic conditions, path interdependencies, and overall network congestion states. As data center networks grow larger and more complex, relying solely on static or round-robin path selection may lead to suboptimal load balancing, localized congestion, and inefficient use of available bandwidth resources.

Response 4: Thanks for your suggestion. Although the current polling-based scheduling approach lacks real-time awareness of traffic dynamics and congestion conditions, this work was intended as a lightweight, easily deployable mechanism within the SDN controller to demonstrate the feasibility of dynamic elephant flow rerouting in a Leaf-Spine topology. We are exploring enhancements incorporating real-time network feedback, such as link utilization metrics and congestion indicators, to enable more adaptive and efficient scheduling decisions.

 

Comments 5: To address these limitations, the integration of intelligent search methods and advanced combinatorial meta-heuristics presents a promising future direction. Techniques such as constraint-based local search, genetic algorithms, and other evolutionary heuristics can optimize flow placement by considering multiple objectives, such as minimizing maximum link utilization, maintaining flow disjointness, and respecting delay constraints. All in all, it is essential to address this aspect. To this end, I’d recommend checking “Arbelaez et al. A constraint-based parallel local search for the edge-disjoint rooted distance-constrained minimum spanning tree problem. J. Heuristics 24(3): 359-394 (2018).” This work enables the exploration of complex solution spaces to build edge-disjoint, distance-constrained spanning trees, which could be directly adapted for smarter elephant flow scheduling in SDN environments.

Response 5: Thanks for your suggestion. We have reviewed and referred to this document in the revised manuscript. Indeed, intelligent search techniques such as constraint-based local search, genetic algorithms, and other evolutionary heuristics offer a promising direction for optimizing flow placement across multiple objectives, including minimizing link utilization and maintaining path diversity under delay constraints. We will introduce the inspiration from this reference to our future research to enhance the flexibility and scalability of the scheduling framework.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors have attempted to address most of the concerns I raised in my report.  It would be worth it if technical details were available in the paper or via a different way that could help others reproduced the reported results.

 

Author Response

We appreciate your valuable feedback. We recognize that sharing additional details, such as the complete implementation code or links to an open repository, can further aid in reproducing the results. Accordingly, the code is available at the following link: https://github.com/cmy-hhxx/el_monitor.

We hope these additions help clarify the technical aspects and enhance the reproducibility of our work.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The authors have addressed all previously raised concerns in a satisfactory manner. Therefore, I recommend acceptance of the manuscript for publication.

Author Response

We sincerely appreciate your positive feedback and are grateful for your recommendation to accept the manuscript. We are pleased that the revisions have addressed your concerns to your satisfaction. Thank you for your time and thoughtful evaluation of our work.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

I read the responses to all reviewer suggestions. The authors promise to address all this in their future work. Under such conditions, I would recommend rejecting the paper as a major revision will not be done

Author Response

We thank you for your careful review and feedback. We acknowledge that some suggestions were marked for future work due to space limitations and the current scope of the study. However, we are fully committed to strengthening the current version and are prepared to incorporate the necessary revisions now, not postpone them. We respectfully request the opportunity to revise the manuscript accordingly to address your concerns in this round. Thank you again for your constructive comments.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

The authors revised paper according to my sugesstions. Some more comments to consider in the finale version:

  • The algorithms for the el_monitor application (Section 2.1) and the polling-based scheduling (Section 2.2) are presented in pseudocode but lack detailed explanations of key steps. For example, the add_flow and packet_in_handler functions could benefit from inline comments or a step-by-step narrative to clarify their roles in traffic classification and forwarding.
  • Figures are not clear
  • You should consider some related works and well discussed the difference between your work and other papers in the literature 

Author Response

Dear Editor and Reviewers:

We are very thankful for your in-depth review. We have carefully read and followed up on the comments to revise the manuscript. All the modifications made to the text in the new version of the manuscript are underlined in red. 

Author Response File: Author Response.pdf

Reviewer 5 Report

Comments and Suggestions for Authors

The authors address the majority of the comments. Nonetheless, some aspects still could be improved. While the use of static thresholds for elephant flow detection offers simplicity, it may not generalize well to dynamic and unpredictable network environments. The round-robin rerouting strategy also lacks real-time awareness of network load or latency conditions, which may lead to inefficient path utilization in certain scenarios.

Author Response

Dear Editor and Reviewers:

 

 

We are very thankful for your in-depth review. We have carefully read and followed up on the comments to revise the manuscript. All the modifications made to the text in the new version of the manuscript are underlined in red.

Author Response File: Author Response.pdf

Back to TopTop