Deployed in the recent decade, Ethernet speeds have been increased from 10 Gbps to 40 Gbps and 100 Gbps and will go in the future towards Terabit per second, as stated in the Ethernet Alliance roadmap [1
]. However, implementation and deployment of multi-gigabit applications, optimization of their performance and accuracy of the transmission speed remain significant challenges [2
As a representative network application for which the accuracy of timing functions is the performance basis, we put forward the following requirements: the application shall have strict timing requirements, the application must be based on high-speed data transmission, and the application must also have practical applicability in modern networks. In this case, the precision of available bandwidth estimation applications is strictly tied to the precision of timing operations and meets mentioned requirements.
Recent network measurement projects, focusing on performance evaluation, capture such performance metrics as path capacity (maximum possible throughput) and available bandwidth (maximum available throughput) of a network path. Estimation of these parameters requires creating explicit probe packets that consume CPU resource and possess enough timers precision to generate accurate timestamps of probe packets under bursty traffic conditions. Moreover, in this kind of application, it takes more than 200 time acquisition function calls per packet and about one sleep operation per issued data packet.
Because of virtualization, datacenter tenants no longer have direct access to the underlying physical network. The extra layer of indirection makes it hard to optimize application performance. AvB can measure the underlying network and provide performance metrics to the virtualized applications to optimize tasks such as VM placement and route selection [2
Therefore, the general requirements of such time-sensitive networking applications are based on precise and rapid timing measurements, such as low latency, dealing with a limited performance of the end systems (e.g., frequency of the kernel interrupts), adoption of inter-packet interval, etc. In this paper, we address this challenge and aim to evaluate how the accuracy of modern timers can influence the result measurements of network performance under the virtualization conditions. For this, we have extended our previous research in timing operations with its performance influence on the real network application. As a matter of fact, keeping high time precision of control algorithms and meeting soft real-time requirements is a complex problem on modern multi-purpose operating systems such as Linux. This task is becoming even more challenging under virtual environments which require efficient access to the physical platform through the software middleware of the hypervisor or host operating system [3
]. Thus, timing operations used by the applications are usually dependent on certain software layers in the operating system. As our previous research has shown [6
], even operating systems running on bare hardware are often not efficient enough for high-performance applications. Running such operations in virtual environments potentially adds a significant overhead to these operations by the hypervisor OS, by means of passing data to bare hardware by it as well as by the means of the guest OS. However, the use of virtual machines as an intermediate layer for simplification of deployment and operations of high-performance applications is “tempting”. Virtualization is very widely used in data centers and in cloud environments. The possibility to run high-performance applications from out of a virtual environment is a very important factor for deployments of such applications, in which timer performance plays a significant role. In the present work, we consider the question of whether it is feasible and if so, on which systems and at which “costs”. After all, direct access to the TSC
counter via RDTSC
instruction can improve CPU overhead for time acquisition and computations on Linux by avoiding expensive system calls, but precise sleep operations remain a more challenging task. Sleep procedures, whose realization in the standard GNU C library has an insufficient precision due to the long wake-up time, which can be even more aggravated when sharing resources between host and virtual machines (VM). To overcome issues with standard system calls, an interface to available time counters as well as for sleep operations the HighPerTimer
] have been used besides the standard timing means of Linux OS. This library has been developed in terms of our previous research as an alternative to the Linux system timer functions, which operate timing functions by direct access to the hardware.
By applying different techniques for timekeeping on Linux machines, we intend to disclose problems that can arise for time-critical network applications running in a virtualized environment. Further, during the experiments, we investigated the impact of timing implementation within the workflow of two AvB tools—Kite2
] and Yaz
]—for estimation of network performance. In our previous work, the accuracy of these tools in high-speed networks has been compared [8
]. By contrast, the target of this research is to explore which timing methods can be more suitable for implementing high-speed applications in a virtual environment. The first tool Kite2
is our implementation of the modified AvB active probing algorithm and one of a very few tools which can deal with 10 Gbps and faster links. The second one, Yaz
, was approved as an accurate tool, which scales well with the increasing cross traffic solution on dozens of Mbps links, and is available in open source [11
]. The algorithms of Kite2
have been developed in terms of our previous data transport research [13
] and use HighPerTimer
library, whereas Yaz
is based on system calls for timing acquisition and rough timing error correction. For 1 Gbps links, a specific adoption for the source code using Yaz
was required. Further details about that experiment setup can be found in Section 2.2.3
shows the active probing model used by both Yaz
AvB tools where
is a probe packet n
is the sending interpacket interval and
is the interpacket interval on the receiver side. The main principle of this method is to find the optimal inter-packet interval, which is strictly dependent on timing operations precision. Therefore, the accurately chosen inter-packet interval on the sender side allows achieving the actual available bandwidth of the network path. On the other hand, inaccurate timestamping of the packets leads to under- or overestimation by the application, which is not able to meet an adequate sending data rate to the AvB. The goal of the AvB algorithm is to find the minimal value
after which the difference
Firstly, the study characterizes the timing behavior under stressful virtualization conditions, where the median wake up time of standard system sleep function is about 40–100
s, which also implies the lower bound of inter-packet interval. Applying Equation (1
), it is found that the corresponding data rate limit is 120–320 Mbps using standard 1500 bytes of MTU size.
is data rate in bps and
is the inter-packet time interval in seconds.
The replacement to the alternative timing methods can provide up to 56 ns wake up time and increasing the precision allows us to operate with 100 Gbps and higher, which better matches to the requirements of modern Ethernet networks.
To increase network performance and synchronize work of applications such as video streaming, multiplayer video games and content delivery services, it is necessary to use timing functions that are as precise as possible. Improved timing methods (direct call to TSC hardware timer and HPTSleep method from HighPerTimer library) that have been studied in this research show half the time cost of a standard Linux system call, and up to thousands of times lower miss time values than the sleep function from the standard C library. Accordingly, the main objectives of this paper are: (a) summarize empirical research of time acquisitions and sleep methods within virtual machines; (b) indicate the influence of timing methods on performance and accuracy of available bandwidth estimation in multi-megabit and multi-gigabit networks; and (c) provide recommendations of minimizing the possible errors during performance measurements in a real network.
In this work, we summarize the previous research to show the efficiency of proposed methods. The novelty of this research is in a comprehensive analysis of how different virtualization methods impact time and sleep measurements of two timing approaches—direct call to hardware timer and standard timing System Call. Furthermore, we investigate the impact of these approaches in a virtual environment on two real networking applications for available bandwidth estimation of middle- and high-speed networks.
The remainder of this paper is structured as follows. Section 2
reviews the materials and methods that were used for the experimental setup to identify which virtual environments are more or less feasible for deployment of high-performance networking applications in virtual environments. In this section, software and hardware environments, load types, and scenarios between remote hosts and VMs are described in detail. Section 3
is devoted to the retrieved results of experiments with time acquisition, sleep function accuracy measurements and applying AvB tools performance in different virtual environments. Interpretation and discussion of the results can be found in Section 4
The performed research and achieved results led us to conclusions regarding such challenges as selection of the timing approach for the high-performance network application, of a suitable virtualization method and available bandwidth estimation for networks with different path capacities. Various timers-related measurement results for the case where one virtual instance is located on hardware allow the following conclusions: (i) While Xen and ESXi, which are both bare-metal hypervisor, have almost no basis overhead in the majority of measurements, Virtual Box’s access to the hardware adds the highest overhead among all virtualization platforms (QEMU, Xen, and VMware ESXi). (ii) In the case of sleep accuracy, the medians of sleeps methods prevail over the virtualization environment effects, except Virtual Box, when miss time median values are 100 times higher than other virtualizations in the case of HPTSleep. (iii) The hypervisor-based approach brings clearly measurable benefits compared with the OS-based one. (iv) Different loads affect mostly the variance of measurements and maximum values, while median and minimum values do not change dramatically. (v) Virtual Box has the highest median time overheads and the highest deviation. In addition, based on minimal overhead values, it makes sense to assume that this platform does not provide direct access to hardware. Thus, this platform is not recommended for time-critical applications. (vi) The AvB estimation measurement on Virtual Box reveals that, independently from timing operations, such type of virtualizations is inapplicable for high-speed applications implementations, due to more than 50% of underestimation error. Nevertheless, even in the case of Virtual Box, TSC-based timers and HPTSleep are able to decrease estimation variance. (vii) The most accurate AvB estimation results were achieved on QEMU OS-based virtualization—in any tested condition, precision and variance of AvB evaluation prove themselves as the most faithful. (viii) Experiments show that the MAD of AvB estimations with HPT are less than half those of system timing operations. The rates of outliers are also noticeably higher for the standard timing functions in 1 Gbps networks. (ix) Due to the revealed limitations cause by uSleep() miss time of 62 s, the maximal achieved send data rate in 10 Gbps networks did not exceed 3.6 Gbps, which caused more than 60% AvB estimation error and points to the inapplicability of standard timing operations in term of modern high-speed networking. Thus, the presented approaches of timing and AvB application can achieve the maximum performance in the researched virtual environments. Moreover, they are both software solutions which do not require significant hardware or software modifications of the Linux-based system structure. In the case of HighPerTimer library implementation, costs are limited to the presence of RDTSCP support in the CPU. In addition, it should be taken into account that the precision of sleep function achieved by busy-wait is in the range of microseconds. The pitfalls of Kite2 are in its network intrusiveness by estimation samples. Despite this, it uses an adaptive train size to reduce traffic generated by probes. Thereby, the low-resolution timing problem in a virtual environment caused by the precision being lower than that required by high-performance network application can be overcome by using direct access to TSC and manipulation with busy-wait while sleep operations implemented in HighPerTimer library. Therefore, it allows availing precise packet timestamping to analyze high-speed networks with capacities up to 120 Gbps, which is not possible with system timers.