Estimation of Adaptation Parameters for Dynamic Video Adaptation in Wireless Network Using Experimental Method

A wireless network gives flexibility to the user in terms of mobility that attracts the user to use wireless communication more. The video communication in the wireless network experiences Quality of Services (QoS) and Quality of Experience (QoE) issues due to network dynamics. The parameters, such as node mobility, routing protocols, and distance between the nodes, play a major role in the quality of video communication. Scalable Video Coding (SVC) is an extension to H.264 Advanced Video Coding (AVC), allows partial removal of layers, and generates a valid adapted bit-stream. This adaptation feature enables the streaming of video data over a wireless network to meet the availability of the resources. The video adaptation is a dynamic process and requires prior knowledge to decide the adaptation parameter for extraction of the video levels. This research work aims at building the adaptation parameters that are required by the adaptation engines, such as Media Aware Network Elements (MANE), to perform adaptation on-the-fly. The prior knowledge improves the performances of the adaptation engines and gives the improved quality of the video communication. The unique feature of this work is that, here, we used an experimental evaluation method to identify the video levels that are suitable for a given network condition. In this paper, we estimated the adaptation parameters for streaming scalable video over the wireless network using the experimental method. The adaptation parameters are derived using node mobility, link bandwidth, and motion level of video sequences as deciding parameters. The experimentation is carried on the OMNeT++ tool, and Joint Scalable Video Module (JSVM) is used to encode and decode the scalable video data.


Introduction
Video communication applications, such as video conferencing, telemedicine, video chat, and video-on-demand, are attracting users more and more in this COVID-19 pandemic situation. The communication applications involve a wireless network to reach a large number of users and enable seamless communication. Wireless networks provide the flexibility of mobility and ease of use to the users, which increases the streaming challenges and issues in providing better communication quality [1,2].
The recent advances in wireless technology and video coding formats have opened many challenges in real-time video streaming over wireless networks. As video communication is very sensitive to jitter and throughput, it is difficult to achieve Quality of Services (QoS)and Quality of Experience (QoE) in wireless networks. The challenges of providing better quality can be handled with the help of technologies, such as Content-Aware Networking (CAN), Content-Centric Networking (CCN) [3,4], layered video coding, and

Scalable Video Coding (SVC) an Extension to H.264/Advanced Video Coding (AVC)
The layered video coding techniques enable the video adaptation. The H.26x series of video compression techniques is popular and mostly used currently. In this series, H.264/AVC [5] is most commonly used, and the majority of the video communication applications use it to encode and decode. The H.265/HEVC (High-Efficiency Video Coding) [10] is the latest compression that is available, but the application that uses this compression technique is limited. Recently, an international committee was formed to develop a new compression method called H.266/VVC (Versatile Video Coding) [11]. In this experiment, we considered Scalable Video Coding (SVC) an extension to H.264/AVC [6], as streaming and adaptation using SVC encoding is still a major research challenge that needs to be addressed. Adaptation is a method that can be used with any layered encoders.
SVC supports scalability in terms of spatial, temporal, and quality resolutions. Spatial scalability represents variations of the spatial resolution with respect to the original picture. The temporal scalability describes subsets of the bit-stream, which represent the source content with a varied frame rate. Quality scalability is also commonly referred to as fidelity or Signal-to-Noise Ratio (SNR) scalability.
In Scalable Video Coding, one base layer and multiple enhancement layers are generated. The base layer is the independent layer, and each enhancement layer is coded keeping previous layers as a reference layer. As a result, it generates a single bit-stream and enables the removal of partial bit-stream in such a way that it forms a valid bit-stream, as shown in Figure 1. The base layer consumes more bandwidth compared to the enhancement layer. Consequently, effective bandwidth consumption is much less. The increase in efficiency comes at the expense of some increase in complexity as compared to simulcast coding. In simulcast, multiple video sequences are generated to meet the different resolution and frame rates.
The SVC standard enhances the temporal prediction feature of AVC. Here, instead of a single-layer coding, multi-layer method is followed. The major difference between AVC and SVC in terms of temporal scalability is signaling the temporal layer information. The hierarchical prediction concept is being used in SVC. The dyadic hierarchical prediction has more coding efficiency than that of other prediction structures, like a non-dyadic and no-delay prediction. The spatial layer represents the spatial resolution, and the dependency identifier used for it is D. The base layer is equated to level 0 and the level is increased for each enhanced layer. In each spatial layer, motion-compensated prediction and intra-prediction are employed for single-layer coding. In simulcast, the same video will be coded with different spatial resolution, but, in spatial dependency, inter-layer prediction mechanisms are incorporated to improve the coding efficiency. The inter-layer prediction includes techniques for motion parameter and residual prediction, and the temporal prediction structures of the spatial layers should be temporally aligned for efficient use of the interlayer prediction.
For Quality or SNR scalability, coarse-grain SNR scalability (CGS) and medium grain SNR scalability (MGS) are distinguished in scalable video coding. The Quality scalable layer is identified by Q, where each spatial layer will have many quality layers. The decoder selects the Q value based on the requirement and decodes the quality for each spatially enhanced frame.
The SVC encoder combines all the above-mentioned scalability to code a video sequence as shown in Figure 2. The original video sequence is initially down-sampled up to the minimum resolution expected in video communication. Later, the base layer is coded independently. By keeping the base layer and spatial levels as a reference, enhancement layers are generated. The number of the enhancement layer is decided by the temporal levels. Finally, coded video is packetized according to Network Abstraction Layer standards and later used to store or stream over the network.

Media Aware Network Elements (MANE)
MANEs are CAN-enabled intermediate devices that implement intelligent modules, such as routing, adaptation, and so on [12][13][14]. Figure 3 depicts the architecture of MANE. It mainly implements Adaptation Decision Engine (ADE) and Extractor.
The modules, such as Network Analyzer and SVC Header Analyzer, are supporting the ADE module. The Network Analyzer monitors the network conditions and availability of the network resources, and then it feeds the same to ADE, which decides the number of layers that need to be extracted. The module continuously monitors the congestion status and Packet Delivery Ratio (PDR) of the network to understand the dynamic nature of the network resources. This monitored data helps in improving the efficiency of adaptation. Along with these parameters, Bandwidth, Buffer availability, and terminal availabilities are monitored for improving the decision-making. In wireless communication, reachability is also an important parameter because it decides the performance of routing protocols and the quality of the data received. Wireless routing protocols have many overheads, such as hello packets and echo packets. Hence, studying the influence of routing protocol and bandwidth availability was the aim of this research.
The SVC header analyzer parses the packets and then extracts the layer information from the bit-stream. The scalable video levels that are decided at ADE are fed into Extractor to extract the SVC levels accordingly. Unwanted scalable layers are removed from the fully scalable video bit-stream, and an adaptation bit-stream is delivered to the network. The adapted video bit-stream provides maximum video quality that can be achieved in the available resources and conditions.

Related Works
The majority of the research works support receiver and sender-driven adaptation methods, which are carried at end devices and server-side, respectively. In the receiverdriven approach [15], the content is adapted by the receiving device just before displaying the content. Guo et al. [16] proposed a multi-quality steaming method using SVC video coding. In this approach, multiple qualities of video data are streamed in a multicast communication and receivers will choose the quality of the video. Ruijian et al. [17] developed a Resource Allocation and Layer Selection method to choose the scalable video levels in a mobility scenario. In the sender-driven method [18], receivers signal the device capabilities while creating the session; then, accordingly, the sender adapts the content and streams the adapted content. A Video Optimizer Virtual Network Function [19] has been proposed to implement dynamic video optimization, where a video processing module at kernel and Network function virtualization (NFV) are used to improve the quality of the video in 5G network. There are many work on Hypertext Transfer Protocol (HTTP)based dynamic video adaptation methods [20][21][22][23][24] that use server-driven method. In these techniques, the server will collect the feedback on video quality, and accordingly, the video will be streamed to improve the QoE of the communication.
There are many adaptation techniques for client-side adaptation, which mainly use Dynamic Adaptive Streaming over HTTP (DASH) and HTTP Adaptive Stream (HAS) for streaming adapted video data. Pu et al. [25] proposed a Dynamic Adaptive Streaming over HTTP mechanism for wireless domain (WiDASH). Similarly, Kim et al. [26] proposed a client-side adaptation technique to improve QoE of HTTP Adaptive Stream. They considered the dynamic variation of both network bandwidth and buffer capacity of the client. Tian et al. [27] demonstrate the video adaptation using a feedback mechanism. Similarly, there few implementation that use client-side adaptation method and resource allocation technique [28,29]. These implementations display the adapted content once it is fully received by the receivers. However, these techniques consider full quality while streaming from sender to receivers; hence, they consume more resources in the network. This leads us to explore more about in-network adaptation methods.
Chen et al. [30] presented a dynamic adaptation mechanism to improve QoE of the video communication. The model considers the multiple video rates for the communication.
In Reference [31], a traffic engineering method has been proposed to feature the video adaptation. Here, a study has been carried to understand the importance of SDN for video streaming. A physical layer-based dynamic adaptation has been proposed in Reference [32].
In this work, the carrier sensing-based method has been developed. Quinaln et al. [33] proposed a streaming class-based method to stream scalable video. Here, quality levels and each level are streamed independently.
These research works aim at in-network adaptation techniques, but they fail in handling network dynamics and video motions together. The adaptation while streaming is difficult because the adaptation module requires dynamic network conditions and video metadata to decide the adaptation parameters. The literature does not discuss the role of video metadata and network parameters, such as mobility and bandwidth availability. Additionally, adaptation requires prior knowledge of the adaptation parameters to improve the adaptation on-the-fly. The majority of the literature concentrate on adaptation techniques and lack in discussing the prior knowledge required by the adaptation engine. Hence, we are carrying experimental analysis and then derive the adaptation parameters in this research work.

Scalable Video Streaming over Wireless Network
The video adaptation over wireless network experiences the following major challenges: • Node Mobility: The nodes in a wireless network are free to move and that leads to disruption in the communication. The node mobility affects the bandwidth availability between the source and receiver. The change in the bandwidth degrades the quality of the communication. This experimental work considers an ad-hoc wireless environment consisting of mobile nodes and forwarders. The routing algorithm considered for the experimentation is Ad-hoc On-Demand Distance Vector (AODV) [34], which is considered to be a stable routing algorithm in the wireless domain.
AODV is capable of routing both unicast communication and multicast packets. It is an on-demand algorithm; it means that the route between source and destination is created when the source has data or packets to send. The routes established are preserved as long as the source requires them for communication. Furthermore, AODV forms trees that connect multicast group members by removing routing loops. To obtain knowledge of network topology, nodes exchange the HELLO packet and Reply packets. Once the route is established, it starts streaming the video packets. In the wireless domain, nodes are acting as a source, forwarder, and destination; they can read the packets. Hence, it is assumed that each node in the wireless network is acting as MANE.
The size of the scalable video bit-stream varies with the motion level of the video sequence. Here, we considered three video sequences, which are Honeybee, Jockey, and Bosphorus, which represent high, medium, and low motion, respectively. Figure 4 shows the video dataset that is considered in this work. SVC encodes motion parameters along with the video data, therefore as motion increases in the video, the number of packets that need to be transmitted over the network also increases. When these packets are transmitted over a bandwidth-limited network, the dropping of a packet that has motion parameters adversely affects the decoded video at the receiver. That is the reason that a decision taken at ADE varies with the motion level of video sequences.
With the above-said methodology, we address the challenges listed in this section. The planned wireless network setup considers the listed network challenges and dynamic conditions. Additionally, we consider the background communications and noises by explicitly creating the communications. These setups make the simulation environment more real-time.

Experimentation and Discussion
In order to study the performance of Scalable Video streaming over wireless networks, we chose OMNeT++ tool [9], and, for encoding and decoding the video sequences, Joint Scalable Video Module (JSVM) tool with version 9.18 [8] was used.
The OMNeT++ is a simulator which supports the connection of real devices to a simulation environment. Hence, real network applications, such as VLC streaming, can be used. JSVM is a reference software developed by Joint Video Team (JVT). It supports up and down-sampling, SVC encoding and decoding, and bit-stream extraction of video data.  In this experimental study, video sequences are encoded for 3 temporal levels, 3 spatial levels, and 2 quality levels. The frame rates considered are 15 fps, 30 fps, and 60 fps, which are represented by value T. In spatial scalability, 480p, 720p, and 1080p standard resolutions are considered. The value of D denotes the spatial level. The quality scalability Q denotes 2 levels of video quality, which is achieved by considering 2 different quantization levels. Table 1 represents the video levels and bitrates of each level that are considered in this experimental study. The network scenario and parameters considered in this experimentation are as shown in Table 2.  The wireless networking environment is created using OMNeT++ Tool, as shown in Figure 5. The source and destination nodes of the simulation environment are attached to the real computer device. The node mobilities are predefined in the OMNeT++ Tool and the same used to study the performances of the video streaming. To realize the wireless environment, background communications using UDP were used. The UDP communication is set in such a way that, periodically, a device broadcasts 100 kB data, and, additionally, routing consumes resources for route management. The variation in the network resources influences the delivery of the video data packets, and that is studied in this experimental work. The video streaming is carried with the help of a VLC media player at source and destination. The VLC player is used for generating the stream-ready video data and then streaming over the simulation environment. It uses Real Time Streaming Protocol (RTSP) and UDP protocols for streaming over the network. In addition, the VLC is used to capture the video sequence at the receiver side. Since VLC does not support SVC encoding and decoding, we use VLC for streaming and capturing only. The Packet Delivery Ratio is calculated using Wireshark and also OMNeT++ tool, where total packets transmitted and received are considered. Figure 6 depicts the PDR obtained for streaming Bosphours video over a wireless network. Here, bandwidth variations with 24 Mbps and 48 Mbps are considered for streaming the video sequences. From the result, it is evident that a wireless network having 48 Mbps can transmit a fully scalable video sequence that has all scalable levels. As there are background communications to keep the network active and ready for communication, the portion of the network bandwidth is allocated for network routing overheads. Hence, video streaming experiences a lack of network resources for the streaming of fully scalable bit-stream. In 24 Mbps, scalable video with lower frame-rates, i.e., up to 15 fps and 720p gives better PDR. However, higher quality and resolutions consume more video bit-rate and additional resources in the network for streaming. As a result, high-quality video streaming suffers a lack of network resources to provide better communication quality. The experiment is carried to study the influence of node mobility and routing overheads on video communication. The results obtained are plotted as shown in Figure 7. In a mobility scenario, the nodes keep changing the position and, hence, connectivity. This change in the node connectivity leads to re-route computation in the network, which affects the communication by dropping the video data packets. Figure 7a,b show the PDR calculation for 24 Mbps and 48 Mbps network, respectively. From the result, it is observed that the base layer is stable both with and without node mobility scenario. The bitrate of the base layers is less compared to the higher levels. However, the increased bitrate leads to more packets in the communication and fails to provide stable quality in the mobility scenario. In a non-mobility, the life-time of the calculated route leads to re-computation of the streaming path; hence, the packet drops are observed in the experiment. The experimentation is carried to study the influence of video bitrates on video streaming over a wireless network. The results are shown in Figure 8. In this experiment, all scalable video bit-streams are streamed over the network and captured the received video at the receiver side, and then PDR is calculated to analyze the streaming performances. The video sequence with more motions produces more bit-rate; hence, more packets are generated while streaming over the network. The Bosphorus video sequence is more stable compared to Jockey and Honeybee sequences. The Bosphorus video sequence has a slowmoving object in the front and very far background objects; hence, the bitrate generated by the SVC encoded is less compared to Jockey and Honeybee. In Jockey and Honeybee, objects in the foreground and background are moving frequently; therefore, the bitrate generated is higher. Figure 9 shows the PDR calculation for non-mobility wireless scenario. From these experiments, the influence of node mobility, bandwidth, and motion levels are observed used for deriving the pre-knowledge required for ADE. As the aim of this research work was to generate the extraction points for ADE, adaptation parameters were estimated considering 80% PDR is required for decoding the bit-stream and display visual quality video. The adaptation parameters for ADE are derived as shown in Tables 3-5. These tables show the adaptation parameters to obtain better QoE in regard to PDR in a bandwidth-constrained network.
Tables 3-5 are the adaptation parameters for video streaming over the wireless network. These parameters are estimated based on the PDR that is observed in the experimental evaluation process. Now, the MANE can use these Tables as a reference to decide the extraction points for an available network resource and network condition. This helps the MANE to implement the adaptation on-the-fly since the knowledge that is built is available for reference and decides the scalable levels for removal. Additionally, the adaptation parameters are estimated from the experimentation that confirms that the video received meets the quality requirements (PDR) of the video communication. The major advantage of the estimation is a reduction in the delay that is involved in the decision-making. The reduced processing delay ensures the implementation of the adaption process in-network and on-the-fly.

Conclusions
The SVC is a suitable encoding technique to attain better QoE/QoS in wireless communication. The network topology is highly dynamic, and routing protocols have more overhead in the wireless network. In addition, most bandwidth is used for maintaining the topology knowledge. In this paper, we estimated the adaptation parameters considering mobility, bandwidth availability, and motion levels of video sequences for deciding the adaptation parameter. The experimental method uses different scalable video levels and network conditions to stream the video over the wireless network. Here, High-definition video sequences are considered for estimating the adaptation parameters. The knowledge built in this work help in the continuous streaming of video over a CAN-enabled Wireless network. Hence, the adaptation is on-the-fly, considering dynamic network conditions and resource availability.
In the future, the knowledge built will be used for developing a dynamic video adaptation method. We plan to use machine learning-based approaches to develop dynamic adaptation techniques. In addition, we will simulate various network scenarios and estimate more adaptation parameters to use in dynamic adaptation algorithms.