VCC-DASH: A Video Content Complexity-Aware DASH Bitrate Adaptation Strategy

Duan, Juzheng; Zhang, Min; Wang, Jing; Han, Shuai; Chen, Xun; Yang, Xiaolong

doi:10.3390/electronics9020230

Open AccessArticle

VCC-DASH: A Video Content Complexity-Aware DASH Bitrate Adaptation Strategy

by

Juzheng Duan

,

Min Zhang

^*

,

Jing Wang

,

Shuai Han

,

Xun Chen

and

Xiaolong Yang

School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China

^*

Author to whom correspondence should be addressed.

Electronics 2020, 9(2), 230; https://doi.org/10.3390/electronics9020230

Submission received: 8 January 2020 / Revised: 26 January 2020 / Accepted: 28 January 2020 / Published: 31 January 2020

(This article belongs to the Section Computer Science & Engineering)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Traditional DASH (dynamic adaptation streaming over HTTP(i.e., HyperText Transfer Protocol)) bitrate strategy cannot differentiate segments with different complexities of video content, resulting in the user’s QoE (quality of experience) of segments with high content complexity as worse than that with low content complexity. In case of this, this paper firstly studies video coding and puts forward the definition of video content complexity. Then the effects of content complexity on user’s QoE is analyzed and the QoE utility function of the segment is formulated based on its MOS (mean opinion score, related to the content complexity and bitrate) and bitrate switching between consecutive segments. Last, in order to maximize user’s QoE, this paper proposes VCC-DASH (video content complexity-aware DASH bitrate adaptation strategy) under the constraints of the network bandwidth and the buffer occupancy. In simulations, we compare VCC-DASH with the classical bitrate adaptation strategy proposed by Liu et al. (LIU’s strategy, for short). The simulation results show that the two strategies have similar performances in bitrate switching numbers, playback interruption times, and buffer lengths. In addition, it is more important for simulation results to reveal that VCC-DASH’s average bitrate is much higher than that of LIU’s strategy, which means that VCC-DASH can make fuller use of the network bandwidth than LIU’s strategy does. Moreover, the MOS distribution of the VCC-DASH is more concentrated on the better scores “4~5”, which profit from its content complexity-aware adaptation to allocate more bandwidth resources to high-complexity segments.

Keywords:

DASH; QoE; bitrate; video content complexity

1. Introduction

Recently DASH (dynamic adaptation streaming over HTTP (i.e., HyperText Transfer Protocol))-based technology has been widely adopted to deliver video content over the Internet since it is able to automatically match video quality with the available network bandwidth for proper accessibility and delivery [1]. Based on DASH system architecture as shown in Figure 1, the video on the server side is divided into small segments with different representation levels, all of which are described in an MPD (media presentation description) file [2]. When a video streaming session starts, the server sends the MPD file to the client to offer access to the segments, and the client sends an HTTP-get message to request the segment with a bitrate closest to the available network bandwidth. This way users receive a video streaming service with a much more satisfying QoE (quality of experience).

As the key of DASH technology, the bitrate adaptation strategy draws much attention from research in recent years. In the traditional bitrate adaptation strategy, the DASH client monitors the network conditions, predicts the available network bandwidth, and then requests the segment whose bitrate matches the predicted network bandwidth [3]. In general, the goal for the strategy is to maximize the viewing quality of the streaming while avoiding unnecessary quality fluctuations [4]. There are many research works on DASH bitrate adaptation strategy. Liu et al [5] proposed a step-wise switching-up and aggressive switching-down method, which can change the consumed representation from different bitrates encoded versions. In addition, it clearly specifies the conditions of switching-up and switching-down. Müller et al [6] builds a bitrate adaptation decision from the download time of each segment and the average bitrate of the whole video streaming session. Huang et al [7] directly chooses the segment bitrate based on the current buffer occupancy, and also uses the estimation of the available network capacity when the network throughput is relatively constant. Kumar et al [8] proposes a QoE-driven bitrate adaptation strategy which jointly considers both bandwidth saving and video quality adaptation for the bitrate adjustment, beneficial to both video content service providers and subscribers.

The above-mentioned DASH bitrate adaptation strategies [5,6,7,8] can offer best-effort viewing quality assurance, but they cannot optimize the playback quality based on the video content complexity, which is a feature of the video content to reflect the motion intensity of video sequences. To get similar playback quality, videos with high content complexity need more encoding bit numbers since they carry more information [9]. This is done assuming that the video is divided into small segments with different media representations, and the complexity of each segment is not the same. Traditional DASH bitrate adaptation strategies (e.g., [5,6,7,8]) adapt to the network bandwidth to select the media representation for each segment and ignore the differentiated requirement of the content complexity on the bitrate, which results in much worse playback quality for segments with high content complexity than those with low content complexity. As a result, the QoE of the whole video is fluctuant. Inspired by the MOS (mean opinion score) metrics provided by Klaue et al [10], we get the MOS of segments with different bitrates and content complexities. In addition, Kim et al [11] indicates that the bitrate switching between consecutive segments can reduce users’ QoE. Thus, at each decision epoch, the time-varying QoE of a viewer is counted by accounting for the MOS of the segment and the bitrate switching loss between segments. Based on the network bandwidth and the buffer occupation, the bitrate is adapted to maximize the resulting QoE. In case of this, the bitrate adaptation strategy VCC (video content complexity)-DASH is proposed. The contributions of this paper are as follows: (i) defines the content complexity of video using its encoding information; (ii) formulates QoE utility function of the segment based on its MOS (relates to its content complexity and bitrate) and the bitrate switching between consecutive segments; and (iii) establishes an QoE optimization model under the constraints of the network bandwidth and the buffer occupation to adjust the bitrate dynamically and accordingly maximize users’ QoE.

2. Background and Key Issues

Here, we analyze the feature of video content and propose how to measure the complexity of video content. Furthermore, we extend the MPD file by adding a VCC attribute for each segment functioning as its video content complexity tag for the convenience of the client to differentiate segments with different content complexities.

2.1. The Analysis and Measurement of Video Content Complexity

The content complexity is a feature of the video to reflect its motion intensity [12]. On the server side, the video is encoded into different code rate versions, which are related to the coded bits number, bit rate, frame rate, etc. According to the video compression standard MPEG (moving pictures experts group) [13], the video is encoded in units of GOP (group of pictures). A GOP is a set of consecutive frames consisting of I-frames, P-frames, and B-frames. The I-frame uses intra-frame compression, which contains a large amount of information and reflects the texture characteristics of the video. The P-frame and B-frame use inter-frame prediction coding to compress pictures by sufficiently reducing the time redundancy between frames, both of which contain less information and reflect the motion characteristics of the video. In general, the encoded bits number of an I-frame is much larger than the P-frame and B-frame within a GOP. So, the video content complexity, which reflects the motion intensity of the video, can be characterized by the GOP-related ratio r of the average encoded bits number of P-frames and B-frames

R_{P, B}

and the average encoded bits number of I-frames, which can be expressed as follows.

r = \frac{R_{P, B}}{R_{I}}

(1)

where

R_{P, B}

and R_i are defined as in Equation (2).

N_{I}

,

N_{P}

, and

N_{B}

are the number of I-frames, P-frames, and B-frames within a GOP, respectively.

R_{I, i}

,

R_{P, i}

and

R_{B, i}

are the coded bits number of the i-th I-frame, the i-th P-frame, and the i-th B-frame within a GOP, respectively.

R_{P, B} = \frac{1}{N_{P} + N_{B}} \times (\sum_{i = 1}^{N_{P}} R_{P, i} + \sum_{i = 1}^{N_{B}} R_{B, i}) R_{I} = \frac{1}{N_{I}} \times \sum_{i = 1}^{N_{I}} R_{I, i}

(2)

All the classic video sequences used in this paper are from the website (i.e., http://trace.kom.aau.dk/yuv). According to Equations (1) and (2) above, we get the GOP-related ratio r of the sequences Akiyo, Container, Foreman, Coastguard, Soccer, and Football with different average bits/frame and draw the scatter plot as shown in Figure 2.

As usual, the P-frame and the B-frame reflect the motion characteristics of the video, and the higher motion intensity of the video means its content complexity is higher. Known from Equation (1), the bigger the GOP-related ratio r is, the higher the encoding bits number of P-frames and B-frames is relatively, and then the higher the content complexity of the video. As shown in Figure 2, the GOP-related ratio r of the sequence Football is the biggest, so its content complexity is the largest. On the contrary, the GOP-related ratio r of the sequence Akiyo is the lowest, so its content complexity is the lowest. In addition, r becomes larger and the gradient becomes slower and slower as the average bits/frame increases. However, since the content complexity is a feature of the video sequence, the mode value of r under average coded bits per frame for each video sequence is token to value its video content complexity in statistics. If more than one mode number exists, then the average number is taken. The content complexity of each video sequence is shown in Table 1.

Furthermore, we obtained the content complexity of more video sequences according to the method proposed above and used the k-means clustering algorithm to classify them into three levels: low-level complexity (VCC = 1), middle-level complexity (VCC = 2), and high-level complexity (VCC = 3), as shown in Table 2.

2.2. Tagging VCC for DASH Segments

In ISO/IEC MPEG-DASH standard [2], an MPD file on the server side describes the collection of encoded and deliverable versions of media content. The basic structure and components of the XML (i.e., Extensible Markup Language)-schema MPD are shown in Figure 3. The sequences of Period in the timeline make up the Media Presentation. A Period typically represents a media content period during which a consistent set of encoded versions of the media content is available. Within a Period, material is arranged into an Adaptation Set, which depicts a set of interchangeable encoded versions of one or several media content components. An Adaptation Set contains a set of Representations, which describes a deliverable encoded version of one or several media content components. Typically, this means that the client may switch dynamically from one Representation to other Representation within an Adaptation Set in order to adapt to network conditions or other factors. Within a Representation, the content may be divided into multiple segments for proper accessibility and delivery. In order to access a segment, its URL (i.e., Uniform Resource Locator) is provided explicitly. Consequently, a segment is the largest unit of data that can be retrieved with a single HTTP request.

MPEG-DASH is an open standard that allows the extension of MPD file for adding components of the media content as needed. In each segment, we add a VCC attribute functioning as its video content complexity tag for the convenience of the client to differentiate segments with different content complexities, as shown in Figure 3. When the video streaming session starts, the MPD file is downloaded to the client, and the corresponding VCC attributes are parsed out as an input parameter for the decision of the media representation.

3. Proposed Algorithm

Here we firstly define the QoE utility function of the segment as the user’s satisfaction with watching it, which is decided by its MOS and the bitrate switching between consecutive segments. Then in order to maximize the user’s QoE, we propose the bitrate adaptation strategy VCC-DASH under the constraints of the network bandwidth and the buffer occupancy, which can be used to select media representation. Last, we present the implement of the proposed VCC-DASH.

3.1. QoE Utility Function

In a video streaming session, the actual viewing quality experienced by users (i.e., QoE) greatly depends on the segment’s MOS [14] and the segment bitrate switching loss [15].

The former refers to the average value of the subjective score offered by a group of non-professionals after watching the segment in a standard test environment, which can be used to evaluate the subjective quality of the segment. According to the MOS metrics method provided by Klaue et al [10], we choose CIF (common intermedia format) video sequences with different VCC, use the open source video quality evaluation tool-set EvalVid [16] to calculate the MOS, and then draw a scatter plot illustrating the relationship between the MOS, the VCC, and the encoding bitrate as shown in Figure 4. An MOS curve against bitrate is shown in Figure 4, and MOS grows close to a logarithmic rate when the VCC is fixed. Therefore, for the segment with the same VCC, the relationship between MOS and its bitrate can be formulated by a logarithmic function. The distribution trend of MOS curve varies distinctly with different VCC, which means that the logarithmic fitting functions of MOS–bitrate under different VCC should have different fitting parameters, which is expressed as follows.

M O S (V C C_{i}, B S_{i}) = a_{V C C_{i}} \cdot \ln (B S_{i}) + b_{V C C_{i}}

(3)

In Figure 4, the MOS–bitrate curves under different VCC are fitted according to the nearest neighbor principle. In Equation (3), the fitting parameters

a_{V C C_{i}}

and

b_{V C C_{i}}

are shown in Table 3.

In DASH standard [2], the video is divided into segments with different media representations, and so the client may switch from one representation to another representation in order to adapt to network conditions and playback environment. Therefore, for two consecutive segments, their bitrates may be different, and if bitrate switching exists then this should worsen the user’s QoE [15]. In this paper, we define the reduction of QoE resulting from the bitrate switching as a loss, which is greatly decided by the switching times and the switching range [17]. As usual, the wider the switching range, the bigger the loss value and the sharper the gradient. For simplicity, this relationship between switching range and QoE loss is expressed by an exponential function. As a whole, the loss of the i-th segment is expressed as follows.

L o s s (B S_{i}) = γ_{1} D_{i} + γ_{2} (e^{\frac{R_{i}}{R_{\max}}} - 1)

(4)

where

γ_{1}

and

γ_{2}

are the QoE loss weight of video bitrate switching and the switching range, respectively.

D_{i} = {\begin{cases} 0, B S_{i} = B S_{i - 1} \\ 1, B S_{1} \neq B S_{i - 1} \end{cases}

indicates if the bitrate of the current i-th segment is different from the (i-1)-th segment.

R_{i} = | \frac{B S_{i} - B S_{i - 1}}{△ B} |

denotes the bitrate switching range of the two consecutive segments, and

△ B = \min {(B_{n} - B_{n - 1}), n = 1, 2 \dots, M}

is the minimum difference between neighboring bitrates in the available bitrate set

ℜ = {B_{1}, B_{2} \dots, B_{M}}

.

R_{m a x} = \frac{B_{M} - B_{1}}{Δ B}

is the maximum switching range.

Combined with Equations (3) and (4), the QoE of the i-th segment is rewritten as follows.

Q o E_{i} = M O S (V C C_{i}, B S_{i}) - L o s s (B S_{i})

(5)

3.2. QoE Optimization Model

In nature, the main idea of the proposed bitrate strategy VCC-DASH is to discriminately select an optimized bitrate for each requesting segment with different content complexity under the constraints of the network bandwidth and the buffer occupancy, whose final purpose aims at maximizing users’ QoE during a video streaming session. Hence, the bitrate adaptation of VCC-DASH can be formulated as the following optimization model. In our model, each segment is assigned the same duration

τ

seconds assumed that the selected bitrate of the i-th segment is

B S_{i}

, which satisfies

B S_{i} \in ℜ

, where

ℜ = {B_{1}, B_{2} \dots, B_{M}}

is the set of available bitrates. The delivery time of the segment is

t_{i}

seconds, which is the time from when the client sends out an HTTP-get message for the i-th segment to when the segment is successfully received by the client. In this way, the current bandwidth

B C_{i}

for delivering the i-th segment is estimated as

\frac{τ}{t_{i}} B S_{i}

, which is used as the decision basis for the selection of the bitrate of the (i+1)-th segment. Further, the buffer occupancy after the i-th segment is received is expressed as

\sum_{i} (τ - t_{i})

.

Max

{Q o E_{i} = M O S (V C C_{i}, B S_{i}) - L o s s (B S_{i})}

\begin{matrix} s . t . B S_{i} \leq {\begin{cases} \sum_{I = i - ω - 1}^{i - 1} (B C_{I} - B S_{I}) + B C_{i}, f o r V C C = 3 \\ B C_{i}, o t h e r s \end{cases}, \\ B S_{i} \in ℜ, i = 1, 2 \dots, N \end{matrix}

(6)

0 \leq λ_{\min} \cdot τ \leq \sum_{i} (τ - t_{i}) \leq λ_{\max} \cdot τ, i = 1, 2 \dots, N

(7)

B C_{i} \cdot \sum_{i - 1} (τ - t_{i}) \geq B S_{i} \cdot τ, i = 1, 2 \dots, N

(8)

Equation (6) represents the VCC differentiation constraint of VCC-DASH (i.e., how to discriminately select bitrate for a segment with different content complexity). For low-complexity segments, it requires that the selected bitrates from the available bitrate set

ℜ

should be no more than the estimation of the current bandwidth

B C_{i}

. For high-complexity segments, the differences between the selected bitrate

B S_{i}

of the prior

ω

segments and the estimated network bandwidth

B C_{i}

for delivering the prior

ω

segments are counted as the bitrates surplus, i.e.,

\sum_{I = i - ω - 1}^{i - 1} (B C_{I} - B S_{I})

. The selected bitrate of the high-complexity segments should be no more than the sum of the bitrates surplus and the current network bandwidth

B C_{i}

, that is

\sum_{I = i - ω - 1}^{i - 1} (B C_{I} - B S_{I}) + B C_{i}

.

Equation (7) represents the constraint on buffer occupancy (i.e., how to avoid buffer overflows and underflows). The buffer occupancy is the total playing duration of the segments loaded in the buffer, which is expressed as

\sum_{i} (τ - t_{i})

. When the duration of the segment

τ

is greater than the download time of segment

t_{i}

, the buffer occupancy increases. If the buffer occupancy continues to increase and exceed the buffer size, then buffer overflow will occur, causing waste of resources. Conversely, if the duration of the segment

τ

is less than the download time of the segment

t_{i}

, the buffer occupancy decreases. Obviously, the continuous decrease of buffer occupancy causes buffer underflow, which then causes playback interruptions. Therefore, VCC-DASH sets the upper bound

λ_{\max}

and the lower bound

λ_{\min}

for the number of buffered segments to impose restrictions on the selection of the media representation, and to avoid the bandwidth waste and playback interruptions.

Equation (8) represents the constraint on the available bandwidth, which requires VCC-DASH takes both the available network bandwidth

B C_{i}

and the buffer occupancy

\sum_{i - 1} (τ - t_{i})

into account to select the media representation. In fact, this constraint guarantees the requesting segment arrival at the buffer before its queue is empty.

3.3. VCC-DASH Implement

Here we present the implement of the proposed strategy VCC-DASH, as shown in the following textbox of Figure 5. The bitrate of the first segment is initialized as the minimum bitrate

B_{1}

in the available bitrate set

ℜ

. VCC-DASH selects bitrate for each segment. Lines (5–17) get the available bitrate set under the constraints of Equation (6). Lines (18–23) shows the joint constrains of Equations (7) and (8) on the bitrate selection. If there is no available bitrate, VCC-DASH directly assigns the lowest bitrate

B_{1}

to the segment as shown in lines (24–27). Otherwise, the segment with the maximized QoE is selected as shown in lines (28–36). This is assuming that there are N segments of the video and the number of bitrate representations in the available bitrate set

ℜ

is M. As usual, M is much smaller than N. When the network bandwidth is so stable and sufficient that the biggest bitrate

B_{M}

in

ℜ

is available, the VCC-DASH has to traverse the available bitrate set

ℜ

to calculate the QoE for each segment according to the optimization model, and then select the segment with the optimal QoE. In this case, the time complexity is

O (N \cdot M^{2})

, which means that the proposed VCC-DASH is fast enough for practical deployment.

4. Performance Evaluation

In this section, we evaluate the proposed VCC-DASH and compare it with a classic bitrate adaptation strategy, known as LIU’s strategy [5], under the same simulation scenarios setup. The simulation results show that the two strategies have similar performances in bitrate switching numbers, playback interruption times and buffer lengths. Moreover, the bitrate and the MOS of the segments selected by VCC-DASH are distinctly higher than that of LIU’s strategy, which means that VCC-DASH offers users better QoE.

4.1. Simulation Scenarios Setup

The proposed strategy VCC-DASH is performed on 100 video segments with different content complexities extracted from the video sequences Akiyo, Container, Foreman, Coastguard, Soccer, and Football, respectively, which are re-arranged as the VCC distribution shown in Figure 5b. For a fast start, assuming that when there are five segments with the lowest bitrate

B_{1}

in the buffer, the video begins to play and the proposed strategy VCC-DASH comes into effect. The parameters settings are shown in Table 4.

Since LIU’s strategy [5] is one of the most famous strategies among the existing DASH bitrate adaptation ones, our simulations regard it as a benchmark to investigate the performances of VCC-DASH. In fact, both of them have the same point in adaptively matching the media representation with the network condition and pursuing goals, including less switching times, higher average bitrate, no buffer overflow or underflow, and no playback interruptions. Different from LIU’s strategy, VCC-DASH differentiates segments in content complexity and considers the constraint of buffer occupancy. In addition, the comparisons of the two strategies are done under the same scenario’s setup, including the network bandwidth, buffer size, segment duration, content complexity, available bitrate set, etc.

4.2. Simulation Results Analysis

Under a worst network condition (as shown in Figure 6a) where bandwidth fluctuations with higher amplitudes occur frequently, our simulations compare the performances of the two strategies in terms of selected bitrate, the QoE items (including MOS and loss), and the buffered media time. If the strategy acts well in this worst condition, it can also adapt to normal conditions. In simulations, the distribution of VCC of 100 segments is shown in Figure 6b.

(1) The selected bitrate: The selected bitrate of the two strategies is shown in Figure 6c. The average bitrate of VCC-DASH is 492.77 Kbps, which is significantly higher than the 452.67 Kbps in LIU’s strategy. The statistical distribution of the bitrate is shown in Figure 7a. The bitrate selected by VCC-DASH is concentrated at 540 Kbps and 720 Kbps while that of LIU’s strategy is concentrated at 360 Kbps and 540 Kbps. The reason for the results is that LIU’s strategy deploys a step-wise switching-up and aggressive switching-down method to change the media representation and prevent buffer underflow, which means it is easy for the bitrate to switch down but it is cautious to switch up, so the bitrate is more concentrated at a relative low bitrate and the average bitrate is low. The VCC-DASH directly selects the segment with the best QoE under the constrains of the network bandwidth and buffer occupancy without limit to the switching range between consecutive segments, so the network bandwidth is well used to transmitting the segment with higher bitrate and the bitrate is more concentrated at higher bitrate.

(2) QoE items: Statistics of QoE items are shown in Table 5. For the two strategies, there is no significant difference in the total number of switching times and switching range, and furthermore, both of them are at a rather small level. Although differently, the sum of the MOS and the sum of QoE of VCC-DASH are obviously higher than that of LIU’s strategy.

Moreover, the distribution of MOS is shown in Figure 7b, and the proportion of subjective opinion “excellent” is as high as 86% in VCC-DASH, which is significantly higher than that of LIU’s strategy. The advantage roots from that VCC-DASH collect bitrate surpluses of the prior segments and provide them to segments with high VCC so that the MOS of the segments with high VCC is visibly enhanced, and at the same time the QoE of the whole video is more equalized. In general, the proposed VCC-DASH can improve users’ QoE and offer an equalized viewing experience.

In addition to the above comparison, we add a comparative experiment under the constraint of the equal transmitting bits for any requiring segment in order to further illustrate the advantage of VCC-DASH relative to LIU’s strategy. Here according to VCC-DASH, we can obtain the bitrate selection decision (i.e., the bitrate sequence for the 100 segments under two network conditions). Then in the experiment, LIU’s strategy requests and delivers each segment based on the bitrate sequence pre-defined by VCC-DASH. Apparently, each segment received and played by LIU’s strategy has the same bit number as the corresponding segment by VCC-DASH. Under an associated VCC-DASH bitrate sequence, we obtain the QoE statistics results of LIU’s strategy as shown in Table 6. Compared with the results in Table 5, we observe that LIU’s QoE under the associated VCC-DASH bitrate sequence is much worse than the ones under the respective optimal bitrate sequences independently determined by their self-adaption policies. Numerically, LIU’s QoE sum of 100 segments at the associated VCC-DASH bitrate sequence (that is about 327.32 in Table 6) is lower by 24.83% than the one at LIU’s optimal bitrate sequence (that is about 408.55 in Table 5), and much lower at 29.06% than VCC-DASH’s QoE sum at its optimal bitrate sequence (that is about 422.44 in Table 5). As shown by Figure 7b and Figure 8b in much more detail, the number of excellent-level MOS for LIU’s reduces from 74 to 31 when the bitrate selection of segment changes from adjusting by LIU’s to being predefined by VCC-DASH. Here, most of the excellent-level MOS for LIU’s degrade to the good-level (about 24) and fair-level ones (about 19). The reason for degradation of MOS levels for LIU’s roots from the mismatch of the associated VCC-DASH bitrate sequence to LIU’s decision-making solution under the network bandwidth condition is shown in Figure 6a. Hence, the QoE performance of LIU’s is much worse than that of VCC-DASH in the case of the exact same received bits.

(3) Buffered media time: The buffer occupancy is shown in Figure 6d where both maintain buffer occupancy in a fair proper level, and do not appear to overflow or underflow. For LIU’s strategy, the results root from two causes. The first is its step-wise switching-up and aggressive switching-down method, which avoids the buffer underflow. The second is that the client should pause a certain period of time to request the next request if the buffer occupancy is large enough to cover the maximum draining of buffered media time during fetching the segment, which prevents buffer overflow. For VCC-DASH, the results root from one cause, which is that VCC-DASH sets the upper bound

λ_{\max}

and the lower bound

λ_{\min}

for the buffer occupancy to prevent buffer overflow or underflow, respectively. As a result, the network bandwidth suddenly drops from 500 Kbps to 0 Kbps in 105 s; the buffer occupancy decreases rapidly but is still above 0, which means that although the network performs extremely bad, playback interruptions will not occur. As a whole, Figure 6d shows that the two strategies can well control the filling level of the client buffer to avoid overflow and underflow.

5. Conclusions

Traditional DASH bitrate adaptive strategies (e.g., LIU’s strategy), only adapt to the network bandwidth to download segments, and cannot differentiate segments with different content complexities. Our proposed VCC-DASH strategy makes full use of the network bandwidth to download segments and allocates more bandwidth resources for segments with high video content complexity, thus offering users a better QoE. The simulation performance shows that it performs remarkably well even under a highly variable throughput network condition. Apart from content complexity, there are many other features of video content that can be studied and integrated with DASH bitrate adaptation strategy to optimize users’ QoE in future work. What is more, some other QoE items may be introduced to measure users’ subjective satisfaction.

Author Contributions

Conceptualization, J.D., M.Z., J.W., S.H., and X.Y.; data curation, J.D., J.W., and X.C.; formal analysis, J.D., J.W., and X.C.; funding acquisition, M.Z. and X.Y.; investigation, X.C.; methodology, J.D., M.Z., J.W., and X.Y.; project administration, X.Y.; resources, M.Z.; software, J.D., J.W., and S.H.; supervision, X.Y.; validation, X.C.; visualization, J.D., J.W., and S.H.; writing—original draft, J.D., M.Z., J.W., and X.Y.; writing—review and editing, M.Z. and X.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This research is funded by National Natural Science Foundation of China, grant number 61671057, 61941113.

Conflicts of Interest

The authors declare no conflict of interest.

References

Dubin, R.; Dvir, A.; Pele, O.; Hadar, O.; Katz, I.; Mashiach, O. Adaptation logic for HTTP dynamic adaptive streaming using geo-predictive crowdsourcing for mobile users. Multimedia Syst. 2018, 24, 19–31. [Google Scholar] [CrossRef] [Green Version]
ISO/IEC 23009-1:2012: Information technology-dynamic adaptive streaming over http (dash)-part 1: Media presentation description and segment formats. Available online: https://www.iso.org/standard/75485.html (accessed on 29 January 2020).
Stockhammer, T. Dynamic adaptive streaming over HTTP--: Standards and design principles. In Proceedings of the Second Annual ACM Conference on Multimedia Systems, Santa Clara, CA, USA, 23–25 February 2011; pp. 133–144. [Google Scholar]
Miller, K.; Quacchio, E.; Gennari, G.; Wolisz, A. Adaptation algorithm for adaptive streaming over HTTP. In Proceedings of the 2012 19th International Packet Video Workshop (PV), Munich, Germany, 10–11 May 2012; pp. 173–178. [Google Scholar]
Liu, C.; Bouazizi, I.; Gabbouj, M. Rate adaptation for adaptive HTTP streaming. In Proceedings of the Second Annual ACM Conference on Multimedia Systems, Santa Clara, CA, USA, 23–25 February 2011; pp. 169–174. [Google Scholar]
Müller, C.; Lederer, S.; Timmerer, C. An evaluation of dynamic adaptive streaming over HTTP in vehicular environments. In Proceedings of the 4th Workshop on Mobile Video, Chapel Hill, NC, USA, 22–24 February 2012; pp. 37–42. [Google Scholar]
Huang, T.Y.; Johari, R.; McKeown, N.; Trunnell, M.; Waston, M. A buffer-based approach to rate adaptation: Evidence from a large video streaming service. ACM SIGCOMM Comput. Commun. Rev. 2015, 44, 187–198. [Google Scholar] [CrossRef]
Kumar, V.P.M.; Mahapatra, S. Quality of Experience Driven Rate Adaptation for Adaptive HTTP Streaming. IEEE Trans. Broadcast. 2018, 64, 602–620. [Google Scholar] [CrossRef]
Porter, T.; Peng, X.-H. An objective approach to measuring video playback quality in lossy networks using TCP. IEEE Commun. Lett. 2011, 15, 76–78. [Google Scholar] [CrossRef] [Green Version]
Klaue, J.; Rathke, B.; Wolisz, A. Evalvid–A framework for video transmission and quality evaluation. In Lecture Notes in Computer Science, Proceedings of the 13th International Conference on Modelling Techniques and Tools for Computer Performance Evaluation, Urbana, IL, USA, 2–5 September 2003; Kemper, P., Sanders, W.H., Eds.; Springer: Berlin/Heidelberg, Germany, 2003; pp. 255–272. [Google Scholar]
Kim, H.J.; Choi, S.G. A study on a QoS/QoE correlation model for QoE evaluation on IPTV service. In Proceedings of the 12th International Conference on Advanced Communication Technology (ICACT), Phoenix Park, Korea, 7–10 February 2010; pp. 1377–1382. [Google Scholar]
Hu, J.; Wildfeuer, H. Use of content complexity factors in video over IP quality monitoring. In Proceedings of the 2009 International Workshop on Quality of Multimedia Experience, San Diego, CA, USA, 29–31 July 2009; pp. 216–221. [Google Scholar]
Richardson, I.E.G. H.264 and MPEG4 video compression; John Wiley & Sons: Hoboken, NJ, USA, 2003; Chapter 10. [Google Scholar]
Mizoguchif, Y.; Kurosaka, T.; Bandai, M. A QoE-aware quality selection controller for HTTP adaptive streaming. In Proceedings of the 2018 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 12–14 January 2018. [Google Scholar] [CrossRef]
Huang, W.; Zhou, Y.; Xie, X.; Wu, D.; Chen, M.; Ngai, E. Buffer state is enough: Simplifying the design of QoE-aware HTTP adaptive video streaming. IEEE Trans. Broadcast. 2018, 64, 590–601. [Google Scholar] [CrossRef]
EvalVid—A Video Quality Evaluation Tool-set. Available online: https://www.tkn.tu-berlin.de/research/evalvid (accessed on 26 January 2020).
Lekharu, A.; Kumar, S.; Sur, A.; Sarkar, A. A QoE aware LSTM based bit-rate prediction model for DASH video. In Proceedings of the 10th International Conference on Communication Systems & Networks (COMSNETS), Bengaluru, India, 3–7 January 2018; pp. 392–395. [Google Scholar]

Figure 1. DASH (dynamic adaptation streaming over HTTP) system architecture.

Figure 2. Relationship between the GOP (group of pictures)-related ratio r and the average coded bits per frame.

Figure 3. The XML-schema MPD.

Figure 4. The relationship between MOS (mean opinion score), bitrate, and VCC.

Figure 5. The implement algorithm of VCC-DASH.

Figure 6. Simulation performances in terms of the selected bitrate and buffer occupancy.

Figure 7. Bitrate distribution vs. MOS distribution (under the independent bitrate sequences).

Figure 8. Bitrate distribution vs. MOS distribution (under an associated VCC-DASH bitrate sequence).

Table 1. The content complexity of each video.

Video	Akiyo	Container	Foreman	Coastguard	Soccer	Football
content complexity	0.70	0.65	0.75	0.80	0.90	0.90

Table 2. The sequences classification by VCC (video content complexity).

VCC	Complexity	Video
1	Low	Container, Hall-Monitor, Akyio, News, Mother and Daughter
2	Middle	Coastguard, Foreman, Silent, Sign-Irene, Tempete
3	High	Carphone, Football, Soccer, Stephan, Rugby

Table 3. Fitting parameters for curves of different VCC.

VCC_i	a_VCCi	b_VCCi
1	0.4883	1.4461
2	0.872	−1.1834
3	1.2132	−3.598

Table 4. Settings of parameters.

Parameter	Value
Segment numbers, n	100
Buffer size	20
The duration of the segment, $τ$	2s
The lower bound of the buffered segments, $λ_{\min}$	3
The upper bound of the buffered segments, $λ_{\max}$	15
The segment number of bitrate surplus, ω	20
The complexity levels of the segments, VCC	1,2,3
The loss weight of video bitrate switching, γ₁	0.2
The loss weight of the switching range, γ₂	1
The available bitrate set, $ℜ$	(90, 180, 360, 540, 720, 1080) Kbps
The maximized switching range, $R_{\max}$	11

Table 5. Statistics of QoE (quality of experience) items with independent bitrate sequences.

	$\sum_{100} Q o E_{i}$	$\sum_{100} M O S$	$\sum_{100} L o s s$	$\sum_{100} D_{i}$	$\sum_{100} R_{i}$
VCC-DASH	422.44	427.90	6.25	12	15
LIU’s	408.55	414.75	6.20	11	14

Table 6. Statistics of QoE items with an associated VCC-DASH bitrate sequence.

	$\sum_{100} Q o E_{i}$	$\sum_{100} M O S$	$\sum_{100} L o s s$	$\sum_{100} D_{i}$	$\sum_{100} R_{i}$
VCC-DASH	422.44	427.90	6.25	12	15
LIU’s	327.32	333.57	6.25	12	15

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Duan, J.; Zhang, M.; Wang, J.; Han, S.; Chen, X.; Yang, X. VCC-DASH: A Video Content Complexity-Aware DASH Bitrate Adaptation Strategy. Electronics 2020, 9, 230. https://doi.org/10.3390/electronics9020230

AMA Style

Duan J, Zhang M, Wang J, Han S, Chen X, Yang X. VCC-DASH: A Video Content Complexity-Aware DASH Bitrate Adaptation Strategy. Electronics. 2020; 9(2):230. https://doi.org/10.3390/electronics9020230

Chicago/Turabian Style

Duan, Juzheng, Min Zhang, Jing Wang, Shuai Han, Xun Chen, and Xiaolong Yang. 2020. "VCC-DASH: A Video Content Complexity-Aware DASH Bitrate Adaptation Strategy" Electronics 9, no. 2: 230. https://doi.org/10.3390/electronics9020230

APA Style

Duan, J., Zhang, M., Wang, J., Han, S., Chen, X., & Yang, X. (2020). VCC-DASH: A Video Content Complexity-Aware DASH Bitrate Adaptation Strategy. Electronics, 9(2), 230. https://doi.org/10.3390/electronics9020230

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

VCC-DASH: A Video Content Complexity-Aware DASH Bitrate Adaptation Strategy

Abstract

1. Introduction

2. Background and Key Issues

2.1. The Analysis and Measurement of Video Content Complexity

2.2. Tagging VCC for DASH Segments

3. Proposed Algorithm

3.1. QoE Utility Function

3.2. QoE Optimization Model

3.3. VCC-DASH Implement

4. Performance Evaluation

4.1. Simulation Scenarios Setup

4.2. Simulation Results Analysis

5. Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI