Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

VCC-DASH: A Video Content Complexity-Aware DASH Bitrate Adaptation Strategy

Electronics 2020, 9(2), 230; https://doi.org/10.3390/electronics9020230

by Juzheng Duan, Min Zhang^*

, Jing Wang, Shuai Han, Xun Chen and Xiaolong Yang

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Electronics 2020, 9(2), 230; https://doi.org/10.3390/electronics9020230

Submission received: 8 January 2020 / Revised: 26 January 2020 / Accepted: 28 January 2020 / Published: 31 January 2020

(This article belongs to the Section Computer Science & Engineering)

Round 1

Reviewer 1 Report

However, in my opinion, the experiments are not carried out in a fair context because of the analyzed algorithms are not compared using the same amount of transmitted bits.

Author Response

An algorithm for rate allocation in video transmission has been proposed. The idea is to dedicate more data to those parts of the video that are more important to the viewers. The results show QoE improvements. However, in my opinion, the experiments are not carried out in a fair context because of the analyzed algorithms are not compared using the same amount of transmitted bits.

[Response]: In our experiments, we compare fairly the performance of VCC-DASH with that of LIU’s strategy in terms of the optimizing bitrate selection, the QoE improvement and the fair buffer occupancy under the same simulation scenarios setup, especially the same VCC distribution. Moreover, it is worth mentioning that the comparisons under the same amount of transmitted bits are not fair to both of them since the transmitted bits consists of the transmitting bits in the en route and the buffered bits in the player buffer, and LIU’s strategy is much more indifferent to the transmitting bits due to its rigid bitrate selection.

Reviewer 2 Report

Page 2, Figure 1: ‘devide’ should be corrected to ‘divide’

Page 3, line 96-87: please explain how the numbers of I/P/B frames are calculated: in a GOP, segment, entire sequence?

Page , line 99: I’ve checked http://trace.eas.asu.edu/yuv several times, it’s not accessible. It’s not your fault but please specify when did you access it last time

Page 3, Figure 2: The plot is hardly readable, could you please use smaller markers. Please specify for what GOP structure (GOP length, number of B frames) video content complexity was calculated.

Page 4, lines 112-114: Please explain how the ‘r’ values from the Figure 1 are converted into values presented in the Table 1.

Page 4, lines 131-134: It is worth mentioning that each segment contains all information needed for its decoding. It makes switching between segments/representation possible.

Page 5, Figure 3: The XML schema seems to me to be a little bit blurry. Could you improve it?

Page 5, line 155: The CIF test sequences have been used in the simulations. Is it possible to repeat them with SD or HD sequences?

Page 5, line 165+: The caption of the Figure 4 is on the next page. The plots are hardly readable, they should be enlarged.

Page 6, line 170: Please explain if ‘two consecutive segments’ are from the same or different representations

Page 7, line 197: what is the summation range (i) in the formula defining ‘buffer occupancy’

Page 7, line 203: ‘\omega’ is used for the number of segments, whereas ‘w’ is used in the listing in the page 9 for the same number of segment. I think the same symbol should be used in both cases

Page 10, line 246: What test sequences are the ‘100 video segments’ from?

Page 13, line 360: http://www2.tkn.tu-berlin.de/reaearch/evalvid is not accessibe

Author Response

Page 2, Figure 1: ‘devide’ should be corrected to ‘divide’

[Response]: This spelling mistake is removed.

Page 3, line 96-87: please explain how the numbers of I/P/B frames are calculated: in a GOP, segment, entire sequence?

[Response]: The numbers of I/P/B frames is counted within a GOP.

Page , line 99: I’ve checked http://trace.eas.asu.edu/yuv several times, it’s not accessible. It’s not your fault but please specify when did you access it last time

[Response]: As you mentioned, this website is not accessible now. In fact, we have accessed the contents within this website properly before Nov.12, 2019. Now, the same contents can be found in the website http://trace.kom.aau.dk/yuv, and so we have updated this content link.

Page 3, Figure 2: The plot is hardly readable, could you please use smaller markers. Please specify for what GOP structure (GOP length, number of B frames) video content complexity was calculated.

[Response]: As your suggestion, we re-draw Figure 2. In addition, the different YUV video sequences in Figure 2 have different GOP structures in terms of the frame number per GOP, and the B-frame number between I-frame and P-frame within a GOP. Hence, the VCC is computed under the different given GOP structure related to the corresponding video sequence.

Page 4, lines 112-114: Please explain how the ‘r’ values from the Figure 1 are converted into values presented in the Table 1.

[Response]: The values presented in the Table 1 reflect the mode values of each video sequence, which are statistically defined as the most frequent ones among the different r values under different average coded bits per frame, as shown in Figure 2.

Page 4, lines 131-134: It is worth mentioning that each segment contains all information needed for its decoding. It makes switching between segments/representation possible.

[Response]: The decoding information related to video segment have been defined and listed in MPEG-DASH standard in detail. Here, it is not necessary for us to mention all information of Adaptation Set or Representation set since we lay stress on the syntax structure and main components to depict how to encode or decode media content according to HTTP request.

Page 5, Figure 3: The XML schema seems to me to be a little bit blurry. Could you improve it?

[Response]: In fact, there are many video frames attributes in a Media Presentation Description file specified by XML-related syntax structure and components. Here in the XML schema, we only present an extension related to the video content complexity while neglecting the other attributes, which make the schema more concise, and take focus on the VCC attribute.

Page 5, line 155: The CIF test sequences have been used in the simulations. Is it possible to repeat them with SD or HD sequences?

[Response]: Yes, it is.

Page 5, line 165+: The caption of the Figure 4 is on the next page. The plots are hardly readable, they should be enlarged.

[Response]: The flaw has been removed, and we re-draw the plots.

Page 6, line 170: Please explain if ‘two consecutive segments’ are from the same or different representations

[Response]: As the definition of video segment, the video frame sequences within a segment have the same representation. Hence, two consecutive segments maybe have the same or different representations, which depends on the predicting available network bandwidth and playback buffer at the playing moment of them.

Page 7, line 197: what is the summation range (i) in the formula defining ‘buffer occupancy’

[Response]: The summation range i is within [1, N].

[Response]: You are right, the symbol w in the two places has the same meanings.

Page 10, line 246: What test sequences are the ‘100 video segments’ from?

[Response]: In simulations, we have extracted some video segments with different content complexity from the video sequences Akiyo, Container, Foreman, Coastguard, Soccer, Football, respectively, and re-arranged the 100 segments in the VCC distribution shown in Figure 5b.

Page 13, line 360: http://www2.tkn.tu-berlin.de/reaearch/evalvid is not accessible.

[Response]: The link has been updated as https://www.tkn.tu-berlin.de/research/evalvid.

Round 2

Reviewer 1 Report

Obviously, two different transmission algorithms can send a different amount of data. This is reasonable. However, if this happens, you can not provide a QoE comparison between the reconstructions using both algorithms. You must force your simulations to uses exactly the same number of received bits. Please, redo the experiments incorporating this constrain.

Author Response

Obviously, two different transmission algorithms can send a different amount of data. This is reasonable. However, if this happens, you cannot provide a QoE comparison between the reconstructions using both algorithms. You must force your simulations to uses exactly the same number of received bits. Please, redo the experiments incorporating this constrain.

[Response]:

Thank you for your comments.

In principle, the core difference of the two strategies lies in the method to adjust adaptively the bitrate of video segments when the download bandwidth and the player buffer are available. In effect, the bitrate adjustment of VCC-DASH is more graceful and progressive than that LIU’s strategy because VCC-DASH takes into account the bitrate surplus or balance of the buffered video segments within a windows w, and selects the higher bitrate being best for the situation related to the link bandwidth and player buffer. On the contrary, LIU’s strategy adjusts the bitrate according to the ratio of the last segment download time to its playing time, which cannot obviously reflect the real bitrate requirement of current segment and its playing experiences. Hence for a requiring segment, LIU’s strategy always selects the lower bitrate than that matching with its actual video delivery environment conservatively.

In order to further illustrate the advantage of VCC-DASH relative to LIU’s strategy, we add a comparative experiment under the constraint of the equal transmitting bits for any requiring segment. Here according to VCC-DASH, we can obtain the bitrate selection decision, i.e., the bitrate sequence for the 100 segments under two network conditions. Then in the experiment, LIU’s strategy requests and delivers each segment based on the bitrate sequence pre-defined by VCC-DASH. Apparently, each segment received and played by LIU’s strategy has the same bit number as the corresponding segment by VCC-DASH. Under an associated VCC-DASH bitrate sequence, we can obtain the QoE statistics results of LIU’s strategy as shown in Table 6. Compared with the results in Table 5, we observe that the LIU’s QoE under the associated VCC-DASH bitrate sequence is much worse than the ones under the respective optimal bitrate sequences independently determined by their self-adaption policies. Numerically, the LIU’s QoE sum of 100 segments at the associated VCC-DASH bitrate sequence (that is about 327.32 in Table 6) is lower 24.83% than the one at LIU’s optimal bitrate sequence (that is about 408.55 in Table 5), and badly lower 29.06% than VCC-DASH’s QoE sum at its optimal bitrate sequence (that is about 422.44 in Table 5). As shown by Figure 6(b) and Figure 7(b) in much more detail, the number of excellent-level MOS for LIU’s has reduced from 74 to 31 when the bitrate selection of segment changes from adjusting by LIU’s to being predefined by VCC-DASH. Here, most of excellent-level MOS for LIU’s degrade to the good-level (about 24) and fair-level ones (about 19). The reason for degradation of MOS levels for LIU’s roots from the mismatch of the associated VCC-DASH bitrate sequence to LIU’s decision-making solution under the network bandwidth condition shown in Figure 5(a). Hence, the QoE performance of LIU’s is much worse than that of VCC-DASH in the case of the exact same received bits.

Author Response File: Author Response.pdf

Round 3

Reviewer 1 Report

OK, thank you for the new experiments and for your response.

Article Menu

VCC-DASH: A Video Content Complexity-Aware DASH Bitrate Adaptation Strategy

Further Information

Guidelines

MDPI Initiatives

Follow MDPI