MDPI - Publisher of Open Access Journals

24 pages, 19576 KiB

Open AccessArticle

Evaluating HAS and Low-Latency Streaming Algorithms for Enhanced QoE

by Syed Uddin, Michał Grega, Mikołaj Leszczuk and Waqas ur Rahman

Electronics 2025, 14(13), 2587; https://doi.org/10.3390/electronics14132587 - 26 Jun 2025

Viewed by 983

The demand for multimedia traffic over the Internet is exponentially growing. HTTP adaptive streaming (HAS) is the leading video delivery system that delivers high-quality video to the end user. The adaptive bitrate (ABR) algorithms running on the HTTP client select the highest feasible [...] Read more.

The demand for multimedia traffic over the Internet is exponentially growing. HTTP adaptive streaming (HAS) is the leading video delivery system that delivers high-quality video to the end user. The adaptive bitrate (ABR) algorithms running on the HTTP client select the highest feasible video quality by adjusting the quality according to the fluctuating network conditions. Recently, low-latency ABR algorithms have been introduced to reduce the end-to-end latency commonly experienced in HAS. However, a comprehensive study of the low-latency algorithms remains limited. This paper investigates the effectiveness of low-latency streaming algorithms in maintaining a high quality of experience (QoE) while minimizing playback delay. We evaluate these algorithms in the context of both Dynamic Adaptive Streaming over HTTP (DASH) and the Common Media Application Format (CMAF), with a particular focus on the impact of chunked encoding and transfer mechanisms on the QoE. We perform both objective as well as subjective evaluations of low-latency algorithms and compare their performance with traditional DASH-based ABR algorithms across multiple QoE metrics, various network conditions, and diverse content types. The results demonstrate that low-latency algorithms consistently deliver high video quality across various content types and network conditions, whereas the performance of the traditional adaptive bitrate (ABR) algorithms exhibit performance variability under fluctuating network conditions and diverse content characteristics. Although traditional ABR algorithms download higher-quality segments in stable network environments, their effectiveness significantly declines under unstable conditions. Furthermore, the low-latency algorithms maintained high user experience regardless of segment duration. In contrast, the performance of traditional algorithms varied significantly with changes in segment duration. In summary, the results underscore that no single algorithm consistently achieves optimal performance across all experimental conditions. Performance varies depending on network stability, content characteristics, and segment duration, highlighting the need for adaptive strategies that can dynamically respond to varying streaming environments. Full article

(This article belongs to the Special Issue Video Streaming Service Solutions)

► Show Figures

Figure 1

31 pages, 1200 KiB

Open AccessArticle

Power-Efficient UAV Positioning and Resource Allocation in UAV-Assisted Wireless Networks for Video Streaming with Fairness Consideration

by Zaheer Ahmed, Ayaz Ahmad, Muhammad Altaf and Mohammed Ahmed Hassan

Drones 2025, 9(5), 356; https://doi.org/10.3390/drones9050356 - 7 May 2025

Viewed by 845

Abstract

This work proposes a power-efficient framework for adaptive video streaming in UAV-assisted wireless networks specially designed for disaster-hit areas where existing base stations are nonfunctional. Delivering high-quality videos requires higher video rates and more resources, which leads to increased power consumption. With the [...] Read more.

This work proposes a power-efficient framework for adaptive video streaming in UAV-assisted wireless networks specially designed for disaster-hit areas where existing base stations are nonfunctional. Delivering high-quality videos requires higher video rates and more resources, which leads to increased power consumption. With the increasing demand of mobile video, efficient bandwidth allocation becomes essential. In shared networks, users with lower bitrates experience poor video quality when high-bitrate users occupy most of the bandwidth, leading to a degraded and unfair user experience. Additionally, frequent video rate switching can significantly impact user experience, making the video rates’ smooth transition essential. The aim of this research is to maximize the overall users’ quality of experience in terms of power-efficient adaptive video streaming by fair distribution and smooth transition of video rates. The joint optimization includes power minimization, efficient resource allocation, i.e., transmit power and bandwidth, and efficient two-dimensional positioning of the UAV while meeting system constraints. The formulated problem is non-convex and difficult to solve with conventional methods. Therefore, to avoid the curse of complexity, the block coordinate descent method, successive convex approximation technique, and efficient iterative algorithm are applied. Extensive simulations are performed to verify the effectiveness of the proposed solution method. The simulation results reveal that the proposed method outperforms 95–97% over equal allocation, 77–89% over random allocation, and 17–40% over joint allocation schemes. Full article

(This article belongs to the Special Issue Drone Communication, Networking, and Trajectory Control in Urban Environments)

► Show Figures

Figure 1

19 pages, 5737 KiB

Open AccessArticle

Improving the Quality of Experience of Video Streaming Through a Buffer-Based Adaptive Bitrate Algorithm and Gated Recurrent Unit-Based Network Bandwidth Prediction

by Jeonghun Woo, Seungwoo Hong, Donghyun Kang and Donghyeok An

Appl. Sci. 2024, 14(22), 10490; https://doi.org/10.3390/app142210490 - 14 Nov 2024

Cited by 1 | Viewed by 2530

Abstract

With the evolution of cellular networks and wireless-local-area-network-based communication technologies, services for smart device users have appeared. With the popularity of 4G and 5G, smart device users can now consume larger bandwidths than before. Consequently, the demand for various services, such as streaming, [...] Read more.

With the evolution of cellular networks and wireless-local-area-network-based communication technologies, services for smart device users have appeared. With the popularity of 4G and 5G, smart device users can now consume larger bandwidths than before. Consequently, the demand for various services, such as streaming, online games, and video conferences, has increased. For improved quality of experience (QoE), streaming services utilize adaptive bitrate (ABR) algorithms to handle network bandwidth variations. ABR algorithms use network bandwidth history for future network bandwidth prediction, allowing them to perform efficiently when network bandwidth fluctuations are minor. However, in environments with frequent network bandwidth changes, such as wireless networks, the QoE of video streaming often degrades because of inaccurate predictions of future network bandwidth. To address this issue, we utilize the gated recurrent unit, a time series prediction model, to predict the network bandwidth accurately. We then propose a buffer-based ABR streaming technique that selects optimized video-quality settings on the basis of the predicted bandwidth. The proposed algorithm was evaluated on a dataset provided by Zeondo by categorizing instances of user mobility into walking, bus, and train scenarios. The proposed algorithm improved the QoE by approximately 11% compared with the existing buffer-based ABR algorithm in various environments. Full article

(This article belongs to the Special Issue Multimedia Systems Studies)

► Show Figures

Figure 1

20 pages, 1346 KiB

Open AccessArticle

MNCATM: A Multi-Layer Non-Uniform Coding-Based Adaptive Transmission Method for 360° Video

by Xiang Li, Junfeng Nie, Xinmiao Zhang, Chengrui Li, Yichen Zhu, Yang Liu, Kun Tian and Jia Guo

Electronics 2024, 13(21), 4200; https://doi.org/10.3390/electronics13214200 - 26 Oct 2024

Viewed by 1079

Abstract

With the rapid development of multimedia services and smart devices, 360-degree video has enhanced the user viewing experience, ushering in a new era of immersive human–computer interaction. These technologies are increasingly integrating everyday life, including gaming, education, and healthcare. However, the uneven spatiotemporal [...] Read more.

With the rapid development of multimedia services and smart devices, 360-degree video has enhanced the user viewing experience, ushering in a new era of immersive human–computer interaction. These technologies are increasingly integrating everyday life, including gaming, education, and healthcare. However, the uneven spatiotemporal distribution of wireless resources presents significant challenges for the transmission of ultra-high-definition 360-degree video streaming. To address this issue, this paper proposes a multi-layer non-uniform coding-based adaptive transmission method for 360° video (MNCATM). This method optimizes video caching and transmission by dividing non-uniform tiles and leveraging users’ dynamic field of view (FoV) information and the multi-bitrate characteristics of video content. First, the video transmission process is formalized and modeled, and an adaptive transmission optimization framework for a non-uniform video is proposed. Based on this, the optimization problem required by the paper is summarized, and an algorithm is proposed to solve the problem. Simulation experiments demonstrate that the proposed method, MNCATM, outperforms existing transmission schemes in terms of bandwidth utilization and user quality of experience (QoE). MNCATM can effectively utilize network bandwidth, reduce latency, improve transmission efficiency, and maximize user experience quality. Full article

► Show Figures

Figure 1

18 pages, 4103 KiB

Open AccessArticle

Content-Adaptive Bitrate Ladder Estimation in High-Efficiency Video Coding Utilizing Spatiotemporal Resolutions

by Jelena Šuljug and Snježana Rimac-Drlje

Electronics 2024, 13(20), 4049; https://doi.org/10.3390/electronics13204049 - 15 Oct 2024

Viewed by 1233

Abstract

The constant increase in multimedia Internet traffic in the form of video streaming requires new solutions for efficient video coding to save bandwidth and network resources. HTTP adaptive streaming (HAS), the most widely used solution for video streaming, allows the client to adaptively [...] Read more.

The constant increase in multimedia Internet traffic in the form of video streaming requires new solutions for efficient video coding to save bandwidth and network resources. HTTP adaptive streaming (HAS), the most widely used solution for video streaming, allows the client to adaptively select the bitrate according to the transmission conditions. For this purpose, multiple presentations of the same video content are generated on the video server, which contains video sequences encoded at different bitrates with resolution adjustment to achieve the best Quality of Experience (QoE). This set of bitrate–resolution pairs is called a bitrate ladder. In addition to the traditional one-size-fits-all scheme for the bitrate ladder, context-aware solutions have recently been proposed that enable optimum bitrate–resolution pairs for video sequences of different complexity. However, these solutions use only spatial resolution for optimization, while the selection of the optimal combination of spatial and temporal resolution for a given bitrate has not been sufficiently investigated. This paper proposes bit-ladder optimization considering spatiotemporal features of video sequences and usage of optimal spatial and temporal resolution related to video content complexity. Optimization along two dimensions of resolution significantly increases the complexity of the problem and the approach of intensive encoding for all spatial and temporal resolutions in a wide range of bitrates, for each video sequence, is not feasible in real time. In order to reduce the level of complexity, we propose a data augmentation using a neural network (NN)-based model. To train the NN model, we used seven video sequences of different content complexity, encoded with the HEVC encoder at five different spatial resolutions (SR) up to 4K. Also, all video sequences were encoded using four frame rates up to 120 fps, presenting different temporal resolutions (TR). The Structural Similarity Index Measure (SSIM) is used as an objective video quality metric. After data augmentation, we propose NN models that estimate optimal TR and bitrate values as switching points to a higher SR. These results can be further used as input parameters for the bitrate ladder construction for video sequences of a certain complexity. Full article

(This article belongs to the Special Issue Image and Video Processing and Retrieval Based on Machine Learning and Deep Learning)

► Show Figures

Figure 1

13 pages, 6598 KiB

Open AccessArticle

Region-of-Interest Based Coding Scheme for Live Videos

by Xiuxin Dou, Xixin Cao and Xianguo Zhang

Appl. Sci. 2024, 14(9), 3823; https://doi.org/10.3390/app14093823 - 30 Apr 2024

Cited by 1 | Viewed by 1861

Abstract

In this paper, we introduce a novel rate control scheme specifically tailored for live broadcasting scenarios. Notably, in high-definition live transmissions of sports events and video game competitions that typically exceed 1080 p resolution and run at frame rates of 60 fps or [...] Read more.

In this paper, we introduce a novel rate control scheme specifically tailored for live broadcasting scenarios. Notably, in high-definition live transmissions of sports events and video game competitions that typically exceed 1080 p resolution and run at frame rates of 60 fps or higher, the transcoding speed of encoders often becomes a limiting factor, leading to streams with substantial bitrates but unsatisfactory quality metrics. To enhance the overall Quality of Service (QoS) without increasing the bitrate, it is essential to improve the quality of Regions of Interest (ROI).Our proposed solution presents an ROI-based rate reservoir model that ingeniously leverages Convolutional Neural Networks (CNNs) to predict rate control parameters. This approach aims to optimize the bitrate allocation within high bitrate live broadcasts, thus enhancing the image quality within ROIs. Experimental outcomes demonstrate that this algorithm manages to increase the bitrate by no more than 5%, effectively redistributing the reduced bitrate across the entire Group of Pictures (GOP). As a result, it ensures a gradual decrease in the quality of Regions of Uninterest (ROU), thereby maintaining a balanced quality experience throughout the broadcasted content. Full article

(This article belongs to the Special Issue AI for Video Compression and Its Applications)

► Show Figures

Figure 1

18 pages, 3164 KiB

Open AccessArticle

PixRevive: Latent Feature Diffusion Model for Compressed Video Quality Enhancement

by Weiran Wang, Minge Jing, Yibo Fan and Wei Weng

Sensors 2024, 24(6), 1907; https://doi.org/10.3390/s24061907 - 16 Mar 2024

Cited by 2 | Viewed by 2646

Abstract

In recent years, the rapid prevalence of high-definition video in Internet of Things (IoT) systems has been directly facilitated by advances in imaging sensor technology. To adapt to limited uplink bandwidth, most media platforms opt to compress videos to bitrate streams for transmission. [...] Read more.

In recent years, the rapid prevalence of high-definition video in Internet of Things (IoT) systems has been directly facilitated by advances in imaging sensor technology. To adapt to limited uplink bandwidth, most media platforms opt to compress videos to bitrate streams for transmission. However, this compression often leads to significant texture loss and artifacts, which severely degrade the Quality of Experience (QoE). We propose a latent feature diffusion model (LFDM) for compressed video quality enhancement, which comprises a compact edge latent feature prior network (ELPN) and a conditional noise prediction network (CNPN). Specifically, we first pre-train ELPNet to construct a latent feature space that captures rich detail information for representing sharpness latent variables. Second, we incorporate these latent variables into the prediction network to iteratively guide the generation direction, thus resolving the problem that the direct application of diffusion models to temporal prediction disrupts inter-frame dependencies, thereby completing the modeling of temporal correlations. Lastly, we innovatively develop a Grouped Domain Fusion module that effectively addresses the challenges of diffusion distortion caused by naive cross-domain information fusion. Comparative experiments on the MFQEv2 benchmark validate our algorithm’s superior performance in terms of both objective and subjective metrics. By integrating with codecs and image sensors, our method can provide higher video quality. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

17 pages, 4019 KiB

Open AccessArticle

QoE-Based Performance Comparison of AVC, HEVC, and VP9 on Mobile Devices with Additional Influencing Factors

by Omer Nawaz, Markus Fiedler and Siamak Khatibi

Electronics 2024, 13(2), 329; https://doi.org/10.3390/electronics13020329 - 12 Jan 2024

Cited by 4 | Viewed by 2515

Abstract

While current video quality assessment research predominantly revolves around resolutions of 4 K and beyond, targeted at ultra high-definition (UHD) displays, effective video quality for mobile video streaming remains primarily within the range of 480 p to 1080 p. In this study, we [...] Read more.

While current video quality assessment research predominantly revolves around resolutions of 4 K and beyond, targeted at ultra high-definition (UHD) displays, effective video quality for mobile video streaming remains primarily within the range of 480 p to 1080 p. In this study, we conducted a comparative analysis of the quality of experience (QoE) for widely implemented video codecs on mobile devices, specifically Advanced Video Coding (AVC), its successor High-Efficiency Video Coding (HEVC), and Google’s VP9. Our choice of 720 p video sequences from a newly developed database, all with identical bitrates, aimed to maintain a manageable subjective assessment duration, capped at 35–40 min. To mimic real-time network conditions, we generated stimuli by streaming original video clips over a controlled emulated setup, subjecting them to eight different packet-loss scenarios. We evaluated the quality and structural similarity of the distorted video clips using objective metrics, including the Video Quality Metric (VQM), Peak Signal-to-Noise Ratio (PSNR), Video Multi-Method Assessment Fusion (VMAF), and Multi-Scale Structural Similarity Index (MS-SSIM). Subsequently, we collected subjective ratings through a custom mobile application developed for Android devices. Our findings revealed that VMAF accurately represented the degradation in video quality compared to other metrics. Moreover, in most cases, HEVC exhibited an advantage over both AVC and VP9 under low packet-loss scenarios. However, it is noteworthy that in our test cases, AVC outperformed HEVC and VP9 in scenarios with high packet loss, based on both subjective and objective assessments. Our observations further indicate that user preferences for the presented content contributed to video quality ratings, emphasizing the importance of additional factors that influence the perceived video quality of end users. Full article

(This article belongs to the Special Issue Quality-of-Experience (QoE) or Quality-of-Service (QoS) in Emerging Networks)

► Show Figures

Figure 1

17 pages, 2727 KiB

Open AccessArticle

Adaptive Streaming Transmission Optimization Method Based on Three-Dimensional Caching Architecture and Environment Awareness in High-Speed Rail

by Jia Guo, Yexuan Zhu, Jinqi Zhu, Fan Shen, Hui Gao and Ye Tian

Electronics 2024, 13(1), 41; https://doi.org/10.3390/electronics13010041 - 20 Dec 2023

Cited by 1 | Viewed by 1380

Abstract

In high-mobility scenarios, a user’s media experience is severely constrained by the difficulty of network channel prediction, the instability of network quality, and other problems caused by the user’s fast movement, frequent base station handovers, the Doppler effect, etc. To this end, this [...] Read more.

In high-mobility scenarios, a user’s media experience is severely constrained by the difficulty of network channel prediction, the instability of network quality, and other problems caused by the user’s fast movement, frequent base station handovers, the Doppler effect, etc. To this end, this paper proposes a video adaptive transmission architecture based on three-dimensional caching. In the temporal dimension, video data are cached to different base stations, and in the spatial dimension video data are cached to base stations, high-speed trains, and clients, thus constructing a multilevel caching architecture based on spatio-temporal attributes. Then, this paper mathematically models the media stream transmission process and summarizes the optimization problems that need to be solved. To solve the optimization problem, this paper proposes three optimization algorithms, namely, the placement algorithm based on three-dimensional caching, the video content selection algorithm for caching, and the bitrate selection algorithm. Finally, this paper builds a simulation system, which shows that the scheme proposed in this paper is more suitable for high-speed mobile networks, with better and more stable performance. Full article

► Show Figures

Figure 1

21 pages, 4136 KiB

Open AccessArticle

Deep Reinforcement Learning-Based Approach for Video Streaming: Dynamic Adaptive Video Streaming over HTTP

by Naima Souane, Malika Bourenane and Yassine Douga

Appl. Sci. 2023, 13(21), 11697; https://doi.org/10.3390/app132111697 - 26 Oct 2023

Cited by 10 | Viewed by 5022

Abstract

Dynamic adaptive video streaming over HTTP (DASH) plays a crucial role in delivering video across networks. Traditional adaptive bitrate (ABR) algorithms adjust video segment quality based on network conditions and buffer occupancy. However, these algorithms rely on fixed rules, making it challenging to [...] Read more.

Dynamic adaptive video streaming over HTTP (DASH) plays a crucial role in delivering video across networks. Traditional adaptive bitrate (ABR) algorithms adjust video segment quality based on network conditions and buffer occupancy. However, these algorithms rely on fixed rules, making it challenging to achieve optimal decisions considering the overall context. In this paper, we propose a novel deep-reinforcement-learning-based approach for DASH streaming, with the primary focus of maintaining consistent perceived video quality throughout the streaming session to enhance user experience. Our approach optimizes quality of experience (QoE) by dynamically controlling the quality distance factor between consecutive video segments. We evaluate our approach through a comprehensive simulation model encompassing diverse wireless network environments and various video sequences. We also conduct a comparative analysis with state-of-the-art methods. The experimental results demonstrate significant improvements in QoE, ensuring users enjoy stable, high-quality video streaming sessions. Full article

(This article belongs to the Section Computing and Artificial Intelligence)

► Show Figures

Figure 1

18 pages, 683 KiB

Open AccessArticle

Intelligent Video Streaming at Network Edge: An Attention-Based Multiagent Reinforcement Learning Solution

by Xiangdong Tang, Fei Chen and Yunlong He

Future Internet 2023, 15(7), 234; https://doi.org/10.3390/fi15070234 - 3 Jul 2023

Cited by 3 | Viewed by 2196

Abstract

Video viewing is currently the primary form of entertainment for modern people due to the rapid development of mobile devices and 5G networks. The combination of pervasive edge devices and adaptive bitrate streaming technologies can lessen the effects of network changes, boosting user [...] Read more.

Video viewing is currently the primary form of entertainment for modern people due to the rapid development of mobile devices and 5G networks. The combination of pervasive edge devices and adaptive bitrate streaming technologies can lessen the effects of network changes, boosting user quality of experience (QoE). Even while edge servers can offer near-end services to local users, it is challenging to accommodate a high number of mobile users in a dynamic environment due to their restricted capacity to maximize user long-term QoE. We are motivated to integrate user allocation and bitrate adaptation into one optimization objective and propose a multiagent reinforcement learning method combined with an attention mechanism to solve the problem of multiedge servers cooperatively serving users. Through comparative experiments, we demonstrate the superiority of our proposed solution in various network configurations. To tackle the edge user allocation problem, we proposed a method called attention-based multiagent reinforcement learning (AMARL), which optimized the problem in two directions, i.e., maximizing the QoE of users and minimizing the number of leased edge servers. The performance of AMARL is proved by experiments. Full article

(This article belongs to the Special Issue Edge and Fog Computing for the Internet of Things)

► Show Figures

Figure 1

13 pages, 2690 KiB

Open AccessArticle

E-Ensemble: A Novel Ensemble Classifier for Encrypted Video Identification

by Syed M. A. H. Bukhari, Waleed Afandi, Muhammad U. S. Khan, Tahir Maqsood, Muhammad B. Qureshi, Muhammad A. B. Fayyaz and Raheel Nawaz

Electronics 2022, 11(24), 4076; https://doi.org/10.3390/electronics11244076 - 8 Dec 2022

Cited by 2 | Viewed by 1997

Abstract

In recent years, video identification within encrypted network traffic has gained popularity for many reasons. For example, a government may want to track what content is being watched by its citizens, or businesses may want to block certain content for productivity. Many such [...] Read more.

In recent years, video identification within encrypted network traffic has gained popularity for many reasons. For example, a government may want to track what content is being watched by its citizens, or businesses may want to block certain content for productivity. Many such reasons advocate for the need to track users on the internet. However, with the introduction of the secure socket layer (SSL) and transport layer security (TLS), it has become difficult to analyze traffic. In addition, dynamic adaptive streaming over HTTP (DASH), which creates abnormalities due to the variable-bitrate (VBR) encoding, makes it difficult for researchers to identify videos in internet traffic. The default quality settings in browsers automatically adjust the quality of streaming videos depending on the network load. These auto-quality settings also increase the challenge in video detection. This paper presents a novel ensemble classifier, E-Ensemble, which overcomes the abnormalities in video identification in encrypted network traffic. To achieve this, three different classifiers are combined by using two different combinations of classifiers: the hard-level and soft-level combinations. To verify the performance of the proposed classifier, the classifiers were trained on a video dataset collected over one month and tested on a separate video dataset captured over 20 days at a different date and time. The soft-level combination of classifiers showed more stable results in handling abnormalities in the dataset than those of the hard-level combination. Furthermore, the soft-level classifier combination technique outperformed the hard-level combination with a high accuracy of 81.81%, even in the auto-quality mode. Full article

(This article belongs to the Special Issue Feature Papers in "Networks" Section)

► Show Figures

Figure 1

15 pages, 2653 KiB

Open AccessArticle

HEVC Based Frame Interleaved Coding Technique for Stereo and Multi-View Videos

by Bruhanth Mallik, Akbar Sheikh-Akbari, Pooneh Bagheri Zadeh and Salah Al-Majeed

Information 2022, 13(12), 554; https://doi.org/10.3390/info13120554 - 25 Nov 2022

Cited by 2 | Viewed by 3421

Abstract

The standard HEVC codec and its extension for coding multiview videos, known as MV-HEVC, have proven to deliver improved visual quality compared to its predecessor, H.264/MPEG-4 AVC’s multiview extension, H.264-MVC, for the same frame resolution with up to 50% bitrate savings. MV-HEVC’s framework [...] Read more.

The standard HEVC codec and its extension for coding multiview videos, known as MV-HEVC, have proven to deliver improved visual quality compared to its predecessor, H.264/MPEG-4 AVC’s multiview extension, H.264-MVC, for the same frame resolution with up to 50% bitrate savings. MV-HEVC’s framework is similar to that of H.264-MVC, which uses a multi-layer coding approach. Hence, MV-HEVC would require all frames from other reference layers decoded prior to decoding a new layer. Thus, the multi-layer coding architecture would be a bottleneck when it comes to quicker frame streaming across different views. In this paper, an HEVC-based Frame Interleaved Stereo/Multiview Video Codec (HEVC-FISMVC) that uses a single layer encoding approach to encode stereo and multiview video sequences is presented. The frames of stereo or multiview video sequences are interleaved in such a way that encoding the resulting monoscopic video stream would maximize the exploitation of temporal, inter-view, and cross-view correlations and thus improving the overall coding efficiency. The coding performance of the proposed HEVC-FISMVC codec is assessed and compared with that of the standard MV-HEVC’s performance for three standard multi-view video sequences, namely: “Poznan_Street”, “Kendo” and “Newspaper1”. Experimental results show that the proposed codec provides more substantial coding gains than the anchor MV-HEVC for coding both stereo and multi-view video sequences. Full article

► Show Figures

Figure 1

19 pages, 12396 KiB

Open AccessArticle

Intelligent Caching for Mobile Video Streaming in Vehicular Networks with Deep Reinforcement Learning

by Zhaohui Luo and Minghui Liwang

Appl. Sci. 2022, 12(23), 11942; https://doi.org/10.3390/app122311942 - 23 Nov 2022

Cited by 5 | Viewed by 2084

Abstract

Caching-enabled multi-access edge computing (MEC) has attracted wide attention to support future intelligent vehicular networks, especially for delivering high-definition videos in the internet of vehicles with limited backhaul capacity. However, factors such as the constrained storage capacity of MEC servers and the mobility [...] Read more.

Caching-enabled multi-access edge computing (MEC) has attracted wide attention to support future intelligent vehicular networks, especially for delivering high-definition videos in the internet of vehicles with limited backhaul capacity. However, factors such as the constrained storage capacity of MEC servers and the mobility of vehicles pose challenges to caching reliability, particularly for supporting multiple bitrate video streaming caching while achieving considerable quality of experience (QoE). Motivated by the above challenges, in this paper, we propose an intelligent caching strategy that takes into account vehicle mobility, time-varying content popularity, and backhaul capability to improve the QoE of vehicle users effectively. First, based on the mobile video mean opinion score (MV-MOS), we designed an average download percentage (ADP) weighted QoE evaluation model. Then, the video content caching problem is formulated as a Markov decision process (MDP) to maximize the ADP weighted MV-MOS. Owing to the prior knowledge of video content popularity and channel state information that may not be available at the road side unit in practical scenarios, we propose a deep reinforcement learning (DRL)-based caching strategy to solve the problem while achieving a maximum ADP weighted MV-MOS. To accelerate its convergence speed, we further integrate the prioritized experience replay, dueling, and double deep Q-network technologies, which improve the performance of DRL algorithm. Numerical results demonstrate that the proposed DRL-based caching strategy significantly improves QoE, and achieves better video delivery reliability compared to existing non-learning approaches. Full article

(This article belongs to the Section Electrical, Electronics and Communications Engineering)

► Show Figures

Figure 1

22 pages, 29561 KiB

Open AccessArticle

HoloKinect: Holographic 3D Video Conferencing

by Stephen Siemonsma and Tyler Bell

Sensors 2022, 22(21), 8118; https://doi.org/10.3390/s22218118 - 23 Oct 2022

Cited by 8 | Viewed by 3878

Abstract

Recent world events have caused a dramatic rise in the use of video conferencing solutions such as Zoom and FaceTime. Although 3D capture and display technologies are becoming common in consumer products (e.g., Apple iPhone TrueDepth sensors, Microsoft Kinect devices, and Meta Quest [...] Read more.

Recent world events have caused a dramatic rise in the use of video conferencing solutions such as Zoom and FaceTime. Although 3D capture and display technologies are becoming common in consumer products (e.g., Apple iPhone TrueDepth sensors, Microsoft Kinect devices, and Meta Quest VR headsets), 3D telecommunication has not yet seen any appreciable adoption. Researchers have made great progress in developing advanced 3D telepresence systems, but often with burdensome hardware and network requirements. In this work, we present HoloKinect, an open-source, user-friendly, and GPU-accelerated platform for enabling live, two-way 3D video conferencing on commodity hardware and a standard broadband internet connection. A Microsoft Azure Kinect serves as the capture device and a Looking Glass Portrait multiscopically displays the final reconstructed 3D mesh for a hologram-like effect. HoloKinect packs color and depth information into a single video stream, leveraging multiwavelength depth (MWD) encoding to store depth maps in standard RGB video frames. The video stream is compressed with highly optimized and hardware-accelerated video codecs such as H.264. A search of the depth and video encoding parameter space was performed to analyze the quantitative and qualitative losses resulting from HoloKinect’s lossy compression scheme. Visual results were acceptable at all tested bitrates (3–30 Mbps), while the best results were achieved with higher video bitrates and full 4:4:4 chroma sampling. RMSE values of the recovered depth measurements were low across all settings permutations. Full article

(This article belongs to the Special Issue Kinect Sensor and Its Application)

► Show Figures

Figure 1

Search Results (48)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (48)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI