What is it about?
Real-time video communication is becoming more and more important. However, packet loss is prevalent and resending packets, especially in long-latency networks, causes visual stalls. Previous solutions all perform suboptimally as they either add redundancy before sending the data, which reduces bitrate when no packet is lost, or fail to prevent video freeze when redundancy is not enough. User studies confirm that both bitrate decrease and video freeze significantly damage users' Quality of Experience (QoE). Through a user study comparing different artifacts during a quality drop period, we find that moderate quality drop is preferred over video freeze during packet loss. Inspired by this, we propose a new solution that trains a neural network Autoencoder to optimize frame quality under different packet loss rates. Our insight is that such training produces a Data Scalable codec, whose quality increases with each new packet arrival and reaches highest quality when no packet is lost. Specifically, with the arrival of any x encoded bytes of a frame, the decoded quality is closer to the quality than if the whole frame were encoded with x bytes in the first place. Thus, unless all packets are lost, our approach causes a moderate quality drop instead of video freeze during packet loss. In the end, we identify the technical challenges remaining in this approach and point out future opportunities.
Featured Image
Photo by visuals on Unsplash
Why is it important?
Video Conferencing is a growing need and video freeze & quality drop both damages experience. We utilize growing GPU power to solve this problem by using autoencoder based Data Scalable Codecs.
Perspectives
Read the Original
This page is a summary of: Optimizing Real-Time Video Experience with Data Scalable Codec, September 2023, ACM (Association for Computing Machinery),
DOI: 10.1145/3609395.3611108.
You can read the full text:
Contributors
The following have contributed to this page