HEVERL – Viewport Estimation Using Reinforcement Learning for 360-degree Video Streaming

360-degree video content has become a pivotal component in virtual reality environments, offering viewers an immersive and engaging experience. However, streaming such comprehensive video content presents significant challenges due to the substantial file sizes and varying network conditions. To add...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Informatika i avtomatizaciâ (Online) 2025-01, Vol.24 (1), p.302-328
Hauptverfasser: Hung, Nguyen Viet, Dat, Pham, Tan, Nguyen, Quan, Nguyen, Trang, Le Thi Huyen, Nam, Le
Format: Artikel
Sprache:eng
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:360-degree video content has become a pivotal component in virtual reality environments, offering viewers an immersive and engaging experience. However, streaming such comprehensive video content presents significant challenges due to the substantial file sizes and varying network conditions. To address these challenges, view adaptive streaming has emerged as a promising solution, aimed at reducing the burden on network capacity. This technique involves streaming lower-quality video for peripheral views while delivering high-quality content for the specific viewport that the user is actively watching. Essentially, it necessitates accurately predicting the user’s viewing direction and enhancing the quality of that particular segment, underscoring the significance of Viewport Adaptive Streaming (VAS). Our research delves into the application of incremental learning techniques to predict the scores required by the VAS system. By doing so, we aim to optimize the streaming process by ensuring that the most relevant portions of the video are rendered in high quality. Furthermore, our approach is augmented by a thorough analysis of human head and facial movement behaviors. By leveraging these insights, we have developed a reinforcement learning model specifically designed to anticipate user view directions and improve the experience quality in targeted regions. The effectiveness of our proposed method is evidenced by our experimental results, which show significant improvements over existing reference methods. Specifically, our approach enhances the Precision metric by values ranging from 0.011 to 0.022. Additionally, it reduces the Root Mean Square Error (RMSE) by 0.008 to 0.013, the Mean Absolute Error (MAE) by 0.012 to 0.018 and the F1-score by 0.017 to 0.028. Furthermore, we observe an increase in overall accuracy of 2.79 to 16.98. These improvements highlight the potential of our model to significantly enhance the viewing experience in virtual reality environments, making 360-degree video streaming more efficient and user-friendly.
ISSN:2713-3192
2713-3206
DOI:10.15622/ia.24.1.11