Scientific results - R&D Center

5G network power XR (3) - @VR-sensitive delay oriented cache management - Dr. Yang Junchao

2021-01-19

1. The introduction

In VR systems, VR latency refers specifically to MTP(Motion-to-Photon) latency, which describes the time taken from the start of a user's motion to the display of the corresponding picture on the screen. It includes motion sensor latency, network latency, rendering latency and display latency. According to a new study says, can guarantee the high quality of VR experience minimum delay range between 17 and 20 ms, MTP delay is in addition to the high rate of another the most important factors affecting the quality of the VR experience, if the delay is more than 20 ms, it is easy to produce motion sickness disease (motion sickness), leading to bad VR experience, most of the HMD manufacturer claims that its equipment can support within 20 ms delay [1], but unless the network latency level to reach the number of milliseconds (it is reported that China has reached more than 90 milliseconds at present 80% of the network delay). Otherwise sending the full 360-degree panoramic video to HMD to reduce network latency will always be a last resort.

Due to the nature of VR interactivity, its relative to the requirement of delay with traditional video is more strict, and the quality of the current mainstream 4 k of virtual reality video requires at least hundreds of megabit per second transmission rate, and to ensure user interaction experience at the same time, the main content of the current VR is the content of the local content broadcast or cable transmission is given priority to, to a certain extent limits the application of VR scene. Current VR video transmission can only meet the basic experience of VR users, from the quality of the VR experience anytime and anywhere there is still a huge gap between the industry application of demand, resolving the VR based on 5 g network real-time wireless transmission is the key to solve this dilemma, however, the instability of the wireless network and fallibility need according to the different network condition reasonable caching of the user (prefetching), to guarantee the quality of the user video at the same time to ensure smooth VR user interaction experience.

Caching VR content is more challenging and difficult than traditional video. On the one hand, the accuracy of viewpoint prediction based on the user's head movement trajectory decreases significantly with time, which increases the difficulty of caching video content in advance. On the other hand, most of the current VR video platforms still use the transmission mechanism of traditional video, which will reduce the interactive experience of VR to some extent. Therefore, in view of VR delay requirements, more accurate viewpoint prediction method and caching mechanism are the urgent needs to provide a higher quality VR experience.

Fig. 1 Schematic diagram of wired transmission of current VR content

2. Adaptive VR content transmission strategy based on two-level cache

For delay-sensitive VR applications, user caching can provide a smooth user experience. However, when users have good bandwidth, caching of low-quality video blocks for a long time may result in cache overflow. In the case of poor bandwidth, the video blocks with short time and high quality are cached, and the video blocks cannot be downloaded in time, leading to the phenomenon of stalling. Both of these cases will reduce the user experience. Therefore, on the basis of viewpoint prediction, we propose an adaptive transmission algorithm based on two-level cache, which comprehensively considers the user's bandwidth status, the user's viewpoint prediction and the state of the cache area to carry out adaptive transmission.

Fig. 2 Cache management strategy for VR delay determination in 5G scenario

In particular, the bandwidth of the user status using machine learning method using the bandwidth condition of a time period on to predict that the next time the bandwidth of the server records the user point of view of trajectory, and based on the perspective of historical trajectory and similar user trajectory prediction, the point and the point of view of the predicted timely feedback to the user terminal adaptive decision module. The user terminal cache area cache strategy adopts a two-level cache strategy, and divides the entire cache region into two regions, namely [0,] and [,], where, and represent the threshold value of the two levels and the maximum cache value respectively. Where, [0,] is the first-level cache area. When the current cache state of the user is less than, it means that the content in the cache will be consumed quickly. If the new video blocks are not downloaded in time, the phenomenon of stalling will appear.

The strategy based on two-level cache management is to update [0,] part of the Tile based on the prediction result of the user's viewpoint, that is, to transfer the higher-quality version of the Tile with greater gain of utility value; Download [,] part of the video block, by downloading the part of the video block to reach the cache threshold in time. For the download task of this part, the utility value in the overall cache is optimized mainly through the selection of Tile code rate in the video block. When the current cache state of the user is greater than that, it means that there are enough video blocks in the cache for the user to watch. Therefore, the Tile of the [0,] part is updated. When downloading the [,] part of the video block, the prediction accuracy of the user's view point is low when downloading this part of the video block, and the Tile selection based on the view point prediction has a great deviation, so the unified lowest bit rate version can be used to download the [,] part of the video block. On the one hand, when the content of the cache is about to be exhausted, a conservative strategy should be adopted to update the tiles of the cache and download new video blocks; on the other hand, when the cache is about to reach the maximum cache value, a more aggressive strategy should be adopted to update the tiles of the cache and download new video blocks. Therefore, the rate at which a download is requested requires a combination of the current state of the cache and the predicted bandwidth determination.

3. The conclusion

Aiming at time-sensitive VR applications, we propose an adaptive algorithm based on two-level caching strategy, which combines bandwidth status and cache state. Compared with the traditional caching mechanism, the proposed adaptive algorithm can guarantee better QoE of VR users in the case of frequent bandwidth changes.

References:

[1]A. TaghaviNasrabadi, A. Mahzari, J. D. Beshay, et al. Adaptive 360-degree video streaming using layered video coding[C]//2017 IEEE Virtual Reality (VR). Los Angeles, USA, IEEE, 2017: 347-348.

[2]R. Guntur, W. T. Ooi. On tile assignment for region-of-interest video streaming in a wireless LAN[C]. 22nd Internatial Workshop Networking Operating Systtem Support Digital Audio Video. Toronto, Canada, 2012: 59-64.

[3]T. Alshawi, Z. Long, G. Alregib. Understanding spatial correlation in eye-fixation maps for visual attention in videos[C]// IEEE International Conference on Multimedia and Expo. Seattle, USA, IEEE, 2016: 1-6.

[4]Q. Zhao, L. Wan, W. Feng, J. Zhang, T.T. Wong. Cube2video: Navigate between cubic panoramas in real-time[J], IEEE Trans. Multimedia, 2013,15(8):1745-1754

[5]C. Ozcinar, A. De Abreu, A. Smolic Viewport-aware adaptive 360 video streaming using tiles for virtual reality[C]//2017 IEEE International Conference on Image Processing (ICIP). Beijing, China, IEEE, 2017: 2174-2178.

[6]J. D. Praeter, P. Duchi, G. Wallendael, J. Macq, P. Lambert. Efficient encoding of interactive personalized views extracted from immersive video content[C]. ACM 1st International Workshop on Multimedia Alternate Realities. Amsterdam, Netherlands, 2016: 25-30.

[7]L. Xie, Z. Xu, Y. Ban, X. Zhang, Z. Guo, 360probdash: Improving qoe of 360 video streaming using tile-based http adaptive streaming[C]//Proceedings of the 25th ACM international conference on Multimedia. Mountain View, California, USA, ACM, 2017: 315-323.

[8]F. Qian, L. Ji, B. Han, V. Gopalakrishnan, Optimizing 360 video delivery over cellular networks[C]//Proceedings of the 5th Workshop on All Things Cellular: Operations, Applications and Challenges. New York City, New York, ACM, 2016: 1-6.

[9]R. Skupin, Y. Sanchez, D. Podborski. HEVC tile based streaming to head mounted displays[C]// Consumer Communications & NETWORKING Conference. New York City, New York, IEEE, 2017: 613-615.

[10]L. Xie, X. Zhang, Z. Guo Cls: A cross-user learning based system for improving qoe in 360-degree video adaptive streaming[C]//2018 ACM Multimedia Conference on Multimedia Conference. Seoul, Korea, ACM, 2018: 564-572.