School of Electrical, Computer and Telecommunications Engineering
Pourashraf, Pedram, Minimisation of Video Downstream Bitrate for Large Scale Immersive Video Conferencing, thesis, School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, 2014. http://ro.uow.edu.au/theses/4225
Video conferencing, in particular multiparty video conferencing, is now seen as an attractive alternative to face-to-face meetings. However, in traditional video conferencing systems, the network capacity grows as the square of number of participants, which will limit scalability.
In this thesis, an immersive video conferencing (IVC) system is introduced, which employs a 3D virtual environment to provide an intuitive space for par- ticipant to naturally interact. The real-time video streams of participants are displayed on their respective avatars in IVC. The avatars can be moved and rotated by the participants within this 3D environment, hence, the video con- ferencing session may include multiple simultaneous conversations (break out groups) and participants can go from one conversation to another. IVC can potentially scale to a larger number of participants provided that each partici- pant only receives accurately adjusted videos within their field of view. This is achieved by a number of techniques developed in this research.
In the technique referred as area of interest management (AOI), transmission of unnecessary video streams to each user is avoided. The decision whether a video stream is required at a particular time would depend on the current perspective of the user. The criteria employed in this research to cull the video streams are: (i) the visual distance to the viewer, (ii) being outside of his/her view frustum, (iii) facing away, or (iv) occluded by opaque objects.
The subjective video quality assessment presented in this research show that, within the visual range of a participant, the required video quality is dependent on the relative distance and orientation of each avatar with respect to the viewer. Therefore, it may not be necessary for all video streams to be at the best quality or rate. Hence, another technique known as video quality differentiation (VQD) is propsed that predicts the required video quality of avatars with their respective 3D situations.
Finally, the results of VQD model are improved for any 3D situation by exploit- ing 3D transformation in the video quality adjustment process. The proposed perceptual pruning mechanism partitions each frame into spatial regions and calculates the required spatial resolution of each region based on their projec- tion sizes on the screen. Consequently, a non-uniform spatially adjusted video quality can be achieved for the 3D situations that the projections of avatars are distorted.