Faculty of Engineering and Information Sciences - Papers: Part B

A psychoacoustic-based multiple audio object coding approach via intra-object sparsity

Maoshen Jia, Beijing University of Technology
Jiaming Zhang, Beijing University of Technology
Changchun Bao, Beijing University of TechnologyFollow
Xiguang Zheng, University of Wollongong, Dolby LaboratoriesFollow

RIS ID

118199

Publication Details

M. Jia, J. Zhang, C. Bao & X. Zheng, "A psychoacoustic-based multiple audio object coding approach via intra-object sparsity," Applied Sciences (Switzerland), vol. 7, (12) pp. 1301-1-1301-21, 2017.

Abstract

Rendering spatial sound scenes via audio objects has become popular in recent years, since it can provide more flexibility for different auditory scenarios, such as 3D movies, spatial audio communication and virtual classrooms. To facilitate high-quality bitrate-efficient distribution for spatial audio objects, an encoding scheme based on intra-object sparsity (approximate k-sparsity of the audio object itself) is proposed in this paper. The statistical analysis is presented to validate the notion that the audio object has a stronger sparseness in the Modified Discrete Cosine Transform (MDCT) domain than in the Short Time Fourier Transform (STFT) domain. By exploiting intra-object sparsity in the MDCT domain, multiple simultaneously occurring audio objects are compressed into a mono downmix signal with side information. To ensure a balanced perception quality of audio objects, a Psychoacoustic-based time-frequency instants sorting algorithm and an energy equalized Number of Preserved Time-Frequency Bins (NPTF) allocation strategy are proposed, which are employed in the underlying compression framework. The downmix signal can be further encoded via Scalar Quantized Vector Huffman Coding (SQVH) technique at a desirable bitrate, and the side information is transmitted in a lossless manner. Both objective and subjective evaluations show that the proposed encoding scheme outperforms the Sparsity Analysis (SPA) approach and Spatial Audio Object Coding (SAOC) in cases where eight objects were jointly encoded.

Download

Included in

Engineering Commons, Science and Technology Studies Commons

COinS

Link to publisher version (DOI)

http://dx.doi.org/10.3390/app7121301

Faculty of Engineering and Information Sciences - Papers: Part B

A psychoacoustic-based multiple audio object coding approach via intra-object sparsity

RIS ID

Publication Details

Abstract

Included in

Link to publisher version (DOI)

Search

Browse

Links

Faculty of Engineering and Information Sciences - Papers: Part B

A psychoacoustic-based multiple audio object coding approach via intra-object sparsity

Authors

RIS ID

Publication Details

Abstract

Included in

Share

Link to publisher version (DOI)

Search

Browse

Links