A general compression approach to multi-channel three-dimensional audio

Cheng, B; Ritz, C; Burnett, I; Zheng, X

A general compression approach to multi-channel three-dimensional audio

Cheng, B Ritz, C Burnett, I

Zheng, X

Permalink

Publication Type:: Journal Article
Citation:: IEEE Transactions on Audio, Speech and Language Processing, 2013, 21 (8), pp. 1676 - 1688
Issue Date:: 2013-05-22

Closed Access

	Filename	Description	Size
	06508842.pdf	Published Version	2.13 MB	Adobe PDF	View/Open

Copyright Clearance Process

Recently Added
In Progress
Closed Access

This item is closed access and not available.

Full metadata record

Field	Value	Language
dc.contributor.author	Cheng, B	en_US
dc.contributor.author	Ritz, C	en_US
dc.contributor.author	Burnett, I https://orcid.org/0000-0003-3795-7722	en_US
dc.contributor.author	Zheng, X	en_US
dc.date.issued	2013-05-22	en_US
dc.identifier.citation	IEEE Transactions on Audio, Speech and Language Processing, 2013, 21 (8), pp. 1676 - 1688	en_US
dc.identifier.issn	1558-7916	en_US
dc.identifier.uri	http://hdl.handle.net/10453/116153
dc.description.abstract	This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel. © 2006-2012 IEEE.	en_US
dc.relation.ispartof	IEEE Transactions on Audio, Speech and Language Processing	en_US
dc.relation.isbasedon	10.1109/TASL.2013.2260156	en_US
dc.subject.classification	Speech-Language Pathology & Audiology	en_US
dc.title	A general compression approach to multi-channel three-dimensional audio	en_US
dc.type	Journal Article
utslib.citation.volume	8	en_US
utslib.citation.volume	21	en_US
utslib.for	0913 Mechanical Engineering	en_US
utslib.for	0906 Electrical and Electronic Engineering	en_US
utslib.for	0801 Artificial Intelligence and Image Processing	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Engineering and Information Technology
utslib.copyright.status	closed_access
pubs.issue	8	en_US
pubs.publication-status	Published	en_US
pubs.volume	21	en_US

Abstract:

This paper presents a technique for low bit rate compression of three-dimensional (3D) audio produced by multiple loudspeaker channels. The approach is based on the time-frequency analysis of the localization of spatial sound sources within the 3D space as rendered by a multi-channel audio signal (in this case 16 channels). This analysis results in the derivation of a stereo downmix signal representing the original 16 channels. Alternatively, a mono-downmix signal with side information representing the location of sound sources within the 3D spatial scene can also be derived. The resulting downmix signals are then compressed with a traditional audio coder, resulting in a representation of the 3D soundfield at bit rates comparable with existing stereo audio coders while maintaining the perceptual quality produced from separate encoding of each channel. © 2006-2012 IEEE.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/116153