Abstract:Directional Audio Coding (DirAC) is a proven method for parametrically representing a 3D audio scene in B-format and is capable of reproducing it on arbitrary loudspeaker layouts. Although such a method seems well suited for low bitrate Ambisonic transmission, little work has been done on the feasibility of building a real system upon it. In this paper, we present a DirAC-based coding for Higher-Order Ambisonics (HOA), developed as part of a standardisation effort to extend the 3GPP EVS codec to immersive communications. Starting from the first-order DirAC model, we show how to reduce algorithmic delay, the bitrate required for the parameters and complexity by bringing the full synthesis in the spherical harmonic domain. The evaluation of the proposed technique for coding 3\textsuperscript{rd} order Ambisonics at bitrates from 32 to 128 kbps shows the relevance of the parametric approach compared with existing solutions.
Abstract:This paper introduces an innovative method for reducing the computational complexity of deep neural networks in real-time speech enhancement on resource-constrained devices. The proposed approach utilizes a two-stage processing framework, employing channelwise feature reorientation to reduce the computational load of convolutional operations. By combining this with a modified power law compression technique for enhanced perceptual quality, this approach achieves noise suppression performance comparable to state-of-the-art methods with significantly less computational requirements. Notably, our algorithm exhibits 3 to 4 times less computational complexity and memory usage than prior state-of-the-art approaches.