Abstract:The traditional adaptive algorithms will face the non-uniqueness problem when dealing with stereophonic acoustic echo cancellation (SAEC). In this paper, we first propose an efficient multi-input and multi-output (MIMO) scheme based on deep learning to filter out echoes from all microphone signals at once. Then, we employ a lightweight complex spectral mapping framework (LCSM) for end-to-end SAEC without decorrelation preprocessing to the loudspeaker signals. Inplace convolution and channel-wise spatial modeling are utilized to ensure the near-end signal information is preserved. Finally, a cross-domain loss function is designed for better generalization capability. Experiments are evaluated on a variety of untrained conditions and results demonstrate that the LCSM significantly outperforms previous methods. Moreover, the proposed causal framework only has 0.55 million parameters, much less than the similar deep learning-based methods, which is important for the resource-limited devices.