Abstract:Optical fiber sensing is a technology wherein audio, vibrations, and temperature are detected using an optical fiber; especially the audio/vibrations-aware sensing is called distributed acoustic sensing (DAS). In DAS, observed data, which is comprised of multichannel data, has suffered from severe noise levels because of the optical noise or the installation methods. In conventional methods for denoising DAS data, signal-processing- or deep-neural-network (DNN)-based models have been studied. The signal-processing-based methods have the interpretability, i.e., non-black box. The DNN-based methods are good at flexibility designing network architectures and objective functions, that is, priors. However, there is no balance between the interpretability and the flexibility of priors in the DAS studies. The DNN-based methods also require a large amount of training data in general. To address the problems, we propose a DNN-structure signal-processing-based denoising method in this paper. As the priors of DAS, we employ spatial knowledge; low rank and channel-dependent sensitivity using the DNN-based structure. The result of fiber-acoustic sensing shows that the proposed method outperforms the conventional methods and the robustness to the number of the spatial ranks. Moreover, the optimized parameters of the proposed method indicate the relationship with the channel sensitivity; the interpretability.
Abstract:In many methods of sound event detection (SED), a segmented time frame is regarded as one data sample to model training. The durations of sound events greatly depend on the sound event class, e.g., the sound event "fan" has a long duration, whereas the sound event "mouse clicking" is instantaneous. Thus, the difference in the duration between sound event classes results in a serious data imbalance in SED. Moreover, most sound events tend to occur occasionally; therefore, there are many more inactive time frames of sound events than active frames. This also causes a severe data imbalance between active and inactive frames. In this paper, we investigate the impact of sound duration and inactive frames on SED performance by introducing four loss functions, such as simple reweighting loss, inverse frequency loss, asymmetric focal loss, and focal batch Tversky loss. Then, we provide insights into how we tackle this imbalance problem.
Abstract:This paper proposes a determined blind source separation method using Bayesian non-parametric modelling of sources. Conventionally source signals are separated from a given set of mixture signals by modelling them using non-negative matrix factorization (NMF). However in NMF, a latent variable signifying model complexity must be appropriately specified to avoid over-fitting or under-fitting. As real-world sources can be of varying and unknown complexities, we propose a Bayesian non-parametric framework which is invariant to such latent variables. We show that our proposed method adapts to different source complexities, while conventional methods require parameter tuning for optimal separation.