Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Linhui Ye

MASS: Multi-task Anthropomorphic Speech Synthesis Framework

May 10, 2021

Jinyin Chen, Linhui Ye, Zhaoyan Ming

Figure 1 for MASS: Multi-task Anthropomorphic Speech Synthesis Framework

Figure 2 for MASS: Multi-task Anthropomorphic Speech Synthesis Framework

Figure 3 for MASS: Multi-task Anthropomorphic Speech Synthesis Framework

Figure 4 for MASS: Multi-task Anthropomorphic Speech Synthesis Framework

Abstract:Text-to-Speech (TTS) synthesis plays an important role in human-computer interaction. Currently, most TTS technologies focus on the naturalness of speech, namely,making the speeches sound like humans. However, the key tasks of the expression of emotion and the speaker identity are ignored, which limits the application scenarios of TTS synthesis technology. To make the synthesized speech more realistic and expand the application scenarios, we propose a multi-task anthropomorphic speech synthesis framework (MASS), which can synthesize speeches from text with specified emotion and speaker identity. The MASS framework consists of a base TTS module and two novel voice conversion modules: the emotional voice conversion module and the speaker voice conversion module. We propose deep emotion voice conversion model (DEVC) and deep speaker voice conversion model (DSVC) based on convolution residual networks. It solves the problem of feature loss during voice conversion. The model trainings are independent of parallel datasets, and are capable of many-to-many voice conversion. In the emotional voice conversion, speaker voice conversion experiments, as well as the multi-task speech synthesis experiments, experimental results show DEVC and DSVC convert speech effectively. The quantitative and qualitative evaluation results of multi-task speech synthesis experiments show MASS can effectively synthesis speech with specified text, emotion and speaker identity.

Via

Access Paper or Ask Questions

Non-recurrent Traffic Congestion Detection with a Coupled Scalable Bayesian Robust Tensor Factorization Model

May 10, 2020

Qin Li, Huachun Tan, Xizhu Jiang, Yuankai Wu, Linhui Ye

Figure 1 for Non-recurrent Traffic Congestion Detection with a Coupled Scalable Bayesian Robust Tensor Factorization Model

Figure 2 for Non-recurrent Traffic Congestion Detection with a Coupled Scalable Bayesian Robust Tensor Factorization Model

Figure 3 for Non-recurrent Traffic Congestion Detection with a Coupled Scalable Bayesian Robust Tensor Factorization Model

Figure 4 for Non-recurrent Traffic Congestion Detection with a Coupled Scalable Bayesian Robust Tensor Factorization Model

Abstract:Non-recurrent traffic congestion (NRTC) usually brings unexpected delays to commuters. Hence, it is critical to accurately detect and recognize the NRTC in a real-time manner. The advancement of road traffic detectors and loop detectors provides researchers with a large-scale multivariable temporal-spatial traffic data, which allows the deep research on NRTC to be conducted. However, it remains a challenging task to construct an analytical framework through which the natural spatial-temporal structural properties of multivariable traffic information can be effectively represented and exploited to better understand and detect NRTC. In this paper, we present a novel analytical training-free framework based on coupled scalable Bayesian robust tensor factorization (Coupled SBRTF). The framework can couple multivariable traffic data including traffic flow, road speed, and occupancy through sharing a similar or the same sparse structure. And, it naturally captures the high-dimensional spatial-temporal structural properties of traffic data by tensor factorization. With its entries revealing the distribution and magnitude of NRTC, the shared sparse structure of the framework compasses sufficiently abundant information about NRTC. While the low-rank part of the framework, expresses the distribution of general expected traffic condition as an auxiliary product. Experimental results on real-world traffic data show that the proposed method outperforms coupled Bayesian robust principal component analysis (coupled BRPCA), the rank sparsity tensor decomposition (RSTD), and standard normal deviates (SND) in detecting NRTC. The proposed method performs even better when only traffic data in weekdays are utilized, and hence can provide more precise estimation of NRTC for daily commuters.

Via

Access Paper or Ask Questions