Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Translation, Sentiment and Voices: A Computational Model to Translate and Analyse Voices from Real-Time Video Calling

Sep 28, 2019

Aneek Barman Roy

Figure 1 for Translation, Sentiment and Voices: A Computational Model to Translate and Analyse Voices from Real-Time Video Calling

Figure 2 for Translation, Sentiment and Voices: A Computational Model to Translate and Analyse Voices from Real-Time Video Calling

Figure 3 for Translation, Sentiment and Voices: A Computational Model to Translate and Analyse Voices from Real-Time Video Calling

Figure 4 for Translation, Sentiment and Voices: A Computational Model to Translate and Analyse Voices from Real-Time Video Calling

Share this with someone who'll enjoy it:

Abstract:With internet quickly becoming an easy access to many, voice calling over internet is slowly gaining momentum. Individuals has been engaging in video communication across the world in different languages. The decade saw the emergence of language translation using neural networks as well. With more data being generated in audio and visual forms, there has become a need and a challenge to analyse such information for many researchers from academia and industry. The availability of video chat corpora is limited as organizations protect user privacy and ensure data security. For this reason, an audio-visual communication system (VidALL) has been developed and audio-speeches were extracted. To understand human nature while answering a video call, an analysis was conducted where polarity and vocal intensity were considered as parameters. Simultaneously, a translation model using a neural approach was developed to translate English sentences to French. Simple RNN-based and Embedded-RNN based models were used for the translation model. BLEU score and target sentence comparators were used to check sentence correctness. Embedded-RNN showed an accuracy of 88.71 percentage and predicted correct sentences. A key finding suggest that polarity is a good estimator to understand human emotion.

* 79 Pages, 19 Tables, 24 Figures, A M.Sc Dissertation

View paper on

Share this with someone who'll enjoy it:

Title:Translation, Sentiment and Voices: A Computational Model to Translate and Analyse Voices from Real-Time Video Calling

Paper and Code