Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dhruv Ramani

VFNet: A Convolutional Architecture for Accent Classification

Oct 15, 2019

Asad Ahmed, Pratham Tangri, Anirban Panda, Dhruv Ramani, Samarjit Karmakar

Figure 1 for VFNet: A Convolutional Architecture for Accent Classification

Figure 2 for VFNet: A Convolutional Architecture for Accent Classification

Figure 3 for VFNet: A Convolutional Architecture for Accent Classification

Figure 4 for VFNet: A Convolutional Architecture for Accent Classification

Abstract:Understanding accent is an issue which can derail any human-machine interaction. Accent classification makes this task easier by identifying the accent being spoken by a person so that the correct words being spoken can be identified by further processing, since same noises can mean entirely different words in different accents of the same language. In this paper, we present VFNet (Variable Filter Net), a convolutional neural network (CNN) based architecture which captures a hierarchy of features to beat the previous benchmarks of accent classification, through a novel and elegant technique of applying variable filter sizes along the frequency band of the audio utterances.

* Accepted at IEEE INDICON 2019

Via

Access Paper or Ask Questions

A Short Survey On Memory Based Reinforcement Learning

Apr 14, 2019

Dhruv Ramani

Figure 1 for A Short Survey On Memory Based Reinforcement Learning

Figure 2 for A Short Survey On Memory Based Reinforcement Learning

Figure 3 for A Short Survey On Memory Based Reinforcement Learning

Figure 4 for A Short Survey On Memory Based Reinforcement Learning

Abstract:Reinforcement learning (RL) is a branch of machine learning which is employed to solve various sequential decision making problems without proper supervision. Due to the recent advancement of deep learning, the newly proposed Deep-RL algorithms have been able to perform extremely well in sophisticated high-dimensional environments. However, even after successes in many domains, one of the major challenge in these approaches is the high magnitude of interactions with the environment required for efficient decision making. Seeking inspiration from the brain, this problem can be solved by incorporating instance based learning by biasing the decision making on the memories of high rewarding experiences. This paper reviews various recent reinforcement learning methods which incorporate external memory to solve decision making and a survey of them is presented. We provide an overview of the different methods - along with their advantages and disadvantages, applications and the standard experimentation settings used for memory based models. This review hopes to be a helpful resource to provide key insight of the recent advances in the field and provide help in further future development of it.

* arXiv admin note: text overlap with arXiv:1803.10760, arXiv:1803.01846, arXiv:1702.08360, arXiv:1805.12375, arXiv:1507.06527, arXiv:1810.02274, arXiv:1711.06677 by other authors

Via

Access Paper or Ask Questions

Autoencoder Based Architecture For Fast & Real Time Audio Style Transfer

Dec 26, 2018

Dhruv Ramani, Samarjit Karmakar, Anirban Panda, Asad Ahmed, Pratham Tangri

Figure 1 for Autoencoder Based Architecture For Fast & Real Time Audio Style Transfer

Figure 2 for Autoencoder Based Architecture For Fast & Real Time Audio Style Transfer

Figure 3 for Autoencoder Based Architecture For Fast & Real Time Audio Style Transfer

Figure 4 for Autoencoder Based Architecture For Fast & Real Time Audio Style Transfer

Abstract:Recently, there has been great interest in the field of audio style transfer, where a stylized audio is generated by imposing the style of a reference audio on the content of a target audio. We improve on the current approaches which use neural networks to extract the content and the style of the audio signal and propose a new autoencoder based architecture for the task. This network generates a stylized audio for a content audio in a single forward pass. The proposed network architecture proves to be advantageous over the quality of audio produced and the time taken to train the network. The network is experimented on speech signals to confirm the validity of our proposal.

Via

Access Paper or Ask Questions