Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Augusto Seben da Rosa

FreeSVC: Towards Zero-shot Multilingual Singing Voice Conversion

Jan 09, 2025

Alef Iury Siqueira Ferreira, Lucas Rafael Gris, Augusto Seben da Rosa, Frederico Santos de Oliveira, Edresson Casanova, Rafael Teixeira Sousa, Arnaldo Candido Junior, Anderson da Silva Soares, Arlindo Galvão Filho

Abstract:This work presents FreeSVC, a promising multilingual singing voice conversion approach that leverages an enhanced VITS model with Speaker-invariant Clustering (SPIN) for better content representation and the State-of-the-Art (SOTA) speaker encoder ECAPA2. FreeSVC incorporates trainable language embeddings to handle multiple languages and employs an advanced speaker encoder to disentangle speaker characteristics from linguistic content. Designed for zero-shot learning, FreeSVC enables cross-lingual singing voice conversion without extensive language-specific training. We demonstrate that a multilingual content extractor is crucial for optimal cross-language conversion. Our source code and models are publicly available.

Via

Access Paper or Ask Questions

No Saved Kaleidosope: an 100% Jitted Neural Network Coding Language with Pythonic Syntax

Sep 17, 2024

Augusto Seben da Rosa, Marlon Daniel Angeli, Jorge Aikes Junior, Alef Iury Ferreira, Lucas Rafael Gris, Anderson da Silva Soares, Arnaldo Candido Junior, Frederico Santos de Oliveira, Gabriel Trevisan Damke, Rafael Teixeira Sousa

Abstract:We developed a jitted compiler for training Artificial Neural Networks using C++, LLVM and Cuda. It features object-oriented characteristics, strong typing, parallel workers for data pre-processing, pythonic syntax for expressions, PyTorch like model declaration and Automatic Differentiation. We implement the mechanisms of cache and pooling in order to manage VRAM, cuBLAS for high performance matrix multiplication and cuDNN for convolutional layers. Our experiments with Residual Convolutional Neural Networks on ImageNet, we reach similar speed but degraded performance. Also, the GRU network experiments show similar accuracy, but our compiler have degraded speed in that task. However, our compiler demonstrates promising results at the CIFAR-10 benchmark, in which we reach the same performance and about the same speed as PyTorch. We make the code publicly available at: https://github.com/NoSavedDATA/NoSavedKaleidoscope

* 12 pages, 3 figures and 3 tables

Via

Access Paper or Ask Questions

Yin Yang Convolutional Nets: Image Manifold Extraction by the Analysis of Opposites

Oct 24, 2023

Augusto Seben da Rosa, Frederico Santos de Oliveira, Anderson da Silva Soares, Arnaldo Candido Junior

Figure 1 for Yin Yang Convolutional Nets: Image Manifold Extraction by the Analysis of Opposites

Figure 2 for Yin Yang Convolutional Nets: Image Manifold Extraction by the Analysis of Opposites

Figure 3 for Yin Yang Convolutional Nets: Image Manifold Extraction by the Analysis of Opposites

Figure 4 for Yin Yang Convolutional Nets: Image Manifold Extraction by the Analysis of Opposites

Abstract:Computer vision in general presented several advances such as training optimizations, new architectures (pure attention, efficient block, vision language models, generative models, among others). This have improved performance in several tasks such as classification, and others. However, the majority of these models focus on modifications that are taking distance from realistic neuroscientific approaches related to the brain. In this work, we adopt a more bio-inspired approach and present the Yin Yang Convolutional Network, an architecture that extracts visual manifold, its blocks are intended to separate analysis of colors and forms at its initial layers, simulating occipital lobe's operations. Our results shows that our architecture provides State-of-the-Art efficiency among low parameter architectures in the dataset CIFAR-10. Our first model reached 93.32\% test accuracy, 0.8\% more than the older SOTA in this category, while having 150k less parameters (726k in total). Our second model uses 52k parameters, losing only 3.86\% test accuracy. We also performed an analysis on ImageNet, where we reached 66.49\% validation accuracy with 1.6M parameters. We make the code publicly available at: https://github.com/NoSavedDATA/YinYang_CNN.

* 12 pages, 5 tables and 6 figures

Via

Access Paper or Ask Questions