Picture for Alessio Brutti

Alessio Brutti

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Add code
Oct 01, 2024
Figure 1 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 2 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 3 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 4 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Viaarxiv icon

Federating Dynamic Models using Early-Exit Architectures for Automatic Speech Recognition on Heterogeneous Clients

Add code
May 27, 2024
Viaarxiv icon

Efficient Fine-tuning of Audio Spectrogram Transformers via Soft Mixture of Adapters

Add code
Feb 01, 2024
Viaarxiv icon

Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers

Add code
Dec 07, 2023
Viaarxiv icon

Continual Contrastive Spoken Language Understanding

Add code
Oct 04, 2023
Figure 1 for Continual Contrastive Spoken Language Understanding
Figure 2 for Continual Contrastive Spoken Language Understanding
Figure 3 for Continual Contrastive Spoken Language Understanding
Figure 4 for Continual Contrastive Spoken Language Understanding
Viaarxiv icon

Training dynamic models using early exits for automatic speech recognition on resource-constrained devices

Add code
Sep 18, 2023
Figure 1 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 2 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 3 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Figure 4 for Training dynamic models using early exits for automatic speech recognition on resource-constrained devices
Viaarxiv icon

An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings

Add code
May 29, 2023
Viaarxiv icon

Sequence-Level Knowledge Distillation for Class-Incremental End-to-End Spoken Language Understanding

Add code
May 23, 2023
Viaarxiv icon

End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations

Add code
Mar 21, 2023
Viaarxiv icon

Improving the Intent Classification accuracy in Noisy Environment

Add code
Mar 12, 2023
Viaarxiv icon