Picture for Julien Pinquier

Julien Pinquier

IRIT-SAMoVA

EMVD dataset: a dataset of extreme vocal distortion techniques used in heavy metal

Add code
Jun 24, 2024
Figure 1 for EMVD dataset: a dataset of extreme vocal distortion techniques used in heavy metal
Figure 2 for EMVD dataset: a dataset of extreme vocal distortion techniques used in heavy metal
Figure 3 for EMVD dataset: a dataset of extreme vocal distortion techniques used in heavy metal
Figure 4 for EMVD dataset: a dataset of extreme vocal distortion techniques used in heavy metal
Viaarxiv icon

CoNeTTE: An efficient Audio Captioning system leveraging multiple datasets with Task Embedding

Add code
Sep 01, 2023
Viaarxiv icon

Killing two birds with one stone: Can an audio captioning system also be used for audio-text retrieval?

Add code
Aug 29, 2023
Viaarxiv icon

Multitask learning in Audio Captioning: a sentence embedding regression loss acts as a regularizer

Add code
May 02, 2023
Viaarxiv icon

Is my automatic audio captioning system so bad? spider-max: a metric to consider several caption candidates

Add code
Nov 14, 2022
Viaarxiv icon

Audio-video fusion strategies for active speaker detection in meetings

Add code
Jun 09, 2022
Figure 1 for Audio-video fusion strategies for active speaker detection in meetings
Figure 2 for Audio-video fusion strategies for active speaker detection in meetings
Figure 3 for Audio-video fusion strategies for active speaker detection in meetings
Figure 4 for Audio-video fusion strategies for active speaker detection in meetings
Viaarxiv icon

End-to-end acoustic modelling for phone recognition of young readers

Add code
Mar 04, 2021
Figure 1 for End-to-end acoustic modelling for phone recognition of young readers
Figure 2 for End-to-end acoustic modelling for phone recognition of young readers
Figure 3 for End-to-end acoustic modelling for phone recognition of young readers
Figure 4 for End-to-end acoustic modelling for phone recognition of young readers
Viaarxiv icon

Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data

Add code
Mar 09, 2020
Figure 1 for Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Figure 2 for Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Figure 3 for Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Figure 4 for Deep Neural Networks for Automatic Speech Processing: A Survey from Large Corpora to Limited Data
Viaarxiv icon

Improving Vehicle Re-Identification using CNN Latent Spaces: Metrics Comparison and Track-to-track Extension

Add code
Oct 21, 2019
Figure 1 for Improving Vehicle Re-Identification using CNN Latent Spaces: Metrics Comparison and Track-to-track Extension
Figure 2 for Improving Vehicle Re-Identification using CNN Latent Spaces: Metrics Comparison and Track-to-track Extension
Figure 3 for Improving Vehicle Re-Identification using CNN Latent Spaces: Metrics Comparison and Track-to-track Extension
Figure 4 for Improving Vehicle Re-Identification using CNN Latent Spaces: Metrics Comparison and Track-to-track Extension
Viaarxiv icon