Picture for Sara Papi

Sara Papi

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Add code
Oct 01, 2024
Figure 1 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 2 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 3 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Figure 4 for MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages
Viaarxiv icon

What the Harm? Quantifying the Tangible Impact of Gender Bias in Machine Translation with a Human-centered Study

Add code
Oct 01, 2024
Viaarxiv icon

How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not

Add code
Sep 25, 2024
Figure 1 for How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not
Figure 2 for How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not
Figure 3 for How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not
Figure 4 for How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not
Viaarxiv icon

SimulSeamless: FBK at IWSLT 2024 Simultaneous Speech Translation

Add code
Jun 20, 2024
Viaarxiv icon

StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection

Add code
Jun 10, 2024
Viaarxiv icon

SBAAM! Eliminating Transcript Dependency in Automatic Subtitling

Add code
May 17, 2024
Viaarxiv icon

How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena

Add code
Feb 20, 2024
Figure 1 for How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena
Figure 2 for How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena
Figure 3 for How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena
Viaarxiv icon

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

Add code
Feb 19, 2024
Figure 1 for Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Figure 2 for Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Figure 3 for Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Figure 4 for Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
Viaarxiv icon

Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection

Add code
Oct 24, 2023
Figure 1 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Figure 2 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Figure 3 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Figure 4 for Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
Viaarxiv icon

Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation

Add code
Oct 23, 2023
Viaarxiv icon