Picture for Patrick Pérez

Patrick Pérez

DANTE

Vision-Speech Models: Teaching Speech Models to Converse about Images

Add code
Mar 19, 2025
Viaarxiv icon

High-Fidelity Simultaneous Speech-To-Speech Translation

Add code
Feb 05, 2025
Viaarxiv icon

Domain Adaptation with a Single Vision-Language Embedding

Add code
Oct 28, 2024
Figure 1 for Domain Adaptation with a Single Vision-Language Embedding
Figure 2 for Domain Adaptation with a Single Vision-Language Embedding
Figure 3 for Domain Adaptation with a Single Vision-Language Embedding
Figure 4 for Domain Adaptation with a Single Vision-Language Embedding
Viaarxiv icon

Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia

Add code
Oct 07, 2024
Figure 1 for Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia
Figure 2 for Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia
Figure 3 for Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia
Figure 4 for Fine-Tuning CLIP's Last Visual Projector: A Few-Shot Cornucopia
Viaarxiv icon

Winner-takes-all learners are geometry-aware conditional density estimators

Add code
Jun 07, 2024
Figure 1 for Winner-takes-all learners are geometry-aware conditional density estimators
Figure 2 for Winner-takes-all learners are geometry-aware conditional density estimators
Figure 3 for Winner-takes-all learners are geometry-aware conditional density estimators
Figure 4 for Winner-takes-all learners are geometry-aware conditional density estimators
Viaarxiv icon

OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks

Add code
Apr 22, 2024
Figure 1 for OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks
Figure 2 for OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks
Figure 3 for OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks
Figure 4 for OccFeat: Self-supervised Occupancy Feature Prediction for Pretraining BEV Segmentation Networks
Viaarxiv icon

POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images

Add code
Jan 17, 2024
Figure 1 for POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images
Figure 2 for POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images
Figure 3 for POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images
Figure 4 for POP-3D: Open-Vocabulary 3D Occupancy Prediction from Images
Viaarxiv icon

Manipulating Trajectory Prediction with Backdoors

Add code
Jan 03, 2024
Viaarxiv icon

CLIP-DINOiser: Teaching CLIP a few DINO tricks

Add code
Dec 19, 2023
Figure 1 for CLIP-DINOiser: Teaching CLIP a few DINO tricks
Figure 2 for CLIP-DINOiser: Teaching CLIP a few DINO tricks
Figure 3 for CLIP-DINOiser: Teaching CLIP a few DINO tricks
Figure 4 for CLIP-DINOiser: Teaching CLIP a few DINO tricks
Viaarxiv icon

Reliability in Semantic Segmentation: Can We Use Synthetic Data?

Add code
Dec 14, 2023
Figure 1 for Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Figure 2 for Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Figure 3 for Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Figure 4 for Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Viaarxiv icon