Picture for Alan Yuille

Alan Yuille

Johns Hopkins University

KeyVID: Keyframe-Aware Video Diffusion for Audio-Synchronized Visual Animation

Add code
Apr 13, 2025
Viaarxiv icon

DINeMo: Learning Neural Mesh Models with no 3D Annotations

Add code
Mar 26, 2025
Viaarxiv icon

X-LRM: X-ray Large Reconstruction Model for Extremely Sparse-View Computed Tomography Recovery in One Second

Add code
Mar 09, 2025
Viaarxiv icon

Beyond Next-Token: Next-X Prediction for Autoregressive Visual Generation

Add code
Feb 27, 2025
Viaarxiv icon

CoCa-CXR: Contrastive Captioners Learn Strong Temporal Structures for Chest X-Ray Vision-Language Understanding

Add code
Feb 27, 2025
Viaarxiv icon

Dictionary-based Framework for Interpretable and Consistent Object Parsing

Add code
Feb 26, 2025
Viaarxiv icon

PulseCheck457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Multimodal Models

Add code
Feb 13, 2025
Viaarxiv icon

PulseCheck457: A Diagnostic Benchmark for Comprehensive Spatial Reasoning of Large Multimodal Models

Add code
Feb 12, 2025
Viaarxiv icon

EigenLoRAx: Recycling Adapters to Find Principal Subspaces for Resource-Efficient Adaptation and Inference

Add code
Feb 07, 2025
Viaarxiv icon

Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More

Add code
Feb 06, 2025
Viaarxiv icon