Picture for Daniel Z. Kaplan

Daniel Z. Kaplan

Robin: a Suite of Multi-Scale Vision-Language Models and the CHIRP Evaluation Benchmark

Add code
Jan 16, 2025
Viaarxiv icon

SER Evals: In-domain and Out-of-domain Benchmarking for Speech Emotion Recognition

Add code
Aug 14, 2024
Viaarxiv icon

Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers

Add code
Jan 21, 2024
Viaarxiv icon