Picture for Salman Khan

Salman Khan

Tracking Meets Large Multimodal Models for Driving Scenario Understanding

Add code
Mar 18, 2025
Viaarxiv icon

How Good is my Histopathology Vision-Language Foundation Model? A Holistic Benchmark

Add code
Mar 17, 2025
Viaarxiv icon

O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models

Add code
Mar 15, 2025
Viaarxiv icon

DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding

Add code
Mar 13, 2025
Viaarxiv icon

Hierarchical Self-Supervised Adversarial Training for Robust Vision Models in Histopathology

Add code
Mar 13, 2025
Viaarxiv icon

Handwritten Digit Recognition: An Ensemble-Based Approach for Superior Performance

Add code
Mar 08, 2025
Viaarxiv icon

LLMVoX: Autoregressive Streaming Text-to-Speech Model for Any LLM

Add code
Mar 06, 2025
Viaarxiv icon

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

Add code
Feb 28, 2025
Viaarxiv icon

C-Drag: Chain-of-Thought Driven Motion Controller for Video Generation

Add code
Feb 27, 2025
Viaarxiv icon

AirCast: Improving Air Pollution Forecasting Through Multi-Variable Data Alignment

Add code
Feb 25, 2025
Viaarxiv icon