Picture for Afshin Dehghan

Afshin Dehghan

Cubify Anything: Scaling Indoor 3D Object Detection

Add code
Dec 05, 2024
Figure 1 for Cubify Anything: Scaling Indoor 3D Object Detection
Figure 2 for Cubify Anything: Scaling Indoor 3D Object Detection
Figure 3 for Cubify Anything: Scaling Indoor 3D Object Detection
Figure 4 for Cubify Anything: Scaling Indoor 3D Object Detection
Viaarxiv icon

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Add code
Sep 30, 2024
Viaarxiv icon

SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models

Add code
Jul 22, 2024
Figure 1 for SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Figure 2 for SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Figure 3 for SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Figure 4 for SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models
Viaarxiv icon

Understanding Alignment in Multimodal LLMs: A Comprehensive Study

Add code
Jul 02, 2024
Figure 1 for Understanding Alignment in Multimodal LLMs: A Comprehensive Study
Figure 2 for Understanding Alignment in Multimodal LLMs: A Comprehensive Study
Figure 3 for Understanding Alignment in Multimodal LLMs: A Comprehensive Study
Figure 4 for Understanding Alignment in Multimodal LLMs: A Comprehensive Study
Viaarxiv icon

4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Add code
Jun 14, 2024
Viaarxiv icon

4M: Massively Multimodal Masked Modeling

Add code
Dec 11, 2023
Viaarxiv icon

GAUDI: A Neural Architect for Immersive 3D Scene Generation

Add code
Jul 27, 2022
Figure 1 for GAUDI: A Neural Architect for Immersive 3D Scene Generation
Figure 2 for GAUDI: A Neural Architect for Immersive 3D Scene Generation
Figure 3 for GAUDI: A Neural Architect for Immersive 3D Scene Generation
Figure 4 for GAUDI: A Neural Architect for Immersive 3D Scene Generation
Viaarxiv icon

ARKitScenes -- A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data

Add code
Nov 17, 2021
Figure 1 for ARKitScenes -- A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data
Figure 2 for ARKitScenes -- A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data
Figure 3 for ARKitScenes -- A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data
Figure 4 for ARKitScenes -- A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data
Viaarxiv icon

License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks

Add code
Mar 28, 2017
Figure 1 for License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks
Figure 2 for License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks
Figure 3 for License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks
Figure 4 for License Plate Detection and Recognition Using Deeply Learned Convolutional Neural Networks
Viaarxiv icon

DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network

Add code
Mar 04, 2017
Figure 1 for DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network
Figure 2 for DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network
Figure 3 for DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network
Figure 4 for DAGER: Deep Age, Gender and Emotion Recognition Using Convolutional Neural Network
Viaarxiv icon