Picture for Yong Jae Lee

Yong Jae Lee

Do Vision Models Develop Human-Like Progressive Difficulty Understanding?

Add code
Mar 17, 2025
Viaarxiv icon

Stay-Positive: A Case for Ignoring Real Image Features in Fake Image Detection

Add code
Feb 11, 2025
Viaarxiv icon

LASER: Lip Landmark Assisted Speaker Detection for Robustness

Add code
Jan 21, 2025
Viaarxiv icon

Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

Add code
Jan 08, 2025
Viaarxiv icon

On the Effectiveness of Dataset Alignment for Fake Image Detection

Add code
Oct 15, 2024
Figure 1 for On the Effectiveness of Dataset Alignment for Fake Image Detection
Figure 2 for On the Effectiveness of Dataset Alignment for Fake Image Detection
Figure 3 for On the Effectiveness of Dataset Alignment for Fake Image Detection
Figure 4 for On the Effectiveness of Dataset Alignment for Fake Image Detection
Viaarxiv icon

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

Add code
Oct 15, 2024
Figure 1 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 2 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 3 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Figure 4 for TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models
Viaarxiv icon

Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos

Add code
Oct 03, 2024
Figure 1 for Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos
Figure 2 for Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos
Figure 3 for Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos
Figure 4 for Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos
Viaarxiv icon

Removing Distributional Discrepancies in Captions Improves Image-Text Alignment

Add code
Oct 01, 2024
Figure 1 for Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
Figure 2 for Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
Figure 3 for Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
Figure 4 for Removing Distributional Discrepancies in Captions Improves Image-Text Alignment
Viaarxiv icon

Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds

Add code
Sep 10, 2024
Figure 1 for Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds
Figure 2 for Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds
Figure 3 for Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds
Figure 4 for Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds
Viaarxiv icon

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation

Add code
Jul 15, 2024
Viaarxiv icon