Picture for Sangyun Chung

Sangyun Chung

Revisiting Misalignment in Multispectral Pedestrian Detection: A Language-Driven Approach for Cross-modal Alignment Fusion

Add code
Nov 27, 2024
Viaarxiv icon

SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models

Add code
Aug 23, 2024
Figure 1 for SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models
Figure 2 for SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models
Figure 3 for SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models
Figure 4 for SPARK: Multi-Vision Sensor Perception and Reasoning Benchmark for Large-scale Vision-Language Models
Viaarxiv icon

TroL: Traversal of Layers for Large Language and Vision Models

Add code
Jun 18, 2024
Viaarxiv icon

MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection

Add code
Mar 22, 2024
Figure 1 for MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection
Figure 2 for MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection
Figure 3 for MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection
Figure 4 for MSCoTDet: Language-driven Multi-modal Fusion for Improved Multispectral Pedestrian Detection
Viaarxiv icon