Picture for Wooyoung Kang

Wooyoung Kang

Honeybee: Locality-enhanced Projector for Multimodal LLM

Add code
Dec 11, 2023
Viaarxiv icon

Large Language Models are Temporal and Causal Reasoners for Video Question Answering

Add code
Nov 06, 2023
Viaarxiv icon

NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

Add code
Sep 11, 2023
Viaarxiv icon

Open-Vocabulary Object Detection using Pseudo Caption Labels

Add code
Mar 23, 2023
Viaarxiv icon

Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning

Add code
Dec 27, 2022
Viaarxiv icon

Dense but Efficient VideoQA for Intricate Compositional Reasoning

Add code
Oct 19, 2022
Figure 1 for Dense but Efficient VideoQA for Intricate Compositional Reasoning
Figure 2 for Dense but Efficient VideoQA for Intricate Compositional Reasoning
Figure 3 for Dense but Efficient VideoQA for Intricate Compositional Reasoning
Figure 4 for Dense but Efficient VideoQA for Intricate Compositional Reasoning
Viaarxiv icon