Picture for Kanta Kaneda

Kanta Kaneda

DM2RM: Dual-Mode Multimodal Ranking for Target Objects and Receptacles Based on Open-Vocabulary Instructions

Add code
Aug 15, 2024
Viaarxiv icon

Polos: Multimodal Metric Learning from Human Feedback for Image Captioning

Add code
Feb 28, 2024
Viaarxiv icon

Learning-To-Rank Approach for Identifying Everyday Objects Using a Physical-World Search Engine

Add code
Dec 26, 2023
Viaarxiv icon

DialMAT: Dialogue-Enabled Transformer with Moment-Based Adversarial Training

Add code
Nov 12, 2023
Viaarxiv icon

JaSPICE: Automatic Evaluation Metric Using Predicate-Argument Structures for Image Captioning Models

Add code
Nov 07, 2023
Viaarxiv icon