Picture for Kiyoharu Aizawa

Kiyoharu Aizawa

Harnessing PDF Data for Improving Japanese Large Multimodal Models

Add code
Feb 20, 2025
Viaarxiv icon

A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models

Add code
Jan 30, 2025
Figure 1 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Figure 2 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Figure 3 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Figure 4 for A Benchmark and Evaluation for Real-World Out-of-Distribution Detection Using Vision-Language Models
Viaarxiv icon

JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation

Add code
Oct 22, 2024
Figure 1 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Figure 2 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Figure 3 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Figure 4 for JMMMU: A Japanese Massive Multi-discipline Multimodal Understanding Benchmark for Culture-aware Evaluation
Viaarxiv icon

FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation

Add code
Sep 27, 2024
Figure 1 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Figure 2 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Figure 3 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Figure 4 for FoodMLLM-JP: Leveraging Multimodal Large Language Models for Japanese Recipe Generation
Viaarxiv icon

Training-Free Sketch-Guided Diffusion with Latent Optimization

Add code
Aug 31, 2024
Figure 1 for Training-Free Sketch-Guided Diffusion with Latent Optimization
Figure 2 for Training-Free Sketch-Guided Diffusion with Latent Optimization
Figure 3 for Training-Free Sketch-Guided Diffusion with Latent Optimization
Figure 4 for Training-Free Sketch-Guided Diffusion with Latent Optimization
Viaarxiv icon

Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey

Add code
Jul 31, 2024
Figure 1 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Figure 2 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Figure 3 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Figure 4 for Generalized Out-of-Distribution Detection and Beyond in Vision Language Model Era: A Survey
Viaarxiv icon

MangaUB: A Manga Understanding Benchmark for Large Multimodal Models

Add code
Jul 26, 2024
Viaarxiv icon

Zero-Shot Character Identification and Speaker Prediction in Comics via Iterative Multimodal Fusion

Add code
Apr 24, 2024
Viaarxiv icon

Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models

Add code
Mar 29, 2024
Viaarxiv icon

Entity-NeRF: Detecting and Removing Moving Entities in Urban Scenes

Add code
Mar 24, 2024
Viaarxiv icon