Picture for Yuta Nakashima

Yuta Nakashima

Measure Twice, Cut Once: Grasping Video Structures and Event Semantics with LLMs for Video Temporal Localization

Add code
Mar 12, 2025
Viaarxiv icon

No Annotations for Object Detection in Art through Stable Diffusion

Add code
Dec 09, 2024
Figure 1 for No Annotations for Object Detection in Art through Stable Diffusion
Figure 2 for No Annotations for Object Detection in Art through Stable Diffusion
Figure 3 for No Annotations for Object Detection in Art through Stable Diffusion
Figure 4 for No Annotations for Object Detection in Art through Stable Diffusion
Viaarxiv icon

VASCAR: Content-Aware Layout Generation via Visual-Aware Self-Correction

Add code
Dec 05, 2024
Figure 1 for VASCAR: Content-Aware Layout Generation via Visual-Aware Self-Correction
Figure 2 for VASCAR: Content-Aware Layout Generation via Visual-Aware Self-Correction
Figure 3 for VASCAR: Content-Aware Layout Generation via Visual-Aware Self-Correction
Figure 4 for VASCAR: Content-Aware Layout Generation via Visual-Aware Self-Correction
Viaarxiv icon

ReLayout: Towards Real-World Document Understanding via Layout-enhanced Pre-training

Add code
Oct 14, 2024
Viaarxiv icon

Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter

Add code
Aug 20, 2024
Figure 1 for Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter
Figure 2 for Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter
Figure 3 for Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter
Figure 4 for Putting People in LLMs' Shoes: Generating Better Answers via Question Rewriter
Viaarxiv icon

SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP

Add code
Aug 19, 2024
Viaarxiv icon

DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models

Add code
Aug 06, 2024
Figure 1 for DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models
Figure 2 for DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models
Figure 3 for DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models
Figure 4 for DiReCT: Diagnostic Reasoning for Clinical Notes via Large Language Models
Viaarxiv icon

Explainable Image Recognition via Enhanced Slot-attention Based Classifier

Add code
Jul 08, 2024
Viaarxiv icon

Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes

Add code
Jul 04, 2024
Figure 1 for Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes
Figure 2 for Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes
Figure 3 for Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes
Figure 4 for Resampled Datasets Are Not Enough: Mitigating Societal Bias Beyond Single Attributes
Viaarxiv icon

From Descriptive Richness to Bias: Unveiling the Dark Side of Generative Image Caption Enrichment

Add code
Jun 20, 2024
Viaarxiv icon