Picture for Yulei Niu

Yulei Niu

Where do Large Vision-Language Models Look at when Answering Questions?

Add code
Mar 18, 2025
Viaarxiv icon

DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models

Add code
Nov 05, 2024
Figure 1 for DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
Figure 2 for DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
Figure 3 for DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
Figure 4 for DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models
Viaarxiv icon

Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses

Add code
Sep 22, 2024
Figure 1 for Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Figure 2 for Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Figure 3 for Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Figure 4 for Unveiling Narrative Reasoning Limits of Large Language Models with Trope in Movie Synopses
Viaarxiv icon

AIPO: Improving Training Objective for Iterative Preference Optimization

Add code
Sep 13, 2024
Figure 1 for AIPO: Improving Training Objective for Iterative Preference Optimization
Figure 2 for AIPO: Improving Training Objective for Iterative Preference Optimization
Figure 3 for AIPO: Improving Training Objective for Iterative Preference Optimization
Figure 4 for AIPO: Improving Training Objective for Iterative Preference Optimization
Viaarxiv icon

Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies

Add code
Jun 16, 2024
Figure 1 for Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies
Figure 2 for Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies
Figure 3 for Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies
Figure 4 for Investigating Video Reasoning Capability of Large Language Models with Tropes in Movies
Viaarxiv icon

WIDIn: Wording Image for Domain-Invariant Representation in Single-Source Domain Generalization

Add code
May 28, 2024
Viaarxiv icon

RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos

Add code
Mar 27, 2024
Figure 1 for RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos
Figure 2 for RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos
Figure 3 for RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos
Figure 4 for RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional Videos
Viaarxiv icon

SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

Add code
Mar 03, 2024
Viaarxiv icon

Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering

Add code
Apr 07, 2023
Figure 1 for Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
Figure 2 for Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
Figure 3 for Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
Figure 4 for Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering
Viaarxiv icon

DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection

Add code
Mar 16, 2023
Figure 1 for DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
Figure 2 for DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
Figure 3 for DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
Figure 4 for DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection
Viaarxiv icon