Picture for Zhiyang Chen

Zhiyang Chen

The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation

Add code
Apr 07, 2025
Viaarxiv icon

Self-Guidance: Boosting Flow and Diffusion Generation on Their Own

Add code
Dec 08, 2024
Figure 1 for Self-Guidance: Boosting Flow and Diffusion Generation on Their Own
Figure 2 for Self-Guidance: Boosting Flow and Diffusion Generation on Their Own
Figure 3 for Self-Guidance: Boosting Flow and Diffusion Generation on Their Own
Figure 4 for Self-Guidance: Boosting Flow and Diffusion Generation on Their Own
Viaarxiv icon

Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation

Add code
Dec 02, 2024
Figure 1 for Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Figure 2 for Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Figure 3 for Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Figure 4 for Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation
Viaarxiv icon

SentiXRL: An advanced large language Model Framework for Multilingual Fine-Grained Emotion Classification in Complex Text Environment

Add code
Nov 27, 2024
Figure 1 for SentiXRL: An advanced large language Model Framework for Multilingual Fine-Grained Emotion Classification in Complex Text Environment
Figure 2 for SentiXRL: An advanced large language Model Framework for Multilingual Fine-Grained Emotion Classification in Complex Text Environment
Figure 3 for SentiXRL: An advanced large language Model Framework for Multilingual Fine-Grained Emotion Classification in Complex Text Environment
Figure 4 for SentiXRL: An advanced large language Model Framework for Multilingual Fine-Grained Emotion Classification in Complex Text Environment
Viaarxiv icon

Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling

Add code
Aug 07, 2024
Figure 1 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Figure 2 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Figure 3 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Figure 4 for Openstory++: A Large-scale Dataset and Benchmark for Instance-aware Open-domain Visual Storytelling
Viaarxiv icon

Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models

Add code
Nov 27, 2023
Viaarxiv icon

Mitigating Hallucination in Visual Language Models with Visual Supervision

Add code
Nov 27, 2023
Figure 1 for Mitigating Hallucination in Visual Language Models with Visual Supervision
Figure 2 for Mitigating Hallucination in Visual Language Models with Visual Supervision
Figure 3 for Mitigating Hallucination in Visual Language Models with Visual Supervision
Figure 4 for Mitigating Hallucination in Visual Language Models with Visual Supervision
Viaarxiv icon

Efficient Masked Autoencoders with Self-Consistency

Add code
Feb 28, 2023
Viaarxiv icon

Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks

Add code
Sep 28, 2022
Figure 1 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 2 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 3 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Figure 4 for Obj2Seq: Formatting Objects as Sequences with Class Prompt for Visual Tasks
Viaarxiv icon

UniVIP: A Unified Framework for Self-Supervised Visual Pre-training

Add code
Mar 14, 2022
Viaarxiv icon