Picture for Liangke Gui

Liangke Gui

VideoAuteur: Towards Long Narrative Video Generation

Add code
Jan 10, 2025
Viaarxiv icon

Is Your Text-to-Image Model Robust to Caption Noise?

Add code
Dec 27, 2024
Viaarxiv icon

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

Add code
Apr 02, 2024
Figure 1 for Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Figure 2 for Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Figure 3 for Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Figure 4 for Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Viaarxiv icon

Mega: Moving Average Equipped Gated Attention

Add code
Sep 26, 2022
Figure 1 for Mega: Moving Average Equipped Gated Attention
Figure 2 for Mega: Moving Average Equipped Gated Attention
Figure 3 for Mega: Moving Average Equipped Gated Attention
Figure 4 for Mega: Moving Average Equipped Gated Attention
Viaarxiv icon

Training Vision-Language Transformers from Captions Alone

Add code
May 19, 2022
Figure 1 for Training Vision-Language Transformers from Captions Alone
Figure 2 for Training Vision-Language Transformers from Captions Alone
Figure 3 for Training Vision-Language Transformers from Captions Alone
Figure 4 for Training Vision-Language Transformers from Captions Alone
Viaarxiv icon

KAT: A Knowledge Augmented Transformer for Vision-and-Language

Add code
Dec 16, 2021
Figure 1 for KAT: A Knowledge Augmented Transformer for Vision-and-Language
Figure 2 for KAT: A Knowledge Augmented Transformer for Vision-and-Language
Figure 3 for KAT: A Knowledge Augmented Transformer for Vision-and-Language
Figure 4 for KAT: A Knowledge Augmented Transformer for Vision-and-Language
Viaarxiv icon