Picture for Shengqiong Wu

Shengqiong Wu

PanoSent: A Panoptic Sextuple Extraction Benchmark for Multimodal Conversational Aspect-based Sentiment Analysis

Add code
Aug 18, 2024
Viaarxiv icon

Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment

Add code
Jun 27, 2024
Figure 1 for Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment
Figure 2 for Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment
Figure 3 for Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment
Figure 4 for Enhancing Video-Language Representations with Structural Spatio-Temporal Alignment
Viaarxiv icon

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Add code
Jun 27, 2024
Viaarxiv icon

Towards Semantic Equivalence of Tokenization in Multimodal LLM

Add code
Jun 07, 2024
Figure 1 for Towards Semantic Equivalence of Tokenization in Multimodal LLM
Figure 2 for Towards Semantic Equivalence of Tokenization in Multimodal LLM
Figure 3 for Towards Semantic Equivalence of Tokenization in Multimodal LLM
Figure 4 for Towards Semantic Equivalence of Tokenization in Multimodal LLM
Viaarxiv icon

Modeling Unified Semantic Discourse Structure for High-quality Headline Generation

Add code
Mar 23, 2024
Viaarxiv icon

NExT-GPT: Any-to-Any Multimodal LLM

Add code
Sep 13, 2023
Viaarxiv icon

Empowering Dynamics-aware Text-to-Video Diffusion with Large Language Models

Add code
Aug 26, 2023
Viaarxiv icon

DialogRE^C+: An Extension of DialogRE to Investigate How Much Coreference Helps Relation Extraction in Dialogs

Add code
Aug 12, 2023
Viaarxiv icon

LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation

Add code
Aug 12, 2023
Figure 1 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Figure 2 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Figure 3 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Figure 4 for LayoutLLM-T2I: Eliciting Layout Guidance from LLM for Text-to-Image Generation
Viaarxiv icon

ECQED: Emotion-Cause Quadruple Extraction in Dialogs

Add code
Jun 10, 2023
Viaarxiv icon