Picture for Caren Han

Caren Han

'No' Matters: Out-of-Distribution Detection in Multimodality Long Dialogue

Add code
Oct 31, 2024
Viaarxiv icon

ChuLo: Chunk-Level Key Information Representation for Long Document Processing

Add code
Oct 14, 2024
Figure 1 for ChuLo: Chunk-Level Key Information Representation for Long Document Processing
Figure 2 for ChuLo: Chunk-Level Key Information Representation for Long Document Processing
Figure 3 for ChuLo: Chunk-Level Key Information Representation for Long Document Processing
Figure 4 for ChuLo: Chunk-Level Key Information Representation for Long Document Processing
Viaarxiv icon

GEM-VPC: A dual Graph-Enhanced Multimodal integration for Video Paragraph Captioning

Add code
Oct 12, 2024
Viaarxiv icon

Text-guided 3D Human Motion Generation with Keyframe-based Parallel Skip Transformer

Add code
May 24, 2024
Viaarxiv icon

Game-MUG: Multimodal Oriented Game Situation Understanding and Commentary Generation Dataset

Add code
Apr 30, 2024
Viaarxiv icon

PEACH: Pretrained-embedding Explanation Across Contextual and Hierarchical Structure

Add code
Apr 21, 2024
Viaarxiv icon

M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding

Add code
Feb 28, 2024
Figure 1 for M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding
Figure 2 for M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding
Figure 3 for M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding
Figure 4 for M3-VRD: Multimodal Multi-task Multi-teacher Visually-Rich Form Document Understanding
Viaarxiv icon

SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering

Add code
Dec 16, 2022
Figure 1 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Figure 2 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Figure 3 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Figure 4 for SceneGATE: Scene-Graph based co-Attention networks for TExt visual question answering
Viaarxiv icon

An Analysis of Deep Reinforcement Learning Agents for Text-based Games

Add code
Sep 12, 2022
Figure 1 for An Analysis of Deep Reinforcement Learning Agents for Text-based Games
Figure 2 for An Analysis of Deep Reinforcement Learning Agents for Text-based Games
Figure 3 for An Analysis of Deep Reinforcement Learning Agents for Text-based Games
Figure 4 for An Analysis of Deep Reinforcement Learning Agents for Text-based Games
Viaarxiv icon

RoViST:Learning Robust Metrics for Visual Storytelling

Add code
May 08, 2022
Figure 1 for RoViST:Learning Robust Metrics for Visual Storytelling
Figure 2 for RoViST:Learning Robust Metrics for Visual Storytelling
Figure 3 for RoViST:Learning Robust Metrics for Visual Storytelling
Figure 4 for RoViST:Learning Robust Metrics for Visual Storytelling
Viaarxiv icon