Picture for Xinglin Hou

Xinglin Hou

Towards Efficient and Effective Text-to-Video Retrieval with Coarse-to-Fine Visual Representation Learning

Add code
Jan 01, 2024
Viaarxiv icon

Visual Captioning at Will: Describing Images and Videos Guided by a Few Stylized Sentences

Add code
Jul 31, 2023
Viaarxiv icon

Edit As You Wish: Video Description Editing with Multi-grained Commands

Add code
May 15, 2023
Viaarxiv icon

Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

Add code
May 07, 2022
Figure 1 for Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information
Figure 2 for Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information
Figure 3 for Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information
Figure 4 for Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information
Viaarxiv icon

Dual-Level Decoupled Transformer for Video Captioning

Add code
May 06, 2022
Figure 1 for Dual-Level Decoupled Transformer for Video Captioning
Figure 2 for Dual-Level Decoupled Transformer for Video Captioning
Figure 3 for Dual-Level Decoupled Transformer for Video Captioning
Figure 4 for Dual-Level Decoupled Transformer for Video Captioning
Viaarxiv icon

CapOnImage: Context-driven Dense-Captioning on Image

Add code
Apr 27, 2022
Figure 1 for CapOnImage: Context-driven Dense-Captioning on Image
Figure 2 for CapOnImage: Context-driven Dense-Captioning on Image
Figure 3 for CapOnImage: Context-driven Dense-Captioning on Image
Figure 4 for CapOnImage: Context-driven Dense-Captioning on Image
Viaarxiv icon