Picture for Ying Cheng

Ying Cheng

CT2C-QA: Multimodal Question Answering over Chinese Text, Table and Chart

Add code
Oct 28, 2024
Viaarxiv icon

ADSNet: Cross-Domain LTV Prediction with an Adaptive Siamese Network in Advertising

Add code
Jun 15, 2024
Viaarxiv icon

Modality-Aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection

Add code
Jul 12, 2022
Figure 1 for Modality-Aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection
Figure 2 for Modality-Aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection
Figure 3 for Modality-Aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection
Figure 4 for Modality-Aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection
Viaarxiv icon

IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training

Add code
Jul 12, 2022
Figure 1 for IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training
Figure 2 for IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training
Figure 3 for IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training
Figure 4 for IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training
Viaarxiv icon

Self-Supervised Learning of Music-Dance Representation through Explicit-Implicit Rhythm Synchronization

Add code
Jul 07, 2022
Figure 1 for Self-Supervised Learning of Music-Dance Representation through Explicit-Implicit Rhythm Synchronization
Figure 2 for Self-Supervised Learning of Music-Dance Representation through Explicit-Implicit Rhythm Synchronization
Figure 3 for Self-Supervised Learning of Music-Dance Representation through Explicit-Implicit Rhythm Synchronization
Figure 4 for Self-Supervised Learning of Music-Dance Representation through Explicit-Implicit Rhythm Synchronization
Viaarxiv icon

Self-Supervised Video Representation Learning with Motion-Contrastive Perception

Add code
Apr 10, 2022
Figure 1 for Self-Supervised Video Representation Learning with Motion-Contrastive Perception
Figure 2 for Self-Supervised Video Representation Learning with Motion-Contrastive Perception
Figure 3 for Self-Supervised Video Representation Learning with Motion-Contrastive Perception
Figure 4 for Self-Supervised Video Representation Learning with Motion-Contrastive Perception
Viaarxiv icon

MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing

Add code
Nov 24, 2021
Figure 1 for MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing
Figure 2 for MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing
Figure 3 for MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing
Figure 4 for MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing
Viaarxiv icon

Domain Adaptive Cascade R-CNN for MItosis DOmain Generalization Challenge

Add code
Sep 29, 2021
Figure 1 for Domain Adaptive Cascade R-CNN for MItosis DOmain Generalization  Challenge
Viaarxiv icon

MPN: Multimodal Parallel Network for Audio-Visual Event Localization

Add code
Apr 07, 2021
Figure 1 for MPN: Multimodal Parallel Network for Audio-Visual Event Localization
Figure 2 for MPN: Multimodal Parallel Network for Audio-Visual Event Localization
Figure 3 for MPN: Multimodal Parallel Network for Audio-Visual Event Localization
Figure 4 for MPN: Multimodal Parallel Network for Audio-Visual Event Localization
Viaarxiv icon

Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning

Add code
Aug 13, 2020
Figure 1 for Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Figure 2 for Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Figure 3 for Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Figure 4 for Look, Listen, and Attend: Co-Attention Network for Self-Supervised Audio-Visual Representation Learning
Viaarxiv icon