Picture for Dan Guo

Dan Guo

Linguistics-Vision Monotonic Consistent Network for Sign Language Production

Add code
Dec 22, 2024
Viaarxiv icon

MOL-Mamba: Enhancing Molecular Representation with Structural & Electronic Insights

Add code
Dec 21, 2024
Viaarxiv icon

Prototypical Calibrating Ambiguous Samples for Micro-Action Recognition

Add code
Dec 19, 2024
Viaarxiv icon

Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production

Add code
Dec 19, 2024
Viaarxiv icon

Multimodal Class-aware Semantic Enhancement Network for Audio-Visual Video Parsing

Add code
Dec 17, 2024
Viaarxiv icon

ASAP: Advancing Semantic Alignment Promotes Multi-Modal Manipulation Detecting and Grounding

Add code
Dec 17, 2024
Viaarxiv icon

Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration

Add code
Dec 17, 2024
Figure 1 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 2 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 3 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 4 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Viaarxiv icon

Patch-level Sounding Object Tracking for Audio-Visual Question Answering

Add code
Dec 14, 2024
Viaarxiv icon

Moderating the Generalization of Score-based Generative Model

Add code
Dec 10, 2024
Viaarxiv icon

Repetitive Action Counting with Hybrid Temporal Relation Modeling

Add code
Dec 10, 2024
Viaarxiv icon