Picture for Jiefeng Ma

Jiefeng Ma

RFL: Simplifying Chemical Structure Recognition with Ring-Free Language

Add code
Dec 10, 2024
Viaarxiv icon

EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion

Add code
Nov 23, 2024
Figure 1 for EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion
Figure 2 for EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion
Figure 3 for EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion
Figure 4 for EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion
Viaarxiv icon

DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation

Add code
Oct 17, 2024
Figure 1 for DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation
Figure 2 for DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation
Figure 3 for DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation
Figure 4 for DAWN: Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for Talking Head Video Generation
Viaarxiv icon

See then Tell: Enhancing Key Information Extraction with Vision Grounding

Add code
Sep 29, 2024
Figure 1 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Figure 2 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Figure 3 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Figure 4 for See then Tell: Enhancing Key Information Extraction with Vision Grounding
Viaarxiv icon

DocMamba: Efficient Document Pre-training with State Space Model

Add code
Sep 18, 2024
Viaarxiv icon

SRFUND: A Multi-Granularity Hierarchical Structure Reconstruction Benchmark in Form Understanding

Add code
Jun 13, 2024
Viaarxiv icon

SEMv3: A Fast and Robust Approach to Table Separation Line Detection

Add code
May 20, 2024
Viaarxiv icon

A Study of Dropout-Induced Modality Bias on Robustness to Missing Video Frames for Audio-Visual Speech Recognition

Add code
Mar 07, 2024
Viaarxiv icon

Bidirectional Trained Tree-Structured Decoder for Handwritten Mathematical Expression Recognition

Add code
Dec 31, 2023
Viaarxiv icon

Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023

Add code
Sep 11, 2023
Figure 1 for Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023
Figure 2 for Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023
Figure 3 for Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023
Figure 4 for Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023
Viaarxiv icon