Picture for Gongpeng Zhao

Gongpeng Zhao

Ostrakon-VL: Towards Domain-Expert MLLM for Food-Service and Retail Stores

Add code
Jan 29, 2026
Viaarxiv icon

DualDiff: Dual-branch Diffusion Model for Autonomous Driving with Semantic Fusion

Add code
May 03, 2025
Viaarxiv icon

DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance

Add code
Mar 05, 2025
Figure 1 for DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance
Figure 2 for DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance
Figure 3 for DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance
Figure 4 for DualDiff+: Dual-Branch Diffusion for High-Fidelity Video Generation with Reward Guidance
Viaarxiv icon

Multimodal Fusion Method with Spatiotemporal Sequences and Relationship Learning for Valence-Arousal Estimation

Add code
Mar 20, 2024
Figure 1 for Multimodal Fusion Method with Spatiotemporal Sequences and Relationship Learning for Valence-Arousal Estimation
Figure 2 for Multimodal Fusion Method with Spatiotemporal Sequences and Relationship Learning for Valence-Arousal Estimation
Figure 3 for Multimodal Fusion Method with Spatiotemporal Sequences and Relationship Learning for Valence-Arousal Estimation
Viaarxiv icon

AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts

Add code
Mar 20, 2024
Figure 1 for AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts
Figure 2 for AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts
Figure 3 for AUD-TGN: Advancing Action Unit Detection with Temporal Convolution and GPT-2 in Wild Audiovisual Contexts
Viaarxiv icon

Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling

Add code
Mar 19, 2024
Figure 1 for Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling
Figure 2 for Exploring Facial Expression Recognition through Semi-Supervised Pretraining and Temporal Modeling
Viaarxiv icon

Local Region Perception and Relationship Learning Combined with Feature Fusion for Facial Action Unit Detection

Add code
Mar 19, 2023
Figure 1 for Local Region Perception and Relationship Learning Combined with Feature Fusion for Facial Action Unit Detection
Figure 2 for Local Region Perception and Relationship Learning Combined with Feature Fusion for Facial Action Unit Detection
Figure 3 for Local Region Perception and Relationship Learning Combined with Feature Fusion for Facial Action Unit Detection
Figure 4 for Local Region Perception and Relationship Learning Combined with Feature Fusion for Facial Action Unit Detection
Viaarxiv icon

Exploring Large-scale Unlabeled Faces to Enhance Facial Expression Recognition

Add code
Mar 19, 2023
Viaarxiv icon

A Dual Branch Network for Emotional Reaction Intensity Estimation

Add code
Mar 16, 2023
Viaarxiv icon