Picture for Zhenan Sun

Zhenan Sun

Affinity Contrastive Learning for Skeleton-based Human Activity Understanding

Add code
Jan 23, 2026
Viaarxiv icon

Dual-Phase LLM Reasoning: Self-Evolved Mathematical Frameworks

Add code
Jan 09, 2026
Viaarxiv icon

3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory

Add code
Dec 22, 2025
Figure 1 for 3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory
Figure 2 for 3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory
Figure 3 for 3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory
Figure 4 for 3SGen: Unified Subject, Style, and Structure-Driven Image Generation with Adaptive Task-specific Memory
Viaarxiv icon

TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models

Add code
Dec 18, 2025
Figure 1 for TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Figure 2 for TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Figure 3 for TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Figure 4 for TTP: Test-Time Padding for Adversarial Detection and Robust Adaptation on Vision-Language Models
Viaarxiv icon

Quantization Meets dLLMs: A Systematic Study of Post-training Quantization for Diffusion LLMs

Add code
Aug 20, 2025
Viaarxiv icon

ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension

Add code
Jul 22, 2025
Figure 1 for ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension
Figure 2 for ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension
Figure 3 for ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension
Figure 4 for ReMeREC: Relation-aware and Multi-entity Referring Expression Comprehension
Viaarxiv icon

TokLIP: Marry Visual Tokens to CLIP for Multimodal Comprehension and Generation

Add code
May 08, 2025
Viaarxiv icon

Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images

Add code
May 06, 2025
Figure 1 for Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images
Figure 2 for Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images
Figure 3 for Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images
Figure 4 for Learning Unknown Spoof Prompts for Generalized Face Anti-Spoofing Using Only Real Face Images
Viaarxiv icon

Learning Knowledge-based Prompts for Robust 3D Mask Presentation Attack Detection

Add code
May 06, 2025
Viaarxiv icon

VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction

Add code
Apr 30, 2025
Figure 1 for VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Figure 2 for VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Figure 3 for VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Figure 4 for VividListener: Expressive and Controllable Listener Dynamics Modeling for Multi-Modal Responsive Interaction
Viaarxiv icon