Picture for Hongsheng Li

Hongsheng Li

GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking

Add code
Jan 05, 2025
Viaarxiv icon

EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation

Add code
Jan 03, 2025
Figure 1 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 2 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 3 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 4 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Viaarxiv icon

A3: Android Agent Arena for Mobile GUI Agents

Add code
Jan 02, 2025
Figure 1 for A3: Android Agent Arena for Mobile GUI Agents
Figure 2 for A3: Android Agent Arena for Mobile GUI Agents
Figure 3 for A3: Android Agent Arena for Mobile GUI Agents
Figure 4 for A3: Android Agent Arena for Mobile GUI Agents
Viaarxiv icon

GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance

Add code
Dec 23, 2024
Figure 1 for GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance
Figure 2 for GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance
Figure 3 for GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance
Figure 4 for GaussianPainter: Painting Point Cloud into 3D Gaussians with Normal Guidance
Viaarxiv icon

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

Add code
Dec 15, 2024
Viaarxiv icon

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Add code
Dec 12, 2024
Figure 1 for EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Figure 2 for EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Figure 3 for EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Figure 4 for EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM
Viaarxiv icon

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

Add code
Dec 12, 2024
Viaarxiv icon

StreamChat: Chatting with Streaming Video

Add code
Dec 11, 2024
Viaarxiv icon

FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes

Add code
Dec 04, 2024
Figure 1 for FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Figure 2 for FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Figure 3 for FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Figure 4 for FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes
Viaarxiv icon

TimeWalker: Personalized Neural Space for Lifelong Head Avatars

Add code
Dec 03, 2024
Figure 1 for TimeWalker: Personalized Neural Space for Lifelong Head Avatars
Figure 2 for TimeWalker: Personalized Neural Space for Lifelong Head Avatars
Figure 3 for TimeWalker: Personalized Neural Space for Lifelong Head Avatars
Figure 4 for TimeWalker: Personalized Neural Space for Lifelong Head Avatars
Viaarxiv icon