Picture for Baining Guo

Baining Guo

MageBench: Bridging Large Multimodal Models to Agents

Add code
Dec 05, 2024
Viaarxiv icon

UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping

Add code
Dec 03, 2024
Viaarxiv icon

CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation

Add code
Nov 29, 2024
Viaarxiv icon

RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

Add code
Jul 11, 2024
Figure 1 for RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Figure 2 for RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Figure 3 for RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Figure 4 for RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Viaarxiv icon

Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms

Add code
Jun 13, 2024
Figure 1 for Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Figure 2 for Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Figure 3 for Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Figure 4 for Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Viaarxiv icon

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Add code
Apr 16, 2024
Figure 1 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Figure 2 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Figure 3 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Figure 4 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Viaarxiv icon

GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling

Add code
Apr 05, 2024
Viaarxiv icon

Simplified Diffusion Schrödinger Bridge

Add code
Mar 27, 2024
Viaarxiv icon

RelationVLM: Making Large Vision-Language Models Understand Visual Relations

Add code
Mar 19, 2024
Figure 1 for RelationVLM: Making Large Vision-Language Models Understand Visual Relations
Figure 2 for RelationVLM: Making Large Vision-Language Models Understand Visual Relations
Figure 3 for RelationVLM: Making Large Vision-Language Models Understand Visual Relations
Figure 4 for RelationVLM: Making Large Vision-Language Models Understand Visual Relations
Viaarxiv icon

VisualCritic: Making LMMs Perceive Visual Quality Like Humans

Add code
Mar 19, 2024
Figure 1 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Figure 2 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Figure 3 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Figure 4 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Viaarxiv icon