Picture for Baining Guo

Baining Guo

Diffusion Models without Classifier-free Guidance

Add code
Feb 17, 2025
Viaarxiv icon

Optimizing Large Language Model Training Using FP4 Quantization

Add code
Jan 28, 2025
Figure 1 for Optimizing Large Language Model Training Using FP4 Quantization
Figure 2 for Optimizing Large Language Model Training Using FP4 Quantization
Figure 3 for Optimizing Large Language Model Training Using FP4 Quantization
Figure 4 for Optimizing Large Language Model Training Using FP4 Quantization
Viaarxiv icon

MageBench: Bridging Large Multimodal Models to Agents

Add code
Dec 05, 2024
Viaarxiv icon

UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping

Add code
Dec 03, 2024
Viaarxiv icon

CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation

Add code
Nov 29, 2024
Figure 1 for CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Figure 2 for CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Figure 3 for CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Figure 4 for CogACT: A Foundational Vision-Language-Action Model for Synergizing Cognition and Action in Robotic Manipulation
Viaarxiv icon

RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models

Add code
Jul 11, 2024
Figure 1 for RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Figure 2 for RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Figure 3 for RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Figure 4 for RodinHD: High-Fidelity 3D Avatar Generation with Diffusion Models
Viaarxiv icon

Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms

Add code
Jun 13, 2024
Figure 1 for Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Figure 2 for Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Figure 3 for Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Figure 4 for Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms
Viaarxiv icon

VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Add code
Apr 16, 2024
Figure 1 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Figure 2 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Figure 3 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Figure 4 for VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Viaarxiv icon

GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling

Add code
Apr 05, 2024
Viaarxiv icon

Simplified Diffusion Schrödinger Bridge

Add code
Mar 27, 2024
Viaarxiv icon