Picture for Peng Gao

Peng Gao

University of Massachusetts Amherst

Bandwidth-Adaptive Spatiotemporal Correspondence Identification for Collaborative Perception

Add code
Feb 17, 2025
Viaarxiv icon

MME-CoT: Benchmarking Chain-of-Thought in Large Multimodal Models for Reasoning Quality, Robustness, and Efficiency

Add code
Feb 13, 2025
Viaarxiv icon

A Data-Efficient Pan-Tumor Foundation Model for Oncology CT Interpretation

Add code
Feb 10, 2025
Viaarxiv icon

Lumina-Video: Efficient and Flexible Video Generation with Multi-scale Next-DiT

Add code
Feb 10, 2025
Viaarxiv icon

ReGNet: Reciprocal Space-Aware Long-Range Modeling and Multi-Property Prediction for Crystals

Add code
Feb 04, 2025
Viaarxiv icon

Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step

Add code
Jan 23, 2025
Figure 1 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Figure 2 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Figure 3 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Figure 4 for Can We Generate Images with CoT? Let's Verify and Reinforce Image Generation Step by Step
Viaarxiv icon

IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models

Add code
Jan 23, 2025
Figure 1 for IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
Figure 2 for IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
Figure 3 for IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
Figure 4 for IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models
Viaarxiv icon

Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models

Add code
Jan 14, 2025
Figure 1 for Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Figure 2 for Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Figure 3 for Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Figure 4 for Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Viaarxiv icon

EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation

Add code
Jan 03, 2025
Figure 1 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 2 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 3 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 4 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Viaarxiv icon

TAR3D: Creating High-Quality 3D Assets via Next-Part Prediction

Add code
Dec 22, 2024
Viaarxiv icon