Picture for Ziqi Pang

Ziqi Pang

Aligning Generative Denoising with Discriminative Objectives Unleashes Diffusion for Visual Perception

Add code
Apr 15, 2025
Viaarxiv icon

AgMMU: A Comprehensive Agricultural Multimodal Understanding and Reasoning Benchmark

Add code
Apr 14, 2025
Viaarxiv icon

GLUS: Global-Local Reasoning Unified into A Single Large Language Model for Video Segmentation

Add code
Apr 10, 2025
Viaarxiv icon

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

Add code
Dec 02, 2024
Figure 1 for RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Figure 2 for RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Figure 3 for RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Figure 4 for RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
Viaarxiv icon

InstructG2I: Synthesizing Images from Multimodal Attributed Graphs

Add code
Oct 09, 2024
Figure 1 for InstructG2I: Synthesizing Images from Multimodal Attributed Graphs
Figure 2 for InstructG2I: Synthesizing Images from Multimodal Attributed Graphs
Figure 3 for InstructG2I: Synthesizing Images from Multimodal Attributed Graphs
Figure 4 for InstructG2I: Synthesizing Images from Multimodal Attributed Graphs
Viaarxiv icon

RMem: Restricted Memory Banks Improve Video Object Segmentation

Add code
Jun 12, 2024
Viaarxiv icon

Frozen Transformers in Language Models Are Effective Visual Encoder Layers

Add code
Oct 19, 2023
Viaarxiv icon

Streaming Motion Forecasting for Autonomous Driving

Add code
Oct 02, 2023
Viaarxiv icon

MV-Map: Offboard HD-Map Generation with Multi-view Consistency

Add code
May 15, 2023
Viaarxiv icon

Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking

Add code
Feb 07, 2023
Viaarxiv icon