Picture for Wang Zhu

Wang Zhu

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Add code
Oct 14, 2024
Viaarxiv icon

TLDR: Token-Level Detective Reward Model for Large Vision Language Models

Add code
Oct 07, 2024
Figure 1 for TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Figure 2 for TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Figure 3 for TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Figure 4 for TLDR: Token-Level Detective Reward Model for Large Vision Language Models
Viaarxiv icon

ST-RetNet: A Long-term Spatial-Temporal Traffic Flow Prediction Method

Add code
Jul 13, 2024
Viaarxiv icon

Language Models can Infer Action Semantics for Classical Planners from Environment Feedback

Add code
Jun 04, 2024
Viaarxiv icon

Hybrid Transformer and Spatial-Temporal Self-Supervised Learning for Long-term Traffic Prediction

Add code
Jan 29, 2024
Viaarxiv icon

Does VLN Pretraining Work with Nonsensical or Irrelevant Instructions?

Add code
Dec 02, 2023
Viaarxiv icon

Efficient End-to-End Visual Document Understanding with Rationale Distillation

Add code
Nov 16, 2023
Viaarxiv icon

Chain-of-Questions Training with Latent Answers for Robust Multistep Question Answering

Add code
May 24, 2023
Viaarxiv icon

Generalization Differences between End-to-End and Neuro-Symbolic Vision-Language Reasoning Systems

Add code
Oct 26, 2022
Viaarxiv icon

Navigating Memory Construction by Global Pseudo-Task Simulation for Continual Learning

Add code
Oct 16, 2022
Figure 1 for Navigating Memory Construction by Global Pseudo-Task Simulation for Continual Learning
Figure 2 for Navigating Memory Construction by Global Pseudo-Task Simulation for Continual Learning
Figure 3 for Navigating Memory Construction by Global Pseudo-Task Simulation for Continual Learning
Figure 4 for Navigating Memory Construction by Global Pseudo-Task Simulation for Continual Learning
Viaarxiv icon