Picture for Zhizheng Zhang

Zhizheng Zhang

Southeast University, China

A General Theory for Compositional Generalization

Add code
May 20, 2024
Viaarxiv icon

Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis

Add code
May 13, 2024
Viaarxiv icon

VisualCritic: Making LMMs Perceive Visual Quality Like Humans

Add code
Mar 19, 2024
Figure 1 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Figure 2 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Figure 3 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Figure 4 for VisualCritic: Making LMMs Perceive Visual Quality Like Humans
Viaarxiv icon

RelationVLM: Making Large Vision-Language Models Understand Visual Relations

Add code
Mar 19, 2024
Viaarxiv icon

NaVid: Video-based VLM Plans the Next Step for Vision-and-Language Navigation

Add code
Mar 01, 2024
Viaarxiv icon

SeD: Semantic-Aware Discriminator for Image Super-Resolution

Add code
Feb 29, 2024
Viaarxiv icon

Reinforced UI Instruction Grounding: Towards a Generic UI Task Automation API

Add code
Oct 07, 2023
Viaarxiv icon

Adaptive Frequency Filters As Efficient Global Token Mixers

Add code
Jul 26, 2023
Viaarxiv icon

When and Why Momentum Accelerates SGD:An Empirical Study

Add code
Jun 15, 2023
Viaarxiv icon

Responsible Task Automation: Empowering Large Language Models as Responsible Task Automators

Add code
Jun 02, 2023
Viaarxiv icon