Picture for Debing Zhang

Debing Zhang

MLLM-Selector: Necessity and Diversity-driven High-Value Data Selection for Enhanced Visual Instruction Tuning

Add code
Mar 26, 2025
Viaarxiv icon

The Devil Is in the Details: Tackling Unimodal Spurious Correlations for Generalizable Multimodal Reward Models

Add code
Mar 05, 2025
Viaarxiv icon

LoRA-Null: Low-Rank Adaptation via Null Space for Large Language Models

Add code
Mar 04, 2025
Viaarxiv icon

Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch

Add code
Feb 24, 2025
Viaarxiv icon

Scalable Oversight for Superhuman AI via Recursive Self-Critiquing

Add code
Feb 07, 2025
Viaarxiv icon

SedarEval: Automated Evaluation using Self-Adaptive Rubrics

Add code
Jan 26, 2025
Viaarxiv icon

RedStar: Does Scaling Long-CoT Data Unlock Better Slow-Reasoning Systems?

Add code
Jan 20, 2025
Viaarxiv icon

Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models

Add code
Oct 28, 2024
Figure 1 for Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models
Figure 2 for Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models
Figure 3 for Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models
Figure 4 for Diff-Instruct*: Towards Human-Preferred One-step Text-to-image Generative Models
Viaarxiv icon

Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?

Add code
Oct 08, 2024
Viaarxiv icon

CodePMP: Scalable Preference Model Pretraining for Large Language Model Reasoning

Add code
Oct 03, 2024
Viaarxiv icon