Picture for Xiangzheng Zhang

Xiangzheng Zhang

Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and Beyond

Add code
Mar 13, 2025
Viaarxiv icon

Utilizing Jailbreak Probability to Attack and Safeguard Multimodal LLMs

Add code
Mar 10, 2025
Viaarxiv icon

TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation

Add code
Mar 06, 2025
Viaarxiv icon

Chain-of-Thought Matters: Improving Long-Context Language Models with Reasoning Path Supervision

Add code
Feb 28, 2025
Viaarxiv icon

Stress Testing Generalization: How Minor Modifications Undermine Large Language Model Performance

Add code
Feb 18, 2025
Viaarxiv icon

Expand VSR Benchmark for VLLM to Expertize in Spatial Rules

Add code
Dec 24, 2024
Figure 1 for Expand VSR Benchmark for VLLM to Expertize in Spatial Rules
Figure 2 for Expand VSR Benchmark for VLLM to Expertize in Spatial Rules
Figure 3 for Expand VSR Benchmark for VLLM to Expertize in Spatial Rules
Figure 4 for Expand VSR Benchmark for VLLM to Expertize in Spatial Rules
Viaarxiv icon