Picture for Hanze Dong

Hanze Dong

Automatic Curriculum Expert Iteration for Reliable LLM Reasoning

Add code
Oct 10, 2024
Figure 1 for Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Figure 2 for Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Figure 3 for Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Figure 4 for Automatic Curriculum Expert Iteration for Reliable LLM Reasoning
Viaarxiv icon

MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs

Add code
Oct 07, 2024
Figure 1 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Figure 2 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Figure 3 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Figure 4 for MathHay: An Automated Benchmark for Long-Context Mathematical Reasoning in LLMs
Viaarxiv icon

FIRST: Teach A Reliable Large Language Model Through Efficient Trustworthy Distillation

Add code
Aug 22, 2024
Viaarxiv icon

ThinK: Thinner Key Cache by Query-Driven Pruning

Add code
Jul 30, 2024
Viaarxiv icon

Faster Sampling via Stochastic Gradient Proximal Sampler

Add code
May 27, 2024
Viaarxiv icon

Reverse Transition Kernel: A Flexible Framework to Accelerate Diffusion Inference

Add code
May 26, 2024
Viaarxiv icon

RLHF Workflow: From Reward Modeling to Online RLHF

Add code
May 13, 2024
Figure 1 for RLHF Workflow: From Reward Modeling to Online RLHF
Figure 2 for RLHF Workflow: From Reward Modeling to Online RLHF
Figure 3 for RLHF Workflow: From Reward Modeling to Online RLHF
Figure 4 for RLHF Workflow: From Reward Modeling to Online RLHF
Viaarxiv icon

An Improved Analysis of Langevin Algorithms with Prior Diffusion for Non-Log-Concave Sampling

Add code
Mar 10, 2024
Viaarxiv icon

MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance

Add code
Jan 17, 2024
Figure 1 for MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
Figure 2 for MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
Figure 3 for MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
Figure 4 for MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance
Viaarxiv icon

Faster Sampling without Isoperimetry via Diffusion-based Monte Carlo

Add code
Jan 12, 2024
Viaarxiv icon