Picture for Dacheng Tao

Dacheng Tao

JD Explore Academy, JD.com, China

Enhancing Input-Label Mapping in In-Context Learning with Contrastive Decoding

Add code
Feb 19, 2025
Viaarxiv icon

Reasoning with Reinforced Functional Token Tuning

Add code
Feb 19, 2025
Viaarxiv icon

HRP: High-Rank Preheating for Superior LoRA Initialization

Add code
Feb 11, 2025
Viaarxiv icon

Near-Optimal Online Learning for Multi-Agent Submodular Coordination: Tight Approximation and Communication Efficiency

Add code
Feb 07, 2025
Viaarxiv icon

Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment

Add code
Feb 06, 2025
Viaarxiv icon

Quantum Machine Learning: A Hands-on Tutorial for Machine Learning Practitioners and Researchers

Add code
Feb 03, 2025
Viaarxiv icon

The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking

Add code
Jan 31, 2025
Figure 1 for The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking
Figure 2 for The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking
Figure 3 for The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking
Figure 4 for The Energy Loss Phenomenon in RLHF: A New Perspective on Mitigating Reward Hacking
Viaarxiv icon

TeZO: Empowering the Low-Rankness on the Temporal Dimension in the Zeroth-Order Optimization for Fine-tuning LLMs

Add code
Jan 31, 2025
Viaarxiv icon

Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation

Add code
Jan 30, 2025
Viaarxiv icon

JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning in Large Language Models

Add code
Jan 24, 2025
Viaarxiv icon