Picture for Nan Jiang

Nan Jiang

Faculty of Information Technology, Beijing University of Technology, Beijing, China, Beijing Key Laboratory of Trusted Computing, Beijing, China, National Engineering Laboratory for Critical Technologies of Information Security Classified Protection, Beijing, China

A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Add code
Apr 15, 2025
Viaarxiv icon

Context-Aware Adaptive Sampling for Intelligent Data Acquisition Systems Using DQN

Add code
Apr 12, 2025
Viaarxiv icon

Improving Harmful Text Detection with Joint Retrieval and External Knowledge

Add code
Apr 03, 2025
Viaarxiv icon

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Add code
Mar 27, 2025
Viaarxiv icon

Dynamic Motion Blending for Versatile Motion Editing

Add code
Mar 26, 2025
Viaarxiv icon

Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs

Add code
Mar 03, 2025
Viaarxiv icon

An Exact Solver for Satisfiability Modulo Counting with Probabilistic Circuits

Add code
Mar 02, 2025
Viaarxiv icon

Self-rewarding correction for mathematical reasoning

Add code
Feb 26, 2025
Viaarxiv icon

Improving LLM General Preference Alignment via Optimistic Online Mirror Descent

Add code
Feb 24, 2025
Viaarxiv icon

Model Selection for Off-policy Evaluation: New Algorithms and Experimental Protocol

Add code
Feb 11, 2025
Viaarxiv icon