Picture for Kunhao Zheng

Kunhao Zheng

Optimizing Language Models for Inference Time Objectives using Reinforcement Learning

Add code
Mar 25, 2025
Viaarxiv icon

The KoLMogorov Test: Compression by Code Generation

Add code
Mar 18, 2025
Viaarxiv icon

Soft Policy Optimization: Online Off-Policy RL for Sequence Models

Add code
Mar 07, 2025
Figure 1 for Soft Policy Optimization: Online Off-Policy RL for Sequence Models
Figure 2 for Soft Policy Optimization: Online Off-Policy RL for Sequence Models
Viaarxiv icon

PILAF: Optimal Human Preference Sampling for Reward Modeling

Add code
Feb 06, 2025
Viaarxiv icon

What Makes Large Language Models Reason in (Multi-Turn) Code Generation?

Add code
Oct 10, 2024
Figure 1 for What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
Figure 2 for What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
Figure 3 for What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
Figure 4 for What Makes Large Language Models Reason in (Multi-Turn) Code Generation?
Viaarxiv icon

RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning

Add code
Oct 02, 2024
Figure 1 for RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Figure 2 for RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Figure 3 for RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Figure 4 for RLEF: Grounding Code LLMs in Execution Feedback with Reinforcement Learning
Viaarxiv icon

D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory

Add code
Mar 01, 2023
Figure 1 for D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Figure 2 for D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Figure 3 for D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Figure 4 for D4FT: A Deep Learning Approach to Kohn-Sham Density Functional Theory
Viaarxiv icon

Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization

Add code
Dec 19, 2022
Figure 1 for Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Figure 2 for Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Figure 3 for Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Figure 4 for Distilling Vision-Language Pre-training to Collaborate with Weakly-Supervised Temporal Action Localization
Viaarxiv icon

Formal Mathematics Statement Curriculum Learning

Add code
Feb 03, 2022
Figure 1 for Formal Mathematics Statement Curriculum Learning
Figure 2 for Formal Mathematics Statement Curriculum Learning
Figure 3 for Formal Mathematics Statement Curriculum Learning
Figure 4 for Formal Mathematics Statement Curriculum Learning
Viaarxiv icon

Prompting Visual-Language Models for Efficient Video Understanding

Add code
Dec 08, 2021
Figure 1 for Prompting Visual-Language Models for Efficient Video Understanding
Figure 2 for Prompting Visual-Language Models for Efficient Video Understanding
Figure 3 for Prompting Visual-Language Models for Efficient Video Understanding
Figure 4 for Prompting Visual-Language Models for Efficient Video Understanding
Viaarxiv icon