Picture for Hugh Zhang

Hugh Zhang

Michael Pokorny

Humanity's Last Exam

Add code
Jan 24, 2025
Viaarxiv icon

Planning In Natural Language Improves LLM Search For Code Generation

Add code
Sep 05, 2024
Figure 1 for Planning In Natural Language Improves LLM Search For Code Generation
Figure 2 for Planning In Natural Language Improves LLM Search For Code Generation
Figure 3 for Planning In Natural Language Improves LLM Search For Code Generation
Figure 4 for Planning In Natural Language Improves LLM Search For Code Generation
Viaarxiv icon

LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet

Add code
Aug 27, 2024
Figure 1 for LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Figure 2 for LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Figure 3 for LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Figure 4 for LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet
Viaarxiv icon

Learning Goal-Conditioned Representations for Language Reward Models

Add code
Jul 18, 2024
Figure 1 for Learning Goal-Conditioned Representations for Language Reward Models
Figure 2 for Learning Goal-Conditioned Representations for Language Reward Models
Figure 3 for Learning Goal-Conditioned Representations for Language Reward Models
Figure 4 for Learning Goal-Conditioned Representations for Language Reward Models
Viaarxiv icon

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning

Add code
Jun 06, 2024
Figure 1 for NATURAL PLAN: Benchmarking LLMs on Natural Language Planning
Figure 2 for NATURAL PLAN: Benchmarking LLMs on Natural Language Planning
Figure 3 for NATURAL PLAN: Benchmarking LLMs on Natural Language Planning
Figure 4 for NATURAL PLAN: Benchmarking LLMs on Natural Language Planning
Viaarxiv icon

A Careful Examination of Large Language Model Performance on Grade School Arithmetic

Add code
May 02, 2024
Figure 1 for A Careful Examination of Large Language Model Performance on Grade School Arithmetic
Figure 2 for A Careful Examination of Large Language Model Performance on Grade School Arithmetic
Figure 3 for A Careful Examination of Large Language Model Performance on Grade School Arithmetic
Figure 4 for A Careful Examination of Large Language Model Performance on Grade School Arithmetic
Viaarxiv icon

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Add code
Feb 22, 2024
Figure 1 for Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Figure 2 for Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Figure 3 for Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Figure 4 for Q-Probe: A Lightweight Approach to Reward Maximization for Language Models
Viaarxiv icon

Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization

Add code
Feb 19, 2024
Figure 1 for Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization
Figure 2 for Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization
Figure 3 for Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization
Figure 4 for Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization
Viaarxiv icon

Chain-of-Thought Reasoning is a Policy Improvement Operator

Add code
Sep 15, 2023
Figure 1 for Chain-of-Thought Reasoning is a Policy Improvement Operator
Figure 2 for Chain-of-Thought Reasoning is a Policy Improvement Operator
Figure 3 for Chain-of-Thought Reasoning is a Policy Improvement Operator
Figure 4 for Chain-of-Thought Reasoning is a Policy Improvement Operator
Viaarxiv icon

Trading Off Diversity and Quality in Natural Language Generation

Add code
Apr 22, 2020
Figure 1 for Trading Off Diversity and Quality in Natural Language Generation
Figure 2 for Trading Off Diversity and Quality in Natural Language Generation
Figure 3 for Trading Off Diversity and Quality in Natural Language Generation
Figure 4 for Trading Off Diversity and Quality in Natural Language Generation
Viaarxiv icon