Picture for Kevin Jamieson

Kevin Jamieson

Overcoming the Sim-to-Real Gap: Leveraging Simulation to Learn to Explore for Real-World RL

Add code
Oct 26, 2024
Viaarxiv icon

Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning

Add code
Jul 02, 2024
Viaarxiv icon

Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning

Add code
Jun 15, 2024
Figure 1 for Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Figure 2 for Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Figure 3 for Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Figure 4 for Humor in AI: Massive Scale Crowd-Sourced Preferences and Benchmarks for Cartoon Captioning
Viaarxiv icon

Sample Complexity Reduction via Policy Difference Estimation in Tabular Reinforcement Learning

Add code
Jun 11, 2024
Viaarxiv icon

CLIPLoss and Norm-Based Data Selection Methods for Multimodal Contrastive Learning

Add code
May 29, 2024
Viaarxiv icon

Variance Alignment Score: A Simple But Tough-to-Beat Data Selection Method for Multimodal Contrastive Learning

Add code
Feb 03, 2024
Viaarxiv icon

An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models

Add code
Jan 12, 2024
Figure 1 for An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Figure 2 for An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Figure 3 for An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Figure 4 for An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Viaarxiv icon

Fair Active Learning in Low-Data Regimes

Add code
Dec 13, 2023
Viaarxiv icon

Minimax Optimal Submodular Optimization with Bandit Feedback

Add code
Oct 27, 2023
Viaarxiv icon

Near-Optimal Pure Exploration in Matrix Games: A Generalization of Stochastic Bandits & Dueling Bandits

Add code
Oct 25, 2023
Viaarxiv icon