Picture for Jeff Schneider

Jeff Schneider

Carnegie Mellon University

Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning

Add code
Oct 15, 2024
Figure 1 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Figure 2 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Figure 3 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Figure 4 for Bayes Adaptive Monte Carlo Tree Search for Offline Model-based Reinforcement Learning
Viaarxiv icon

Decentralized Uncertainty-Aware Active Search with a Team of Aerial Robots

Add code
Oct 11, 2024
Viaarxiv icon

Measure Preserving Flows for Ergodic Search in Convoluted Environments

Add code
Sep 13, 2024
Viaarxiv icon

Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts

Add code
Sep 02, 2024
Figure 1 for Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
Figure 2 for Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
Figure 3 for Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
Figure 4 for Beyond Parameter Count: Implicit Bias in Soft Mixture of Experts
Viaarxiv icon

Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization

Add code
Aug 08, 2024
Viaarxiv icon

Soft-QMIX: Integrating Maximum Entropy For Monotonic Value Function Factorization

Add code
Jun 20, 2024
Viaarxiv icon

Planning with Adaptive World Models for Autonomous Driving

Add code
Jun 15, 2024
Figure 1 for Planning with Adaptive World Models for Autonomous Driving
Figure 2 for Planning with Adaptive World Models for Autonomous Driving
Figure 3 for Planning with Adaptive World Models for Autonomous Driving
Figure 4 for Planning with Adaptive World Models for Autonomous Driving
Viaarxiv icon

What is Your Data Worth to GPT? LLM-Scale Data Valuation with Influence Functions

Add code
May 22, 2024
Viaarxiv icon

Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data

Add code
Apr 23, 2024
Figure 1 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 2 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 3 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Figure 4 for Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Viaarxiv icon

Full Shot Predictions for the DIII-D Tokamak via Deep Recurrent Networks

Add code
Apr 18, 2024
Viaarxiv icon