Picture for Martin Müller

Martin Müller

ETGL-DDPG: A Deep Deterministic Policy Gradient Algorithm for Sparse Reward Continuous Control

Add code
Oct 07, 2024
Viaarxiv icon

Efficiently Training Neural Networks for Imperfect Information Games by Sampling Information Sets

Add code
Jul 08, 2024
Viaarxiv icon

Neural Network-based Information Set Weighting for Playing Reconnaissance Blind Chess

Add code
Jul 08, 2024
Viaarxiv icon

Contrastive Learning of Preferences with a Contextual InfoNCE Loss

Add code
Jul 08, 2024
Viaarxiv icon

Learning With Generalised Card Representations for "Magic: The Gathering"

Add code
Jul 08, 2024
Viaarxiv icon

Expected Work Search: Combining Win Rate and Proof Size Estimation

Add code
May 09, 2024
Viaarxiv icon

Monte Carlo Tree Search in the Presence of Transition Uncertainty

Add code
Dec 18, 2023
Viaarxiv icon

Supervised and Reinforcement Learning from Observations in Reconnaissance Blind Chess

Add code
Aug 03, 2022
Figure 1 for Supervised and Reinforcement Learning from Observations in Reconnaissance Blind Chess
Figure 2 for Supervised and Reinforcement Learning from Observations in Reconnaissance Blind Chess
Figure 3 for Supervised and Reinforcement Learning from Observations in Reconnaissance Blind Chess
Viaarxiv icon

Quantity vs Quality: Investigating the Trade-Off between Sample Size and Label Reliability

Add code
Apr 20, 2022
Figure 1 for Quantity vs Quality: Investigating the Trade-Off between Sample Size and Label Reliability
Figure 2 for Quantity vs Quality: Investigating the Trade-Off between Sample Size and Label Reliability
Figure 3 for Quantity vs Quality: Investigating the Trade-Off between Sample Size and Label Reliability
Figure 4 for Quantity vs Quality: Investigating the Trade-Off between Sample Size and Label Reliability
Viaarxiv icon

Cedille: A large autoregressive French language model

Add code
Feb 07, 2022
Viaarxiv icon