Picture for Wonseok Jeon

Wonseok Jeon

AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability

Add code
Oct 24, 2024
Viaarxiv icon

On Speculative Decoding for Multimodal Large Language Models

Add code
Apr 13, 2024
Viaarxiv icon

Direct Alignment of Draft Model for Speculative Decoding with Chat-Fine-Tuned LLMs

Add code
Mar 08, 2024
Viaarxiv icon

Recursive Speculative Decoding: Accelerating LLM Inference via Sampling Without Replacement

Add code
Mar 05, 2024
Viaarxiv icon

Local Metric Learning for Off-Policy Evaluation in Contextual Bandits with Continuous Actions

Add code
Oct 25, 2022
Viaarxiv icon

Neural Topological Ordering for Computation Graphs

Add code
Jul 13, 2022
Figure 1 for Neural Topological Ordering for Computation Graphs
Figure 2 for Neural Topological Ordering for Computation Graphs
Figure 3 for Neural Topological Ordering for Computation Graphs
Figure 4 for Neural Topological Ordering for Computation Graphs
Viaarxiv icon

OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation

Add code
Jun 21, 2021
Figure 1 for OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Figure 2 for OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Figure 3 for OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Figure 4 for OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
Viaarxiv icon

Regularized Inverse Reinforcement Learning

Add code
Oct 07, 2020
Figure 1 for Regularized Inverse Reinforcement Learning
Figure 2 for Regularized Inverse Reinforcement Learning
Figure 3 for Regularized Inverse Reinforcement Learning
Figure 4 for Regularized Inverse Reinforcement Learning
Viaarxiv icon

Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization

Add code
Jun 23, 2020
Figure 1 for Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization
Figure 2 for Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization
Figure 3 for Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization
Figure 4 for Adversarial Soft Advantage Fitting: Imitation Learning without Policy Optimization
Viaarxiv icon

Scalable Multi-Agent Inverse Reinforcement Learning via Actor-Attention-Critic

Add code
Feb 24, 2020
Figure 1 for Scalable Multi-Agent Inverse Reinforcement Learning via Actor-Attention-Critic
Figure 2 for Scalable Multi-Agent Inverse Reinforcement Learning via Actor-Attention-Critic
Figure 3 for Scalable Multi-Agent Inverse Reinforcement Learning via Actor-Attention-Critic
Figure 4 for Scalable Multi-Agent Inverse Reinforcement Learning via Actor-Attention-Critic
Viaarxiv icon